Use logsoftmax operator in InfiniCore #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

gongchensu wants to merge 3 commits into InfiniTensor:main from gongchensu:feature/use_logsoft_max_in_ppl

gongchensu commented Sep 19, 2025 •

edited

Loading

接入使用InfiniCore分支中的logsoftmax算子
增加completion端口，支持launch_server后通过http端口计算得到max_tokens=0的logprobs
更改test_ppl和jiuge_ppl中用到的torch库的log_softmax算子
对齐test_ppl的token分块方式，使得和jiuge_ppl对perlexity的计算结果保持一致

gongchensu added 3 commits

September 17, 2025 15:25


          add completions endpoint

4c1528e


          Add completions endpoint, only support max_tokens=0

b9f48f8


          Replace torch's log_softmax with InfiniCore's logSoftmax operator.

189d8d5

consistent with jiuge_ppl's chunk method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet