【Feature】: add GLM-47 tool parser and support thinking/non-thinking mode toggle`#151
Conversation
|
@kurkol Please review this PR. |
|
Thanks for your work on adapting the GLM-4.7 model. However, during the adaptation, please avoid overwriting entire vLLM source files. Instead, use localized replacements or override specific class methods to maximize the vllm-kunlun plugin’s compatibility with different vLLM versions. You can refer to other adaptation PRs for concrete examples. |
|
In particular, you can follow the approach used in this PR, which replaces the relevant class methods: #75 |
696e007 to
db9bb28
Compare
To keep the PR focused, I have only modified two methods within this class to support toggling between thinking and non-thinking modes. This ensures minimal impact on the existing logic. |
OK, then just fix the file naming — it should be glm, not gim. |
Signed-off-by: zhangzhenyi <zhangzhenyi@baidu.com>
db9bb28 to
6710fcf
Compare
done |
Signed-off-by: zhangzhenyi <zhangzhenyi@baidu.com>
Signed-off-by: zhangzhenyi <zhangzhenyi@baidu.com> Co-authored-by: Li Wei <liwei.109@outlook.com>
PR Description
1. Background
Currently, vLLM does not support Function Calling for the GLM-47 (GLM4) series models. This PR introduces a dedicated tool parser for GLM-47 and adds a control mechanism for its hybrid "thinking" (reasoning) mode.
2. Changes
glm47tool parser. Users can enable it by passing the command-line argument:chat_template_kwargs."chat_template_kwargs": {"enable_thinking": true/false}.</think>existed in the prompt) to determine the reasoning state, which was unreliable in multi-turn dialogues.enable_thinkingparameter, ensuring consistent behavior across complex conversation histories.3. Test Plan
--tool-call-parser glm47flag.enable_thinkingcorrectly toggles the model's reasoning output without state confusion.Notes: This PR improves the robustness of GLM4 integration by moving away from prompt-parsing heuristics to explicit parameter control.