Skip to content

Conversation

@marceldev89
Copy link

@marceldev89 marceldev89 commented Oct 24, 2025

Note

Original work and PR by bold84 @ #15019

This pull request resolves #15012 and introduces comprehensive support for the Qwen3-Coder model family's XML-based tool-calling format. It includes a new, robust XML parser and updated chat template detection logic to ensure reliable function calling.

Key Changes:

  1. New XML Parser (common/chat-parser.cpp):

    • A dedicated, non-streaming XML parser has been implemented to handle the Qwen3-Coder's specific output format.
    • Features include robust attribute parsing, improved error reporting, and efficient function lookups using a hash set.
  2. Chat Template Detection (common/chat.h, common/chat.cpp):

    • The chat template detection logic has been updated to correctly identify Qwen3-Coder models, preventing conflicts with other formats like Hermes 2.
    • Ensures the QWEN3_CODER_XML format is applied consistently, even when no tools are explicitly provided in the request.
  3. Comprehensive tests (tests/test-chat.cpp):

    • Comprehensive tests for the parser logic has been implemented.

Known issues:

  • The model (Qwen3-Coder-30B-A3B-Instruct-UD-Q*_K_XL.gguf) occasionally stops prefixing tool calls with the proper <tool_call>. This seems to be an issue with the model itself(?).

bold84 and others added 23 commits August 2, 2025 02:02
…r_edit

Fix grammar, hide tool_call from output
Add missing closing brace to terminate test_template_output_parsers() function. This resolves compilation errors that prevented successful build of the test-chat target.
Co-authored-by: Kashyap Jois <[email protected]>
Co-authored-by: Kashyap Jois <[email protected]>
Co-authored-by: Marcel de Vries <[email protected]>
Co-authored-by: Marcel de Vries <[email protected]>
…ranches; add tests

- chat-parser: support schema.type as array (e.g. ["number","null"]) in convert_qwen3_param_value()
- chat: resolve $refs; allow unions including "string" as freeform; sanitize empty {"not":{}} in anyOf/oneOf before add_schema
- tests: add Qwen3-Coder regression ensuring grammar builds with unions and ignores {"not":{}}
@github-actions github-actions bot added the testing Everything test related label Oct 24, 2025
@coder543
Copy link

Anecdotally, I observed that the previous PR (and presumably this PR too) essentially fixed tool calling for qwen3-coder. Although when trying to use it with codex, qwen3-coder absolutely refuses to use the apply_patch tool, opting to use sed instead, which is probably just a training issue?

It would be nice to get this PR merged in.

@marceldev89
Copy link
Author

marceldev89 commented Oct 24, 2025

Anecdotally, I observed that the previous PR (and presumably this PR too) essentially fixed tool calling for qwen3-coder. Although when trying to use it with codex, qwen3-coder absolutely refuses to use the apply_patch tool, opting to use sed instead, which is probably just a training issue?

It would be nice to get this PR merged in.

I guess you could test it through openrouter or something and check if you see the same behavior there as well. My guess would be that it's a model thing and not so much this PR. Or maybe even a codex thing since it's probably heavily optimized for GPT models in terms of system prompt and tool descriptions.

@MartyLake
Copy link

Hey, just to confirm that running this branch fixes the integration with Qwen3-Coder-30B-A3B.

Reproduction steps:

# Compile this branch
mkdir $HOME/bin; cd $HOME/bin
git clone https://github.com/marceldev89/llama.cpp.git llama.cpp-fork-sources && cd llama.cpp-fork-sources
cmake -Bbuild && cmake --build build --target llama-server --parallel

# Install qwen
brew install qwen-coder

# Launch model
$HOME/bin/llama.cpp-fork-sources/build/bin/llama-server --port 8012 --host 0.0.0.0 --jinja -ngl 99 -c 300000 -m $HOME/.lmstudio/models/hf.co/hf.co-unsloth-Qwen3-Coder-30B-A3B-Instruct-GGUF-UD-Q4-K-XL-GGUF/hf.co-unsloth-Qwen3-Coder-30B-A3B-Instruct-GGUF-UD-Q4-K-XL.gguf

# Launch qwen
OPENAI_API_KEY=no OPENAI_BASE_URL=http://localhost:8012/v1 OPENAI_MODEL=models/hf.co-unsloth-Qwen3-Coder-30B-A3B-Instruct-GGUF-UD-Q4-K-XL.gguf qwen

PS: I opened too many tabs to figure it out, and I can’t find the sources any more to properly source them. I invented nothing here, credits goes to whoever wrote the pieces first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Qwen3-Coder Tool Call Parser

4 participants