Improve Hermes tool-call recovery#206
Conversation
|
Imporve Hermes took call recovery#206 |
|
We had some free compute available for another deeper pass here, especially since this is stacked on #204 and would help a lot with the Hermes/Qwen autonomous coding path. These are suggestions from a multi-pass review, not a formal requested-changes review. 🙂 I trimmed this to the pieces that still look specific to #206 after #204 is merged/rebased. Suggestions to check before merge
That means a real string value containing formatted JSON file content is parsed and reserialized, losing whitespace/trailing newline: content = "{\n \"compilerOptions\": {\n \"strict\": true\n }\n}\n"
text = "Calling tool: write_file({\"path\":\"/tmp/tsconfig.json\", \"content\": " + json.dumps(content) + "})"
_, calls = parse_tool_calls(text, request)
args = json.loads(calls[0].function.arguments)
# current:
args["content"] == "{\"compilerOptions\": {\"strict\": true}}"
# expected:
args["content"] == contentSuggestion: infer the schema type first. If the schema explicitly says
Examples that currently return "The best way to test this API is with pytest."
"The article explains how to write plugins for this API."
"Path is \"path\" in JSON schema."
"I will summarize the API."Suggestion: narrow the retry to stronger deferred-tool signals, maybe exact assistant promise patterns plus raw partial tool-call tails, and consider respecting
If the hidden retry loop stays, a few details are worth tightening:
The balanced raw JSON extraction handles braces inside string/file content now, which is good. The added test covers A public Validated locally on #206 head:
|
|
SOP §0–§2 review. Holding — depends on #204 outcome + has its own scope concerns. §0 Necessity: ✅ partially — the JSON-content-in-string fix is needed (it actually addresses P1#1 from my #204 review). The deferred-tool-use retry is more speculative. Relationship to #204: I left review on #204 requesting a fix for if schema_type in ("string", "str", "text", "varchar", "char", "enum"):
if isinstance(value, str):
return value
return json.dumps(value, ensure_ascii=False)That's exactly the right fix. Suggested path: backport this §2 — issues with #206-specific scope (the retry logic): In
Suggested fixes for the retry path:
§3 + §4: I'll run targeted tests after #204 lands and you've rebased. To summarize the path I recommend:
Thanks for the work — the parser fixes are excellent. The retry logic just needs a tighter trigger. |
Based on PR #204.
This keeps the PR #204 tool-call fixes and adds the smallest extra recovery needed for Hermes/Qwen agent runs to complete autonomous file-writing tasks.
Changes:
Validation:
uv run pytest tests/test_chat_tool_retry.py tests/test_tool_calling.py tests/test_tool_parsers.py tests/test_upstream_regression.py tests/test_postprocessor.pyuv run ruff check vllm_mlx/routes/chat.py vllm_mlx/api/tool_calling.py vllm_mlx/service/helpers.py tests/test_tool_calling.py tests/test_chat_tool_retry.pybun testpassed with 19 tests.