fix(codex): merge consecutive assistant messages & raise idle timeout for MiMo multi-round hang#3896
fix(codex): merge consecutive assistant messages & raise idle timeout for MiMo multi-round hang#3896lssuzie wants to merge 11 commits into
Conversation
… for MiMo multi-round hang Two bugs caused multi-round tool calls to hang with MiMo/MiniMax providers: 1. **content: null → content: ""**: Strict Chat Completions providers reject null content on assistant messages with tool_calls, returning HTTP 400. Replace with empty string in both flush_pending_tool_calls and responses_message_item_to_chat_message. 2. **Consecutive assistant messages**: In the Responses API, a message item (assistant with text) and a function_call item are separate output items. When converted to Chat Completions, they must be merged into a single assistant message with both content and tool_calls. Without merging, strict providers reject the request as having invalid consecutive assistant messages. 3. **Inline think state machine short-circuit**: When reasoning was already captured via reasoning_content (MiMo/DeepSeek/Kimi), skip the inline thinking block state machine to avoid buffering lag. 4. **streaming_idle_timeout raised to 300s**: Long-reasoning models often exceed the old 120s default between reasoning bursts. Now defaults to 300s with a v10→v11 migration preserving user overrides. Fixes farion1231#3561
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8d70024ba2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
CI Check Rust formatting step failed because the original fix commit was not run through cargo fmt. This commit applies cargo fmt to the three affected files: - src-tauri/src/proxy/providers/streaming_codex_chat.rs - src-tauri/src/proxy/providers/transform_codex_chat.rs - src-tauri/src/proxy/types.rs No logic changes.
When a <think> open tag is split across SSE chunks, the previous implementation emitted the second chunks content as plain text, leaking the models reasoning to the user. This was flagged by codex-bot in PR farion1231#3896 review. Add a short-delta buffer in content_when_reasoning_already_seen: - Deltas of 16 chars or less containing a < are buffered. - The next delta is concatenated with the buffer before processing, so the combined text can be correctly identified as a think block. - At finalize() time, any still-buffered content is emitted as text so we never silently drop content (e.g. if the stream ends mid-tag). A new test (split_think_open_tag_across_chunks_does_not_leak) exercises this path: reasoning_content arrives, then chunk 1 is <think> (7 chars) and chunk 2 is plan</think>\nanswer. The test verifies that the word plan does not leak as plain text output.
|
@farion1231 The CI workflow is awaiting your first-time-contributor approval — could you approve it so the checks can run? I also pushed a follow-up commit that addresses the codex-bot review feedback about partial tag leaks across SSE chunks (short-delta buffer in content_when_reasoning_already_seen, plus a new regression test). New commits since the last push:
The original 3 commits (merge consecutive assistant messages, content: null, idle timeout, etc.) are unchanged. Let me know if the new buffering approach looks right or if you prefer a different solution. |
…tion test The streaming_idle_timeout column is normally added by startup compatibility code, not in any versioned migration. The test schema_migration_v4_adds_pricing_model_columns seeds the database at v4 and runs the full migration chain, skipping the startup path. When v10→v11 runs, the column doesn't exist. Fix: add streaming_first_byte_timeout, streaming_idle_timeout, and non_streaming_timeout columns in migrate_v4_to_v5, before any later migration references them.
Fix multi-round tool call hang when using MiMo/MiniMax providers via Codex Chat Completions routing.
Two bugs fixed:
Additionally:
53 tests pass, including 5 new regression tests specific to MiMo/MiniMax:
Fixes #3561