Skip to content

fix(codex): merge consecutive assistant messages & raise idle timeout for MiMo multi-round hang#3896

Open
lssuzie wants to merge 11 commits into
farion1231:mainfrom
lssuzie:main
Open

fix(codex): merge consecutive assistant messages & raise idle timeout for MiMo multi-round hang#3896
lssuzie wants to merge 11 commits into
farion1231:mainfrom
lssuzie:main

Conversation

@lssuzie

@lssuzie lssuzie commented Jun 8, 2026

Copy link
Copy Markdown

Fix multi-round tool call hang when using MiMo/MiniMax providers via Codex Chat Completions routing.

Two bugs fixed:

  1. content: null → content: "" — Strict Chat Completions providers reject null content on assistant messages with tool_calls
  2. Consecutive assistant messages — In Responses API, message item (assistant text) and function_call item are separate output items. They must be merged into a single assistant message in Chat Completions format.

Additionally:

  • streaming_idle_timeout raised from 120s to 300s (MiMo is slow between reasoning bursts)
  • Short-circuit inline think state machine when reasoning_content already captured

53 tests pass, including 5 new regression tests specific to MiMo/MiniMax:

  • responses_tool_call_produces_empty_string_content_not_null
  • responses_multi_turn_tool_call_no_null_content
  • responses_consecutive_assistant_messages_are_merged
  • responses_three_round_codex_no_consecutive_assistant_no_null_content
  • responses_assistant_text_then_function_call_merges

Fixes #3561

… for MiMo multi-round hang

Two bugs caused multi-round tool calls to hang with MiMo/MiniMax providers:

1. **content: null → content: ""**: Strict Chat Completions providers reject
   null content on assistant messages with tool_calls, returning HTTP 400.
   Replace with empty string in both flush_pending_tool_calls and
   responses_message_item_to_chat_message.

2. **Consecutive assistant messages**: In the Responses API, a message item
   (assistant with text) and a function_call item are separate output items.
   When converted to Chat Completions, they must be merged into a single
   assistant message with both content and tool_calls. Without merging,
   strict providers reject the request as having invalid consecutive
   assistant messages.

3. **Inline think state machine short-circuit**: When reasoning was already
   captured via reasoning_content (MiMo/DeepSeek/Kimi), skip the inline
   thinking block state machine to avoid buffering lag.

4. **streaming_idle_timeout raised to 300s**: Long-reasoning models often
   exceed the old 120s default between reasoning bursts. Now defaults to
   300s with a v10→v11 migration preserving user overrides.

Fixes farion1231#3561
@farion1231

Copy link
Copy Markdown
Owner

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8d70024ba2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src-tauri/src/proxy/providers/streaming_codex_chat.rs
OpenClaw added 3 commits June 8, 2026 18:52
CI Check Rust formatting step failed because the original fix commit was not run through cargo fmt. This commit applies cargo fmt to the three affected files:

- src-tauri/src/proxy/providers/streaming_codex_chat.rs
- src-tauri/src/proxy/providers/transform_codex_chat.rs
- src-tauri/src/proxy/types.rs

No logic changes.
When a <think> open tag is split across SSE chunks, the previous
implementation emitted the second chunks content as plain text, leaking
the models reasoning to the user. This was flagged by codex-bot in
PR farion1231#3896 review.

Add a short-delta buffer in content_when_reasoning_already_seen:
- Deltas of 16 chars or less containing a < are buffered.
- The next delta is concatenated with the buffer before processing,
  so the combined text can be correctly identified as a think block.
- At finalize() time, any still-buffered content is emitted as text
  so we never silently drop content (e.g. if the stream ends mid-tag).

A new test (split_think_open_tag_across_chunks_does_not_leak) exercises
this path: reasoning_content arrives, then chunk 1 is <think> (7 chars)
and chunk 2 is plan</think>\nanswer. The test verifies that the word
plan does not leak as plain text output.
@lssuzie

lssuzie commented Jun 8, 2026

Copy link
Copy Markdown
Author

@farion1231 The CI workflow is awaiting your first-time-contributor approval — could you approve it so the checks can run?

I also pushed a follow-up commit that addresses the codex-bot review feedback about partial tag leaks across SSE chunks (short-delta buffer in content_when_reasoning_already_seen, plus a new regression test). New commits since the last push:

  • fix(streaming): buffer short deltas to prevent partial <think> leak
  • style: cargo fmt

The original 3 commits (merge consecutive assistant messages, content: null, idle timeout, etc.) are unchanged. Let me know if the new buffering approach looks right or if you prefer a different solution.

…tion test

The streaming_idle_timeout column is normally added by startup
compatibility code, not in any versioned migration. The test
schema_migration_v4_adds_pricing_model_columns seeds the database at
v4 and runs the full migration chain, skipping the startup path.
When v10→v11 runs, the column doesn't exist.

Fix: add streaming_first_byte_timeout, streaming_idle_timeout, and
non_streaming_timeout columns in migrate_v4_to_v5, before any later
migration references them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[codex][mimo]小米 mimo 调用错误

2 participants