Fix/max tokens to max completion tokens 2101 by Muran-prog · Pull Request #2143 · router-for-me/CLIProxyAPI

Muran-prog · 2026-03-14T19:35:23Z

Newer OpenAI models (o3, o4-mini, gpt-5, etc.) reject the legacy max_tokens parameter:
400 Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.

Added promoteMaxTokens() in the openai-compat executor - renames max_tokens to max_completion_tokens before forwarding to upstream. If max_completion_tokens is already set, just removes the old field.

Applied in both Execute and ExecuteStream paths.

…oviders (router-for-me#2101)

gemini-code-assist · 2026-03-14T19:35:36Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an incompatibility issue with newer OpenAI models by updating how token limits are handled. It introduces a mechanism to automatically translate the deprecated max_tokens parameter to the required max_completion_tokens parameter, ensuring that requests to these models are processed correctly without errors. This change improves the robustness and forward compatibility of the system when interacting with evolving OpenAI APIs.

Highlights

OpenAI API Compatibility: Implemented a fix to ensure compatibility with newer OpenAI models that reject the legacy max_tokens parameter in favor of max_completion_tokens.
Parameter Promotion Logic: Introduced a promoteMaxTokens function that renames max_tokens to max_completion_tokens if max_completion_tokens is not already present. If max_completion_tokens exists, max_tokens is simply removed.
Execution Path Integration: Applied the promoteMaxTokens logic to both the Execute and ExecuteStream paths within the OpenAICompatExecutor to cover all relevant API calls.
Comprehensive Testing: Added a new test file with dedicated unit tests for the promoteMaxTokens function, covering various scenarios like renaming, existing max_completion_tokens, no max_tokens, and empty payloads.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

internal/runtime/executor/openai_compat_executor.go
- Imported the gjson package for JSON parsing.
- Called promoteMaxTokens in the Execute method to adapt request payloads.
- Called promoteMaxTokens in the ExecuteStream method to adapt streaming request payloads.
- Added the promoteMaxTokens function to handle the renaming and removal of max_tokens.
internal/runtime/executor/promote_max_tokens_test.go
- Added a new test file.
- Included TestPromoteMaxTokens_Rename to verify renaming functionality.
- Included TestPromoteMaxTokens_AlreadySet to ensure existing max_completion_tokens are preserved.
- Included TestPromoteMaxTokens_NoOp to confirm no changes when max_tokens is absent.
- Included TestPromoteMaxTokens_EmptyPayload to test handling of nil or empty input payloads.

Activity

No human activity has been recorded on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request correctly addresses the need to rename max_tokens to max_completion_tokens for newer OpenAI models by introducing a promoteMaxTokens function. The change is applied correctly in both streaming and non-streaming execution paths. However, I've identified a potential bug in the error handling within the new function that could lead to data loss under certain conditions, for which I've provided a suggestion.

gemini-code-assist · 2026-03-14T19:37:09Z

internal/runtime/executor/openai_compat_executor.go

+	if !gjson.GetBytes(payload, "max_completion_tokens").Exists() {
+		payload, _ = sjson.SetBytes(payload, "max_completion_tokens", mt.Value())
+	}
+	payload, _ = sjson.DeleteBytes(payload, "max_tokens")


There's a potential for data loss here if the payload is not valid JSON. If sjson.SetBytes fails, it returns the original payload and the error is ignored. Subsequently, sjson.DeleteBytes is called on the original payload. If DeleteBytes succeeds, max_tokens is removed, but max_completion_tokens was never set, effectively losing the max_tokens value.

It's better to handle the error from sjson.SetBytes and return early to prevent this inconsistent state.

if !gjson.GetBytes(payload, "max_completion_tokens").Exists() { var err error payload, err = sjson.SetBytes(payload, "max_completion_tokens", mt.Value()) if err != nil { log.Warnf("promoteMaxTokens: failed to set max_completion_tokens, returning original payload: %v", err) return payload } } payload, _ = sjson.DeleteBytes(payload, "max_tokens")

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1100f46cca

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-14T19:39:04Z

internal/runtime/executor/openai_compat_executor.go

+	if !gjson.GetBytes(payload, "max_completion_tokens").Exists() {
+		payload, _ = sjson.SetBytes(payload, "max_completion_tokens", mt.Value())
+	}
+	payload, _ = sjson.DeleteBytes(payload, "max_tokens")


Preserve token-limit echo for Responses conversions

Deleting max_tokens here breaks a non-stream /v1/responses round-trip: ConvertOpenAIChatCompletionsResponseToOpenAIResponsesNonStream currently reconstructs max_output_tokens only from max_output_tokens or max_tokens in requestRawJSON (see internal/translator/openai/openai/responses/openai_openai-responses_response.go around lines 632-637). After this change, translated requests carry only max_completion_tokens, so the final Responses payload silently drops max_output_tokens, which is a behavior regression for clients relying on that echoed field.

Useful? React with 👍 / 👎.

luispater

This fixes the upstream request rewrite for newer OpenAI-style models, but it also changes the request payload that the response translators use for /v1/responses round-trips.

Blocking:

promoteMaxTokens() rewrites translated before TranslateNonStream(...), but the non-stream Responses converter currently reconstructs max_output_tokens only from max_output_tokens or max_tokens. After this change, that path sees only max_completion_tokens, so /v1/responses responses can silently lose max_output_tokens.

Non-blocking:

The rewrite is applied to every openai-compatibility provider even though issue #2101 is specifically about GitHub Models / Azure-style newer models. Consider scoping this by provider/model or making it configurable.
The new tests only cover the helper and do not exercise executor-level round-trips for /v1/responses, which is why the regression above is not caught.

Test plan:

Reviewed PR metadata, diff, inline comments, and check results with gh.
Traced the non-stream execution path from OpenAICompatExecutor.Execute() into sdktranslator.TranslateNonStream() and the Responses response converter.
Did not run the PR branch locally.

…allback in Responses converter Address review feedback: - promoteMaxTokens now returns original payload on sjson.SetBytes error - Responses converter (stream + non-stream) recognizes max_completion_tokens so max_output_tokens is preserved after promotion

…nverter

Muran-prog · 2026-03-15T05:30:13Z

Updated based on feedback:

Responses converter (both stream and non-stream paths in openai_openai-responses_response.go) now uses max_completion_tokens as a fallback for max_output_tokens.
Fallback priority: max_output_tokens > max_completion_tokens > max_tokens.
promoteMaxTokens() now returns the original payload on sjson.SetBytes errors to prevent silent data loss.
Added 5 tests for the non-stream Responses converter covering all token limit field combinations (direct, promoted, legacy, etc.).
Kept provider scoping global for now, as max_completion_tokens is the standard for current OpenAI-compatible models.

Muran-prog added 2 commits March 14, 2026 21:30

fix: convert max_tokens to max_completion_tokens for openai-compat pr…

297137d

…oviders (router-for-me#2101)

test: cover promoteMaxTokens rename/noop/edge cases

1100f46

gemini-code-assist bot reviewed Mar 14, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 14, 2026

View reviewed changes

luispater requested changes Mar 15, 2026

View reviewed changes

Muran-prog added 2 commits March 15, 2026 07:27

test: cover max_completion_tokens fallback in Responses non-stream co…

aa1c417

…nverter

Muran-prog requested a review from luispater March 15, 2026 05:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix/max tokens to max completion tokens 2101#2143

Fix/max tokens to max completion tokens 2101#2143
Muran-prog wants to merge 4 commits intorouter-for-me:mainfrom
Muran-prog:fix/max-tokens-to-max-completion-tokens-2101

Muran-prog commented Mar 14, 2026

Uh oh!

gemini-code-assist bot commented Mar 14, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 14, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 14, 2026

Uh oh!

luispater left a comment

Uh oh!

Muran-prog commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Muran-prog commented Mar 14, 2026

Uh oh!

gemini-code-assist bot commented Mar 14, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

luispater left a comment

Choose a reason for hiding this comment

Uh oh!

Muran-prog commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants