Skip to content

fix: increase Anthropic max_tokens and warn on truncation#29

Open
efecnc wants to merge 1 commit intoaltaidevorg:mainfrom
efecnc:fix/anthropic-max-tokens-configurable
Open

fix: increase Anthropic max_tokens and warn on truncation#29
efecnc wants to merge 1 commit intoaltaidevorg:mainfrom
efecnc:fix/anthropic-max-tokens-configurable

Conversation

@efecnc
Copy link
Copy Markdown
Contributor

@efecnc efecnc commented May 6, 2026

Summary

  • Increases default max_tokens for Anthropic Opus/Sonnet models from 8192 to 16384
  • Adds with_max_tokens() builder method for explicit configuration
  • Logs a warning when stop_reason == "max_tokens" so truncation is visible

Root Cause

max_tokens was hardcoded at 8192 for all Anthropic models. When output was truncated, stop_reason was never checked, so incomplete tool_use JSON was silently parsed with unwrap_or_default() — tools received empty args and failed mysteriously.

Test plan

  • Verify cargo check passes
  • Test with Claude Sonnet — long tool-heavy responses should no longer be truncated
  • Verify truncation warning appears in logs when output hits the limit

max_tokens was hardcoded at 8192 for all Anthropic models. Newer models
(Opus, Sonnet) support much higher output. Truncated responses could
produce incomplete tool_use JSON that silently breaks tool execution.

- Default to 16384 for opus/sonnet models, 8192 for others
- Add with_max_tokens() builder for explicit configuration
- Log warning when stop_reason is max_tokens so truncation is visible
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the AnthropicProvider to dynamically set default max_tokens based on the model name, adds a with_max_tokens builder method, and introduces logging for truncated responses. Feedback indicates that the proposed token limits for Opus and Sonnet models exceed current API constraints and may result in errors, suggesting lower defaults and case-insensitive model name matching.

Comment thread src/provider.rs
Comment on lines +87 to +91
let max_tokens = if model.contains("opus") || model.contains("sonnet") {
16384
} else {
8192
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The proposed default of 16384 tokens for Opus and Sonnet models is likely to cause 400 Bad Request errors from the Anthropic API. As of the current API version (2023-06-01), the maximum output tokens for Claude 3 Opus is 4096, and for Claude 3.5 Sonnet it is 8192. While a beta feature exists for 16384 tokens on Sonnet 3.5, it requires the anthropic-beta: max-tokens-3-5-sonnet-2024-07-15 header, which is not currently included in the request. Additionally, the model name check is case-sensitive; using to_lowercase() would be more robust.

        let model_lower = model.to_lowercase();
        let max_tokens = if model_lower.contains("sonnet") {
            8192
        } else if model_lower.contains("opus") {
            4096
        } else {
            8192
        };

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant