Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 22 additions & 1 deletion src/provider.rs
Original file line number Diff line number Diff line change
Expand Up @@ -83,12 +83,18 @@ pub struct AnthropicProvider {

impl AnthropicProvider {
pub fn new(base_url: &str, api_key: &str, model: &str) -> Self {
// Use a higher default for newer models that support longer output
let max_tokens = if model.contains("opus") || model.contains("sonnet") {
16384
} else {
8192
};
Comment on lines +87 to +91
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The proposed default of 16384 tokens for Opus and Sonnet models is likely to cause 400 Bad Request errors from the Anthropic API. As of the current API version (2023-06-01), the maximum output tokens for Claude 3 Opus is 4096, and for Claude 3.5 Sonnet it is 8192. While a beta feature exists for 16384 tokens on Sonnet 3.5, it requires the anthropic-beta: max-tokens-3-5-sonnet-2024-07-15 header, which is not currently included in the request. Additionally, the model name check is case-sensitive; using to_lowercase() would be more robust.

        let model_lower = model.to_lowercase();
        let max_tokens = if model_lower.contains("sonnet") {
            8192
        } else if model_lower.contains("opus") {
            4096
        } else {
            8192
        };

Self {
api_key: api_key.to_string(),
model: model.to_string(),
base_url: base_url.to_string(),
temperature: 0.3,
max_tokens: 8192,
max_tokens,
client: build_reqwest_client(),
}
}
Expand All @@ -98,6 +104,11 @@ impl AnthropicProvider {
self
}

pub fn with_max_tokens(mut self, max_tokens: u32) -> Self {
self.max_tokens = max_tokens;
self
}

/// Convert OpenAI-format tool definitions to Anthropic format.
fn convert_tools(tools: &Value) -> Value {
// OpenAI: [{ "type": "function", "function": { "name", "description", "parameters" } }]
Expand Down Expand Up @@ -366,6 +377,16 @@ impl Provider for AnthropicProvider {
}
}

// Check if response was truncated due to max_tokens
let stop_reason = json_resp["stop_reason"].as_str().unwrap_or("");
if stop_reason == "max_tokens" {
log::warn!(
"Anthropic response truncated (stop_reason=max_tokens, max_tokens={}). \
Tool calls may be incomplete.",
self.max_tokens
);
}

let usage = json_resp.get("usage").map(|u| TokenUsage {
prompt_tokens: u["input_tokens"].as_u64().unwrap_or(0) as u32,
completion_tokens: u["output_tokens"].as_u64().unwrap_or(0) as u32,
Expand Down