Skip to content

Conversation

@fenilfaldu
Copy link

@fenilfaldu fenilfaldu commented Nov 14, 2025

Closes #2258
Before
Screenshot 2025-11-14 at 6 05 50 AM

After
Screenshot 2025-11-14 at 6 06 55 AM


Note

Capture LLM model name and detailed token usage in BaseOpenAIChatCompletionClient.create and .create_stream, with streaming usage injection and end-of-stream attribute setting.

  • Instrumentation: OpenAI ChatCompletion (_wrappers.py)
    • LLM token metrics: Extract token usage from CreateResult.usage (prompt, completion, total) with details (prompt: cache_read/audio/cache_input; completion: reasoning/audio) and set via get_llm_token_count_attributes, plus explicit reasoning/audio completion detail attributes.
    • Model metadata: Add model name via get_llm_model_name_attributes to LLM spans.
    • Streaming support: Ensure include_usage (via extra_create_args.stream_options or include_usage param); during streaming, accumulate output/tool-call/token attributes and set them after stream completion.
    • Outputs/Tools: Continue setting output attributes and extracting tool call attributes from responses.

Written by Cursor Bugbot for commit ab0a81a. This will update automatically on new commits. Configure here.

@fenilfaldu fenilfaldu requested a review from a team as a code owner November 14, 2025 00:48
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Nov 14, 2025
@fenilfaldu fenilfaldu changed the title Add token metrics support to autogen-agentchat instrumentation with streaming support feat:Add token metrics support to autogen-agentchat instrumentation with streaming support Nov 14, 2025
if details:
prompt_details = _extract_details_from_object(
details,
{"cache_read": "cached_tokens", "audio": "audio_tokens", "cache_input": "text_tokens"}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Token Attribution Fails for Cached Input

The mapping includes "cache_input": "text_tokens", but get_llm_token_count_attributes doesn't handle the cache_input key in prompt_details. If text_tokens exists in the response, it gets extracted into the token usage dict but is never converted to a span attribute, making this extraction ineffective.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[bug] cannot get token counts and cost from streaming llm clients in autogen-agentchat

1 participant