Skip to content

Support reasoning summary models in AzureOpenAIEvalClient#216

Merged
taniokay merged 10 commits intomainfrom
azure-responses
Nov 3, 2025
Merged

Support reasoning summary models in AzureOpenAIEvalClient#216
taniokay merged 10 commits intomainfrom
azure-responses

Conversation

@taniokay
Copy link
Copy Markdown
Contributor

@taniokay taniokay commented Nov 3, 2025

LiteLLMEvalClient already supports Responses API, but litellm.responses is still in beta, so this PR supports Responses API in AzureOpenAIEvalClient too.


Note

Adds Responses API-based reasoning summary to OpenAI and Azure eval clients, adjusts outputs/logprobs behavior, improves error logging, and bumps version.

  • Eval Clients (OpenAI/Azure):
    • Add optional reasoning summary support (use_reasoning_summary, reasoning_effort, reasoning_summary).
    • Route requests via new _dispatch to switch between Chat Completions and Responses API.
    • In get_text_responses, extract content and append reasoning summaries when enabled.
    • Disallow log-likelihood retrieval when reasoning summary is enabled; refine logprobs config handling.
  • Error Handling:
    • Print full stack traces for caught exceptions in LiteLLM/OpenAI paths.
  • Version:
    • Bump __version__ and project version to 0.10.0.dev12.

Written by Cursor Bugbot for commit d98842e. This will update automatically on new commits. Configure here.

@kennysong
Copy link
Copy Markdown
Contributor

bugbot run

@kennysong kennysong requested a review from Copilot November 3, 2025 08:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for OpenAI's reasoning summary feature to the OpenAI evaluation clients, along with enhanced error logging using traceback for better debugging. The changes introduce new parameters to enable reasoning summaries and refactor the API dispatch logic to support both the Chat Completions API and the Responses API.

Key changes:

  • Added reasoning summary parameters (use_reasoning_summary, reasoning_effort, reasoning_summary) to OpenAIEvalClient and AzureOpenAIEvalClient
  • Introduced a new _dispatch method to handle routing between Chat Completions API and Responses API
  • Enhanced exception handling with traceback.print_exception() calls in multiple places
  • Updated response processing logic to extract and format reasoning summaries

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/langcheck/metrics/eval_clients/_openai.py Added reasoning summary support with new parameters, refactored API dispatch logic, enhanced error logging with traceback
src/langcheck/metrics/eval_clients/_litellm.py Added traceback printing for better error debugging
Comments suppressed due to low confidence (3)

src/langcheck/metrics/eval_clients/_openai.py:554

  • Corrected spelling of 'Intialize' to 'Initialize'.
        Intialize the Azure OpenAI evaluation client.

src/langcheck/metrics/eval_clients/_openai.py:56

  • The docstring is missing documentation for the newly added parameters: use_reasoning_summary, reasoning_effort, and reasoning_summary. These should be documented to explain their purpose and usage.
            openai_client (Optional): The OpenAI client to use.
            openai_args (Optional): dict of additional args to pass in to the
            `client.chat.completions.create` function.
            use_async: If True, the async client will be used. Defaults to
                False.
            system_prompt (Optional): The system prompt to use. If not provided,
                no system prompt will be used.
            extractor (Optional): The extractor to use. If not provided, the
                default extractor will be used.

src/langcheck/metrics/eval_clients/_openai.py:572

  • The docstring is missing documentation for the newly added parameters: use_reasoning_summary, reasoning_effort, and reasoning_summary. These should be documented to explain their purpose and usage.
            text_model_name (Optional): The text model name you want to use with
                the Azure OpenAI API. The name is used as
                `{ "model": text_model_name }` parameter when calling the Azure
                OpenAI API for text models.
            embedding_model_name (Optional): The text model name you want to
                use with the Azure OpenAI API. The name is used as
                `{ "model": embedding_model_name }` parameter when calling the
                Azure OpenAI API for embedding models.
            azure_openai_client (Optional): The Azure OpenAI client to use.
            openai_args (Optional): dict of additional args to pass in to the
                `client.chat.completions.create` function.
            use_async (Optional): If True, the async client will be used.
            system_prompt (Optional): The system prompt to use. If not provided,
                no system prompt will be used.
            extractor (Optional): The extractor to use. If not provided, the
                default extractor will be used.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/langcheck/metrics/eval_clients/_openai.py Outdated
Comment thread src/langcheck/metrics/eval_clients/_openai.py
Comment thread src/langcheck/metrics/eval_clients/_openai.py
cursor[bot]

This comment was marked as outdated.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Comment thread src/langcheck/metrics/eval_clients/_openai.py
Comment thread src/langcheck/metrics/eval_clients/_openai.py
Comment thread src/langcheck/metrics/eval_clients/_openai.py
cursor[bot]

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@kennysong kennysong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done reviewing!

}

# seed and logprobs are not supported in responses API.
return self._client.responses.create(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nice, I was wondering if we set this properly to avoid logging prompts

Comment thread src/langcheck/metrics/eval_clients/_openai.py
Co-authored-by: Kenny Song <kenny.ysong@gmail.com>
cursor[bot]

This comment was marked as outdated.

@@ -2,12 +2,14 @@

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cursor[bot]

This comment was marked as outdated.

@kennysong
Copy link
Copy Markdown
Contributor

kennysong commented Nov 3, 2025

LGTM after bumping the version!

@taniokay
Copy link
Copy Markdown
Contributor Author

taniokay commented Nov 3, 2025

Thanks for your quick review!

Comment thread src/langcheck/metrics/eval_clients/_openai.py
@taniokay taniokay merged commit d6e492f into main Nov 3, 2025
27 checks passed
@taniokay taniokay deleted the azure-responses branch November 3, 2025 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants