-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add fallback_on_response parameter to FallbackModel
#3786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
sarth6
wants to merge
13
commits into
pydantic:main
Choose a base branch
from
sarth6:fallback-on-response
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+880
−5
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Adds response-based fallback to `FallbackModel`, allowing fallback based on response content rather than just exceptions. This is useful when a model returns a successful HTTP response but the content indicates a semantic failure (e.g., a builtin tool like `web_fetch` reporting failure). - Add `fallback_on_response: Callable[[ModelResponse, list[ModelMessage]], bool]` parameter - Support streaming with response-based fallback via `BufferedStreamedResponse` - Add 15 tests covering non-streaming and streaming scenarios - Document the feature with a practical example Closes pydantic#3640 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Adds a new `fallback_on_part` parameter to `FallbackModel` that enables checking each response part during streaming. This allows early abort when failure conditions are detectable before the full response completes, such as when a WebFetch tool fails in the first chunk. - Adds FallbackOnPart type alias and callback parameter - Checks parts on PartEndEvent during streaming iteration - Can be combined with fallback_on_response for layered checks - Tracks part_rejections separately in error reporting - Documentation and tests for the new feature 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Changed gemini-2.0-flash to gemini-2.5-flash (2.5 is current stable) - Kept claude-sonnet-4-5 (already current) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add warning that fallback_on_part buffers the response like fallback_on_response. The caller doesn't receive progressive streaming—the benefit is token savings and faster fallback when failures are detected early. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Flatten nested conditionals using guard clauses for better readability. The examples already use pydantic-ai types correctly via builtin_tool_calls and isinstance checks with BuiltinToolReturnPart. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add provider_url property to BufferedStreamedResponse - Fix type hints from TextPart to ModelResponsePart 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Fix type hints from TextPart to ModelResponsePart for fallback_on_part callbacks - Add isinstance checks before accessing .content on ModelResponsePart - Add proper type annotation for dict content in web_fetch_failed test 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
The response-hooks-design.md was a planning document that shouldn't be included in the final PR. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add explicit imports for BuiltinToolCallPart and BuiltinToolReturnPart - Add type annotations for loop variables - Add dict[str, Any] annotation for content variable - Use guard clause pattern for clearer flow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
The test framework runs ruff linting separately from code execution. test="skip" only skips execution, not linting. Adding lint="skip" ensures these example snippets don't fail CI linting. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add pragma: no branch to guard clauses in test helper - Add pragma: no cover to unreachable lines inside pytest.raises blocks 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Move pragma: no cover to continue statements (the actual uncovered lines) - Use cast(dict[str, Any], content) to fix pyright type errors 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
fallback_on_responseparameter toFallbackModelfor response-based fallbackBufferedStreamedResponseMotivation
Closes #3640
When using built-in tools like
web_fetch, a model may return a successful HTTP response (no exception) but the response content indicates a semantic failure. For example, Google'sWebFetchToolreturnsURL_RETRIEVAL_STATUS_FAILEDin the response rather than throwing an exception.This PR adds a
fallback_on_responsecallback that inspects the response content and triggers fallback when appropriate.Changes
FallbackModel:fallback_on_response: Callable[[ModelResponse, list[ModelMessage]], bool] | NoneparameterTrueto reject the response and try the next modelStreaming support:
fallback_on_responseis set with streaming, the entire response is buffered before evaluationBufferedStreamedResponsereplays the buffered events to the callerExample
Test plan
fallback_on_responsefallback_on_response🤖 Generated with Claude Code