Add `fallback_on_response` parameter to `FallbackModel` #3786

sarth6 · 2025-12-20T22:01:41Z

Summary

Add fallback_on_response parameter to FallbackModel for response-based fallback
Support streaming with response-based fallback via BufferedStreamedResponse
Add 15 tests covering both non-streaming and streaming scenarios

Motivation

When using built-in tools like web_fetch, a model may return a successful HTTP response (no exception) but the response content indicates a semantic failure. For example, Google's WebFetchTool returns URL_RETRIEVAL_STATUS_FAILED in the response rather than throwing an exception.

This PR adds a fallback_on_response callback that inspects the response content and triggers fallback when appropriate.

Changes

FallbackModel:

New fallback_on_response: Callable[[ModelResponse, list[ModelMessage]], bool] | None parameter
When set, the callback is invoked after each successful model response
Returns True to reject the response and try the next model

Streaming support:

When fallback_on_response is set with streaming, the entire response is buffered before evaluation
BufferedStreamedResponse replays the buffered events to the caller
Trade-off documented: caller won't receive partial results until full response is ready

Example

from pydantic_ai.models.fallback import FallbackModel

def web_fetch_failed(response: ModelResponse, messages: list[ModelMessage]) -> bool:
    for call, result in response.builtin_tool_calls:
        if call.tool_name == 'web_fetch':
            content = result.content
            if isinstance(content, dict):
                status = content.get('url_retrieval_status', '')
                if status and status != 'URL_RETRIEVAL_STATUS_SUCCESS':
                    return True
    return False

fallback_model = FallbackModel(
    google_model,
    anthropic_model,
    fallback_on_response=web_fetch_failed,
)

Test plan

9 non-streaming tests for fallback_on_response
6 streaming tests for fallback_on_response
100% test coverage maintained
All 32 fallback tests pass

🤖 Generated with Claude Code

Adds response-based fallback to `FallbackModel`, allowing fallback based on response content rather than just exceptions. This is useful when a model returns a successful HTTP response but the content indicates a semantic failure (e.g., a builtin tool like `web_fetch` reporting failure). - Add `fallback_on_response: Callable[[ModelResponse, list[ModelMessage]], bool]` parameter - Support streaming with response-based fallback via `BufferedStreamedResponse` - Add 15 tests covering non-streaming and streaming scenarios - Document the feature with a practical example Closes pydantic#3640 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Adds a new `fallback_on_part` parameter to `FallbackModel` that enables checking each response part during streaming. This allows early abort when failure conditions are detectable before the full response completes, such as when a WebFetch tool fails in the first chunk. - Adds FallbackOnPart type alias and callback parameter - Checks parts on PartEndEvent during streaming iteration - Can be combined with fallback_on_response for layered checks - Tracks part_rejections separately in error reporting - Documentation and tests for the new feature 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Changed gemini-2.0-flash to gemini-2.5-flash (2.5 is current stable) - Kept claude-sonnet-4-5 (already current) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Add warning that fallback_on_part buffers the response like fallback_on_response. The caller doesn't receive progressive streaming—the benefit is token savings and faster fallback when failures are detected early. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Flatten nested conditionals using guard clauses for better readability. The examples already use pydantic-ai types correctly via builtin_tool_calls and isinstance checks with BuiltinToolReturnPart. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Add provider_url property to BufferedStreamedResponse - Fix type hints from TextPart to ModelResponsePart 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Fix type hints from TextPart to ModelResponsePart for fallback_on_part callbacks - Add isinstance checks before accessing .content on ModelResponsePart - Add proper type annotation for dict content in web_fetch_failed test 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

The response-hooks-design.md was a planning document that shouldn't be included in the final PR. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Add explicit imports for BuiltinToolCallPart and BuiltinToolReturnPart - Add type annotations for loop variables - Add dict[str, Any] annotation for content variable - Use guard clause pattern for clearer flow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

The test framework runs ruff linting separately from code execution. test="skip" only skips execution, not linting. Adding lint="skip" ensures these example snippets don't fail CI linting. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Add pragma: no branch to guard clauses in test helper - Add pragma: no cover to unreachable lines inside pytest.raises blocks 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Move pragma: no cover to continue statements (the actual uncovered lines) - Use cast(dict[str, Any], content) to fix pyright type errors 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

sarth6 and others added 13 commits December 20, 2025 17:01

Update documentation to use current model names

8a86efe

- Changed gemini-2.0-flash to gemini-2.5-flash (2.5 is current stable) - Kept claude-sonnet-4-5 (already current) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

WIP: Fix CI failures - add provider_url, fix type hints

c140148

- Add provider_url property to BufferedStreamedResponse - Fix type hints from TextPart to ModelResponsePart 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Merge remote-tracking branch 'upstream/main' into fallback-on-response

2315a54

Remove design document from PR

874d503

The response-hooks-design.md was a planning document that shouldn't be included in the final PR. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Fix test coverage issues

43a7015

- Add pragma: no branch to guard clauses in test helper - Add pragma: no cover to unreachable lines inside pytest.raises blocks 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `fallback_on_response` parameter to `FallbackModel` #3786

Add `fallback_on_response` parameter to `FallbackModel` #3786

sarth6 commented Dec 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add fallback_on_response parameter to FallbackModel #3786

Are you sure you want to change the base?

Add fallback_on_response parameter to FallbackModel #3786

Conversation

sarth6 commented Dec 20, 2025

Summary

Motivation

Changes

Example

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add `fallback_on_response` parameter to `FallbackModel` #3786

Add `fallback_on_response` parameter to `FallbackModel` #3786