diff --git a/docs/agents.md b/docs/agents.md index 640df1e24c..9833ab0c01 100644 --- a/docs/agents.md +++ b/docs/agents.md @@ -124,13 +124,13 @@ It also takes an optional `event_stream_handler` argument that you can use to ga The example below shows how to stream events and text output. You can also [stream structured output](output.md#streaming-structured-output). -!!! note - The `run_stream()` method will consider the first output that matches the [output type](output.md#structured-output) to be the final output of the agent run, even when the model generates tool calls after this "final" output. +!!! note "Streaming Methods Choose Final Result Differently" + [`run_stream()`][pydantic_ai.agent.AbstractAgent.run_stream] and [`run_stream_sync()`][pydantic_ai.agent.AbstractAgent.run_stream_sync] will consider the first tool call that can produce a final result (both [output](output.md#tool-output) and [deferred](deferred-tools.md) tools) to be the final output. This is different from all [other](#running-agents) run methods, which prioritize **output** tools over deferred tools. - These "dangling" tool calls will not be executed unless the agent's [`end_strategy`][pydantic_ai.agent.Agent.end_strategy] is set to `'exhaustive'`, and even then their results will not be sent back to the model as the agent run will already be considered completed. + "Dangling" tool calls generated after the "final output" will not be executed unless the agent's [`end_strategy`][pydantic_ai.agent.Agent.end_strategy] is set to `'exhaustive'`, and even then their results will not be sent back to the model as the agent run will already be considered completed. - If you want to always keep running the agent when it performs tool calls, and stream all events from the model's streaming response and the agent's execution of tools, - use [`agent.run_stream_events()`][pydantic_ai.agent.AbstractAgent.run_stream_events] or [`agent.iter()`][pydantic_ai.agent.AbstractAgent.iter] instead, as described in the following sections. +!!! note + If you want to always keep running the agent when it performs tool calls, and stream all events from the model's streaming response and the agent's execution of tools, use [`agent.run_stream_events()`][pydantic_ai.agent.AbstractAgent.run_stream_events] or [`agent.iter()`][pydantic_ai.agent.AbstractAgent.iter] instead, as described in the following sections. ```python {title="run_stream_event_stream_handler.py"} import asyncio @@ -226,6 +226,8 @@ if __name__ == '__main__': """ ``` +_(This example is complete, it can be run "as is")_ + ### Streaming All Events Like `agent.run_stream()`, [`agent.run()`][pydantic_ai.agent.AbstractAgent.run_stream] takes an optional `event_stream_handler` diff --git a/docs/output.md b/docs/output.md index dbbb84c6e9..bde9439d38 100644 --- a/docs/output.md +++ b/docs/output.md @@ -319,6 +319,12 @@ When the model calls other tools in parallel with an output tool, you can contro The `'exhaustive'` strategy is useful when tools have important side effects (like logging, sending notifications, or updating metrics) that should always execute. +!!! warning "Difference in choosing Final Result in Streaming Methods" + [`run_stream()`][pydantic_ai.agent.AbstractAgent.run_stream] and [`run_stream_sync()`][pydantic_ai.agent.AbstractAgent.run_stream_sync] methods select the first tool call that can produce a final result (both [output](#tool-output) and [deferred](deferred-tools.md)) as final output, while all [other](agents.md#running-agents) run methods prioritize [output tools](#tool-output) first. + + See [Streaming Events and Final Output](agents.md#streaming-events-and-final-output) for a detailed explanation. + + #### Native Output Native Output mode uses a model's native "Structured Outputs" feature (aka "JSON Schema response format"), where the model is forced to only output text matching the provided JSON schema. Note that this is not supported by all models, and sometimes comes with restrictions. For example, Gemini cannot use tools at the same time as structured output, and attempting to do so will result in an error. diff --git a/tests/test_agent.py b/tests/test_agent.py index 41b697d84e..e4b78e6010 100644 --- a/tests/test_agent.py +++ b/tests/test_agent.py @@ -3354,7 +3354,14 @@ def deferred_tool(x: int) -> int: # pragma: no cover ) def test_early_strategy_with_external_tool_call(self): - """Test that early strategy handles external tool calls correctly.""" + """Test that early strategy handles external tool calls correctly. + + Streaming and sync modes differ in how they choose the final result: + - Streaming: First tool call (in response order) that can produce a final result (output or deferred) + - Sync: First output tool (if none called, all deferred tools become final result) + + See https://github.com/pydantic/pydantic-ai/issues/3636#issuecomment-3618800480 for details. + """ tool_called: list[str] = [] def return_model(_: list[ModelMessage], info: AgentInfo) -> ModelResponse: diff --git a/tests/test_streaming.py b/tests/test_streaming.py index f1ea8eaa17..208ab057b3 100644 --- a/tests/test_streaming.py +++ b/tests/test_streaming.py @@ -1137,9 +1137,11 @@ def deferred_tool(x: int) -> int: # pragma: no cover async def test_early_strategy_with_external_tool_call(self): """Test that early strategy handles external tool calls correctly. - Streaming mode expects the first output tool call to be the final result, - and has different behavior from sync mode in this regard. - See https://github.com/pydantic/pydantic-ai/issues/3636 for details. + Streaming and sync modes differ in how they choose the final result: + - Streaming: First tool call (in response order) that can produce a final result (output or deferred) + - Sync: First output tool (if none called, all deferred tools become final result) + + See https://github.com/pydantic/pydantic-ai/issues/3636#issuecomment-3618800480 for details. """ tool_called: list[str] = []