Skip to content

Add ToolErrorRecovery capability#171

Open
DouweM wants to merge 2 commits intomainfrom
capability/tool-error-recovery
Open

Add ToolErrorRecovery capability#171
DouweM wants to merge 2 commits intomainfrom
capability/tool-error-recovery

Conversation

@DouweM
Copy link
Copy Markdown
Contributor

@DouweM DouweM commented Apr 10, 2026

Summary

  • Adds ToolErrorRecovery capability that catches unhandled tool execution errors and recovers gracefully, preventing agent run crashes
  • Three configurable strategies: 'inform' (default, returns error message to model), ('retry', N) (retries up to N times then informs), ('fallback', value) (returns static value)
  • Per-tool strategy configuration via tool_strategies dict, with default_strategy for unconfigured tools
  • Per-run state isolation via for_run() for retry count tracking
  • Convenience constructors retry() and fallback() for readable strategy definitions

Test plan

  • Unit tests for convenience constructors (retry, fallback) including validation
  • Unit tests for strategy validation (_validate_strategy) covering all valid and invalid forms
  • Unit tests for error formatting with and without traceback
  • Construction validation tests (valid defaults, custom strategies, invalid input)
  • for_run() isolation: fresh instance with reset retry counts
  • inform strategy: error message returned to model (with/without traceback)
  • fallback strategy: static value returned (None, string, dict)
  • retry strategy: success on first attempt, success after failures, exhaustion falls back to inform
  • Non-retry strategies pass through to on_tool_execute_error (wrap_tool_execute re-raises)
  • Strategy resolution: tool-specific overrides and default fallthrough
  • Retry count resets on success across multiple calls
  • Public API import from pydantic_harness
  • 100% code coverage, pyright strict mode (0 errors), ruff lint and format clean

Closes #61

🤖 Generated with Claude Code

DouweM and others added 2 commits April 2, 2026 05:33
…acefully

Catches unhandled tool execution errors and applies configurable recovery
strategies (inform, retry, fallback) per tool, preventing agent run crashes
and enabling the model to self-correct.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…r to ToolErrorRecovery

- retry() now accepts retry_delay (base seconds for 2^attempt backoff) and
  retryable_exceptions (tuple of exception types eligible for retry)
- ToolErrorRecovery gains max_total_errors: after N total errors across all
  tools, recovery stops and errors propagate as-is
- Per-run state (_total_errors) is reset by for_run() alongside _retry_counts
- Full test coverage for all new features including backoff timing verification,
  exception subclass matching, cross-tool budget exhaustion, and validation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 2 additional findings in Devin Review.

Open in Devin Review

Comment on lines +304 to +306
# If the exception isn't retryable, stop immediately.
if not isinstance(exc, retryable_exceptions):
return _format_error(call.tool_name, exc, include_traceback=self.include_traceback)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Non-retryable exceptions bypass the max_total_errors budget check in wrap_tool_execute

In wrap_tool_execute, the non-retryable exception check at line 305 returns an inform message before the budget exhaustion check at line 309. This means when a tool configured with a retry strategy and a custom retryable_exceptions filter encounters a non-retryable exception, it will always be "recovered" (returned as an inform string) even if max_total_errors budget is already exhausted. This contradicts the documented contract of max_total_errors (src/pydantic_harness/tool_error_recovery.py:222-228): "Once the budget is exhausted, subsequent errors propagate as-is instead of being recovered."

Concrete scenario triggering the bug

With ToolErrorRecovery(tool_strategies={'t': retry(3, retryable_exceptions=(ConnectionError,))}, max_total_errors=0), raising a ValueError will increment _total_errors to 1, then hit the non-retryable check and return an inform message — even though _budget_exhausted() would return True (1 > 0). The budget check on line 309 is never reached.

Suggested change
# If the exception isn't retryable, stop immediately.
if not isinstance(exc, retryable_exceptions):
return _format_error(call.tool_name, exc, include_traceback=self.include_traceback)
# If the error budget is exhausted, let the error propagate.
if self._budget_exhausted():
raise
# If the exception isn't retryable, stop immediately.
if not isinstance(exc, retryable_exceptions):
return _format_error(call.tool_name, exc, include_traceback=self.include_traceback)
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +308 to +310
# If the error budget is exhausted, let the error propagate.
if self._budget_exhausted():
raise
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Potential double-counting of _total_errors if framework calls on_tool_execute_error after wrap_tool_execute raises

When the retry strategy's budget is exhausted, wrap_tool_execute re-raises the exception at line 310 (after already incrementing _total_errors at line 301). If the PydanticAI framework then calls on_tool_execute_error for this propagated exception, _total_errors would be incremented again at src/pydantic_harness/tool_error_recovery.py:337. This depends on the framework's hook dispatch behavior — specifically whether on_tool_execute_error fires for exceptions that escape wrap_tool_execute. Without access to the pydantic-ai AbstractCapability source, I can't confirm whether this happens. If it does, the error count would be inflated, though in practice it wouldn't change behavior since the budget is already exhausted at that point.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@DouweM
Copy link
Copy Markdown
Contributor Author

DouweM commented Apr 10, 2026

Originally posted by @DouweM in #158 comment (PR closed due to history rewrite)

Audit vs prior art: ToolErrorRecovery

Worth adding now:

  • Exponential backoff for retries
  • max_total_errors budget across all tools
  • retryable_exceptions filter

Follow-up opportunities:

  • Error categorization, metrics/reporting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tool Error Recovery capability

2 participants