Conversation
Implements a capability that runs configurable verification checks (e.g. lint, test, build) after agent completion and automatically retries with failure feedback if any check fails, up to a configurable maximum number of retries. Closes #79 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add `parallel: bool = True` parameter to run verifiers concurrently via `asyncio.gather` (falls back to sequential for single verifier) - Improve retry feedback prompt to explicitly say "ONLY fix the failing checks, do not make other changes" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| self._in_retry = True | ||
| try: | ||
| result = await agent.run( | ||
| feedback, | ||
| message_history=result.all_messages(), | ||
| ) | ||
| finally: | ||
| self._in_retry = False |
There was a problem hiding this comment.
🔴 Instance-level _in_retry flag causes verification to be silently skipped during concurrent agent runs
The _in_retry boolean is mutable shared state on the VerificationLoop instance. When the agent retries, it sets self._in_retry = True (line 157) and then await agent.run(...) which yields control. If another concurrent agent.run() call enters wrap_run on the same agent (and thus the same capability instance) while the first run is retrying, it will see _in_retry = True at line 130 and skip all verification, returning the result unchecked. After the first run finishes its retry, _in_retry is reset to False (line 164), but the damage is done — the second run silently bypassed verification entirely. Since PydanticAI agents are designed to be reusable across concurrent calls, this is a realistic scenario. A per-call token (e.g., using contextvars or checking the RunContext identity) would avoid this.
Prompt for agents
The _in_retry flag at line 112 is instance-level mutable state that is shared across all concurrent wrap_run invocations on the same VerificationLoop instance. When one run sets _in_retry = True and awaits agent.run() (yielding control), another concurrent run entering wrap_run will see _in_retry as True and skip verification entirely.
To fix this, use a per-call mechanism instead of a shared boolean. Options include:
1. Use a contextvars.ContextVar to track whether the current execution context is a retry, so each async task has its own value.
2. Pass a unique run identifier through the RunContext and track which run IDs are retries in a set.
3. Use an asyncio.Lock or counter (e.g., an integer tracking nested retry depth per task) instead of a plain boolean.
The fix needs to ensure that (a) retry runs on the same call chain still skip verification (to prevent infinite recursion), while (b) unrelated concurrent agent.run() calls on the same agent are not affected.
Was this helpful? React with 👍 or 👎 to provide feedback.
| for attempt in range(1, self.max_retries + 1): | ||
| failures = await self._run_verifiers() | ||
| if not failures: | ||
| return result | ||
|
|
||
| failure_summary = '; '.join(f'{name}: {msg}' for name, msg in failures) | ||
| feedback = self._build_feedback(failures, attempt) | ||
| logger.info( | ||
| 'Verification failed (attempt %d/%d): %s', | ||
| attempt, | ||
| self.max_retries, | ||
| failure_summary, | ||
| ) | ||
|
|
||
| if agent is None: # pragma: no cover — defensive; agent is always set in practice | ||
| warnings.warn( | ||
| 'Verification failed but agent is not available on RunContext for retry. Returning last result.', | ||
| stacklevel=2, | ||
| ) | ||
| return result | ||
|
|
||
| # Mark that the next run is a retry so wrap_run passes through. | ||
| self._in_retry = True | ||
| try: | ||
| result = await agent.run( | ||
| feedback, | ||
| message_history=result.all_messages(), | ||
| ) | ||
| finally: | ||
| self._in_retry = False | ||
|
|
||
| # Final verification after last retry. | ||
| failures = await self._run_verifiers() | ||
| if not failures: | ||
| return result | ||
|
|
||
| warnings.warn( | ||
| f'Verification still failing after {self.max_retries} retries: ' | ||
| + '; '.join(f'{name}: {msg}' for name, msg in failures), | ||
| stacklevel=2, | ||
| ) | ||
| return result |
There was a problem hiding this comment.
🚩 Verification still runs even with max_retries=0
With max_retries=0, the for loop at line 135 is range(1, 1) which is empty, so no retries happen. However, the final verification block at lines 167-176 still executes, running verifiers and potentially emitting a warning like 'Verification still failing after 0 retries'. This means max_retries=0 does not disable verification — it disables retries but still verifies once. This may be the intended behavior (verify but don't retry), but it's worth documenting since a user might expect max_retries=0 to skip verification entirely.
Was this helpful? React with 👍 or 👎 to provide feedback.
Audit vs prior art: VerificationLoopWorth adding now:
Follow-up opportunities:
|
Summary
VerificationLoopcapability that runs configurable verification checks after agent completion and retries with failure feedback on failureVerificationLoop,Verifier,VerificationResultwrap_runhook to orchestrate the verify-fix-retry loop withctx.agent.run()for retries, passing accumulated message history plus structured failure feedbackmax_retries(default 3); emitsUserWarningif all retries exhaustedTest plan
_build_feedbackand_run_verifiershelpersTestModel: pass on first try, retry then pass, max retries exceeded, partial failures, no verifiers, feedback content verification, final-check-after-loop passruff checkandruff formatpasspyrightstrict mode passes on bothsrc/andtests/coverage reportshows 100% across all filesCloses #79
🤖 Generated with Claude Code