Skip to content

Add StuckLoopDetection capability#186

Draft
DouweM wants to merge 2 commits intomainfrom
capability/stuck-loop-detection
Draft

Add StuckLoopDetection capability#186
DouweM wants to merge 2 commits intomainfrom
capability/stuck-loop-detection

Conversation

@DouweM
Copy link
Copy Markdown
Contributor

@DouweM DouweM commented Apr 10, 2026

Summary

  • Adds StuckLoopDetection capability that detects three patterns of repetitive agent behavior: repeated identical tool calls, alternating A-B-A-B call patterns, and no-op calls (same tool returning the same result)
  • Configurable threshold (max_repeated_calls, default 3) and action (warn via ModelRetry or error via StuckLoopError)
  • Uses for_run() for per-run state isolation so concurrent runs don't interfere

Closes #71.

Test plan

  • 32 tests covering all detection scenarios, edge cases, and configuration options
  • 100% code coverage (branch and statement)
  • Passes ruff lint and format checks
  • Passes pyright strict type checking
  • Verify with a real agent that ModelRetry successfully redirects the model

🤖 Generated with Claude Code

DouweM and others added 2 commits April 2, 2026 05:27
Implements a capability that monitors tool-call patterns via
after_model_request and after_tool_execute hooks, detecting three
scenarios: repeated identical calls, alternating A-B-A-B patterns,
and no-op calls (same result returned). Configurable threshold and
action (warn via ModelRetry or abort via StuckLoopError).

Closes #71.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prevents unbounded memory growth during long agent runs by discarding
oldest entries (from the left) when lists exceed the configured limit.
Defaults to 50 entries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment on lines +172 to +180
# --- Check for repeated identical calls ---
reason = self._check_repeated()
if reason is None:
reason = self._check_alternating()

if reason is not None:
self._trigger(reason)

return response
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 ModelRetry after detection does not clear history — could cause repeated warnings

When action='warn' and a loop is detected, ModelRetry is raised but _call_history is not cleared. If the model retries with the same call again, the history still contains the old repeated entries plus the new one, and detection will trigger again immediately. This creates a cycle of ModelRetry → same call → ModelRetry that will exhaust max_result_retries and abort the run. This is arguably the correct behavior (the agent genuinely is stuck), but it means action='warn' may effectively behave like action='error' if the model doesn't change strategy after the first retry. A design where the history is partially reset after a warning (to give the model a fresh chance) was presumably considered and rejected. This is not a bug but worth noting for documentation.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@DouweM
Copy link
Copy Markdown
Contributor Author

DouweM commented Apr 10, 2026

Originally posted by @DouweM in #130 comment (PR was recreated)

Note: Currently uses StuckLoopError (propagates as unhandled exception) or ModelRetry (warns the model). A cleaner termination path would use #145 (StopConditionMet exception) once available.

@DouweM
Copy link
Copy Markdown
Contributor Author

DouweM commented Apr 10, 2026

Originally posted by @DouweM in #130 comment (PR was recreated)

Audit vs prior art: StuckLoopDetection

Worth adding now:

  • Sliding-window fuzzy matching (same tool, slightly different args) — OpenHands pattern
  • max_history_length cap to bound memory
  • Configurable string similarity threshold for no-op detection

Follow-up opportunities:

  • Higher-level semantic loop detection
  • "3 identical commands with no edits" pattern (SWE-agent)

@DouweM DouweM marked this pull request as draft April 10, 2026 15:13
@dmmihov
Copy link
Copy Markdown

dmmihov commented Apr 13, 2026

Noticed community alternatives is blank, vstorm just released their stuck loop detections as well, though it might be worth bringing to your attention.

https://github.com/vstorm-co/pydantic-deepagents/blob/main/pydantic_deep/capabilities/stuck_loop.py#L75

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stuck/Loop Detection capability

2 participants