Skip to content

Add AdaptiveReasoning capability#174

Draft
DouweM wants to merge 2 commits intomainfrom
capability/adaptive-reasoning
Draft

Add AdaptiveReasoning capability#174
DouweM wants to merge 2 commits intomainfrom
capability/adaptive-reasoning

Conversation

@DouweM
Copy link
Copy Markdown
Contributor

@DouweM DouweM commented Apr 10, 2026

Summary

  • Adds AdaptiveReasoning capability that dynamically adjusts model thinking effort per step using get_model_settings() with a callable
  • Built-in heuristic: high effort on first step and after tool errors, low effort on simple follow-ups
  • Supports custom effort_fn(RunContext) -> Literal['low', 'medium', 'high'] for domain-specific logic
  • Exported from pydantic_harness package

Test plan

  • 18 tests covering _has_tool_errors helper, default_effort_fn heuristic, AdaptiveReasoning capability with default and custom effort functions
  • 100% code coverage
  • pyright strict mode passes
  • ruff lint and format pass

Closes #84

🤖 Generated with Claude Code

DouweM and others added 2 commits April 2, 2026 05:30
Implements a capability that dynamically adjusts model thinking effort
based on task complexity signals via `get_model_settings()` returning a
callable. Built-in heuristic uses high effort on first step and after
tool errors, low effort on simple follow-ups. Supports custom effort
functions for domain-specific logic.

Closes #84

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Step 2 now returns medium (was incorrectly going straight to low)
- Step 3+ returns low as intended
- Removed unreachable default branch
- Added many-tool-calls signal: if the last ModelResponse had 3+
  ToolCallParts, use medium effort regardless of step number

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 3 additional findings in Devin Review.

Open in Devin Review

"""Built-in heuristic effort selector.

Rules (evaluated in order):
1. First step (``run_step == 1``): ``'high'`` -- understand the task.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Docstring says run_step == 1 but code uses run_step <= 1

The docstring for default_effort_fn documents rule 1 as "First step (run_step == 1)" but the actual implementation at line 68 uses ctx.run_step <= 1, which also matches run_step == 0. The test test_step_zero_high (tests/test_adaptive_reasoning.py:152-154) confirms that run_step=0 returns 'high', matching the code but not the docstring. The PLAN.md:38 correctly documents this as run_step <= 1.

Suggested change
1. First step (``run_step == 1``): ``'high'`` -- understand the task.
1. First step (``run_step <= 1``): ``'high'`` -- understand the task.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +83 to +113
@dataclass
class AdaptiveReasoning(AbstractCapability[Any]):
"""Dynamically adjusts model thinking effort per step.

By default a built-in heuristic is used:

* **First step** -> ``'high'`` (understand the task)
* **After tool errors** -> ``'high'`` (reason about what went wrong)
* **Simple follow-ups** -> ``'low'`` (just incorporating tool results)

Supply a custom ``effort_fn`` to override these rules::

def my_effort(ctx: RunContext[MyDeps]) -> Literal['low', 'medium', 'high']:
if ctx.run_step > 5:
return 'high' # wrap-up needs careful thought
return 'medium'

agent = Agent(..., capabilities=[AdaptiveReasoning(effort_fn=my_effort)])
"""

effort_fn: Callable[[RunContext[Any]], Literal['low', 'medium', 'high']] = field(default=default_effort_fn)
"""Callable that receives the current ``RunContext`` and returns an effort level."""

def get_model_settings(self) -> Callable[[RunContext[Any]], ModelSettings]:
"""Return a dynamic model-settings callable that sets ``thinking`` per step."""

def _resolve(ctx: RunContext[Any]) -> ModelSettings:
effort = self.effort_fn(ctx)
return ModelSettings(thinking=_EFFORT_TO_THINKING[effort])

return _resolve
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Capability only overrides get_model_settings, no other hooks

The AdaptiveReasoning class only implements get_model_settings() from the AbstractCapability interface. Without being able to inspect the AbstractCapability source (the pydantic-ai-slim package isn't installed in the review environment), I could not verify whether there are other required or recommended methods (e.g., get_system_prompt, get_tools). The tests do verify isinstance(cap, AbstractCapability) passes, and the tests presumably run in CI, so this is likely fine.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@DouweM
Copy link
Copy Markdown
Contributor Author

DouweM commented Apr 10, 2026

Originally posted by @DouweM in #155 comment (PR was recreated)

Audit vs prior art: AdaptiveReasoning

Worth adding now:

  • Medium effort for step 2 (currently jumps straight from high to low)
  • Tool-count signal: 3+ tool calls in last response -> bump effort
  • Long tool output signal: >2000 chars -> medium effort

Follow-up opportunities:

  • Model-routed effort prediction, phase-based effort

@DouweM
Copy link
Copy Markdown
Contributor Author

DouweM commented Apr 10, 2026

Originally posted by @DouweM in #155 comment (PR was recreated)

Audit vs prior art: AdaptiveReasoning

Worth adding now:

  • Medium effort for step 2 (currently jumps from high to low)
  • Tool-count signal: 3+ tool calls -> bump effort
  • Long tool output signal: >2000 chars -> medium

Follow-up opportunities:

  • Model-routed effort prediction, phase-based effort

@DouweM DouweM marked this pull request as draft April 10, 2026 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Adaptive Reasoning Effort (per-step thinking budget selection)

1 participant