Skip to content

Commit eea580a

Browse files
committed
feat(code-reviewer): add conditional Performance lens, extend existing agents
The three existing review lenses left performance-sensitive concerns (N+1 queries, algorithmic complexity, unbounded memory growth) in nobody's scope. Adding a conditional 4th subagent keeps the typical case at 3 agents while providing deeper analysis when the diff touches hot paths. Folding standards, docs, and dependency concerns into existing agents closes additional gaps without coordination overhead.
1 parent 583a6d4 commit eea580a

2 files changed

Lines changed: 48 additions & 10 deletions

File tree

src/ai_rules/config/skills/code-reviewer/SKILL.md

Lines changed: 33 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,17 @@ Compute from the gathered diff:
8080

8181
Crossfire (external model perspectives) is only available for Medium/Large diffs when `crossfire` keyword is in args.
8282

83+
### Performance Relevance
84+
85+
For Medium/Large diffs, scan the diff text to determine whether the Performance & Scalability agent should be activated. Set `performance_relevant = true` when the diff contains ANY of:
86+
- Database/ORM query construction (`.filter(`, `.all(`, `.query(`, `.execute(`, `SELECT`, `JOIN`, `WHERE`, raw SQL strings)
87+
- Loops over collections of indeterminate size (`for x in results`, `for item in data`, `while` loops processing external input)
88+
- New function calls, I/O operations, or subprocess invocations inside loops
89+
- Data structures that grow proportionally with input volume (appending to lists/dicts in loops, accumulation patterns)
90+
- Explicit performance-related references in comments/docstrings (`performance`, `latency`, `throughput`, `cache`, `O(n`)
91+
92+
Do NOT activate for: config-only changes, test-only changes, documentation-only changes, UI/template changes, import reorganization, type annotation changes.
93+
8394
## Review Methodology
8495

8596
### Phase 1: Context Gathering
@@ -100,19 +111,22 @@ For Small complexity diffs, execute the review inline using all four lenses sequ
100111
- Is this the right location/abstraction level for this functionality?
101112
- Would this be better in a library or separate module?
102113
- Does the overall design approach make sense for this system?
114+
- If this diff introduces loops, query patterns, or data structure operations: are there obvious algorithmic complexity concerns (e.g., O(n^2) where O(n) is possible) or unnecessary repeated I/O?
103115

104116
**Lens 1: Simplicity & Maintainability**
105117
- Could this be simpler while maintaining functionality?
106118
- Will future developers understand this easily?
107119
- Is there unnecessary complexity or over-engineering?
108120
- Is this solving present needs or hypothetical future problems?
109121
- Are there opportunities to reduce duplication (3+ occurrences)?
122+
- Does the code follow project-specific conventions from AGENTS.md or CLAUDE.md? (naming, directory structure, tooling mandates)
110123

111124
**Lens 2: Security & Reliability**
112125
- Are there security vulnerabilities? (SQL injection, XSS, auth bypass, data exposure)
113126
- Is error handling adequate for external dependencies?
114127
- Are edge cases properly handled?
115128
- Could this cause data corruption or loss?
129+
- If dependency files changed: are new dependencies well-maintained, version-pinned, and free of known vulnerabilities?
116130

117131
**Lens 3: Functionality & Testing**
118132
- Does the code do what the developer intended?
@@ -122,6 +136,7 @@ For Small complexity diffs, execute the review inline using all four lenses sequ
122136
- Do tests verify behavior, not implementation details?
123137
- Is coverage sufficient for the risk level?
124138
- Are tests focused on what matters, not trivial cases?
139+
- For changed APIs or function signatures: are docstrings and documentation still accurate?
125140

126141
After applying all lenses, proceed directly to Phase 3.
127142

@@ -140,13 +155,16 @@ Gather for subagent briefings:
140155

141156
Load the briefing template from `references/subagent-template.md` and construct one briefing per specialist. Launch all agents in parallel — this is critical for speed.
142157

143-
**Claude subagents (always, for Medium/Large):**
158+
**Claude subagents (for Medium/Large):**
159+
160+
| Agent | Model | Lens Focus | Scope Boundaries | Condition |
161+
|-------|-------|------------|-----------------|-----------|
162+
| Security & Reliability | `sonnet` | Injection, auth, data exposure, error handling, edge cases, dependency hygiene | Do NOT review for design fit, over-engineering, test coverage, or performance | Always |
163+
| Design & Simplicity | `sonnet` | Architecture fit, abstraction level, over-engineering, duplication, maintainability, project conventions | Do NOT review for security vulnerabilities, test coverage, or performance cost | Always |
164+
| Functionality & Testing | `sonnet` | Correctness, intended behavior, test coverage, test quality, user-facing edge cases, API contract accuracy | Do NOT review for security vulnerabilities, design patterns, or performance | Always |
165+
| Performance & Scalability | `sonnet` | Algorithmic complexity, query efficiency, I/O patterns, memory growth, hot-path regressions | Do NOT review for security, design architecture, correctness, or test quality | Only when `performance_relevant = true` |
144166

145-
| Agent | Model | Lens Focus | Scope Boundaries |
146-
|-------|-------|------------|-----------------|
147-
| Security & Reliability | `sonnet` | Injection, auth, data exposure, error handling, edge cases | Do NOT review for design fit, over-engineering, or test coverage |
148-
| Design & Simplicity | `sonnet` | Architecture fit, abstraction level, over-engineering, duplication, maintainability | Do NOT review for security vulnerabilities or test coverage |
149-
| Functionality & Testing | `sonnet` | Correctness, intended behavior, test coverage, test quality, user-facing edge cases | Do NOT review for security vulnerabilities or design patterns |
167+
If the diff was flagged as performance-relevant in the Performance Relevance classification, launch all four agents. Otherwise, launch only the three core agents (Security & Reliability, Design & Simplicity, Functionality & Testing).
150168

151169
Each subagent receives: the full diff, instruction to read modified files in full (not just diff hunks), its assigned lens with key questions from the template, explicit scope boundaries, and the severity framework (🔴 MUST FIX / 🟡 SHOULD FIX / 🟢 CONSIDER).
152170

@@ -313,8 +331,16 @@ Synthesize findings from ALL sources (Claude subagents + optional crossfire):
313331
- Identical findings from multiple agents: keep the one with the most specific file:line citation
314332
- Map crossfire severities: CRITICAL → 🔴, IMPORTANT → 🟡, MINOR → 🟢
315333

334+
**Step 2.5: Cross-agent verification**
335+
336+
Before producing the final output, perform two verification checks:
337+
338+
1. **Contradiction check:** Scan for cases where one agent's findings assume something another agent's findings contradict. When detected, apply orchestrator judgment — explain which finding holds and why, rather than presenting both uncritically.
339+
340+
2. **Gap check:** Ask: "Are there concerns that fall between the scope boundaries of the agents that none of them would have been positioned to catch?" Surface any such concerns as orchestrator-attributed findings with appropriate severity.
341+
316342
**Step 3: Produce unified output**
317-
Organize findings by severity tier (🔴 then 🟡 then 🟢), NOT by which agent found them. For each finding, note if it was confirmed by multiple sources. Include a methodology note (e.g., "Reviewed via 3 parallel Claude specialists" or "Reviewed via 3 Claude specialists + Codex + Gemini").
343+
Organize findings by severity tier (🔴 then 🟡 then 🟢), NOT by which agent found them. For each finding, note if it was confirmed by multiple sources. Include a methodology note (e.g., "Reviewed via 3 parallel Claude specialists", "Reviewed via 4 Claude specialists (incl. Performance)" or "Reviewed via 4 Claude specialists + Codex + Gemini").
318344

319345
**Net verdict (PR Mode only):**
320346
- REQUEST_CHANGES if any HIGH-confidence 🔴 MUST FIX exists

src/ai_rules/config/skills/code-reviewer/references/subagent-template.md

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -85,25 +85,37 @@ CRITICAL RULES:
8585
## Lens Definitions
8686

8787
### Security & Reliability Agent
88-
**Focus:** Injection, authentication, authorization, data exposure, error handling, edge cases, data corruption risks
88+
**Focus:** Injection, authentication, authorization, data exposure, error handling, edge cases, data corruption risks, dependency hygiene
8989
**Key questions (adapt per diff):**
9090
- Are there input validation gaps where external data enters the system?
9191
- Could any change introduce auth bypass, data leakage, or injection vulnerabilities?
9292
- Is error handling adequate for external dependencies and failure modes?
9393
- Are edge cases handled that could cause data corruption or crashes?
94+
- If dependency files (pyproject.toml, uv.lock, package.json, Cargo.toml, etc.) are in the diff: are new dependencies well-maintained, pinned to a specific version range, and free of known security concerns?
9495

9596
### Design & Simplicity Agent
96-
**Focus:** Architecture fit, abstraction level, over-engineering, code duplication, maintainability, naming clarity
97+
**Focus:** Architecture fit, abstraction level, over-engineering, code duplication, maintainability, naming clarity, project conventions
9798
**Key questions (adapt per diff):**
9899
- Does this change integrate well with the existing architecture and patterns?
99100
- Is there unnecessary complexity or over-engineering for the problem being solved?
100101
- Are there duplication opportunities (3+ occurrences of similar logic)?
101102
- Will future developers understand this code without the PR context?
103+
- Are there violations of project-specific conventions documented in AGENTS.md, CLAUDE.md, or similar convention files? (Read these files as part of your review context.)
102104

103105
### Functionality & Testing Agent
104-
**Focus:** Correctness, intended behavior, user-facing edge cases, test coverage, test quality
106+
**Focus:** Correctness, intended behavior, user-facing edge cases, test coverage, test quality, API contract accuracy
105107
**Key questions (adapt per diff):**
106108
- Does the code actually do what the developer intended? Any logical errors?
107109
- Are critical paths covered by tests that verify behavior (not implementation)?
108110
- Are there edge cases end users will encounter that aren't handled?
109111
- For UI changes: will the user experience work as expected?
112+
- For any changed public API, function signature, or user-facing behavior: is the corresponding documentation (docstrings, README, changelog) still accurate?
113+
114+
### Performance & Scalability Agent
115+
**Focus:** Algorithmic complexity, query efficiency, I/O patterns, memory growth, hot-path regressions, resource utilization
116+
**Condition:** Only activated when the diff touches performance-sensitive code (database queries, loops over collections, data structure operations, I/O in loops)
117+
**Key questions (adapt per diff):**
118+
- Does this change introduce any operations whose cost scales worse than linearly with input size?
119+
- Are there database queries or I/O operations inside loops that could be batched or hoisted?
120+
- Could any new data structures grow unboundedly in proportion to user/data volume?
121+
- Are there repeated computations or I/O calls that could be cached or deduplicated?

0 commit comments

Comments
 (0)