feat: add AGENT_DEV_CMD / AGENT_REVIEW_CMD per-side overrides (INV-37)#156
Conversation
Adds:
- docs/pipeline/per-side-agent-cmd.md — spec for AGENT_DEV_CMD /
AGENT_REVIEW_CMD operator knobs that let dev and review run on
different agent CLIs in the same project (e.g. claude for dev,
agy for review). Naming follows AGENT_DEV_MODEL /
AGENT_REVIEW_MODEL precedent. Defaults to AGENT_CMD so existing
deployments are unchanged.
- docs/pipeline/README.md — index entry.
INV-37 (the precedence rule) is referenced in the spec but lands in
invariants.md as part of the implementation PR.
…spec
Review findings landed inline:
- A1: clarify AGENT_LAUNCHER guard runs at lib-agent source time,
BEFORE the wrapper's AGENT_CMD override
- A3: tighten AGENT_PERMISSION_MODE rationale — claude-only branch
consumes it; rare inverse pattern (agy-dev / claude-review)
explicitly deferred
- Q1: pin override-line placement (immediately after source
lib-agent, before any other source)
- Q3: drop step 7 (cross-link in agy-cli-support.md is duplication;
autonomous.conf.example already covers the deployment pattern)
- T1: add PSC-S11 (launcher guard with both sides non-claude) so
refactor can't silently collapse the guard's || into AND
Test count now eleven (S1-S11). No behavior change to the planned
implementation; spec is more precise.
Adds GSTACK REVIEW REPORT section (eng-review CLEARED, 4 issues all resolved inline). Fixes stale 'ten test cases' reference at step 5 (now eleven, matching the test table).
Bite-sized TDD plan, six tasks with red-green-commit cadence: T1 — lib-agent init (2 vars + guard rewrite) + PSC-S1..S8 + S11 T2 — autonomous-dev.sh override + PSC-S9 T3 — autonomous-review.sh override + PSC-S10 T4 — autonomous.conf.example operator block T5 — INV-37 in invariants.md T6 — final regression + push + PR Self-review confirms full spec coverage, no placeholders, type consistency across tasks.
Adds per-side AGENT_CMD overrides to lib-agent.sh. Default to AGENT_CMD so existing deployments are byte-for-byte unchanged. The AGENT_LAUNCHER guard is rewritten to read AGENT_DEV_CMD / AGENT_REVIEW_CMD directly. The new check is strictly more permissive than the old one — it rejects the same set of bad configs (any non-claude side with launcher set) plus correctly accepts the new 'both sides claude' default case via the per-side path. Tests: PSC-S1..S8 + S11 (defaults, single-side override, both-side, empty-string fallback, launcher guard 4 ways). PSC-S9/S10 (wrapper structural placement) land in Tasks 2 and 3. Existing test-lib-agent-*.sh suites all pass — zero regressions.
… placeholders Code-quality review nit: shellcheck SC2034 (declared but not used) on the two wrapper-path variables. They're consumed by PSC-S9 / PSC-S10 which land in Tasks 2 and 3 of the plan; defining them here keeps the test scaffolding in one place across all three tasks. Add an explanatory comment + 'shellcheck disable=SC2034' so a default- severity shellcheck run is clean without changing what the variables mean to a human reader.
…urce Adds AGENT_CMD=$AGENT_DEV_CMD immediately after source lib-agent.sh so the case statements in run_agent / resume_agent dispatch to the dev-side CLI when the operator has set per-side overrides. Default behavior (no overrides) is unchanged because AGENT_DEV_CMD defaults to $AGENT_CMD inside lib-agent.sh. Test: PSC-S9 asserts the line lands within 2 lines of the source statement so a future refactor can't accidentally move it past code that consumes $AGENT_CMD (earliest consumer is line 177).
Spec-compliance review surfaced an internal inconsistency in the plan: Step 2.4 specs a 3-line WHY-rationale comment before AGENT_CMD= "$AGENT_DEV_CMD", but Step 2.1's awk window of "+2" only allowed 1 intervening line, forcing one or the other to be sacrificed. Resolve by restoring the spec'd 3-line comment (it carries useful WHY-documentation that justifies the override pattern for future maintainers) and widening the awk window to "+4". The window is still tight enough to catch a refactor that moves the override past the first $AGENT_CMD consumer (line 177 was the original concern).
…ent source Mirrors the dev-side change. AGENT_CMD=$AGENT_REVIEW_CMD lands immediately after source lib-agent.sh so the run_agent dispatch and the 'Reviewed HEAD: ... agent X' trailer (line ~636) report the review-side CLI correctly. Test: PSC-S10 asserts placement structurally with a +5 line window (accommodates the 4-line WHY-rationale comment, same as the dev-side fix in commit 7bea617).
…onf.example Adds a 17-line operator-facing comment block between the EXTRA_ARGS declarations and the per-CLI blocks. Includes a worked example for the 'claude for dev, agy for review' deployment pattern (the motivating use case) and the AGENT_LAUNCHER constraint reminder.
INV-37 documents the AGENT_DEV_CMD / AGENT_REVIEW_CMD precedence rule that lets one project run dev and review on different agent CLIs. Defaults preserve back-compat (both vars fall back to AGENT_CMD via :- semantics). AGENT_LAUNCHER guard tightens to require both sides claude. Anchored to docs/pipeline/per-side-agent-cmd.md (which already exists on this branch with cross-reference [INV-37] resolved by this commit).
PR-review fixes (Critical + High + Medium):
C1 (CLAUDE.md No Private Repo Refs): replace 'podcast-curation'
with 'a downstream consumer project' / 'the motivating use case'
in two committed docs (per-side-agent-cmd.md failure-mode table
and the implementation plan PR body example).
H1: drop stale lib-agent.sh:106-108 line ref for the AGENT_LAUNCHER
guard (the guard moved during the rewrite); reference by name
instead so the doc survives churn.
H2: drop stale ~lines 588, 800 line refs for AGENT_PERMISSION_MODE
consumption (claude branch consumes it; line numbers were ~12
off from origin/main and would drift further).
M1: spec said the operator-block lands 'after AGENT_PERMISSION_MODE,
before per-CLI blocks' — actual placement (and where the
autonomous.conf.example edit lands) is after AGENT_*_EXTRA_ARGS
defaults. Update to match.
M2 + M3: PSC-S9 / PSC-S10 spec table descriptions said 'at most
one blank line of separation' / 'immediately after' — actual
awk windows are +4 (dev, 3-line WHY comment) and +5 (review,
4-line WHY comment) per the lessons learned during Tasks 2/3.
Spec table now matches the implementation.
Test plan: 14/14 PSC assertions still pass; all peer test-lib-agent-*
suites still pass; shellcheck clean. (Note: pre-existing private-repo
references in unrelated tracked docs are out of scope per CLAUDE.md
'what I WRITE' clause.)
agy CLI review (Antigravity 2.0, v1.0.2)I ran the local Finding 1:
|
…-completed (agy review #156) Address Findings 2 + 3 from the agy CLI review of PR #156. The dispatcher tick does not source lib-agent.sh and so cannot inherit the wrapper-level AGENT_CMD override (added by this same PR for run_agent/resume_agent dispatch). Two helpers in lib-dispatch.sh need explicit per-side awareness: Finding 2 — _pgid_has_agent_process now accepts an optional 2nd argument: the per-side CLI to match against process comms in the group. Callers updated: dev_near_success → "${AGENT_DEV_CMD:-${AGENT_CMD:-claude}}" review_near_success → "${AGENT_REVIEW_CMD:-${AGENT_CMD:-claude}}" Empty/missing 2nd arg falls back to AGENT_CMD for back-compat with existing single-arg callers and test mocks. Without this, a project running AGENT_REVIEW_CMD=agy would have its live agy review process classified as DEAD by the dispatcher's *claude* substring match. Finding 3 — is_session_completed gates on the dev-side CLI (${AGENT_DEV_CMD:-${AGENT_CMD:-claude}}) because it parses the dev wrapper's log file. Without this, a project with AGENT_DEV_CMD=codex and the dispatcher's AGENT_CMD=claude default would attempt to parse codex JSONL with claude rules, producing wrong completion-state verdicts. Test plan: - new tests/unit/test-pgid-has-agent-process.sh — 9 assertions covering per-side arg, fallback, back-compat 1-arg call, bad-PGID guards. Spawns a child via cp /bin/sleep + setsid so its comm matches a chosen name. - extends tests/unit/test-is-session-completed.sh with TC-WH-005b: 3 assertions covering AGENT_DEV_CMD ≠ AGENT_CMD split config. - all existing peer tests still pass (test-dev-near-success, test-dispatcher-review-near-success, test-is-session-completed, test-is-session-completed-end-ts). Spec: docs/pipeline/per-side-agent-cmd.md gains a §Dispatcher-side coupling section. INV-37 Consumer field extended to name the two dispatcher-side reads. Test-cases doc at docs/test-cases/dispatcher-per-side-cmd-coupling.md tracks the fixes.
Findings 2 + 3 addressed in commit 7675a90Pushed a fix commit to this branch addressing Findings 2 + 3 from the agy review above. What changedFinding 2 —
Finding 3 — Tests:
Spec/doc:
Finding 1 (pre-existing Verification
|
agy re-review (after commit 7675a90)Re-ran the local agy CLI to re-review after pushing the fix commit. agy resumed the prior conversation ( Verdict: CLEARED
Verified by agy:
Finding 1 ( PR is ready for merge from agy's perspective. CI green (Unit Tests + ShellCheck + pipeline-docs gate all pass). |
…ide overrides (#158) Adds AGENT_DEV_LAUNCHER and AGENT_REVIEW_LAUNCHER operator knobs that let dev and review wrappers each have their own launcher prefix. Both default to \${AGENT_LAUNCHER:-} so existing deployments are byte-for- byte unchanged. INV-38 documents the precedence rule. The single INV-37 'both sides claude' launcher guard is replaced by two independent per-side guards: each side's launcher is gated on THAT side's AGENT_CMD. Strictly more permissive than the INV-37 form. The motivating use case: a project with AGENT_CMD=claude + AGENT_DEV_LAUNCHER=cc-bridge (Bedrock dev) + AGENT_REVIEW_CMD=kiro (no launcher needed for review). Pre-INV-38 was blocked by INV-37's both- sides-claude check. Post-INV-38 works. Includes: - skills/autonomous-dispatcher/scripts/lib-agent.sh — init + per- side argv tokenization mirroring the existing AGENT_LAUNCHER eval; two per-side guards replace the single guard - skills/autonomous-dispatcher/scripts/autonomous-dev.sh — AGENT_LAUNCHER_ARGV rebind to AGENT_DEV_LAUNCHER_ARGV after the existing AGENT_CMD rebind from #156 - skills/autonomous-dispatcher/scripts/autonomous-review.sh — symmetric rebind - skills/autonomous-dispatcher/scripts/autonomous.conf.example — operator-facing comment block with worked example + caveat about the kiro --trust-all-tools stage-vs-side gotcha (pre-existing) - tests/unit/test-lib-agent-per-side-launcher.sh — 14 assertions (PSL-S1..S10): defaults, fallback, per-side override, per-side guard pass/fail with RC=1 lock, structural placement - tests/unit/test-lib-agent-per-side-cmd.sh — PSC-S7/S8/S11 needles updated to match the new per-side error messages - docs/pipeline/per-side-launcher.md — full spec - docs/pipeline/invariants.md — INV-38 - docs/pipeline/README.md — index entry agy CLI review CLEARED with clean sweep — no other AGENT_LAUNCHER reads exist on the dispatcher side.
Summary
Adds two operator knobs in
lib-agent.shthat let dev and review wrappers run on different agent CLIs in the same project:AGENT_DEV_CMD— defaults to${AGENT_CMD:-claude}AGENT_REVIEW_CMD— defaults to${AGENT_CMD:-claude}Each wrapper sets
AGENT_CMDto its side's value immediately after sourcinglib-agent.sh. TheAGENT_LAUNCHERclaude-only guard generalizes to reject any non-claude side (strictly more permissive than the old single-side check — never rejects a config that previously passed).The motivating use case: a downstream consumer project wants
AGENT_CMD="claude"for heavy dev andAGENT_REVIEW_CMD="agy"for cheaper review.What's in the PR
skills/autonomous-dispatcher/scripts/lib-agent.sh— 7-line init + guard rewriteskills/autonomous-dispatcher/scripts/autonomous-dev.sh— 4-line override block aftersource lib-agent.shskills/autonomous-dispatcher/scripts/autonomous-review.sh— 5-line override block (symmetric)skills/autonomous-dispatcher/scripts/autonomous.conf.example— operator-facing comment block with worked exampletests/unit/test-lib-agent-per-side-cmd.sh— 14 assertions (PSC-S1..S11): defaults, single-side override, both-side, empty-string, launcher guard ×4, wrapper structural placementdocs/pipeline/per-side-agent-cmd.md— specdocs/pipeline/invariants.md— INV-37docs/pipeline/README.md— index entrydocs/superpowers/plans/2026-05-25-per-side-agent-cmd.md— implementation plan (kept for traceability)Backwards compatibility
Existing deployments don't set the new vars. Defaults make both equal to
$AGENT_CMD, so behavior is byte-for-byte identical pre/post. Verified: all 7 peertest-lib-agent-*.shsuites still pass.Spec & invariant
docs/pipeline/per-side-agent-cmd.mdINV-37indocs/pipeline/invariants.mdPipeline-docs gate
Touches
lib-agent.sh(watched path), sodocs/pipeline/must also change. Satisfied:docs/pipeline/per-side-agent-cmd.mddocs/pipeline/invariants.md(INV-37)docs/pipeline/README.mdTest plan
bash tests/unit/test-lib-agent-per-side-cmd.sh— 14/14 PASStests/unit/test-lib-agent-*.shregression tests PASSshellcheck -S erroron lib-agent.sh + both wrappers + new testbash -n autonomous.conf.examplesyntax OKunit-tests+shellcheck+pipeline-docs-gategreenReview process
The PR went through
/pr-review-toolkit:review-prwith four specialized agents (code-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer). Findings addressed inline:podcast-curationprivate-repo refs from two committed docs (CLAUDE.md "No Private Repo Refs" rule)+4/+5reflecting the actual WHY-rationale comment sizes)Silent-failure-hunter found 0 critical issues; 2 cosmetic observations (no fix recommended). Test analyzer noted 3 improvement opportunities (rated 5-7) deferred to follow-up.
Future work (filed as suggestions in review, NOT blocking this PR)
_LIB_AGENT_AGY_MODEL_WARNED-style WARN-once gate for the (rare) inverse pattern (agy-dev / claude-review with conflicting permission needs) — currently theAGENT_PERMISSION_MODEvalue is silently dropped on agy side, which is fine for the common pattern but not for the inverseAGENT_CMDrebinding actually steerscase "$AGENT_CMD"dispatch (PSC-S9/S10 are structural-only)env -iisolation in test sandboxes (mitigates hypothetical dev-boxAUTONOMOUS_CONFambient export)