Skip to content

chore(agent): log SDK options and capture claude subprocess stderr#16

Merged
rebelopsio merged 1 commit into
mainfrom
diag/agent-stderr-and-options
May 13, 2026
Merged

chore(agent): log SDK options and capture claude subprocess stderr#16
rebelopsio merged 1 commit into
mainfrom
diag/agent-stderr-and-options

Conversation

@rebelopsio
Copy link
Copy Markdown
Owner

Summary

./archy daily --force was reporting a successful agent run with turns=0, zero cost, zero tool calls, empty text, duration 1.6s — meaning the claude subprocess was exiting immediately without consulting the model. The actual error was on the subprocess's stderr, which archy was dropping. This PR makes that stderr visible.

SDK investigation: the partio-io SDK exposes claude.WithStderrCallback(func(string)) Option (option.go:225). Option A from the spec applied directly — no need to fall back.

Changes:

  • New RunResult.SubprocessStderr field. Run registers a WithStderrCallback that appends each line into a mutex-guarded buffer (callback fires on the SDK's drain goroutine), and writes the buffer into RunResult.SubprocessStderr on every return path. The iteration-error path also includes stderr inline in the ErrRun error string.
  • New logSDKInvocation helper prints a structured one-time summary of the agent setup to stderr (skill, model, max_turns, permission_mode, cwd, cli_path, archy_binary, sdk_option_count, enabled MCP servers, skills dirs). No secrets are logged.
  • Runtime.stderrLog io.Writer field defaults to os.Stderr; tests swap in io.Discard to keep output quiet.
  • cmd/daily_run.go: explainAgentOutcome surfaces SubprocessStderr (truncated to 2000 chars) in the verification-failure error.

No logging framework, --debug flag, or trace file. Plain stderr output for now.

Test plan

  • go build ./...
  • go test -race ./... (14 packages green)
  • go vet ./...
  • golangci-lint run (0 issues)
  • New: TestRunDaily_VerificationErrorIncludesSubprocessStderr
  • Manual: rebuild, ./archy daily --force — the invocation summary prints first; on verification failure the error includes claude stderr: <whatever> so we can finally see why the subprocess exits

Surface the exact configuration archy passes to the partio-io SDK
(model, cwd, MCP servers, skills dirs) and capture the subprocess's
stderr so we can see what claude actually says when it exits without
consulting the model.

Uses the SDK's WithStderrCallback option (callback fires from the
SDK's drain goroutine, guarded by a mutex). Adds
RunResult.SubprocessStderr; surfaces it in verification-failure
errors via cmd/daily_run.go. Plain stderr output for now; structured
logging is a separate concern.

Tests get io.Discard for the invocation log to keep output quiet.
@rebelopsio rebelopsio changed the title diag(agent): log SDK options and capture claude subprocess stderr chore(agent): log SDK options and capture claude subprocess stderr May 13, 2026
@rebelopsio rebelopsio merged commit e7be5fe into main May 13, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant