feat(provider): claudecode provider for Claude Max/Pro subscriptions by adelin-b · Pull Request #64 · m0n0x41d/haft

adelin-b · 2026-04-19T22:43:20Z

Summary

Adds a claudecode LLM provider that wraps the claude CLI as a subprocess. Lets Pro/Max subscribers run haft agent without ANTHROPIC_API_KEY — auth is delegated entirely to the CLI.

New provider + tests in internal/provider/claudecode.go / claudecode_test.go
Factory dispatch + longest-prefix-match fix (claude-code was shadowed by claude-)
Registry entries: claude-code, claude-code:opus, claude-code:sonnet, claude-code:haiku
haft doctor surfaces the CLI presence + version
Docs at docs/claude-code-provider.md

Why subprocess and not the SDK

haft is Go. The Vercel AI SDK's ai-sdk-provider-claude-code is TypeScript-only, and the Claude Agent SDK itself only ships Python + TS bindings. Subprocess wrapping is the only path that reuses Claude Code's subscription auth without taking a non-Go dep (or building a CGO bridge to Bun/Node).

Scope (intentionally narrow)

In this PR:

Flatten haft messages → (system_prompt, user_prompt) pair
Pipe to claude -p --output-format stream-json --verbose --allowed-tools ''
Parse NDJSON, stream text deltas, return assistant Message

Out of scope (follow-ups):

Tool-use: haft's ToolSchema isn't translated to the CLI's MCP surface yet, so the model emits text only. Documented as a clear limitation — tool-driven flows should keep using the anthropic/openai providers. The right next step is `--mcp-config` pointing at a local haft MCP server so `haft_note`/`haft_problem`/etc. are callable.
Session reuse via --resume to amortize per-turn startup cost.
Propagating token counts from the result event.

I'd rather ship the scaffold small and iterate than bundle everything and risk a wholesale reject.

Design notes

Stdin pipe for the prompt (avoids argv size limits on long conversations).
--allowed-tools '' disables CLI built-ins so the model doesn't write files under the user's feet when haft is supposed to own the agent surface.
--no-session-persistence keeps each turn ephemeral (matches existing Anthropic/OpenAI providers' stateless shape).
ModelID() reports the haft-facing id (claude-code or claude-code:<sub>), not a real Anthropic model name — avoids confusing the registry.

Tests

ok  github.com/m0n0x41d/haft/internal/provider    8.031s

Unit tests cover:

flattenConversation — system merging, labeled body blocks, empty-turn skip
renderParts — text, tool_call, tool_result (incl. error variant)
parseClaudeStream — text delta extraction, result event (success / error), malformed line skip
guessProviderFromPrefix — regression: claude-code* routes to claudecode, not anthropic

Not yet in CI: end-to-end subprocess invocation against the real claude binary. Open to adding a -tags integration suite gated on claude being on PATH if you want it.

Test plan

`go test ./internal/provider/...` (clean)
`go build ./internal/provider/... ./internal/config/... ./internal/cli/...` (clean)
Manual: `model: claude-code` in `~/.haft/config.yaml`, run `haft "hello"`, see streamed response
`haft doctor` reports Claude Code CLI when `claude` is on PATH
Prefix routing: `--model claude-opus-4-20250514` still dispatches to anthropic (not claudecode)

Open questions for the maintainer

Model id shape: is `claude-code:sonnet` acceptable, or prefer a flat list like `claude-code-sonnet`?
Tool-use follow-up: MCP-bridge approach OK, or would you rather expose haft tools via `--allowed-tools` on custom MCP servers differently?
Happy to gate the doctor check behind a config flag if you don't want CLI detection to run when the user isn't using this provider.

Wraps the `claude` CLI (Claude Code) as an LLMProvider so users with a Pro/Max subscription can run haft's interactive agent without setting ANTHROPIC_API_KEY. Auth is delegated entirely to the CLI (OAuth, keychain, or pass-through env var — whichever Claude Code is configured with). Scope (MVP, intentionally narrow): - Flattens haft's structured message history into a single (system, user) prompt pair and pipes it to `claude -p --output-format stream-json`. - Parses NDJSON events, forwards text deltas as StreamDelta, returns the assembled assistant Message on `result`. - Tool schemas are **not** translated to the CLI's MCP surface yet — the model emits text only. Tool-driven agent loops (haft_note etc.) should keep using the anthropic/openai providers until a follow-up PR wires --mcp-config. This is documented in both the package doc comment and docs/claude-code-provider.md so operators pick the right provider. Changes: - internal/provider/claudecode.go — new provider implementation. - internal/provider/claudecode_test.go — unit tests for flatten/render/parse plus the prefix-routing regression. - internal/provider/factory.go — dispatch "claudecode" provider id; rewrite guessProviderFromPrefix as ordered list so "claude-code" beats "claude-". - internal/config/config.go — same ordered-prefix fix for ProviderForModel. - internal/provider/registry.go — register Claude Code (CLI) provider with "claude-code" (+ :opus/:sonnet/:haiku sub-model variants). Cost fields left at zero since the subscription isn't per-token. - internal/cli/doctor.go — surface `claude` CLI presence + version. - docs/claude-code-provider.md — setup, model ids, limitations, follow-up. Constraint: haft is Go, so the Vercel AI SDK's `ai-sdk-provider-claude-code` doesn't apply and the TS/Python Claude Agent SDK can't be imported. Subprocess wrapping is the only path that reuses Claude Code's subscription auth. Rejected: embedding an Anthropic SDK OAuth flow | Max auth goes through the CLI, not the public API. Rejected: CGO bridge to the TS SDK | pulls Bun/Node into the build. Confidence: medium — text path proven by tests; subprocess path exercised only manually (no real CLI in CI yet). Scope-risk: narrow — additive provider, no existing call sites touched. Not-tested: end-to-end subprocess invocation; CLI auth-failure surface text; large-conversation stdin > argv-safe sizes.

…illing The Apr 2026 fix to claude-code #43333 makes `claude -p` draw from an active Max/Pro subscription when OAuth'd and no ANTHROPIC_API_KEY is present. If the user has a stray ANTHROPIC_API_KEY exported in their shell, the CLI silently routes to per-token API billing instead (this is the $1,800-in-two-days foot-gun tracked in #37686). Strip ANTHROPIC_API_KEY from the child env in the provider itself so the subscription path is taken by default. Parent env is untouched. Also clean up the docs to reflect the post-fix reality and point users at the anthropic provider when they *want* API-key billing. Constraint: can't require the user to shell-unset before every call; the goal of this provider is subscription-by-default ergonomics. Rejected: require --unsetenv flag from the caller | still leaks on forgotten exports. Rejected: error out when the key is set | false- positives for users who intentionally want API billing but happened to pick this provider. Confidence: high — env strip is standard, covered by a unit test. Scope-risk: narrow — only affects the claudecode subprocess.

Generate a per-turn --mcp-config tmpfile that spawns `haft serve` as the backing MCP server, so the model can call haft_note / haft_problem / haft_decision / haft_query / haft_refresh / haft_solution during a turn. The CLI also keeps its built-in Read/Write/Bash/etc. under --permission-mode bypassPermissions so file ops work without interactive approval. Execution happens entirely inside the CLI subprocess; haft's outer agent loop receives the final assistant text after all round-trips. Operators who need per-tool hooks or cycle tracking must keep using the anthropic / openai providers — this is documented. Opt-out: HAFT_CLAUDECODE_NO_MCP=1 drops the bridge and restores the previous text-only behavior (--allowed-tools ''). Constraint: schlunsen/claude-agent-sdk-go is the most maintained Go port but its in-process MCP server support is TODO, so adding it as a dep wouldn't solve this. Raw subprocess + --mcp-config pointing at the haft binary itself is the smallest viable bridge. Rejected: port haft's full tool surface to an in-process MCP server | big refactor, duplicates the existing `haft serve` path. Rejected: fail hard when no haft project root is found | makes the provider unusable in quick-chat contexts; we fall back to text-only. Confidence: medium — tmpfile + config shape are unit-tested; live tool round-trip verified manually against Max subscription. Scope-risk: narrow — additive to the provider; env opt-out preserves the previous MVP behavior. Not-tested: long conversations exceeding 1MB stdin; CLI subprocess killed mid-turn leaving stale haft serve children.

`haft agent` first-run gate (ensureConfigured) called setup.Run() whenever no provider had an APIKey or AccessToken. The setup stub only knows how to read OPENAI_API_KEY, so `model: claude-code` users hit: First run — let's set up Haft. Error: setup: OPENAI_API_KEY not set — set it in your environment Auth for claudecode is owned by the `claude` CLI, so no credentials ever live in ~/.haft/config.yaml for this provider. IsConfigured() now short- circuits true when the selected model resolves to the claudecode provider. Covered by four unit tests in internal/config (previously no test file).

…, not append --append-system-prompt concatenates on top of Claude Code's ~30K-token default prompt. Haft's FPF protocol instructions get diluted to the point the model behaves like vanilla Claude Code instead of a haft agent. Switch to --system-prompt so haft owns the prompt outright. If the user wants Claude Code's built-in tool instructions, haft's own system prompt can describe them — but that's a haft-side authoring choice, not a provider concern.

Addressing feedback from the PR m0n0x41d#64 code review. HIGH: - Remove the dead `cleanup` variable. It was assigned a func that was never called (the defer above did the actual work) and was then abused as a nil-check bool. Replace with an explicit `mcpBridged` bool so the control flow is honest. - Dedup the provider-prefix routing tables. `config.ProviderForModel` is now the single source of truth; `provider.guessProviderFromPrefix` delegates to it. The two tables had already drifted (`mistral` existed only in config) and this PR's addition made it worse. MEDIUM: - Explicitly chmod the MCP tmpfile to 0600 so a permissive umask can't leak the haft binary path + project root to other users. - Use `--add-dir=<value>` (equals form) so a project root starting with `-` can't be mis-parsed as a CLI flag. - Cap stderr at 64KB via a small `cappedBuffer` writer so a chatty --verbose session can't pressure parent memory. - Truncate stderr text embedded in the exit error message to 8KB. - docs: fix stale `--append-system-prompt` in the "How it works" example to match the actual code (`--system-prompt`). No functional changes to the wire format or subprocess flags; tests still pass (provider + config).

Before: `internal/cli/util.go` had a private `findProjectRoot` that walked cwd up to a .haft/ dir. This PR's `internal/provider/claudecode.go` duplicated that logic (`findHaftProjectRoot`). Two copies of the same walker is the kind of thing that drifts. Now: - `internal/project.FindRoot(startDir)` — pure-ish public helper, reusable from any package, takes the starting dir as a parameter so it's testable without os.Chdir tricks. - `internal/project.FindRootFromCwd()` — convenience wrapper. - `internal/cli/util.go:findProjectRoot` — thin wrapper that preserves the existing error-returning signature its callers use. - `internal/provider/claudecode.go` — uses `project.FindRootFromCwd` directly; local `findHaftProjectRoot` deleted. Tests moved to `internal/project/findroot_test.go` where they can exercise the pure `FindRoot(startDir)` without chdir. Added a regression test that a regular file named `.haft` is ignored (only a directory counts as a project marker — matches the existing cli behavior and the IsDir check both walkers already had). No behavior change for any existing caller.

Reuse: - desktop/app.go:findProjectRoot delegates to project.FindRootFromCwd like the cli and provider copies do. Three walkers collapsed into one. Fixes a latent bug in the desktop copy that would have matched a regular file named .haft (no IsDir check). Quality: - Drop the redundant `subModel` field on ClaudeCodeProvider; derive from `modelID` via cliSubModel() so there's one source of truth. A failing test on "claude-code" (bare) caught a real bug: the previous field-based code was never wrong because the prefix-check stripping happened up front, but the new derivation had to check the `ok` return from strings.CutPrefix — which is now tested. - registry.go: zero out ContextWindow / DefaultMaxOut for claude-code* entries. Those limits live in the CLI, not here; fabricating numbers would go stale. Comment explains why. Efficiency: - Cache `haftExe` and `projectRoot` on ClaudeCodeProvider at construction. Was re-running os.Executable + os.Getwd + filesystem walk on every turn. Per-turn cost drops to just the tmpfile (marshal + write + chmod + remove — ~microseconds vs the CLI's multi-second spawn). writeHaftMCPConfig is now a pure function of (exe, projectRoot) and easier to test without chdir tricks. - cappedBuffer now keeps the TAIL of stderr, not the head. Real failures almost always print near the end; head-only meant error messages showed startup chatter and dropped the actual error for long --verbose sessions. Added tail / no-truncation tests. - Removed the now-redundant "truncate error message" block in Stream() — the cappedBuffer does it itself. No behavior change to the subprocess wire format.

Applied the same frame → compare → decide reasoning the project enforces via FPF. Three of four items survived scrutiny as tiny wins; the fourth (--resume) deserves its own PR. Done here: - Cache the filtered child env on ClaudeCodeProvider at construction. `envWithout(os.Environ(), "ANTHROPIC_API_KEY")` was called once per turn; move it once-per-provider. The only env var we mask is rarely toggled mid-session. Documented the tradeoff (added-after-construction vars are missed — callers needing fresh env should rebuild). - Pre-size the text `strings.Builder` in parseClaudeStream to 16KB. Skips ~4 grow-and-copy doublings for typical outputs; still grows naturally for long ones. Tiny allocation win, zero ergonomic cost. Not done (and why): - --resume session reuse is the real ~3-5x speedup but it changes the LLMProvider contract (stateless → stateful), needs delta-only message forwarding, session-TTL fallback, and cleanup. Too much for this PR. Moved the entry in docs/claude-code-provider.md from a one-liner under "Follow-up" to a full design sketch so whoever picks it up has a starting point instead of re-deriving the constraints. - `tui_spawn.go` env-stripping near-duplicate: different semantics (multi-key + replace vs single-key strip). Wrapping adds indirection with no payoff. Left alone.

All four skipped items from the simplify pass done. (1) Session reuse via --resume - Provider grows sessionID + msgsSent + a mutex. Turn 1 spawns fresh, records the CLI's session id from stream-json events, and snapshots the conversation length. Turn 2+ checks that haft grew the message list by exactly one user message; if so, sends just that user text with --resume <id>, skipping --system-prompt. Any mismatch falls back to a fresh turn and clears state. - On parse or subprocess error, invalidate the session so a stale id doesn't keep failing turn after turn. - Dropped --no-session-persistence. Freshness is now controlled by whether we pass --resume, not by whether the session is stored. - Opt-out: HAFT_CLAUDECODE_NO_RESUME=1 forces every turn fresh. - Live-verified: turn 2 via --resume correctly recalled context from turn 1 ("my name is Zephyr" -> "Your name is Zephyr."). (2) Shared env utility - New internal/envutil.Strip(env, keys...) with its own tests, covering single-key, multi-key, shared-prefix, order-preservation. - claudecode.go dropped its local envWithout; uses envutil.Strip. - cli/tui_spawn.go:tuiProcessEnv collapsed from 20 lines to 5; same behavior, one filter call. Other provider changes: - parseClaudeStream now returns a claudeStreamResult struct (text, finishReason, sessionID) instead of three values. Cleaner, and extensible for future fields (token counts). - streamEvent picks up session_id from every event; the last one wins — `result` carries it reliably, so settled-state is correct. Tests: 6 new cases for takeResumeDecision (fresh/continue/gap/non-user- tail/env-optout) and session bookkeeping (invalidate/record). All green.

Moved the big --resume design sketch out of 'Future work' (it ships in this PR). Replaced with a 'Session reuse' section explaining the warm path, divergence fallback, and the opt-out env var.

Audited every file for quint/QUINT references. Kept the load-bearing ones (migration code in cli/init, cli/serve fallback, project/index, project.go legacy db read; historical attribution in README/CONTRIBUTING; naming-migration table in spec/AGENT_CONTRACT, OPEN_QUESTIONS; canonical install URL quint.codes in goreleaser). Fixed the stale ones that were just mislabelled: - internal/provider/claudecode.go + test: QUINT_PROJECT_ROOT → HAFT_PROJECT_ROOT. serve.go still accepts QUINT_PROJECT_ROOT as a fallback for old configs, but new invocations should use the modern env var. - docs/claude-code-provider.md: matching env var rename. - db/store_test.go: audit-log fixtures used quint_propose / quint_verify. Renamed to haft_propose / haft_verify — audit labels are free-form, but the test should reflect current tool naming. - internal/artifact/nav_test.go:TestContract_NoToolCallSyntax: the contract was "no tool-call syntax in NextAction" but only checked for quint_. Expanded to also reject haft_ so new code doesn't regress — defense in depth keeps both checks. - .golangci.yml: G204 comment said "we call quint/git as subprocess". Now claude/git. Also: .gitignore picks up .osgrep (tool cache, shouldn't be tracked).

adelin-b added 3 commits April 20, 2026 00:42

adelin-b marked this pull request as draft April 19, 2026 23:15

adelin-b added 9 commits April 20, 2026 01:16

docs(claude-code-provider): describe shipped --resume behavior

1159c1a

Moved the big --resume design sketch out of 'Future work' (it ships in this PR). Replaced with a 'Session reuse' section explaining the warm path, divergence fallback, and the opt-out env var.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(provider): claudecode provider for Claude Max/Pro subscriptions#64

feat(provider): claudecode provider for Claude Max/Pro subscriptions#64
adelin-b wants to merge 12 commits intom0n0x41d:mainfrom
adelin-b:feat/claude-code-provider

adelin-b commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adelin-b commented Apr 19, 2026

Summary

Why subprocess and not the SDK

Scope (intentionally narrow)

Design notes

Tests

Test plan

Open questions for the maintainer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant