feat(provider): claudecode provider for Claude Max/Pro subscriptions#64
Draft
adelin-b wants to merge 12 commits intom0n0x41d:mainfrom
Draft
feat(provider): claudecode provider for Claude Max/Pro subscriptions#64adelin-b wants to merge 12 commits intom0n0x41d:mainfrom
adelin-b wants to merge 12 commits intom0n0x41d:mainfrom
Conversation
added 3 commits
April 20, 2026 00:42
Wraps the `claude` CLI (Claude Code) as an LLMProvider so users with a Pro/Max subscription can run haft's interactive agent without setting ANTHROPIC_API_KEY. Auth is delegated entirely to the CLI (OAuth, keychain, or pass-through env var — whichever Claude Code is configured with). Scope (MVP, intentionally narrow): - Flattens haft's structured message history into a single (system, user) prompt pair and pipes it to `claude -p --output-format stream-json`. - Parses NDJSON events, forwards text deltas as StreamDelta, returns the assembled assistant Message on `result`. - Tool schemas are **not** translated to the CLI's MCP surface yet — the model emits text only. Tool-driven agent loops (haft_note etc.) should keep using the anthropic/openai providers until a follow-up PR wires --mcp-config. This is documented in both the package doc comment and docs/claude-code-provider.md so operators pick the right provider. Changes: - internal/provider/claudecode.go — new provider implementation. - internal/provider/claudecode_test.go — unit tests for flatten/render/parse plus the prefix-routing regression. - internal/provider/factory.go — dispatch "claudecode" provider id; rewrite guessProviderFromPrefix as ordered list so "claude-code" beats "claude-". - internal/config/config.go — same ordered-prefix fix for ProviderForModel. - internal/provider/registry.go — register Claude Code (CLI) provider with "claude-code" (+ :opus/:sonnet/:haiku sub-model variants). Cost fields left at zero since the subscription isn't per-token. - internal/cli/doctor.go — surface `claude` CLI presence + version. - docs/claude-code-provider.md — setup, model ids, limitations, follow-up. Constraint: haft is Go, so the Vercel AI SDK's `ai-sdk-provider-claude-code` doesn't apply and the TS/Python Claude Agent SDK can't be imported. Subprocess wrapping is the only path that reuses Claude Code's subscription auth. Rejected: embedding an Anthropic SDK OAuth flow | Max auth goes through the CLI, not the public API. Rejected: CGO bridge to the TS SDK | pulls Bun/Node into the build. Confidence: medium — text path proven by tests; subprocess path exercised only manually (no real CLI in CI yet). Scope-risk: narrow — additive provider, no existing call sites touched. Not-tested: end-to-end subprocess invocation; CLI auth-failure surface text; large-conversation stdin > argv-safe sizes.
…illing The Apr 2026 fix to claude-code #43333 makes `claude -p` draw from an active Max/Pro subscription when OAuth'd and no ANTHROPIC_API_KEY is present. If the user has a stray ANTHROPIC_API_KEY exported in their shell, the CLI silently routes to per-token API billing instead (this is the $1,800-in-two-days foot-gun tracked in #37686). Strip ANTHROPIC_API_KEY from the child env in the provider itself so the subscription path is taken by default. Parent env is untouched. Also clean up the docs to reflect the post-fix reality and point users at the anthropic provider when they *want* API-key billing. Constraint: can't require the user to shell-unset before every call; the goal of this provider is subscription-by-default ergonomics. Rejected: require --unsetenv flag from the caller | still leaks on forgotten exports. Rejected: error out when the key is set | false- positives for users who intentionally want API billing but happened to pick this provider. Confidence: high — env strip is standard, covered by a unit test. Scope-risk: narrow — only affects the claudecode subprocess.
Generate a per-turn --mcp-config tmpfile that spawns `haft serve` as the backing MCP server, so the model can call haft_note / haft_problem / haft_decision / haft_query / haft_refresh / haft_solution during a turn. The CLI also keeps its built-in Read/Write/Bash/etc. under --permission-mode bypassPermissions so file ops work without interactive approval. Execution happens entirely inside the CLI subprocess; haft's outer agent loop receives the final assistant text after all round-trips. Operators who need per-tool hooks or cycle tracking must keep using the anthropic / openai providers — this is documented. Opt-out: HAFT_CLAUDECODE_NO_MCP=1 drops the bridge and restores the previous text-only behavior (--allowed-tools ''). Constraint: schlunsen/claude-agent-sdk-go is the most maintained Go port but its in-process MCP server support is TODO, so adding it as a dep wouldn't solve this. Raw subprocess + --mcp-config pointing at the haft binary itself is the smallest viable bridge. Rejected: port haft's full tool surface to an in-process MCP server | big refactor, duplicates the existing `haft serve` path. Rejected: fail hard when no haft project root is found | makes the provider unusable in quick-chat contexts; we fall back to text-only. Confidence: medium — tmpfile + config shape are unit-tested; live tool round-trip verified manually against Max subscription. Scope-risk: narrow — additive to the provider; env opt-out preserves the previous MVP behavior. Not-tested: long conversations exceeding 1MB stdin; CLI subprocess killed mid-turn leaving stale haft serve children.
added 9 commits
April 20, 2026 01:16
`haft agent` first-run gate (ensureConfigured) called setup.Run() whenever no provider had an APIKey or AccessToken. The setup stub only knows how to read OPENAI_API_KEY, so `model: claude-code` users hit: First run — let's set up Haft. Error: setup: OPENAI_API_KEY not set — set it in your environment Auth for claudecode is owned by the `claude` CLI, so no credentials ever live in ~/.haft/config.yaml for this provider. IsConfigured() now short- circuits true when the selected model resolves to the claudecode provider. Covered by four unit tests in internal/config (previously no test file).
…, not append --append-system-prompt concatenates on top of Claude Code's ~30K-token default prompt. Haft's FPF protocol instructions get diluted to the point the model behaves like vanilla Claude Code instead of a haft agent. Switch to --system-prompt so haft owns the prompt outright. If the user wants Claude Code's built-in tool instructions, haft's own system prompt can describe them — but that's a haft-side authoring choice, not a provider concern.
Addressing feedback from the PR m0n0x41d#64 code review. HIGH: - Remove the dead `cleanup` variable. It was assigned a func that was never called (the defer above did the actual work) and was then abused as a nil-check bool. Replace with an explicit `mcpBridged` bool so the control flow is honest. - Dedup the provider-prefix routing tables. `config.ProviderForModel` is now the single source of truth; `provider.guessProviderFromPrefix` delegates to it. The two tables had already drifted (`mistral` existed only in config) and this PR's addition made it worse. MEDIUM: - Explicitly chmod the MCP tmpfile to 0600 so a permissive umask can't leak the haft binary path + project root to other users. - Use `--add-dir=<value>` (equals form) so a project root starting with `-` can't be mis-parsed as a CLI flag. - Cap stderr at 64KB via a small `cappedBuffer` writer so a chatty --verbose session can't pressure parent memory. - Truncate stderr text embedded in the exit error message to 8KB. - docs: fix stale `--append-system-prompt` in the "How it works" example to match the actual code (`--system-prompt`). No functional changes to the wire format or subprocess flags; tests still pass (provider + config).
Before: `internal/cli/util.go` had a private `findProjectRoot` that walked cwd up to a .haft/ dir. This PR's `internal/provider/claudecode.go` duplicated that logic (`findHaftProjectRoot`). Two copies of the same walker is the kind of thing that drifts. Now: - `internal/project.FindRoot(startDir)` — pure-ish public helper, reusable from any package, takes the starting dir as a parameter so it's testable without os.Chdir tricks. - `internal/project.FindRootFromCwd()` — convenience wrapper. - `internal/cli/util.go:findProjectRoot` — thin wrapper that preserves the existing error-returning signature its callers use. - `internal/provider/claudecode.go` — uses `project.FindRootFromCwd` directly; local `findHaftProjectRoot` deleted. Tests moved to `internal/project/findroot_test.go` where they can exercise the pure `FindRoot(startDir)` without chdir. Added a regression test that a regular file named `.haft` is ignored (only a directory counts as a project marker — matches the existing cli behavior and the IsDir check both walkers already had). No behavior change for any existing caller.
Reuse: - desktop/app.go:findProjectRoot delegates to project.FindRootFromCwd like the cli and provider copies do. Three walkers collapsed into one. Fixes a latent bug in the desktop copy that would have matched a regular file named .haft (no IsDir check). Quality: - Drop the redundant `subModel` field on ClaudeCodeProvider; derive from `modelID` via cliSubModel() so there's one source of truth. A failing test on "claude-code" (bare) caught a real bug: the previous field-based code was never wrong because the prefix-check stripping happened up front, but the new derivation had to check the `ok` return from strings.CutPrefix — which is now tested. - registry.go: zero out ContextWindow / DefaultMaxOut for claude-code* entries. Those limits live in the CLI, not here; fabricating numbers would go stale. Comment explains why. Efficiency: - Cache `haftExe` and `projectRoot` on ClaudeCodeProvider at construction. Was re-running os.Executable + os.Getwd + filesystem walk on every turn. Per-turn cost drops to just the tmpfile (marshal + write + chmod + remove — ~microseconds vs the CLI's multi-second spawn). writeHaftMCPConfig is now a pure function of (exe, projectRoot) and easier to test without chdir tricks. - cappedBuffer now keeps the TAIL of stderr, not the head. Real failures almost always print near the end; head-only meant error messages showed startup chatter and dropped the actual error for long --verbose sessions. Added tail / no-truncation tests. - Removed the now-redundant "truncate error message" block in Stream() — the cappedBuffer does it itself. No behavior change to the subprocess wire format.
Applied the same frame → compare → decide reasoning the project enforces via FPF. Three of four items survived scrutiny as tiny wins; the fourth (--resume) deserves its own PR. Done here: - Cache the filtered child env on ClaudeCodeProvider at construction. `envWithout(os.Environ(), "ANTHROPIC_API_KEY")` was called once per turn; move it once-per-provider. The only env var we mask is rarely toggled mid-session. Documented the tradeoff (added-after-construction vars are missed — callers needing fresh env should rebuild). - Pre-size the text `strings.Builder` in parseClaudeStream to 16KB. Skips ~4 grow-and-copy doublings for typical outputs; still grows naturally for long ones. Tiny allocation win, zero ergonomic cost. Not done (and why): - --resume session reuse is the real ~3-5x speedup but it changes the LLMProvider contract (stateless → stateful), needs delta-only message forwarding, session-TTL fallback, and cleanup. Too much for this PR. Moved the entry in docs/claude-code-provider.md from a one-liner under "Follow-up" to a full design sketch so whoever picks it up has a starting point instead of re-deriving the constraints. - `tui_spawn.go` env-stripping near-duplicate: different semantics (multi-key + replace vs single-key strip). Wrapping adds indirection with no payoff. Left alone.
All four skipped items from the simplify pass done.
(1) Session reuse via --resume
- Provider grows sessionID + msgsSent + a mutex. Turn 1 spawns fresh,
records the CLI's session id from stream-json events, and snapshots
the conversation length. Turn 2+ checks that haft grew the message
list by exactly one user message; if so, sends just that user text
with --resume <id>, skipping --system-prompt. Any mismatch falls
back to a fresh turn and clears state.
- On parse or subprocess error, invalidate the session so a stale
id doesn't keep failing turn after turn.
- Dropped --no-session-persistence. Freshness is now controlled by
whether we pass --resume, not by whether the session is stored.
- Opt-out: HAFT_CLAUDECODE_NO_RESUME=1 forces every turn fresh.
- Live-verified: turn 2 via --resume correctly recalled context from
turn 1 ("my name is Zephyr" -> "Your name is Zephyr.").
(2) Shared env utility
- New internal/envutil.Strip(env, keys...) with its own tests,
covering single-key, multi-key, shared-prefix, order-preservation.
- claudecode.go dropped its local envWithout; uses envutil.Strip.
- cli/tui_spawn.go:tuiProcessEnv collapsed from 20 lines to 5; same
behavior, one filter call.
Other provider changes:
- parseClaudeStream now returns a claudeStreamResult struct (text,
finishReason, sessionID) instead of three values. Cleaner, and
extensible for future fields (token counts).
- streamEvent picks up session_id from every event; the last one
wins — `result` carries it reliably, so settled-state is correct.
Tests: 6 new cases for takeResumeDecision (fresh/continue/gap/non-user-
tail/env-optout) and session bookkeeping (invalidate/record). All green.
Moved the big --resume design sketch out of 'Future work' (it ships in this PR). Replaced with a 'Session reuse' section explaining the warm path, divergence fallback, and the opt-out env var.
Audited every file for quint/QUINT references. Kept the load-bearing ones (migration code in cli/init, cli/serve fallback, project/index, project.go legacy db read; historical attribution in README/CONTRIBUTING; naming-migration table in spec/AGENT_CONTRACT, OPEN_QUESTIONS; canonical install URL quint.codes in goreleaser). Fixed the stale ones that were just mislabelled: - internal/provider/claudecode.go + test: QUINT_PROJECT_ROOT → HAFT_PROJECT_ROOT. serve.go still accepts QUINT_PROJECT_ROOT as a fallback for old configs, but new invocations should use the modern env var. - docs/claude-code-provider.md: matching env var rename. - db/store_test.go: audit-log fixtures used quint_propose / quint_verify. Renamed to haft_propose / haft_verify — audit labels are free-form, but the test should reflect current tool naming. - internal/artifact/nav_test.go:TestContract_NoToolCallSyntax: the contract was "no tool-call syntax in NextAction" but only checked for quint_. Expanded to also reject haft_ so new code doesn't regress — defense in depth keeps both checks. - .golangci.yml: G204 comment said "we call quint/git as subprocess". Now claude/git. Also: .gitignore picks up .osgrep (tool cache, shouldn't be tracked).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
claudecodeLLM provider that wraps theclaudeCLI as a subprocess. Lets Pro/Max subscribers runhaft agentwithoutANTHROPIC_API_KEY— auth is delegated entirely to the CLI.internal/provider/claudecode.go/claudecode_test.goclaude-codewas shadowed byclaude-)claude-code,claude-code:opus,claude-code:sonnet,claude-code:haikuhaft doctorsurfaces the CLI presence + versiondocs/claude-code-provider.mdWhy subprocess and not the SDK
haft is Go. The Vercel AI SDK's
ai-sdk-provider-claude-codeis TypeScript-only, and the Claude Agent SDK itself only ships Python + TS bindings. Subprocess wrapping is the only path that reuses Claude Code's subscription auth without taking a non-Go dep (or building a CGO bridge to Bun/Node).Scope (intentionally narrow)
In this PR:
(system_prompt, user_prompt)pairclaude -p --output-format stream-json --verbose --allowed-tools ''MessageOut of scope (follow-ups):
ToolSchemaisn't translated to the CLI's MCP surface yet, so the model emits text only. Documented as a clear limitation — tool-driven flows should keep using the anthropic/openai providers. The right next step is `--mcp-config` pointing at a local haft MCP server so `haft_note`/`haft_problem`/etc. are callable.--resumeto amortize per-turn startup cost.resultevent.I'd rather ship the scaffold small and iterate than bundle everything and risk a wholesale reject.
Design notes
--allowed-tools ''disables CLI built-ins so the model doesn't write files under the user's feet when haft is supposed to own the agent surface.--no-session-persistencekeeps each turn ephemeral (matches existing Anthropic/OpenAI providers' stateless shape).ModelID()reports the haft-facing id (claude-codeorclaude-code:<sub>), not a real Anthropic model name — avoids confusing the registry.Tests
Unit tests cover:
flattenConversation— system merging, labeled body blocks, empty-turn skiprenderParts— text, tool_call, tool_result (incl. error variant)parseClaudeStream— text delta extraction,resultevent (success / error), malformed line skipguessProviderFromPrefix— regression:claude-code*routes to claudecode, not anthropicNot yet in CI: end-to-end subprocess invocation against the real
claudebinary. Open to adding a-tags integrationsuite gated onclaudebeing on PATH if you want it.Test plan
Open questions for the maintainer