feat(agent): agent-provider class — drive Claude over ACP on your own subscription (#813)#866
feat(agent): agent-provider class — drive Claude over ACP on your own subscription (#813)#866softmarshmallow wants to merge 3 commits into
Conversation
…e user's own subscription (#813) A second provider class where an EXTERNAL agent owns the loop: Grida acts as an ACP consumer, spawns the ACP-team bridge (@agentclientprotocol/claude-agent-acp), and drives Claude on the user's own Pro/Max subscription — zero API key. Distinct from the model-provider kinds (BYOK/endpoint): no ModelFactory is called; the runtime branches before resolution and streams from the external agent (ProviderChunk -> AI-SDK chunks). Working + verified via deterministic + gated-live tests: keyless run, native streaming, multi-turn continuity (session/resume), model picker (Opus 1M / Sonnet / Haiku, Claude-default), thinking content, tool-call UI, persistence, cancel, resume-fallback on stale id, stop-reason mapping. Adds a deterministic in-memory fake ACP agent (testing/fake-acp-agent.ts) + JTBD suite (agent-provider/jtbd.test.ts) so the user-outcome contract is testable without a real Claude; gated run.live.test.ts proves the real bridge. Spike: the keep/drop decision (Anthropic ToS + the permanent per-feature fork tax) is OPEN — see docs/wg/ai/agent/acp-provider.md. GRIDA-SEC-004 reviewed (sandbox/policy.ts allowlist, SECURITY.md).
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (3)
WalkthroughAdds an "agent-provider" execution class to ChangesAgent-provider class (Claude Code ACP)
Sequence Diagram(s)sequenceDiagram
participant Client as Client (POST /agent/run)
participant AgentRuntime
participant runAgentProviderTurn
participant openClaudeSession
participant ClaudeBridge as Claude ACP Bridge (stdio)
participant SSE as SSE Response
Client->>AgentRuntime: run({ model_id: "claude-code/opus-4.8-1m", messages })
AgentRuntime->>AgentRuntime: isAgentProviderModel → makeAgentProvider
AgentRuntime->>AgentRuntime: extractLastUserText(messages)
AgentRuntime->>runAgentProviderTurn: prompt, cwd, resume_session_id, model, signal, emit
runAgentProviderTurn->>openClaudeSession: openProvider("claude", opts)
openClaudeSession->>ClaudeBridge: spawn npx claude-agent-acp
openClaudeSession->>ClaudeBridge: initialize + newSession/resumeSession
openClaudeSession-->>runAgentProviderTurn: AgentProviderSession
runAgentProviderTurn->>ClaudeBridge: session.prompt(text, chunkSink)
runAgentProviderTurn->>Client: emit start, start-step (SSE)
loop streaming ProviderChunks
ClaudeBridge-->>runAgentProviderTurn: ProviderChunk (text/reasoning/tool/error)
runAgentProviderTurn->>Client: emit text-delta, reasoning-delta, tool events (SSE)
end
ClaudeBridge-->>runAgentProviderTurn: TurnResult { stopReason, providerSessionId }
runAgentProviderTurn->>Client: emit finish-step, finish (SSE)
runAgentProviderTurn-->>AgentRuntime: TurnResult
AgentRuntime->>AgentRuntime: setAgentProviderSessionId (persist external session id)
AgentRuntime->>Client: close SSE
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related issues
Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6ffead327e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (isAgentProviderModel(req.model_id)) { | ||
| // Agent-provider class (issue #813): an external agent owns its own | ||
| // loop. No BYOK/endpoint resolution, no model factory — the runtime | ||
| // streams from the agent-provider consumer in startTurn. | ||
| provider = makeAgentProvider(AGENT_PROVIDER_MODELS[req.model_id].id); |
There was a problem hiding this comment.
Handle queued Claude Code sends
When the selected model is Claude Code, this special-case only runs for direct /agent/run calls. If the user submits while a Claude Code turn is busy, the desktop queue posts /sessions/:id/queue; the scheduler later calls drainTurn, which still resolves the persisted provider_id ("claude") through resolveProvider instead of recreating an agent-provider, so the drain throws after the queued row has already been dequeued and no turn starts. Mirror the agent-provider resolution in the drain path and pass the dequeued user text as the provider prompt.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Acknowledged, and intentionally deferred in this spike PR. The fix is clear, but its prompt-derivation half is behaviorally verifiable only via the gated live test (the deterministic fake injects at runAgentProviderTurn, not the runtime/host level). Tracked in docs/wg/ai/agent/acp-provider.plan.md (Queue drain), to land with a runtime-level test harness or a live run.
| const allow = | ||
| req.options.find((o) => o.kind === "allow_always") ?? | ||
| req.options.find((o) => o.kind === "allow_once") ?? | ||
| req.options[0]; | ||
| if (!allow) return { outcome: { outcome: "cancelled" } }; | ||
| return { outcome: { outcome: "selected", optionId: allow.optionId } }; |
There was a problem hiding this comment.
Do not auto-approve external agent tools
For any ACP permission request from Claude Code, this handler selects allow_always/allow_once without consulting the session mode or the supervised approval flow. In the default accept-edits posture, a prompt-injected external agent can therefore run write/shell/file tools without the user approval that Grida's built-in agent requires; route these requests through the host approval UI or fail closed instead of automatically granting them.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Acknowledged — this is the deferred mode/approval-UX item (tracked in acp-provider.plan.md). Auto-approve is the documented PoC posture; the real fix routes ACP session/request_permission to the host supervised Allow/Deny gate and maps the Auto / Accept-edits picker to setSessionMode. Fail-closed would make the spike non-functional, so the PoC behavior stays pending that design.
There was a problem hiding this comment.
Actionable comments posted: 9
🧹 Nitpick comments (1)
packages/grida-ai-agent/src/__public-api__.test.ts (1)
236-238: ⚡ Quick winBroaden the public contract assertion for external-agent egress hosts.
Line [238] only pins one domain, so regressions on other required hosts can pass tests and fail runtime egress.
Suggested test hardening
- expect(policy.network.allowed_domains).toContain("api.anthropic.com"); + expect(policy.network.allowed_domains).toEqual( + expect.arrayContaining([ + "api.anthropic.com", + "anthropic.com", + "*.anthropic.com", + "claude.ai", + "*.claude.ai", + ]) + );🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/grida-ai-agent/src/__public-api__.test.ts` around lines 236 - 238, The test assertion for the external agent egress policy only validates a single domain (api.anthropic.com) in policy.network.allowed_domains, which is insufficient to catch regressions. Expand this test to verify that ALL required external agent egress hosts are present in the allowed_domains list, not just one. Identify all necessary vendor backend hosts that the external agent needs to reach and add assertions to ensure each one is included in the policy, so that missing domains are caught at test time rather than failing at runtime.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@editor/scaffolds/desktop/shared/model-picker.tsx`:
- Around line 46-56: The DEFAULT_MODEL_ID constant is currently set to
"claude-code/opus-4.8-1m" which requires external Claude authentication, causing
first-run to fail for signed-out users. Change the DEFAULT_MODEL_ID value to use
TIER_MODEL_IDS.pro instead, which references the hosted Sonnet 4.6 model that
does not require external authentication and allows new chat to succeed without
auth.
- Around line 67-76: The AGENT_PROVIDER_IDS set only includes the current model
ids from AGENT_PROVIDER_OPTIONS but excludes legacy claude-code model ids that
the runtime still supports for backward compatibility. Modify the
AGENT_PROVIDER_IDS set creation to also include legacy id variants, or create a
normalization function that can map legacy claude-code ids to their current
equivalents. This will ensure that sessions stored with legacy ids can be
properly validated and reseeded in the model picker state path.
In `@packages/grida-ai-agent/src/agent-provider/claude.ts`:
- Line 17: The CI spell-check pipeline is failing because the variable name
`ndJsonStream` at lines 17 and 63 is being flagged as a misspelling. Add an
allowlist or ignore entry for the token `ndJsonStream` in your spell-check
configuration file (typically a file like .spellcheckrc, .spelling,
pyproject.toml, or similar spell-checker config) to mark this as an intentional
term that should not trigger spelling errors and allow the CI job to pass.
- Around line 198-209: The requestPermission method in the claude.ts file is
currently auto-approving all tool execution requests by selecting the first
available permission option, which bypasses the authorization gate. To implement
fail-closed permission handling, refactor the requestPermission method to deny
permission by default and require explicit user or host approval before allowing
any tool to execute. Instead of automatically selecting from the provided
options, return an outcome with "cancelled" status or implement a mechanism that
routes permission requests to the host's permission UX for explicit consent,
ensuring that tools cannot be executed without explicit authorization.
- Around line 189-248: The transport spawned by spawnBridge at the beginning of
the function is not disposed if initialize, newSession, resumeSession, or
startFresh throw an error. Wrap all the initialization logic starting from the
ClientSideConnection instantiation through the sessionId assignment (including
the canResume check, the conditional session resume/start logic, and startFresh
invocation) in a try-catch block that ensures the transport is disposed before
re-throwing the error, so the bridge process is not left running if any of these
operations fail.
In `@packages/grida-ai-agent/src/agent-provider/consume.bin.ts`:
- Around line 41-56: The session.dispose() call is currently only executed on
the success path after session.prompt() completes. If session.prompt() throws an
error, the session resource will not be cleaned up before the process exits.
Wrap the block containing the session.prompt() call and the subsequent logging
and metrics calculation within a try-finally block, moving the session.dispose()
call into the finally block to ensure it always executes regardless of whether
session.prompt() succeeds or throws an error.
In `@packages/grida-ai-agent/src/agent-provider/run.live.test.ts`:
- Around line 77-89: The beforeEach hook deletes process.env.ANTHROPIC_API_KEY
but the afterEach hook does not restore it, causing process-wide state pollution
that affects subsequent tests. Store the original value of
process.env.ANTHROPIC_API_KEY before deleting it in the beforeEach hook, then
restore the original value in the afterEach hook after the cleanup operations to
prevent test interference.
In `@packages/grida-ai-agent/src/agent-provider/types.ts`:
- Around line 93-97: The isAgentProviderModel function uses the `in` operator to
check if modelId exists in AGENT_PROVIDER_MODELS, which checks the entire
prototype chain including inherited properties like __proto__ and toString. This
allows invalid model IDs to pass validation and later cause runtime errors when
indexing the object. Replace the `in` check with
Object.prototype.hasOwnProperty.call(AGENT_PROVIDER_MODELS, modelId) or
Object.hasOwn(AGENT_PROVIDER_MODELS, modelId) to only validate against own
properties of the allowlist object.
In `@packages/grida-ai-agent/src/runtime/run-input.ts`:
- Around line 107-108: The isAgentProviderModel function in
packages/grida-ai-agent/src/agent-provider/types.ts uses the `in` operator to
check for model IDs, which checks both own and inherited properties and allows
inherited properties like "toString" to bypass the guard. Replace the `in`
operator check with an own-property check using hasOwnProperty or
Object.prototype.hasOwnProperty.call to ensure only explicitly defined model IDs
pass the guard and prevent invalid provider access.
---
Nitpick comments:
In `@packages/grida-ai-agent/src/__public-api__.test.ts`:
- Around line 236-238: The test assertion for the external agent egress policy
only validates a single domain (api.anthropic.com) in
policy.network.allowed_domains, which is insufficient to catch regressions.
Expand this test to verify that ALL required external agent egress hosts are
present in the allowed_domains list, not just one. Identify all necessary vendor
backend hosts that the external agent needs to reach and add assertions to
ensure each one is included in the policy, so that missing domains are caught at
test time rather than failing at runtime.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a9ac459d-bd1b-4e4a-a07f-85b8ff5061b4
📒 Files selected for processing (23)
SECURITY.mddocs/wg/ai/agent/acp-provider-codex.mddocs/wg/ai/agent/acp-provider.mddocs/wg/ai/agent/index.mdeditor/scaffolds/desktop/shared/model-picker.tsxpackages/grida-ai-agent/README.mdpackages/grida-ai-agent/src/__public-api__.test.tspackages/grida-ai-agent/src/agent-provider/claude.tspackages/grida-ai-agent/src/agent-provider/consume.bin.tspackages/grida-ai-agent/src/agent-provider/index.tspackages/grida-ai-agent/src/agent-provider/jtbd.test.tspackages/grida-ai-agent/src/agent-provider/run.live.test.tspackages/grida-ai-agent/src/agent-provider/types.tspackages/grida-ai-agent/src/providers/endpoints.live.test.tspackages/grida-ai-agent/src/providers/index.tspackages/grida-ai-agent/src/runtime/agent-provider-run.tspackages/grida-ai-agent/src/runtime/index.tspackages/grida-ai-agent/src/runtime/run-input.tspackages/grida-ai-agent/src/runtime/runtime.live.test.tspackages/grida-ai-agent/src/sandbox/policy.tspackages/grida-ai-agent/src/session/store.tspackages/grida-ai-agent/src/testing/fake-acp-agent.tspackages/grida-ai-agent/src/testing/sse.ts
…port cleanup, test hygiene - typos: allowlist the `ndJsonStream` identifier (CI typos check was failing on it) - isAgentProviderModel: use Object.hasOwn instead of `in` so prototype keys (toString/__proto__) can't pass the guard and reach AGENT_PROVIDER_MODELS[id] - claude.ts: close the bridge transport if the ACP handshake throws (otherwise the spawned npx process leaks — dispose() is unreachable pre-return) - consume.bin.ts: dispose the session in a finally block - run.live.test.ts: save/restore ANTHROPIC_API_KEY so the env mutation can't bleed into other suites - __public-api__.test.ts: assert the full agent-provider egress host set, not just api.anthropic.com - model-picker: keep the legacy bare `claude-code` id recognized so older sessions re-seed instead of being clobbered
…tem prompt (#813) Prompt registry: add src/prompts.ts as the single home for every authored prompt across the package's agent systems — model-provider core/svg/command-hint, agent-provider/ACP system, and the session titler/compactor subagents (moved out of agent/prompts.ts, agent/index.ts, session/*, agent-provider/config.ts). Tool-name-dependent prompts are dependency-free builders that take the tool-name bag as a parameter, so importing a prompt never drags in the tool modules; tool descriptions and the skill-index block stay own-contracted with their tool/mechanism. ACP config dial-board: add agent-provider/config.ts — one place to change how Grida configures the external agent (bridge package, client capabilities, MCP servers, thinking, system-prompt mode). claude.ts reads from it. System prompt: the agent-provider path now appends a managed system prompt (registry prompts.acp_system) onto Claude Code's preset via _meta.systemPrompt; locked by a JTBD test (the fake captures the session _meta). Verified: 531 unit tests, typecheck, oxlint clean; composeSystemPrompt output byte-identical after the move.
|
Followup tracked in #871 — productionize the transport: drop the runtime |
What
Adds the agent-provider class (#813): a second provider class where an external agent owns the loop. Grida acts as an ACP consumer, spawns the ACP-team bridge (
@agentclientprotocol/claude-agent-acp), and drives Claude on the user's own Pro/Max subscription — zero API key, zero signup.It is deliberately distinct from the model-provider kinds (BYOK / endpoint): no
ModelFactoryis ever called. The runtime branches before provider resolution and streams the external agent's output (ProviderChunk→ AI-SDK chunks).Why
BYOK already covers "bring your own key." This reaches the subscription crowd that BYOK can't — the lowest-friction on-ramp. It is a spike to inform a keep/drop decision, not a settled feature.
What's in it
Works + verified:
agent_message_chunkdeltas →text-delta)session/resume), with fallback to a fresh session when a stale id can't be resumedsession/cancel) and stop-reason → finish-reason mappingTest infrastructure: a deterministic in-memory fake ACP agent (
testing/fake-acp-agent.ts) + a JTBD suite (agent-provider/jtbd.test.ts) that asserts user-visible outcomes with no real Claude and nonpx; the gatedrun.live.test.tsproves the real bridge.Docs: decision/RFD
docs/wg/ai/agent/acp-provider.md, a README section, and aSECURITY.md(GRIDA-SEC-004) amendment.Reviewer notes
acp-provider.md.npxbridge running under the packaged srt/Seatbelt + proxy wrap has not been reproduced (only dev). This is the technical go/no-go.DEFAULT_MODEL_IDpoints at Claude Code, so a signed-out user's first new chat hitsauth_requireduntil the zero-config auto-detect lands. Intentional per the product ask — flag if you'd rather gate it on login detection now.ProviderChunkis the neutral seam andruntime/agent-provider-run.tsis the adapter.sandbox/policy.tsadds the Anthropic egress allowlist;claude.tsstripsANTHROPIC_API_KEY(subscription billing only).Tests
pnpm --filter @grida/agent test→ 530 passed, 16 skipped (gated live), 3 todo. Typecheck clean, oxlint clean.Summary by CodeRabbit