On-chain policy enforcement for Solana AI agents. Three layers of defense — program allow-listing, spending budgets, and a real-time AI kill switch — between any autonomous agent and the blockchain.
Built for Solana Frontier hackathon.
250,000+ AI agents operate on Solana today. Most hold unconstrained keys — if compromised, there's nothing on-chain stopping them from draining their treasury. Step Finance lost $40M to exactly this kind of failure.
Agent Guardrails Protocol is an on-chain policy layer that sits between an AI agent and the programs it calls. Every transaction goes through a guarded_execute instruction that enforces:
- Allow-listing — agent can only call whitelisted programs
- Spending budgets — per-transaction and rolling daily caps
- AI kill switch — a monitoring server watches all activity in real time, uses Claude to judge anomalies, and can pause an agent on-chain in under 3 seconds
Agent signs txn → Guardrails PDA validates → CPI to target program
↓ (events)
Helius webhook → Server → Claude judge
↓ (if suspicious)
On-chain PAUSE
graph TB
subgraph "On-Chain (Solana)"
Agent[AI Agent<br/>session key]
GR[Guardrails Program<br/>PDA policy check]
Target[Target Program<br/>Jupiter / System / Token]
Squads[Squads v4<br/>Multisig]
end
subgraph "Off-Chain Infrastructure"
Helius[Helius Webhooks]
Server[Server<br/>Express + Node.js]
Claude[Claude API<br/>Haiku judge / Opus reports]
DB[(Neon Postgres<br/>via Prisma)]
end
subgraph "Frontend"
Dashboard[Dashboard<br/>Next.js on Vercel]
Wallet[Phantom / Solflare<br/>Owner wallet]
end
Agent -->|signs guarded_execute| GR
GR -->|CPI via PDA signer| Target
GR -->|escalate if > threshold| Squads
GR -->|emit events| Helius
Helius -->|webhook POST| Server
Server -->|judge call| Claude
Server -->|persist txns + verdicts| DB
Server -->|pause_agent tx| GR
Server -->|SSE push| Dashboard
Dashboard -->|RPC reads| GR
Dashboard -->|REST API queries| Server
Wallet -->|sign txns| Dashboard
| Layer | Technology |
|---|---|
| Smart contract | Anchor 0.30.1 / Rust |
| Frontend | Next.js 14 (App Router) / Tailwind / shadcn/ui / Recharts |
| Wallet | @solana/wallet-adapter (Phantom, Solflare, Backpack) + SIWS |
| State management | TanStack Query + Zustand |
| Server | Express + Node.js 20 / TypeScript |
| Anomaly detection | Claude Haiku 4.5 (real-time) + Claude Opus 4.7 (incident reports) |
| Data ingestion | Helius Enhanced Webhooks |
| Database | Neon Postgres + Prisma ORM |
| Realtime | Server-Sent Events (SSE) |
| Session keys | Swig |
| Multisig escalation | Squads v4 |
| Testing | LiteSVM (in-process) via anchor-litesvm |
Four isolated sub-projects — no monorepo, each deploys independently:
agent-guardrails/
├── program/ # Anchor/Rust on-chain program → Solana devnet
├── sdk/ # IDL + TS client (source of truth, synced to consumers)
├── server/ # Express server: API + worker pipeline
├── dashboard/ # Next.js 14 frontend only → Vercel
├── docs/ # Architecture, data contracts, setup, deploy, demo runbook
├── scripts/ # SDK sync, devnet deploy
├── .github/workflows/ # CI + deploy workflows
└── .claude/ # Custom commands + specialized agents for Claude Code
Shared SDK is synced automatically — edit sdk/, a pre-commit hook copies to server/src/sdk/ and dashboard/lib/sdk/.
- Rust 1.75+, Solana CLI 1.18+, Anchor 0.30.1
- Node.js 20+, pnpm 9+
git clone https://github.com/iamasx/agent-guardrails.git
cd agent-guardrails
# Configure git hooks (auto-syncs SDK on commit)
git config core.hooksPath .githooks
# Build the program
cd program
pnpm install
anchor build
cd ..
# Sync IDL to consumers
bash scripts/sync-sdk.sh
# Run tests (LiteSVM, in-process — no validator needed)
cd program
anchor test --skip-local-validator --skip-deploycd server
pnpm install
cp .env.example .env # fill in values (DATABASE_URL, ANTHROPIC_API_KEY, etc.)
npx prisma migrate dev # create database tables
pnpm dev # http://localhost:8080cd dashboard
npm install
cp .env.example .env.local # fill in values (NEXT_PUBLIC_API_URL, etc.)
npm run dev # http://localhost:3000See docs/env-setup.md for the full local development guide.
The core is guarded_execute — a single instruction that validates and proxies agent transactions:
- Load the agent's
PermissionPolicyPDA - Assert the policy is active (not paused)
- Assert the session hasn't expired
- Assert the target program is whitelisted
- Assert the transaction amount is within the per-tx cap
- Roll the daily budget if 24h has elapsed
- Assert the daily budget isn't exceeded
- If amount exceeds the escalation threshold and Squads is configured, create a multisig proposal
- Emit a
GuardedTxnAttemptedevent - CPI to the target program with the PDA as signer
- Update the
SpendTracker - Emit
GuardedTxnExecutedorGuardedTxnRejected
The agent's keypair holds no funds. All funds live in token accounts owned by the policy PDA. The agent instructs, the PDA acts.
Helius webhook → Server
→ [Ingest] HMAC verify, parse, persist via Prisma → SSE push
→ [Prefilter] Cheap stat checks — skip LLM for routine txns (~70% skipped)
→ [Judge] Claude Haiku evaluates: ALLOW / FLAG / PAUSE → SSE push
→ [Executor] If PAUSE: sign + send pause_agent on-chain → SSE push
→ [Reporter] Queue Claude Opus incident report (async) → SSE push
Prefilter rules (skips LLM if all hold):
- Target is the agent's most-used program
- Amount < 50% of cap
- < 3 txns in last 60 seconds
- Within historical activity hours
Always invokes LLM if any hold:
- New program never seen before
- Amount > 70% of cap
- Burst > 5 txns in 60s
- Daily budget > 80% consumed
- Session expiring within 10 minutes
| Component | Platform | Command |
|---|---|---|
| Program | Solana devnet | cd program && anchor deploy |
| Database | Neon Postgres | cd server && npx prisma migrate deploy |
| Server | TBD | See docs/deploy.md |
| Dashboard | Vercel | Auto-deploys on push to main |
Deploy order matters: Program (need program ID) → Database (need connection string) → Server (need program ID + DB + webhook URL) → Dashboard (need server URL).
See docs/deploy.md for the full deployment guide.
- Swig — Session key issuance. Scoped agent keys with built-in expiry. Defense in depth with Guardrails.
- Squads v4 — Multisig escalation. High-value transactions require human approval via Squads proposal.
- Helius — Enhanced webhooks stream program events to the monitoring server in real time.
- Solana Agent Kit (SendAI) — Demo agents simulate honest and malicious behavior for the live demo.
Three agents run on devnet:
- Yield Bot — honest Jupiter swaps, ~30 txns/hour
- Staking Agent — honest Marinade staking
- Alpha Scanner — deliberately misbehaves (burst txns to unknown programs)
cd dashboard
npm run demo:setup # Create policies on devnet
npm run demo:simulate # Start agents — attacker triggers at T+60sThe dashboard shows normal activity streaming in via SSE, then the attack: FLAG → FLAG → PAUSE in real time. The agent is stopped on-chain in under 3 seconds. An Opus-generated incident report appears with a full timeline and root cause analysis.
| Document | Description |
|---|---|
| implementation-plan.md | High-level spec, week plan, demo script, risks |
| program/IMPLEMENTATION.md | On-chain program design |
| server/IMPLEMENTATION.md | Server pipeline, API, SSE, auth |
| dashboard/IMPLEMENTATION.md | Frontend components, data fetching, SSE |
| docs/architecture.md | System topology and data flow diagrams |
| docs/data-contracts.md | Account layouts, events, Prisma schemas, API contracts |
| docs/env-setup.md | Local development setup guide |
| docs/deploy.md | Deployment guide for all components |
| docs/demo-runbook.md | Demo day operator's guide |
| docs/walkthrough.md | End-to-end system walkthrough with demo example |
See contributing/CONTRIBUTING.md for the full guide: workflow, coding standards, PR process, and testing.
AI tool setup — this repo is built with Claude Code. For other tools:
- Cursor —
bash contributing/scripts/setup-cursor.sh - Codex —
bash contributing/scripts/setup-codex.sh - VS Code + Copilot —
bash contributing/scripts/setup-vscode.sh
See contributing/README.md for details.
Apache 2.0
Built with Claude Code