Kubernetes-native AI coding agent harness with a real-time intelligence layer.
RoboDev orchestrates autonomous developer agents (Claude Code, OpenAI Codex, Aider, OpenCode) inside isolated Kubernetes Jobs. It goes beyond job dispatching — a built-in intelligence layer streams live output from every running agent, scores productivity in real-time, diagnoses failures causally, routes tasks to the best engine, and accumulates cross-task knowledge that improves future runs.
Documentation → | Quick Start | Plugin System
Most agent orchestrators are job launchers — they dispatch a task and wait for success or failure. RoboDev adds an intelligence layer that actively monitors, coaches, and learns from every running agent:
| Capability | What it does |
|---|---|
| Real-time streaming | Agent output flows over a live NDJSON stream — tool calls, cost updates, and content deltas are visible line-by-line as they happen, not after the job finishes |
| Process Reward Model (PRM) | Every tool call is scored for productivity. If a trajectory is deteriorating, targeted guidance is injected before the situation escalates — like having a senior engineer watching over the agent's shoulder |
| Episodic memory | A SQLite-backed knowledge graph accumulates facts, patterns, and engine profiles across tasks. Each new task receives relevant prior knowledge injected into its prompt. Confidence decays over time; stale knowledge is pruned automatically |
| Causal failure diagnosis | Failed tasks are not just retried. The failure mode is classified — permission error, test failure, ambiguous spec, environment issue — and a targeted recovery strategy is generated before the next attempt |
| Intelligent engine routing | Historical success rates, cost, and task-type affinity guide which engine handles each task. Poor-performing engines are penalised; good ones accumulate positive signal. The selection updates continuously |
| Predictive cost estimation | Before a task launches, a k-NN model estimates cost and duration from similar historical outcomes. Tasks projected to exceed max_predicted_cost_per_job are automatically rejected before the job starts |
| Competitive execution | Multiple engines run the same task simultaneously. A judge engine evaluates all outputs and selects the best result — useful for critical tasks where quality matters more than cost |
| Review-response loop | RoboDev monitors PRs it opens, classifies incoming review comments, and automatically spawns follow-up jobs to address actionable feedback — turning a single-pass agent into a review-responsive loop |
An adaptive watchdog detects repetitive tool-call loops, cost-velocity spikes, thrashing between files, and idle stalls — intervening with targeted actions (nudge, escalate, or terminate) rather than waiting for a blunt timeout.
- 4 execution engines — Claude Code, OpenAI Codex, Aider, and OpenCode; extensible via gRPC plugin interface
- Event-driven ingestion — Webhooks for GitHub, GitLab, Slack, Shortcut, and generic sources with HMAC signature validation
- Plugin architecture — Ticketing, notifications, secrets, SCM, and review backends are all swappable; write plugins in any language with gRPC support
- Human-in-the-loop — Approval gates at
pre_startandpre_mergehold execution until a human approves via Slack - Secret resolution — SCM and notification credentials resolved from Kubernetes Secrets or HashiCorp Vault and injected into agent pods; the
secretresolverpackage provides the infrastructure for per-task references - Defence in depth — Six layered safety boundaries: controller guard rails, engine hooks, quality gate, adaptive watchdog, NetworkPolicies, and secret resolution policy
- Kubernetes-native — Isolated agent pods with non-root, read-only-FS, dropped-all-capabilities security contexts; optional gVisor/Kata Containers sandboxing
- Enterprise-ready — Cost budgets per task, Prometheus metrics, and Grafana dashboards; multi-tenancy config schema is defined and planned for a future release
kubectl create namespace robodev
kubectl create secret generic robodev-github-token \
-n robodev --from-literal=token=ghp_YOUR_TOKEN
kubectl create secret generic robodev-anthropic-key \
-n robodev --from-literal=api_key=sk-ant-YOUR_KEY
helm repo add robodev https://unitaryai.github.io/RoboDev/charts
helm install robodev robodev/robodev -n robodev -f robodev-config.yamlMinimal robodev-config.yaml:
ticketing:
backend: github
config:
owner: "your-org"
repo: "your-repo"
token_secret: robodev-github-token
labels: ["robodev"]
engines:
default: claude-code
claude-code:
auth:
method: api_key
api_key_secret: robodev-anthropic-key
guardrails:
max_cost_per_job: 5.00
max_job_duration_minutes: 60
allowed_repos:
- "github.com/your-org/your-repo"Label a GitHub issue robodev and the controller picks it up, launches a Claude Code agent, and opens a pull request with the changes.
For step-by-step setup guides, the full configuration reference, and instructions for enabling the intelligence layer features, see the documentation.
All external integrations are pluggable. Built-in plugins are compiled into the controller; third-party plugins run as gRPC subprocesses via hashicorp/go-plugin.
| Interface | Built-in | Third-party examples |
|---|---|---|
| Ticketing | GitHub Issues, Shortcut, Linear | Jira, Monday.com |
| Notifications | Slack, Telegram, Discord | Teams, PagerDuty |
| Approval | Slack | Teams |
| Secrets | K8s Secrets, HashiCorp Vault | AWS SM, 1Password |
| SCM | GitHub, GitLab | Bitbucket |
| Review | CodeRabbit | Custom |
| Engine | Claude Code, Codex, Aider, OpenCode | Custom |
SDK helper libraries for Python, Go, and TypeScript are in sdk/. Generated proto stubs are not checked in — run make sdk-gen to generate them from the proto definitions. See the plugin docs for details.
make test # Run all unit tests
make build # Build controller binary
make lint # Run linter
make proto-lint # Lint protobuf definitions
make docker-build # Build all container imagesPrerequisites: Go 1.23+, gofumpt, golangci-lint, buf.
We welcome contributions. See CONTRIBUTING.md for guidelines.
- Conventional commits:
feat:,fix:,docs:,test: - British English in all comments, docs, and user-facing strings
- Table-driven tests with
testify gofumptformatting andgolangci-lintclean
Apache 2.0. See LICENCE.
