Area: New API Extension
Summary
Kelos agents today pass only flat metadata between dependent tasks — branch names, PR URLs, token counts (TaskStatus.Results as map[string]string). When a Linear investigation agent concludes, there is no efficient way to transfer its rich findings (root cause analysis, relevant code paths, reproduction steps) to a downstream solver agent. The solver either re-derives everything from scratch (burning tokens and time) or operates blind.
This proposal adds a standard structured handoff payload to TaskStatus — a versioned JSON format that agents write during execution and downstream agents consume via prompt templates. It is deliberately scoped to the data format and plumbing for inter-task context transfer, not control flow, pipeline management, or dynamic spawning.
Problem
1. Results are metadata, not context
The existing TaskStatus.Results (map[string]string) is designed for machine-readable execution metadata — branch names, commit SHAs, PR URLs, cost, token counts. These are outputs of kelos-capture, not the agent itself. There is no channel for the agent to pass substantive context to a downstream agent: investigation findings, analysis summaries, Slack thread context, or decision rationale.
The prompt template system (internal/controller/task_controller.go:849-886) injects dependency results into downstream prompts:
{{index .Deps "investigate" "Results" "branch"}}
But there is no Results key for "here's what I found and why." Agents would have to stuff freeform text into a flat string map, which has no schema, no size discipline, and no semantic separation between metadata and content.
2. Downstream agents waste tokens re-deriving context
In a pipeline like investigate → fix → open-pr, the fix agent currently receives only coordinates (which branch, which PR). It must re-read the issue, re-analyze the codebase, and re-identify the root cause — duplicating work the investigation agent already completed. This is the primary cost multiplier in multi-stage agent workflows.
With a structured handoff, the investigation agent's summary flows directly into the fix agent's prompt, eliminating redundant analysis.
3. No audit trail for inter-agent communication
When debugging why an agent made a particular decision, operators can inspect TaskStatus.Results for metadata but cannot see what context the agent received from its upstream dependency. A structured handoff stored in TaskStatus provides a persistent, kubectl-inspectable record of exactly what context was transferred between agents.
4. Existing proposals assume Results is sufficient
Several open proposals (#747 conditional dependencies, #792 historyContext, #829 parent-child tasks, #983 TaskPipeline CRD) reference dependency results as the primary inter-task data channel. All of them would benefit from a richer, standardized handoff format — but none of them define one.
Proposed Design
The Handoff Type
Add a new type to api/v1alpha1/task_types.go:
// TaskHandoff is the standard structured payload for agent-to-agent
// context transfer. Versioned for forward compatibility.
type TaskHandoff struct {
// Version of the handoff schema.
// +kubebuilder:validation:Minimum=1
// +kubebuilder:default=1
Version int `json:"version"`
// Summary is a concise description of what was accomplished or found.
// Intended to be token-efficient when injected into downstream prompts.
// +kubebuilder:validation:MaxLength=4096
Summary string `json:"summary"`
// Detail contains rich context: findings, analysis, reasoning.
// Downstream tasks can selectively include this when depth is needed.
// +optional
// +kubebuilder:validation:MaxLength=65536
Detail string `json:"detail,omitempty"`
// Data contains structured key-value pairs for machine-readable fields.
// Supports templating in downstream prompts and CEL evaluation in
// conditional dependencies (#747).
// +optional
Data map[string]string `json:"data,omitempty"`
}
Add the field to TaskStatus:
type TaskStatus struct {
// ... existing fields ...
// Handoff contains the structured context payload produced by the agent
// for consumption by downstream dependent tasks.
// +optional
Handoff *TaskHandoff `json:"handoff,omitempty"`
}
Why separate from Results
|
Results |
Handoff |
| Purpose |
Execution metadata |
Agent-to-agent context |
| Producer |
kelos-capture (post-run binary) |
The agent itself (during execution) |
| Content |
Branch, PR URL, commit SHA, cost, tokens |
Investigation findings, analysis, summaries |
| Format |
Flat map[string]string |
Versioned struct with summary/detail/data |
| Consumer |
Controller (metrics, branch lock), prompt templates |
Downstream agent prompts |
Keeping them separate means each can evolve independently. Results is part of the agent image interface; Handoff is part of the agent's output to other agents.
How agents produce handoffs
1. Well-known file path. The agent writes JSON to a path specified by the KELOS_HANDOFF_PATH environment variable (default: /tmp/kelos-handoff.json).
2. kelos-capture emits it. After the agent exits, kelos-capture reads the handoff file (if it exists), validates the schema, and emits it between markers in stdout:
---KELOS_HANDOFF_START---
{"version":1,"summary":"...","detail":"...","data":{"key":"value"}}
---KELOS_HANDOFF_END---
3. Controller parses it. The controller extracts the handoff JSON from pod logs (same mechanism as KELOS_OUTPUTS_START/END) and stores it in TaskStatus.Handoff.
4. No handoff = no error. If the agent doesn't write a handoff file, nothing happens. The field remains nil. This makes handoffs fully opt-in with zero impact on existing tasks.
How downstream tasks consume handoffs
The existing .Deps template context is extended to include Handoff:
deps[depName] = map[string]interface{}{
"Outputs": depTask.Status.Outputs,
"Results": depTask.Status.Results,
"Handoff": depTask.Status.Handoff, // NEW
"Name": depName,
}
Downstream prompts access handoff fields via Go templates:
prompt: |
## Investigation Summary
{{index .Deps "investigate" "Handoff" "Summary"}}
## Detailed Findings
{{index .Deps "investigate" "Handoff" "Detail"}}
## Specific Data
Root cause file: {{index .Deps "investigate" "Handoff" "Data" "root_cause_file"}}
Fix the issue described above on branch {{index .Deps "investigate" "Results" "branch"}}.
The task author controls exactly what gets injected — Summary for token efficiency, Detail when depth is needed, specific Data values for targeted references.
Size limits
Summary: max 4 KB — forces conciseness, keeps downstream prompts lean
Detail: max 64 KB — room for rich context without blowing up etcd (object size limit ~1.5 MB)
Data: inherits the 64 KB detail limit for the overall handoff
kelos-capture validates and rejects oversized handoffs with a warning log
Agent-side experience
The agent writes a JSON file during execution. No new tools, no MCP server, no special SDK — any agent that can write a file can produce a handoff:
# During agent execution, the agent writes:
cat > "$KELOS_HANDOFF_PATH" <<'EOF'
{
"version": 1,
"summary": "Root cause: null pointer in auth.go:142 when session cookie is expired. The middleware skips validation but the handler assumes non-nil session.",
"detail": "Full stack trace:\n auth.go:142 → session.Validate()\n middleware.go:89 → next.ServeHTTP()\n\nReproduction:\n curl -H 'Cookie: session=expired' https://api.example.com/protected\n\nThe fix requires a nil check before accessing session fields in the auth handler.",
"data": {
"root_cause_file": "pkg/auth/auth.go",
"root_cause_line": "142",
"severity": "critical",
"issue_key": "LIN-423"
}
}
EOF
For AI coding agents (Claude Code, Codex, etc.), the prompt can instruct the agent to write this file. The instruction can be part of the task prompt or injected via AgentConfig agentsMD.
Concrete Examples
Example 1: Investigation → Fix pipeline
# Stage 1: Investigate the issue
apiVersion: kelos.dev/v1alpha1
kind: Task
metadata:
name: investigate
spec:
type: claude-code
credentials:
type: oauth
secretRef:
name: claude-credentials
workspaceRef:
name: my-workspace
prompt: |
Investigate Linear issue LIN-423: "Users getting 500 errors on login."
Analyze the codebase, identify the root cause, and find relevant code paths.
Do NOT fix the issue — only investigate.
When done, write your findings to $KELOS_HANDOFF_PATH as JSON:
{
"version": 1,
"summary": "<concise root cause and location>",
"detail": "<full analysis with code paths, stack traces, reproduction steps>",
"data": {"root_cause_file": "<path>", "severity": "<low|medium|high|critical>"}
}
---
# Stage 2: Fix the issue (receives investigation context)
apiVersion: kelos.dev/v1alpha1
kind: Task
metadata:
name: fix
spec:
type: claude-code
credentials:
type: oauth
secretRef:
name: claude-credentials
workspaceRef:
name: my-workspace
branch: fix/lin-423
dependsOn: [investigate]
prompt: |
## Investigation Summary
{{index .Deps "investigate" "Handoff" "Summary"}}
## Detailed Findings
{{index .Deps "investigate" "Handoff" "Detail"}}
Fix the issue described above. Write tests that cover the failure case.
Commit and push your changes.
---
# Stage 3: Open PR
apiVersion: kelos.dev/v1alpha1
kind: Task
metadata:
name: open-pr
spec:
type: claude-code
credentials:
type: oauth
secretRef:
name: claude-credentials
workspaceRef:
name: my-workspace
branch: fix/lin-423
dependsOn: [fix]
prompt: |
The fix for LIN-423 is ready on branch {{index .Deps "fix" "Results" "branch"}}.
Investigation context:
{{index .Deps "investigate" "Handoff" "Summary"}}
Review the diff and open a pull request with `gh pr create`.
Reference LIN-423 in the PR description.
The fix agent starts with full context immediately — no re-investigation, no wasted tokens. The PR agent also references the investigation summary for a well-written PR description.
Example 2: Slack-triggered triage with context preservation
apiVersion: kelos.dev/v1alpha1
kind: TaskSpawner
metadata:
name: slack-triage
spec:
when:
genericWebhook:
path: /slack-escalation
taskTemplate:
type: claude-code
credentials:
type: oauth
secretRef:
name: claude-credentials
workspaceRef:
name: my-workspace
promptTemplate: |
A Slack escalation was received: {{.Body}}
Triage this issue:
1. Identify the affected service and severity
2. Check recent deployments and error logs
3. Determine if this is a known issue
Write your triage findings to $KELOS_HANDOFF_PATH so the resolver
agent has full context without re-reading the Slack thread.
The downstream resolver task (via dependsOn or a future TaskPipeline) gets the triage context without needing to re-fetch and re-parse the Slack thread.
Relationship to Existing Proposals
| Issue |
What it does |
How handoffs interact |
| #792 (historyContext) |
Injects prior task outcomes from the same spawner into prompts |
Handoff summaries from historical tasks could be included via includeKeys: [handoff_summary]. Different axes: #792 is temporal (across runs), handoffs are spatial (across pipeline stages). |
| #747 (conditional deps) |
CEL-based routing on dependency results |
Handoff Data fields become available for CEL evaluation: handoff.data["severity"] == "critical". Richer routing signals than flat Results alone. |
| #829 (parent-child tasks) |
Agent-initiated dynamic task spawning via MCP |
Parent agent reads child task handoffs via get_task_status. Child tasks inherit the parent's handoff context. The handoff format standardizes what flows in both directions. |
| #983 (TaskPipeline CRD) |
First-class pipeline with stages, matrix fan-out |
Handoff becomes the inter-stage data format. Pipeline stages access upstream handoffs via {{.Stages}} templates. Matrix stages could aggregate handoffs from parallel tasks. |
This proposal does not depend on any of these issues and does not block them. It is a standalone, additive primitive that each of them benefits from.
Files to Change
| File |
Change |
api/v1alpha1/task_types.go |
Add TaskHandoff struct, add Handoff *TaskHandoff to TaskStatus |
internal/controller/output_parser.go |
Add ParseHandoff() for KELOS_HANDOFF_START/END markers |
internal/controller/output_parser_test.go |
Tests for handoff parsing, size validation, malformed JSON |
internal/controller/task_controller.go |
Parse handoff from pod logs, store in status; inject into .Deps template context; set KELOS_HANDOFF_PATH env var |
internal/controller/task_controller_test.go |
Tests for handoff in template resolution, env var injection |
cmd/kelos-capture/main.go |
Read /tmp/kelos-handoff.json, validate, emit between markers |
docs/agent-image-interface.md |
Document handoff file contract, env var, markers |
examples/07-task-pipeline/pipeline.yaml |
Update example to demonstrate handoff usage |
Estimated: ~60 lines of types + ~50 lines of parser + ~30 lines of controller wiring + tests.
Backward Compatibility
- Purely additive: New optional field on
TaskStatus, no changes to existing behavior
- Zero config for existing users: Tasks that don't write a handoff file are completely unaffected
- Existing Results/Outputs unchanged:
kelos-capture continues emitting KELOS_OUTPUTS_START/END as before; handoff markers are separate
- Safe template fallback: If
{{.Deps.X.Handoff.Summary}} is referenced but no handoff exists, the template renders empty (Go template zero-value behavior)
- No new CRDs: Extends existing
TaskStatus within the Task resource
/kind feature
Area: New API Extension
Summary
Kelos agents today pass only flat metadata between dependent tasks — branch names, PR URLs, token counts (
TaskStatus.Resultsasmap[string]string). When a Linear investigation agent concludes, there is no efficient way to transfer its rich findings (root cause analysis, relevant code paths, reproduction steps) to a downstream solver agent. The solver either re-derives everything from scratch (burning tokens and time) or operates blind.This proposal adds a standard structured handoff payload to
TaskStatus— a versioned JSON format that agents write during execution and downstream agents consume via prompt templates. It is deliberately scoped to the data format and plumbing for inter-task context transfer, not control flow, pipeline management, or dynamic spawning.Problem
1. Results are metadata, not context
The existing
TaskStatus.Results(map[string]string) is designed for machine-readable execution metadata — branch names, commit SHAs, PR URLs, cost, token counts. These are outputs ofkelos-capture, not the agent itself. There is no channel for the agent to pass substantive context to a downstream agent: investigation findings, analysis summaries, Slack thread context, or decision rationale.The prompt template system (
internal/controller/task_controller.go:849-886) injects dependency results into downstream prompts:But there is no
Resultskey for "here's what I found and why." Agents would have to stuff freeform text into a flat string map, which has no schema, no size discipline, and no semantic separation between metadata and content.2. Downstream agents waste tokens re-deriving context
In a pipeline like
investigate → fix → open-pr, the fix agent currently receives only coordinates (which branch, which PR). It must re-read the issue, re-analyze the codebase, and re-identify the root cause — duplicating work the investigation agent already completed. This is the primary cost multiplier in multi-stage agent workflows.With a structured handoff, the investigation agent's summary flows directly into the fix agent's prompt, eliminating redundant analysis.
3. No audit trail for inter-agent communication
When debugging why an agent made a particular decision, operators can inspect
TaskStatus.Resultsfor metadata but cannot see what context the agent received from its upstream dependency. A structured handoff stored inTaskStatusprovides a persistent,kubectl-inspectable record of exactly what context was transferred between agents.4. Existing proposals assume Results is sufficient
Several open proposals (#747 conditional dependencies, #792 historyContext, #829 parent-child tasks, #983 TaskPipeline CRD) reference dependency results as the primary inter-task data channel. All of them would benefit from a richer, standardized handoff format — but none of them define one.
Proposed Design
The Handoff Type
Add a new type to
api/v1alpha1/task_types.go:Add the field to
TaskStatus:Why separate from Results
ResultsHandoffkelos-capture(post-run binary)map[string]stringKeeping them separate means each can evolve independently.
Resultsis part of the agent image interface;Handoffis part of the agent's output to other agents.How agents produce handoffs
1. Well-known file path. The agent writes JSON to a path specified by the
KELOS_HANDOFF_PATHenvironment variable (default:/tmp/kelos-handoff.json).2.
kelos-captureemits it. After the agent exits,kelos-capturereads the handoff file (if it exists), validates the schema, and emits it between markers in stdout:3. Controller parses it. The controller extracts the handoff JSON from pod logs (same mechanism as
KELOS_OUTPUTS_START/END) and stores it inTaskStatus.Handoff.4. No handoff = no error. If the agent doesn't write a handoff file, nothing happens. The field remains nil. This makes handoffs fully opt-in with zero impact on existing tasks.
How downstream tasks consume handoffs
The existing
.Depstemplate context is extended to includeHandoff:Downstream prompts access handoff fields via Go templates:
The task author controls exactly what gets injected —
Summaryfor token efficiency,Detailwhen depth is needed, specificDatavalues for targeted references.Size limits
Summary: max 4 KB — forces conciseness, keeps downstream prompts leanDetail: max 64 KB — room for rich context without blowing up etcd (object size limit ~1.5 MB)Data: inherits the 64 KB detail limit for the overall handoffkelos-capturevalidates and rejects oversized handoffs with a warning logAgent-side experience
The agent writes a JSON file during execution. No new tools, no MCP server, no special SDK — any agent that can write a file can produce a handoff:
For AI coding agents (Claude Code, Codex, etc.), the prompt can instruct the agent to write this file. The instruction can be part of the task prompt or injected via AgentConfig
agentsMD.Concrete Examples
Example 1: Investigation → Fix pipeline
The fix agent starts with full context immediately — no re-investigation, no wasted tokens. The PR agent also references the investigation summary for a well-written PR description.
Example 2: Slack-triggered triage with context preservation
The downstream resolver task (via
dependsOnor a future TaskPipeline) gets the triage context without needing to re-fetch and re-parse the Slack thread.Relationship to Existing Proposals
includeKeys: [handoff_summary]. Different axes: #792 is temporal (across runs), handoffs are spatial (across pipeline stages).Datafields become available for CEL evaluation:handoff.data["severity"] == "critical". Richer routing signals than flat Results alone.get_task_status. Child tasks inherit the parent's handoff context. The handoff format standardizes what flows in both directions.{{.Stages}}templates. Matrix stages could aggregate handoffs from parallel tasks.This proposal does not depend on any of these issues and does not block them. It is a standalone, additive primitive that each of them benefits from.
Files to Change
api/v1alpha1/task_types.goTaskHandoffstruct, addHandoff *TaskHandofftoTaskStatusinternal/controller/output_parser.goParseHandoff()forKELOS_HANDOFF_START/ENDmarkersinternal/controller/output_parser_test.gointernal/controller/task_controller.go.Depstemplate context; setKELOS_HANDOFF_PATHenv varinternal/controller/task_controller_test.gocmd/kelos-capture/main.go/tmp/kelos-handoff.json, validate, emit between markersdocs/agent-image-interface.mdexamples/07-task-pipeline/pipeline.yamlEstimated: ~60 lines of types + ~50 lines of parser + ~30 lines of controller wiring + tests.
Backward Compatibility
TaskStatus, no changes to existing behaviorkelos-capturecontinues emittingKELOS_OUTPUTS_START/ENDas before; handoff markers are separate{{.Deps.X.Handoff.Summary}}is referenced but no handoff exists, the template renders empty (Go template zero-value behavior)TaskStatuswithin the Task resource/kind feature