- Status: Draft
- Tracking issue: to be filed
- Author: @shangdinggu
- Last updated: 2026-05-08
- Builds on:
0016-subprocess-agent-runner.md
A 10-minute LLM response, a git log with 50 MB of output, a
multi-megabyte web fetch — each is currently invisible to the
caller until the runner ships its single exit message. This
RFC adds a streaming primitive so callers see incremental
output as it's produced.
The substrate is intentionally narrow:
- One new IPC message kind:
chunk. Emitted betweeniteration_startandexit; carries an arbitrary payload. supervisor.waitgetson_chunk: callers can register a callback that fires on each chunk arrival. Synchronous, inline with the wait loop.RunnerExitInfo.chunks: a tuple of all received chunks, populated for after-the-fact inspection.- No protocol breakage: runners that don't emit
chunkmessages behave exactly as before.chunksdefaults to empty tuple.
What's not in this RFC:
- LLM runner integration with anthropic streaming — separate follow-up. Substrate ships first.
- Exec / Fetch streaming output — same.
- Caller-driven cancellation mid-stream — already covered by
stop(). - Chunk acknowledgement / backpressure — chunks are fire-and- forget. The receiver's callback has to be quick or the runner backs up at the pipe.
kind is a free-form classifier so callers can route
chunks (e.g., a UI surface separates "text" from "tool_output").
content is typically a UTF-8 string but binary callers can
base64-encode and use a custom kind.
The runner can emit zero or many chunk messages in any
iteration. They MUST appear before the iteration's
iteration_done (or exit for runners that don't track
iterations). The supervisor doesn't enforce ordering — chunks
arrive in send order regardless of message kind.
def wait(self, pid: int, *,
timeout: float | None = None,
on_chunk: Callable[[dict], None] | None = None,
) -> RunnerExitInfo:
...When on_chunk is supplied, the supervisor calls it for
each chunk IPC message received. The callable is invoked
synchronously inside the wait loop — slow callbacks slow the
drain. Callers that need responsive UIs should hand off to a
queue inside the callback and process elsewhere.
Callback exceptions are caught and dropped (logged, not reraised) — a bad UI thread shouldn't kill the runner.
@dataclass(frozen=True)
class RunnerExitInfo:
...
chunks: tuple = () # NEWchunks is the full sequence of received chunks (each a
dict in the wire shape). Populated even when on_chunk was
also provided — the same chunks appear in both.
A new CC_RUNNER_BEHAVIOR=chunks=N value emits N text
chunks (each "chunk-i" for i = 1..N), then exits cleanly.
This drives the supervisor end-to-end without needing an LLM
or real streaming source.
RunnerExitInfo.chunksdefaults to(). Existing fields and tests are unchanged.supervisor.waiton_chunkparameter is optional with default None. Existing callers see no change.- Runners that don't emit
chunkmessages receivechunks=()in their info — same as before.
A PR claiming this RFC must:
op="chunk"IPC message is recognised by the supervisor; sent toon_chunkif provided; appended toRunnerExitInfo.chunks.- Order is preserved: callback fires in send order; tuple reflects same.
- Runner emitting 5 chunks → callback fires 5 times → tuple has 5 entries.
- Runner emitting no chunks →
chunks == (). - Callback raising → next chunks still delivered + tuple still appended.
- Existing tests with no chunks involvement keep passing.
- No file outside
cc_kernel/,tests/,docs/RFC/modified.
{ "op": "chunk", "kind": "text" | "tool_output" | "log" | <custom>, "content": "<the chunk payload — usually a string>", "metadata": { /* opaque */ } }