Summary
gc sling <worker> <bead-id> sets gc.routed_to metadata and wraps the bead in a convoy/wisp, but does NOT send a wake signal to the target worker. As a result, routed beads sit unclaimed indefinitely against any worker that is not currently in an active turn cycle.
Root cause: --nudge defaults to false (cmd/gc/cmd_sling.go:105), and gc hook --inject is wired to the Stop hook only (internal/hooks/config/claude.json:42-51), not SessionStart. A warm-idle worker session (one that ran gc prime and has been sitting at its idle prompt with no active turn) never fires Stop, so the work-check never runs and routed work is never surfaced.
Relationship to #1027
#1027 addressed the cold-spawn pool case — a sling that spawns a fresh polecat now triggers a first turn via prompt_mode=\"arg\" + PromptSuffix. That fix does NOT cover the warm-idle case (session already spawned, primed, and idle). @julianknutsen's closing comment on #1027 explicitly flagged this as separate follow-up:
Caveat: if we later want gc sling --nudge itself to enqueue a post-start reminder for cold pool targets, that remains separate follow-up work because the cold pool branch still does not enqueue a sling reminder before an instance exists.
Reproduction
- Let a worker session quiesce at its idle prompt (
gc prime done, no active turn).
- From any caller:
gc sling <rig>/<worker> <bead-id> (without --nudge).
- Observe:
- The bead has
metadata.gc.routed_to = <worker> ✓
- The bead is wrapped in a convoy (
sling-<id>) and a wisp with the target formula ✓
bd ready excludes the bead (it is considered hooked) ✓
gc hook <rig>/<worker> returns exit=1 (no work detected) — the hook does not surface routed-to-this-worker beads as claimable ✓
- Worker session stays at idle prompt,
status on the bead stays open, assignee stays null ✗
Observed in practice (ds-research city)
- Slung
scix_experiments-wqr.9.1 → scix-worker at T+0. Sat idle.
- Slung
scix_experiments-wqr.9.2 → scix-worker-2 at T+0. Sat idle.
- Slung
scix_experiments-wqr.9.6 → scix-worker-3 at T+50min. Claimed + closed within 2 minutes.
- Difference:
scix-worker-3 happened to be in an active turn cycle at T+50min (finishing prior work). Its Stop hook fired, surfaced 9.6, next turn claimed it.
wqr.9.1 and 9.2 were slung to sessions that were already quiescent; they never fired Stop, never saw the work.
Workaround (confirmed)
gc session nudge <active-session-id> "check for assigned work"
Any user-prompt nudge unsticks — it starts a turn, the turn ends, Stop fires, gc hook --inject surfaces routed work into the next turn. Time-to-claim: ~30-60s.
Non-workarounds
gc session reset <sid> — spawns a fresh session but the new SessionStart still only runs gc prime, no work-check.
- Session turnover from idle timeout — same reason.
- Just waiting — Stop never fires because the session has no turn to end.
Acceptance criteria
gc sling to an idle-but-running worker results in the bead being claimed (assignee set, worker started) within 60s without any manual session nudge. Behavior consistent whether the worker was previously active or quiescent.
Proposed fix
Auto-nudge when the sling target has a running session, regardless of the --nudge flag. Keep --nudge as an explicit override for the cold-target case (still useful for enqueueing against a target that hasn't spawned yet).
Implementation lives in the CLI layer at cmd/gc/cmd_sling.go post-DoSling dispatch (not sling.finalize), to preserve the Layer 2 → Layer 0 layering boundary and avoid threading runtime.Provider into SlingDeps. The API handler path (internal/api/handler_sling.go) remains nudge-free to avoid adding WaitIdle latency to HTTP requests.
Environment
- Repro: local + ds-research city
gc from main HEAD
Summary
gc sling <worker> <bead-id>setsgc.routed_tometadata and wraps the bead in a convoy/wisp, but does NOT send a wake signal to the target worker. As a result, routed beads sit unclaimed indefinitely against any worker that is not currently in an active turn cycle.Root cause:
--nudgedefaults tofalse(cmd/gc/cmd_sling.go:105), andgc hook --injectis wired to theStophook only (internal/hooks/config/claude.json:42-51), notSessionStart. A warm-idle worker session (one that rangc primeand has been sitting at its idle prompt with no active turn) never firesStop, so the work-check never runs and routed work is never surfaced.Relationship to #1027
#1027 addressed the cold-spawn pool case — a sling that spawns a fresh polecat now triggers a first turn via
prompt_mode=\"arg\"+PromptSuffix. That fix does NOT cover the warm-idle case (session already spawned, primed, and idle). @julianknutsen's closing comment on #1027 explicitly flagged this as separate follow-up:Reproduction
gc primedone, no active turn).gc sling <rig>/<worker> <bead-id>(without--nudge).metadata.gc.routed_to = <worker>✓sling-<id>) and a wisp with the target formula ✓bd readyexcludes the bead (it is considered hooked) ✓gc hook <rig>/<worker>returns exit=1 (no work detected) — the hook does not surface routed-to-this-worker beads as claimable ✓statuson the bead staysopen,assigneestaysnull✗Observed in practice (ds-research city)
scix_experiments-wqr.9.1→scix-workerat T+0. Sat idle.scix_experiments-wqr.9.2→scix-worker-2at T+0. Sat idle.scix_experiments-wqr.9.6→scix-worker-3at T+50min. Claimed + closed within 2 minutes.scix-worker-3happened to be in an active turn cycle at T+50min (finishing prior work). Its Stop hook fired, surfaced 9.6, next turn claimed it.wqr.9.1and9.2were slung to sessions that were already quiescent; they never fired Stop, never saw the work.Workaround (confirmed)
Any user-prompt nudge unsticks — it starts a turn, the turn ends, Stop fires,
gc hook --injectsurfaces routed work into the next turn. Time-to-claim: ~30-60s.Non-workarounds
gc session reset <sid>— spawns a fresh session but the new SessionStart still only runsgc prime, no work-check.Acceptance criteria
gc slingto an idle-but-running worker results in the bead being claimed (assignee set, worker started) within 60s without any manual session nudge. Behavior consistent whether the worker was previously active or quiescent.Proposed fix
Auto-nudge when the sling target has a running session, regardless of the
--nudgeflag. Keep--nudgeas an explicit override for the cold-target case (still useful for enqueueing against a target that hasn't spawned yet).Implementation lives in the CLI layer at
cmd/gc/cmd_sling.gopost-DoSlingdispatch (notsling.finalize), to preserve the Layer 2 → Layer 0 layering boundary and avoid threadingruntime.ProviderintoSlingDeps. The API handler path (internal/api/handler_sling.go) remains nudge-free to avoid adding WaitIdle latency to HTTP requests.Environment
gcfrom main HEAD