fix(runtime-tmux,web): keep tmux session alive after agent exit#1758
Conversation
Two related fixes:
1. runtime-tmux: append `exec "${SHELL:-/bin/bash}" -i` to the launch
command so the pane drops to an interactive shell when the agent
exits, instead of letting the empty pane take down the whole tmux
session. The lifecycle still detects agent termination via
`agent.isProcessRunning` and transitions the session to
`agent_process_exited` — the runtime just stays usable so the user
can run shell commands or manually re-launch the agent.
2. mux-websocket: add a `tmux has-session` guard at the top of
`pty.onExit`. When the tmux session is genuinely gone (e.g. `ao
stop` killed it out from under a still-subscribed dashboard), skip
the three doomed `attach-session` spawns introduced by the
MAX_REATTACH_ATTEMPTS bound in ComposioHQ#1640 and notify subscribers
immediately. The bound from ComposioHQ#1640 still covers transient
tmux-server hiccups where the session does still exist.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Greptile SummaryThis PR fixes issue #1756 where pressing Ctrl-C on an agent inside a web terminal would destroy the entire tmux session, leaving the dashboard in a phantom "runtime lost" state. It also addresses unnecessary re-attach retries when a tmux session is genuinely gone.
Confidence Score: 5/5Safe to merge — both changes are targeted and well-tested, with no regression risk to existing terminal lifecycle paths. The keep-alive shell tail is a minimal, well-scoped addition to the launch command that leaves all other runtime-tmux behavior unchanged. The has-session guard in mux-websocket correctly uses async/await to avoid event-loop blocking (the only concern from the prior review round, proactively addressed here), short-circuits naturally when there are no subscribers, and falls cleanly through to the existing re-attach logic when the session is alive. Test coverage for both new paths is thorough. No files require special attention.
|
| Filename | Overview |
|---|---|
| packages/web/server/mux-websocket.ts | onExit handler made async to await tmuxHasSession before deciding to re-attach; guard correctly skips doomed retries when the session is gone and notifies exit callbacks immediately. |
| packages/web/server/tmux-utils.ts | New async tmuxHasSession uses promisified execFile, = exact-match prefix to avoid tmux prefix ambiguity, injectable exec function for testability, and a null-tmuxPath guard. |
| packages/plugins/runtime-tmux/src/index.ts | KEEP_ALIVE_SHELL and withKeepAliveShell extracted and applied to both inline and temp-script code paths; trailing-newline stripping before append is correct. |
| packages/web/server/tests/mux-websocket.test.ts | Two new tests cover the has-session guard: session-gone path skips re-attach and fires exit callbacks; session-alive path still re-attaches. |
| packages/web/server/tests/tmux-utils.test.ts | Four unit tests cover tmuxHasSession: resolve/reject paths, exact-match prefix, and null-tmuxPath guard. |
Sequence Diagram
sequenceDiagram
participant User
participant Agent as Agent Process
participant Tmux as tmux session/pane
participant PTY as attach-session PTY
participant MW as TerminalManager.onExit
participant HS as tmuxHasSession
Note over User,HS: Ctrl-C agent (keep-alive shell active)
User->>Agent: Ctrl-C
Agent-->>Tmux: process exits
Tmux->>Tmux: exec $SHELL -i (pane stays alive)
Note over PTY: PTY stays connected (session alive)
Note over User,HS: ao stop (session destroyed)
User->>Tmux: ao stop kills session
Tmux-->>PTY: attach-session exits
PTY->>MW: onExit(exitCode)
MW->>MW: "terminal.pty = null"
MW->>HS: await tmuxHasSession(tmuxSessionId)
HS-->>MW: false (session gone)
MW->>MW: clearTimeout(resetTimer)
MW->>MW: fire exitCallbacks(exitCode)
MW-->>PTY: return (no re-attach)
Note over User,HS: transient PTY hiccup (session still alive)
PTY->>MW: onExit(exitCode)
MW->>MW: "terminal.pty = null"
MW->>HS: await tmuxHasSession(tmuxSessionId)
HS-->>MW: true (session alive)
MW->>MW: reattachAttempts++
MW->>PTY: open() new attach-session
Reviews (2): Last reviewed commit: "fix(mux): make tmuxHasSession async to a..." | Re-trigger Greptile
There was a problem hiding this comment.
Pull request overview
This PR improves the tmux-based runtime and web terminal mux so tmux sessions remain usable after the agent process exits, and so the mux server avoids futile re-attach retries when the underlying tmux session is actually gone.
Changes:
- Runtime tmux: append an interactive
$SHELLkeep-alive tail to the pane launch command (inline + temp-script paths) so the tmux session persists after agent exit. - Web mux: add a
tmux has-sessionguard to skip the re-attach loop when the tmux session no longer exists, notifying subscribers immediately instead. - Tests + changeset updates covering the new behavior and guard logic.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| packages/plugins/runtime-tmux/src/index.ts | Appends a keep-alive interactive shell tail to the tmux pane launch command (inline + script). |
| packages/plugins/runtime-tmux/src/tests/index.test.ts | Updates/adds tests asserting the keep-alive tail is present in both launch paths. |
| packages/web/server/tmux-utils.ts | Adds tmuxHasSession() helper using exact-match =target semantics. |
| packages/web/server/mux-websocket.ts | Uses tmuxHasSession() in PTY exit handling to skip doomed re-attach retries when tmux session is gone. |
| packages/web/server/tests/tmux-utils.test.ts | Adds unit tests for tmuxHasSession() behavior and exact-match targeting. |
| packages/web/server/tests/mux-websocket.test.ts | Adds regression tests ensuring re-attach is skipped when has-session reports the session missing. |
| .changeset/keep-tmux-session-alive-after-agent-exit.md | Publishes patch releases for the tmux runtime plugin and web package with the new behavior. |
The has-session probe added in ComposioHQ#1756 ran via execFileSync inside node-pty's onExit callback, freezing every WebSocket connection, HTTP request, and in-flight terminal for up to the 5 s subprocess timeout whenever an agent exited and tmux was slow to respond. Switch tmuxHasSession to promisified execFile and await it from the onExit handler, mirroring the execFileAsync pattern in runtime-tmux. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
i-trytoohard
left a comment
There was a problem hiding this comment.
Code Review — PR #1758
fix(runtime-tmux,web): keep tmux session alive after agent exit
Summary Assessment
This is a clean, well-scoped two-part fix:
-
Keep-alive shell tail — appends
exec "${SHELL:-/bin/bash}" -iafter the agent launch command so the tmux pane (and session) survives agent exit. Theexecreplaces the shell wrapper, soisProcessRunningstill correctly detects agent termination →agent_process_exitedlifecycle transition. -
Defensive
has-sessionguard — before the re-attach loop fires on PTY exit, probestmux has-session(async, with=exact-match prefix). If the session is gone, skip the 3 doomed re-attach attempts and notify subscribers immediately.
Architecture ✅
- Single definition, two code paths —
KEEP_ALIVE_SHELL+withKeepAliveShell()shared by both inline and temp-script paths. No duplication. execnot&— the right call. The shell replaces itself, so the pane PID stays the same. No orphan processes.- Async probe —
tmuxHasSessionusespromisify(execFile)with a 5s timeout instead ofexecFileSync. The comment explains exactly why: node-pty'sonExitfires on the main thread, and a sync probe would block the entire WebSocket server. Correct. =exact-match prefix — avoids tmux prefix matching (whereao-1would matchao-15). Already proven correct in #1714 forattach-session, now applied tohas-sessiontoo.- Injectable
execFn—tmuxHasSessiontakes an optional exec adapter for testability. Clean pattern.
Edge Cases ✅
tmuxPathis null — returnsfalseimmediately, no subprocess spawn. Correct.subscribers.size === 0— skips the guard and falls through to the existing re-attach logic. This is correct because with no subscribers, there's nobody to notify and the re-attach loop won't fire anyway (theifbelow also checkssubscribers.size > 0).terminal.resetTimercleanup — cleared before firing exit callbacks, preventing a stale timer from zeroingreattachAttemptsafter the terminal is already gone.command.replace(/\n+$/, "")— strips trailing newlines before appending the keep-alive tail. Prevents double-newline edge case. Good.
Tests ✅
- 2 new regression tests for keep-alive tail (inline + temp script paths)
- 2 new tests for
tmuxHasSession(session exists / missing) - 2 new tests for the re-attach guard in
mux-websocket.test.ts(session gone → skip, session alive → re-attach) - All existing assertion updates are correct (previous expected values didn't include the keep-alive tail)
- 862 web tests pass, 31 runtime-tmux tests pass, CI all green
Minor Notes
- The
withKeepAliveShellfunction is only used inindex.ts— could've been inlined, but extracting it with theKEEP_ALIVE_SHELLconstant is better for readability and documentation. No issue. - The
ExecFileAsynctype export is properly scoped — only used bytmuxHasSessionand its tests.
Verdict: ✅ Approve
This is a textbook defensive fix. Clear problem statement, minimal surface-area change, shared constant for both code paths, async-aware implementation, solid test coverage for both the keep-alive and the guard. No issues found.
Reviewed by AO (Agent Orchestrator bot)
Summary
\$SHELLin the workspace dir instead.tmux has-sessionguard in the mux-websocket re-attach loop so the cap from fix(web): bound PTY re-attach loop with grace-period counter reset #1640 doesn't spend three doomedattach-sessionspawns when the session is genuinely gone.Why
Today, pressing Ctrl-C on the agent inside a web terminal nukes the entire tmux session (the pane's only process was the agent). The dashboard then loses the runtime, shows "Session terminated (runtime lost)", and the workspace is unrecoverable from the UI — even though the user just wanted to stop the current operation.
After this PR:
bash(or\$SHELL) in the workspace dir.agent.isProcessRunningand transitions toagent_process_exited(more accurate than the oldruntime_lost).Changes
packages/plugins/runtime-tmuxexec "\${SHELL:-/bin/bash}" -ito the launch command in both code paths (inline + temp script).withKeepAliveShell+KEEP_ALIVE_SHELLso both paths share one definition.packages/web/servertmuxHasSession(tmuxPath, sessionName)helper intmux-utils.ts.TerminalManager.openpty.onExit: before the existing fix(web): bound PTY re-attach loop with grace-period counter reset #1640 re-attach branch, checkhas-session. If the session is gone, clear the grace timer, fireexitCallbacks(exitCode), and return — no retries. The bound from fix(web): bound PTY re-attach loop with grace-period counter reset #1640 still covers transient hiccups where the session is alive.Test plan
pnpm --filter @aoagents/ao-plugin-runtime-tmux test— 31 pass (4 existing assertions updated, 2 new regression tests for the keep-alive tail in inline + script paths).pnpm --filter @aoagents/ao-web test— 862 pass / 4 skipped (2 new tests inmux-websocket.test.tsfor the guard, 4 new tests fortmuxHasSessionintmux-utils.test.ts).pnpm typecheck— clean across all packages.pnpm lint— 0 errors (pre-existing warnings only).Re-attaching/Max re-attach attempts reachedlog lines.Related
🤖 Generated with Claude Code