This document outlines the implementation plan for Hive Phase 16, focusing on undo/redo commands, fixing sessions randomly going idle, and capping run tab output history.
The implementation is divided into 7 focused sessions, each with:
- Clear objectives
- Definition of done
- Testing criteria for verification
Phase 16 builds upon Phase 15 — all Phase 15 infrastructure is assumed to be in place.
Session 1 (Undo/Redo: Research OpenCode SDK) ── no deps
Session 2 (Undo/Redo: Backend IPC) ── blocked by Session 1
Session 3 (Undo/Redo: Frontend Integration) ── blocked by Session 2
Session 4 (Session Idle: Global Listener Fix) ── no deps
Session 5 (Session Idle: SessionView Fix) ── no deps (can parallel with S4)
Session 6 (Run Output Cap) ── no deps
Session 7 (Integration & Verification) ── blocked by Sessions 1-6
┌──────────────────────────────────────────────────────────────────────┐
│ Time → │
│ │
│ Track A: [S1: SDK Research] → [S2: IPC Backend] → [S3: Frontend] │
│ Track B: [S4: Global Listener Fix] │
│ Track C: [S5: SessionView Fix] │
│ Track D: [S6: Run Output Cap] │
│ │
│ All ──────────────────────────────────────────► [S7: Integration] │
└──────────────────────────────────────────────────────────────────────┘
Maximum parallelism: Sessions 1, 4, 5, 6 are fully independent. Sessions 4 and 5 can run in parallel (they touch different files for the same bug). Session 2 depends on Session 1 (research findings). Session 3 depends on Session 2 (IPC endpoints).
Minimum total: 4 rounds:
- (S1, S4, S5, S6 in parallel)
- (S2 — after S1)
- (S3 — after S2)
- (S7)
Recommended serial order (if doing one at a time):
S6 → S4 → S5 → S1 → S2 → S3 → S7
Rationale: S6 is the simplest self-contained change. S4 and S5 fix the highest-impact bug (sessions randomly idling). S1-S3 are sequential for undo/redo (research → backend → frontend). S7 validates everything.
test/
├── phase-16/
│ ├── session-1/
│ │ └── opencode-undo-redo-research.test.ts
│ ├── session-2/
│ │ └── undo-redo-ipc.test.ts
│ ├── session-3/
│ │ └── undo-redo-frontend.test.tsx
│ ├── session-4/
│ │ └── global-listener-busy.test.ts
│ ├── session-5/
│ │ └── session-idle-debounce.test.ts
│ ├── session-6/
│ │ └── run-output-cap.test.ts
│ └── session-7/
│ └── integration-verification.test.ts
# No new dependencies — all features use existing packages:
# - zustand (stores — already installed)
# - lucide-react (icons — already installed)
# - sonner (toasts — already installed)
# - @opencode-ai/sdk (already installed)- Study the opencode CLI client at
<opencode-repo-path>to understand how undo/redo works - Document the exact SDK API for undo and redo (method names, parameters, return types)
- Determine whether undo/redo triggers stream events that the renderer needs to handle
- Understand preconditions and error cases
Explore the codebase at <opencode-repo-path> to find:
- How undo/redo commands are defined and invoked
- What SDK client methods are called (
client.session.undo,client.session.redo, or similar) - What parameters are passed (session ID, directory, etc.)
- What the response shape looks like
Look at the @opencode-ai/sdk package in node_modules/@opencode-ai/sdk to find:
- The type definitions for undo/redo endpoints
- Whether these are REST calls, SSE subscriptions, or something else
- The request/response types
Create a brief summary of the API shape to guide Sessions 2 and 3. This should include:
- Exact method signatures
- Required and optional parameters
- Expected response types
- Any stream events triggered by undo/redo
- Error conditions (nothing to undo, session busy, etc.)
<opencode-repo-path>— reference CLI client (undo/redo implementation)node_modules/@opencode-ai/sdk/dist/index.d.ts— SDK type definitionssrc/main/services/opencode-service.ts— existing SDK usage patterns in our codebase
- The exact SDK API for undo is documented (method, params, response)
- The exact SDK API for redo is documented (method, params, response)
- Stream event behavior after undo/redo is understood
- Error conditions are identified
- Findings are sufficient for Session 2 to implement the backend without further research
- This is a research session — verification is that the documented API matches what the SDK provides
- Cross-check documented method signatures against
node_modules/@opencode-ai/sdk/dist/index.d.ts
// test/phase-16/session-1/opencode-undo-redo-research.test.ts
describe('Session 1: OpenCode SDK Undo/Redo API', () => {
test('SDK client has undo method on session namespace', () => {
// Verify the SDK type exports include session.undo
// This validates that the research correctly identified the API
})
test('SDK client has redo method on session namespace', () => {
// Verify the SDK type exports include session.redo
})
})- Add
undo()andredo()methods toopencode-service.tsthat call the OpenCode SDK - Add
opencode:undoandopencode:redoIPC handlers inopencode-handlers.ts - Add preload bridge methods and type declarations
In src/main/services/opencode-service.ts, add two new methods. The exact SDK API calls should match the findings from Session 1:
async undo(worktreePath: string, sessionId: string): Promise<{ success: boolean }> {
const instance = await this.getOrCreateInstance()
// Use the exact SDK method identified in Session 1 research
await instance.client.session.undo(/* params from research */)
return { success: true }
}
async redo(worktreePath: string, sessionId: string): Promise<{ success: boolean }> {
const instance = await this.getOrCreateInstance()
await instance.client.session.redo(/* params from research */)
return { success: true }
}In src/main/ipc/opencode-handlers.ts, add two new handlers:
ipcMain.handle('opencode:undo', async (_event, { worktreePath, sessionId }) => {
try {
const result = await openCodeService.undo(worktreePath, sessionId)
return { success: true, ...result }
} catch (error) {
return {
success: false,
error: error instanceof Error ? error.message : String(error)
}
}
})
ipcMain.handle('opencode:redo', async (_event, { worktreePath, sessionId }) => {
try {
const result = await openCodeService.redo(worktreePath, sessionId)
return { success: true, ...result }
} catch (error) {
return {
success: false,
error: error instanceof Error ? error.message : String(error)
}
}
})In src/preload/index.ts, inside the opencodeOps namespace:
undo: (worktreePath: string, sessionId: string) =>
ipcRenderer.invoke('opencode:undo', { worktreePath, sessionId }),
redo: (worktreePath: string, sessionId: string) =>
ipcRenderer.invoke('opencode:redo', { worktreePath, sessionId })In src/preload/index.d.ts, inside the opencodeOps interface:
undo(worktreePath: string, sessionId: string): Promise<{ success: boolean; error?: string }>
redo(worktreePath: string, sessionId: string): Promise<{ success: boolean; error?: string }>src/main/services/opencode-service.ts— addundo()andredo()methodssrc/main/ipc/opencode-handlers.ts— addopencode:undoandopencode:redohandlerssrc/preload/index.ts— expose inopencodeOpsnamespacesrc/preload/index.d.ts— type declarations
-
opencode-service.tshasundo()andredo()methods that call the SDK -
opencode-handlers.tshasopencode:undoandopencode:redohandlers with error handling -
preload/index.tsexposesundoandredoinopencodeOps -
preload/index.d.tshas type declarations for both methods - Error cases return
{ success: false, error: string } -
pnpm lintpasses -
pnpm testpasses
- Verify the IPC handlers are registered by checking no duplicate channel errors on startup
- Call
window.opencodeOps.undo()from the dev console with a valid session — verify it returns{ success: true }or a meaningful error - Call
window.opencodeOps.redo()similarly
// test/phase-16/session-2/undo-redo-ipc.test.ts
describe('Session 2: Undo/Redo IPC', () => {
test('undo handler calls openCodeService.undo with correct params', async () => {
// Mock openCodeService.undo to return { success: true }
// Invoke the handler with { worktreePath: '/path', sessionId: 'sess-1' }
// Verify openCodeService.undo called with '/path' and 'sess-1'
// Verify result is { success: true }
})
test('undo handler returns error on failure', async () => {
// Mock openCodeService.undo to throw Error('Nothing to undo')
// Invoke handler
// Verify result is { success: false, error: 'Nothing to undo' }
})
test('redo handler calls openCodeService.redo with correct params', async () => {
// Similar to undo test
})
test('redo handler returns error on failure', async () => {
// Similar to undo error test
})
})- Define built-in
/undoand/redocommands that appear in the slash command popover - Route these commands to the dedicated IPC endpoints (not through SDK command API)
- Reload messages from database after successful undo/redo
- Show toast notifications for success/failure
In src/renderer/src/components/sessions/SessionView.tsx, add a static array of built-in commands above the component:
const BUILT_IN_COMMANDS = [
{
name: 'undo',
description: 'Undo the last assistant response',
template: '/undo',
builtIn: true as const
},
{
name: 'redo',
description: 'Redo the last undone response',
template: '/redo',
builtIn: true as const
}
]Update the commands prop passed to SlashCommandPopover to include built-in commands at the top:
const allCommands = useMemo(() => [...BUILT_IN_COMMANDS, ...slashCommands], [slashCommands])Pass allCommands instead of slashCommands to SlashCommandPopover.
At the top of the slash command detection block in handleSend (before the existing SDK command matching), add built-in command handling:
if (trimmedValue.startsWith('/')) {
const spaceIndex = trimmedValue.indexOf(' ')
const commandName = spaceIndex > 0 ? trimmedValue.slice(1, spaceIndex) : trimmedValue.slice(1)
// Built-in commands — handled locally, no user message created
if (commandName === 'undo' || commandName === 'redo') {
setInputValue('')
setShowSlashCommands(false)
try {
const result =
commandName === 'undo'
? await window.opencodeOps.undo(worktreePath, opencodeSessionId)
: await window.opencodeOps.redo(worktreePath, opencodeSessionId)
if (result.success) {
await loadMessagesFromDatabase()
toast.success(commandName === 'undo' ? 'Undone' : 'Redone')
} else {
toast.error(result.error || `Nothing to ${commandName}`)
}
} catch {
toast.error(`${commandName === 'undo' ? 'Undo' : 'Redo'} failed`)
}
return
}
// ... existing SDK command matching below ...
}Key behavior differences from SDK commands:
- Built-in commands do NOT save a user message to the database
- They do NOT display a user message in the chat
- They clear the input and execute immediately
- On success, messages are reloaded from DB to reflect the undone/redone state
In src/renderer/src/components/sessions/SlashCommandPopover.tsx, optionally add a subtle badge or styling for built-in commands to distinguish them from SDK commands. This is low priority and can be skipped if time is tight.
src/renderer/src/components/sessions/SessionView.tsx—BUILT_IN_COMMANDS, merge with SDK commands, route inhandleSendsrc/renderer/src/components/sessions/SlashCommandPopover.tsx— (optional) visual distinction
- Typing
/shows/undoand/redoat the top of the command popover - Selecting
/undofrom the popover fills the input with/undo - Pressing Enter with
/undoin the input callswindow.opencodeOps.undo(), NOTwindow.opencodeOps.command() - No user message is created in the chat for
/undoor/redo - On success, messages are reloaded from the database and a toast shows "Undone" / "Redone"
- On failure, an error toast shows the error message
- The input is cleared after execution regardless of success/failure
-
/undoand/redoare filterable (typing/unshows/undo, typing/reshows/redo) - Regular SDK slash commands still work as before
-
pnpm lintpasses -
pnpm testpasses
- Open a session and send a message, wait for the response
- Type
/undoand press Enter — verify the last assistant response is removed, toast shows "Undone" - Type
/redoand press Enter — verify the response is restored, toast shows "Redone" - Type
/un— verify the popover filters to show/undo - Click
/undoin the popover — verify input becomes/undo, then press Enter - Try
/undowhen there's nothing to undo — verify error toast - Use a regular SDK command like
/compact— verify it still works through the SDK endpoint - Verify no user message bubble appears in the chat for
/undoor/redo
// test/phase-16/session-3/undo-redo-frontend.test.tsx
describe('Session 3: Undo/Redo Frontend', () => {
test('BUILT_IN_COMMANDS are merged with SDK commands', () => {
// Mock slashCommands with [{ name: 'compact', ... }]
// Verify allCommands contains undo, redo, and compact
// Verify undo and redo appear before compact
})
test('/undo calls window.opencodeOps.undo, not command', async () => {
const undoMock = vi.fn().mockResolvedValue({ success: true })
const commandMock = vi.fn()
// Mock window.opencodeOps.undo = undoMock
// Mock window.opencodeOps.command = commandMock
// Simulate handleSend with '/undo'
// Verify undoMock called, commandMock NOT called
})
test('/undo does not create a user message', async () => {
// Mock window.opencodeOps.undo to return { success: true }
// Mock window.db.message.create
// Simulate handleSend with '/undo'
// Verify window.db.message.create NOT called
})
test('/undo reloads messages on success', async () => {
// Mock window.opencodeOps.undo to return { success: true }
// Spy on loadMessagesFromDatabase
// Simulate handleSend with '/undo'
// Verify loadMessagesFromDatabase was called
})
test('/undo shows error toast on failure', async () => {
// Mock window.opencodeOps.undo to return { success: false, error: 'Nothing to undo' }
// Simulate handleSend with '/undo'
// Verify toast.error called with 'Nothing to undo'
})
test('/redo calls window.opencodeOps.redo', async () => {
// Similar to /undo test but for redo
})
test('unknown slash command still routes to SDK command API', async () => {
// Mock slashCommands with [{ name: 'compact', ... }]
// Simulate handleSend with '/compact'
// Verify window.opencodeOps.command called (not undo/redo)
})
})- Handle
session.status busyevents in the global listener for background sessions - Ensure background sessions transition from
'unread'back to'working'/'planning'when they become busy again - This addresses Gap 5 from the PRD (no busy-state tracking for background sessions)
In src/renderer/src/hooks/useOpenCodeGlobalListener.ts, the current code (around line 97) has:
if (status?.type !== 'idle') returnThis ignores ALL non-idle statuses for background sessions. Change to explicitly handle busy:
if (status?.type === 'busy') {
// Background session became busy again — restore working/planning status
if (sessionId !== activeId) {
const currentMode = useSessionStore.getState().getSessionMode(sessionId)
useWorktreeStatusStore
.getState()
.setSessionStatus(sessionId, currentMode === 'plan' ? 'planning' : 'working')
}
return
}
if (status?.type !== 'idle') return
// ... existing idle handling unchanged belowThis ensures that when a background session transitions from idle to busy (e.g., processing queued messages, continuing multi-step work, or resuming after a brief pause), the worktree sidebar correctly shows "Working" or "Planning" instead of staying at "Ready" or "Unread".
src/renderer/src/hooks/useOpenCodeGlobalListener.ts— handlesession.status busyfor background sessions
- Background sessions transitioning from idle to busy show "Working" or "Planning" in the sidebar
- The mode-aware status (
'working'vs'planning') is correctly derived fromgetSessionMode - Active session events are still skipped (handled by SessionView)
- Existing idle handling is unaffected
-
pnpm lintpasses -
pnpm testpasses
- Open two worktrees, start a session in each
- Send a message in Worktree A, switch to Worktree B
- When Worktree A's session completes (shows "Unread"), queue another message
- Verify Worktree A's sidebar shows "Working" (not stuck at "Unread" or "Ready")
- When the queued message completes, verify it transitions back to "Unread"
// test/phase-16/session-4/global-listener-busy.test.ts
describe('Session 4: Global Listener Busy Handling', () => {
test('session.status busy sets working status for background session', () => {
// Set activeSessionId to 'session-A'
// Mock getSessionMode('session-B') to return 'build'
// Fire onStream with { type: 'session.status', sessionId: 'session-B', statusPayload: { type: 'busy' } }
// Verify setSessionStatus called with ('session-B', 'working')
})
test('session.status busy sets planning status for plan-mode background session', () => {
// Set activeSessionId to 'session-A'
// Mock getSessionMode('session-B') to return 'plan'
// Fire onStream with session.status busy for session-B
// Verify setSessionStatus called with ('session-B', 'planning')
})
test('session.status busy is ignored for the active session', () => {
// Set activeSessionId to 'session-A'
// Fire onStream with session.status busy for session-A
// Verify setSessionStatus NOT called (active session handled by SessionView)
})
test('session.status idle still sets unread for background session', () => {
// Set activeSessionId to 'session-A'
// Fire onStream with session.status idle for session-B
// Verify setSessionStatus called with ('session-B', 'unread')
})
})- Debounce
session.status idlefinalization in SessionView to handle rapid idle-busy-idle SDK transitions - Guard
session.idle(deprecated) fallback against firing while actively streaming - Add diagnostic logging for status transitions
- This addresses Gaps 1, 2, 3 from the PRD
In src/renderer/src/components/sessions/SessionView.tsx, add an idleTimerRef and change the session.status idle handler from immediate to debounced:
const idleTimerRef = useRef<ReturnType<typeof setTimeout> | null>(null)In the session.status handler, replace the immediate idle handling:
Current code (conceptual):
if (status.type === 'idle') {
immediateFlush()
setIsSending(false)
// ... finalize immediately
}New code:
if (status.type === 'busy') {
// Cancel any pending idle finalization
if (idleTimerRef.current) {
clearTimeout(idleTimerRef.current)
idleTimerRef.current = null
}
setIsStreaming(true)
newPromptPendingRef.current = false
return
}
if (status.type === 'idle') {
// Debounce: wait 300ms before finalizing, in case busy arrives immediately after
if (idleTimerRef.current) clearTimeout(idleTimerRef.current)
idleTimerRef.current = setTimeout(() => {
idleTimerRef.current = null
immediateFlush()
setIsSending(false)
setQueuedMessages([])
if (!hasFinalizedCurrentResponseRef.current) {
hasFinalizedCurrentResponseRef.current = true
void finalizeResponseFromDatabase()
}
const activeId = useSessionStore.getState().activeSessionId
const statusStore = useWorktreeStatusStore.getState()
if (activeId === sessionId) {
statusStore.clearSessionStatus(sessionId)
} else {
statusStore.setSessionStatus(sessionId, 'unread')
}
}, 300)
return
}In the cleanup function of the stream subscription effect:
return () => {
unsubscribe()
if (idleTimerRef.current) {
clearTimeout(idleTimerRef.current)
idleTimerRef.current = null
}
}In the session.idle handler, add a guard against firing while streaming is active:
} else if (event.type === 'session.idle') {
if (event.childSessionId) {
// ... existing child session handling — unchanged ...
return
}
// Guard: if we are actively streaming and haven't finalized yet,
// defer to session.status for authoritative finalization
if (isStreaming && !hasFinalizedCurrentResponseRef.current) {
console.warn(
`[SessionView] session.idle received while streaming (session=${sessionId}). Deferring to session.status.`
)
return
}
// ... existing fallback finalization ...
}Add console.debug calls to status transition handlers to help diagnose any remaining issues:
// In session.status handler:
console.debug(`[SessionView] session.status`, {
type: status.type,
sessionId,
isStreaming,
hasFinalizedRef: hasFinalizedCurrentResponseRef.current,
isSending
})src/renderer/src/components/sessions/SessionView.tsx— debounced finalization, session.idle guard, diagnostic logging
-
session.status idlefinalization is debounced by 300ms -
session.status busycancels any pending idle debounce timer -
session.idlefallback does NOT finalize ifisStreamingis true (defers tosession.status) - The idle timer is cleaned up on component unmount
- Diagnostic logging is present for
session.statustransitions - Sessions that briefly go idle between tool calls do NOT appear to finish
- Sessions that genuinely complete still finalize correctly (after 300ms)
-
pnpm lintpasses -
pnpm testpasses
- Start a session that uses multiple tool calls in sequence
- Observe that the session stays in "Working" state throughout (no brief flicker to "Ready")
- When the session genuinely completes, verify it transitions to "Ready" within ~300ms
- Check browser console for
[SessionView] session.statusdebug logs during streaming - Start a session and switch tabs during streaming — switch back and verify it's still showing as streaming
// test/phase-16/session-5/session-idle-debounce.test.ts
describe('Session 5: SessionView Idle Debounce', () => {
test('session.status idle does not finalize immediately', () => {
// Fire session.status { type: 'idle' }
// Verify finalizeResponseFromDatabase NOT called immediately
// Advance timers by 200ms
// Verify still NOT called
// Advance timers to 300ms total
// Verify NOW called
})
test('session.status busy cancels pending idle timer', () => {
// Fire session.status { type: 'idle' }
// Fire session.status { type: 'busy' } at +100ms
// Advance timers to 500ms
// Verify finalizeResponseFromDatabase NEVER called
})
test('rapid idle-busy-idle only finalizes on the last idle', () => {
// Fire idle → busy → idle sequence
// Only the last idle should trigger finalization after 300ms
})
test('session.idle is ignored while streaming', () => {
// Set isStreaming = true, hasFinalizedCurrentResponseRef = false
// Fire session.idle (without childSessionId)
// Verify finalizeResponseFromDatabase NOT called
// Verify console.warn logged
})
test('session.idle with childSessionId still updates subtask', () => {
// Fire session.idle with childSessionId = 'child-1'
// Verify subtask card status is updated (existing behavior preserved)
// Verify parent session is NOT finalized
})
test('idle timer is cleared on unmount', () => {
// Fire session.status idle
// Unmount component before 300ms
// Verify no finalization occurs after 300ms
})
})- Cap the run tab output buffer at 500,000 characters to prevent unbounded memory growth
- Use a circular ring buffer so appends and evictions are O(1) — no array copying or shifting
- Show a truncation marker in the UI when old output has been evicted
The current implementation uses [...existing.runOutput, line] on every append — a full O(n) array copy. When trimming, it additionally requires slice() — another O(n). For long-running dev servers producing thousands of chunks, this creates significant GC pressure and CPU waste.
A ring buffer solves this by:
- O(1) append: write at the current
headposition, advancehead - O(1) eviction: advance
tailto discard oldest entries, no shifting - No array copying: the backing array is pre-allocated and mutated in place
- O(n) read only when rendering:
toArray()produces the ordered snapshot only when React needs it
Ring Buffer Visualization:
Capacity: 8 slots (in reality: 50,000)
Initial state (3 entries):
┌───┬───┬───┬───┬───┬───┬───┬───┐
│ A │ B │ C │ │ │ │ │ │
└───┴───┴───┴───┴───┴───┴───┴───┘
tail=0 head=3
After wrapping + eviction (char limit hit):
┌───┬───┬───┬───┬───┬───┬───┬───┐
│ H │ │ │ D │ E │ F │ G │ H │ ← H overwrites slot 0
└───┴───┴───┴───┴───┴───┴───┴───┘
tail=3 head=1
toArray() reads: [D, E, F, G, H] (tail → head, wrapping)
A, B, C were evicted — truncated=true
The key design decision: the ring buffer lives outside Zustand as a module-level mutable data structure. Zustand only stores a runOutputVersion counter that increments on each append, triggering React re-renders. This avoids fighting Zustand's immutability model while keeping appends truly O(1).
Create a new file src/renderer/src/lib/output-ring-buffer.ts:
const MAX_CHARS = 500_000
const BUFFER_CAPACITY = 50_000 // max entries (50K * ~10 chars avg = 500K)
const TRUNCATION_MARKER = '\x00TRUNC:[older output truncated]'
export class OutputRingBuffer {
private chunks: (string | null)[]
private head: number = 0 // next write position
private tail: number = 0 // oldest valid entry position
private _count: number = 0 // number of valid entries
private _totalChars: number = 0
private _truncated: boolean = false
constructor(private capacity: number = BUFFER_CAPACITY) {
this.chunks = new Array(capacity).fill(null)
}
append(chunk: string): void {
// If buffer is full (by entry count), evict oldest
if (this._count === this.capacity) {
this.evictOldest()
}
// Write at head
this.chunks[this.head] = chunk
this._totalChars += chunk.length
this._count++
this.head = (this.head + 1) % this.capacity
// Evict oldest entries until under character limit
while (this._totalChars > MAX_CHARS && this._count > 1) {
this.evictOldest()
}
}
private evictOldest(): void {
const evicted = this.chunks[this.tail]
if (evicted !== null) {
this._totalChars -= evicted.length
this.chunks[this.tail] = null
}
this.tail = (this.tail + 1) % this.capacity
this._count--
this._truncated = true
}
/**
* Produce an ordered array for rendering.
* Called only when React needs to render — not on every append.
*/
toArray(): string[] {
const result: string[] = []
if (this._truncated) {
result.push(TRUNCATION_MARKER)
}
for (let i = 0; i < this._count; i++) {
const chunk = this.chunks[(this.tail + i) % this.capacity]
if (chunk !== null) result.push(chunk)
}
return result
}
clear(): void {
this.chunks.fill(null)
this.head = 0
this.tail = 0
this._count = 0
this._totalChars = 0
this._truncated = false
}
get totalChars(): number {
return this._totalChars
}
get count(): number {
return this._count
}
get truncated(): boolean {
return this._truncated
}
}
// Module-level buffer registry — one per worktree
const buffers = new Map<string, OutputRingBuffer>()
export function getOrCreateBuffer(worktreeId: string): OutputRingBuffer {
let buf = buffers.get(worktreeId)
if (!buf) {
buf = new OutputRingBuffer()
buffers.set(worktreeId, buf)
}
return buf
}
export function deleteBuffer(worktreeId: string): void {
buffers.delete(worktreeId)
}
export { TRUNCATION_MARKER }Why capacity = 50,000:
- At average chunk size of ~10 chars, 50K entries * 10 = 500K chars — matches the char limit
- For larger chunks (100+ chars), the 500K char limit is the binding constraint (evicts before capacity is reached)
- For very small chunks (1-2 chars), 50K entries is still ample history
- Memory: 50K pointers = ~400KB overhead — negligible
In src/renderer/src/stores/useScriptStore.ts:
Replace the ScriptState interface:
interface ScriptState {
setupOutput: string[]
setupRunning: boolean
setupError: string | null
runOutputVersion: number // replaces runOutput: string[]
runRunning: boolean
runPid: number | null
}
function createDefaultScriptState(): ScriptState {
return {
setupOutput: [],
setupRunning: false,
setupError: null,
runOutputVersion: 0, // replaces runOutput: []
runRunning: false,
runPid: null
}
}Replace appendRunOutput:
import { getOrCreateBuffer } from '@/lib/output-ring-buffer'
appendRunOutput: (worktreeId, line) => {
// O(1) mutation — no array copying
const buffer = getOrCreateBuffer(worktreeId)
buffer.append(line)
// Bump version to trigger React re-render
set((state) => {
const existing = state.scriptStates[worktreeId] || createDefaultScriptState()
return {
scriptStates: {
...state.scriptStates,
[worktreeId]: {
...existing,
runOutputVersion: existing.runOutputVersion + 1
}
}
}
})
}Replace clearRunOutput:
clearRunOutput: (worktreeId) => {
const buffer = getOrCreateBuffer(worktreeId)
buffer.clear()
set((state) => {
const existing = state.scriptStates[worktreeId] || createDefaultScriptState()
return {
scriptStates: {
...state.scriptStates,
[worktreeId]: {
...existing,
runOutputVersion: existing.runOutputVersion + 1
}
}
}
})
}Add a new getter for consumers that need the array:
getRunOutput: (worktreeId: string): string[] => {
const buffer = getOrCreateBuffer(worktreeId)
return buffer.toArray()
}Add getRunOutput to the ScriptStore interface as well.
In src/renderer/src/components/layout/RunTab.tsx:
Replace the runOutput subscription:
import { getOrCreateBuffer, TRUNCATION_MARKER } from '@/lib/output-ring-buffer'
// Subscribe to version counter (triggers re-render on each append)
const runOutputVersion = useScriptStore((s) =>
worktreeId ? (s.scriptStates[worktreeId]?.runOutputVersion ?? 0) : 0
)
// Produce the ordered array only when version changes
const runOutput = useMemo(() => {
if (!worktreeId) return emptyOutput
return getOrCreateBuffer(worktreeId).toArray()
}, [worktreeId, runOutputVersion])Add truncation marker rendering in the runOutput.map() block. Place BEFORE the existing \x00CMD: and \x00ERR: checks:
if (line.startsWith('\x00TRUNC:')) {
const msg = line.slice(7)
return (
<div
key={i}
className="text-muted-foreground text-center text-[10px] py-1 border-b border-border/50"
>
{msg}
</div>
)
}Update the auto-scroll dependency:
// Auto-scroll to bottom on new output
useEffect(() => {
if (outputRef.current) {
outputRef.current.scrollTop = outputRef.current.scrollHeight
}
}, [runOutputVersion]) // was: [runOutput]Update empty/length checks — replace runOutput.length with version-aware checks:
const hasOutput = runOutput.length > 0The rest of the rendering logic (.map(), status bar) remains the same since runOutput is still a string[].
In src/renderer/src/components/layout/BottomPanel.tsx, the URL detection reads scriptState.runOutput (lines 31-32). Update to use the store getter:
// Replace:
// if (!scriptState?.runRunning || !scriptState.runOutput?.length) return null
// return extractDevServerUrl(scriptState.runOutput)
// With:
const runOutput = useScriptStore.getState().getRunOutput(worktreeId)
if (!scriptState?.runRunning || !runOutput.length) return null
return extractDevServerUrl(runOutput)Or use the runOutputVersion selector to drive a useMemo similar to RunTab.
src/renderer/src/lib/output-ring-buffer.ts— new file:OutputRingBufferclass, module-level registrysrc/renderer/src/stores/useScriptStore.ts— replacerunOutput: string[]withrunOutputVersion: number, use ring buffersrc/renderer/src/components/layout/RunTab.tsx— read from ring buffer via version selector, render truncation markersrc/renderer/src/components/layout/BottomPanel.tsx— update URL detection to usegetRunOutput()
-
OutputRingBuffer.append()is O(1) — no array copying or shifting -
OutputRingBuffer.toArray()produces the correct ordered output - Old entries are evicted when total characters exceed 500,000
- Old entries are evicted when entry count exceeds 50,000 (capacity)
- A
[older output truncated]marker appears at the top when entries have been evicted - The truncation marker renders as a centered, muted, small text line in the Run tab
-
clear()resets the buffer completely (no stale data) - "Open in Chrome" URL detection still works (reads from
getRunOutput()) - Output under limits is unaffected (no premature eviction)
-
pnpm lintpasses -
pnpm testpasses
- Start a dev server that produces continuous output (e.g., a watch mode build)
- Let it run for several minutes
- Verify the Run tab does not grow infinitely — scroll to the top and verify the truncation marker appears
- Verify recent output is still visible and correctly rendered
- Verify ANSI color codes still render correctly after eviction
- Stop and restart the dev server — verify
clearresets everything - If "Open in Chrome" is implemented — verify URL detection still works after buffer wraps
// test/phase-16/session-6/run-output-cap.test.ts
import { OutputRingBuffer } from '@/lib/output-ring-buffer'
import { useScriptStore } from '@/stores/useScriptStore'
describe('OutputRingBuffer', () => {
test('append and toArray preserve order', () => {
const buf = new OutputRingBuffer(8)
buf.append('A')
buf.append('B')
buf.append('C')
expect(buf.toArray()).toEqual(['A', 'B', 'C'])
})
test('evicts oldest when char limit exceeded', () => {
// Use a small buffer with low capacity to test char eviction
const buf = new OutputRingBuffer(100) // high capacity so char limit is binding
const bigChunk = 'x'.repeat(200_000)
buf.append(bigChunk)
buf.append(bigChunk)
buf.append(bigChunk) // total = 600K, limit = 500K
expect(buf.totalChars).toBeLessThanOrEqual(500_000)
expect(buf.truncated).toBe(true)
const arr = buf.toArray()
// First entry should be the truncation marker
expect(arr[0]).toMatch(/truncated/)
})
test('evicts oldest when capacity exceeded', () => {
const buf = new OutputRingBuffer(4) // small capacity
buf.append('A')
buf.append('B')
buf.append('C')
buf.append('D')
buf.append('E') // capacity exceeded, A evicted
expect(buf.count).toBe(4)
expect(buf.truncated).toBe(true)
const arr = buf.toArray()
// Truncation marker + B, C, D, E
expect(arr).toContain('B')
expect(arr).toContain('E')
expect(arr).not.toContain('A')
})
test('wraps around correctly', () => {
const buf = new OutputRingBuffer(4)
buf.append('A')
buf.append('B')
buf.append('C')
buf.append('D')
buf.append('E') // wraps: head=1, tail=1, [E, B, C, D]
buf.append('F') // wraps: head=2, tail=2, [E, F, C, D]
const arr = buf.toArray()
const dataEntries = arr.filter((s) => !s.startsWith('\x00'))
expect(dataEntries).toEqual(['C', 'D', 'E', 'F'])
})
test('clear resets all state', () => {
const buf = new OutputRingBuffer(4)
buf.append('A')
buf.append('B')
buf.clear()
expect(buf.count).toBe(0)
expect(buf.totalChars).toBe(0)
expect(buf.truncated).toBe(false)
expect(buf.toArray()).toEqual([])
})
test('most recent entry is always preserved even if it alone exceeds limit', () => {
const buf = new OutputRingBuffer(100)
const hugeChunk = 'x'.repeat(600_000) // single chunk > limit
buf.append(hugeChunk)
expect(buf.count).toBe(1)
const arr = buf.toArray()
expect(arr).toContain(hugeChunk)
})
})
describe('useScriptStore with ring buffer', () => {
beforeEach(() => {
useScriptStore.setState({ scriptStates: {} })
})
test('appendRunOutput increments version', () => {
const store = useScriptStore.getState()
store.appendRunOutput('wt-1', 'hello')
const v1 = store.scriptStates['wt-1'].runOutputVersion
store.appendRunOutput('wt-1', 'world')
const v2 = store.scriptStates['wt-1'].runOutputVersion
expect(v2).toBe(v1 + 1)
})
test('getRunOutput returns ordered array', () => {
const store = useScriptStore.getState()
store.appendRunOutput('wt-1', 'line 1')
store.appendRunOutput('wt-1', 'line 2')
const output = store.getRunOutput('wt-1')
expect(output).toEqual(['line 1', 'line 2'])
})
test('clearRunOutput resets buffer and bumps version', () => {
const store = useScriptStore.getState()
store.appendRunOutput('wt-1', 'data')
store.clearRunOutput('wt-1')
const output = store.getRunOutput('wt-1')
expect(output).toEqual([])
})
test('special markers (CMD, ERR) are preserved in recent output', () => {
const store = useScriptStore.getState()
const bigChunk = 'x'.repeat(500_000)
store.appendRunOutput('wt-1', bigChunk)
store.appendRunOutput('wt-1', '\x00CMD:pnpm dev')
store.appendRunOutput('wt-1', 'server started')
const output = store.getRunOutput('wt-1')
const lastTwo = output.slice(-2)
expect(lastTwo[0]).toBe('\x00CMD:pnpm dev')
expect(lastTwo[1]).toBe('server started')
})
})Replace the existing appendRunOutput implementation:
appendRunOutput: (worktreeId, line) => {
set((state) => {
const existing = state.scriptStates[worktreeId] || createDefaultScriptState()
let newOutput = [...existing.runOutput, line]
// Calculate total character count
let totalChars = 0
for (const chunk of newOutput) {
totalChars += chunk.length
}
// Trim from the front if over limit
if (totalChars > MAX_RUN_OUTPUT_CHARS) {
let charsToRemove = totalChars - MAX_RUN_OUTPUT_CHARS
let startIndex = 0
// Skip the truncation marker if already present
if (newOutput[0] === TRUNCATION_MARKER) {
startIndex = 1
}
while (charsToRemove > 0 && startIndex < newOutput.length - 1) {
charsToRemove -= newOutput[startIndex].length
startIndex++
}
newOutput = [TRUNCATION_MARKER, ...newOutput.slice(startIndex)]
}
return {
scriptStates: {
...state.scriptStates,
[worktreeId]: {
...existing,
runOutput: newOutput
}
}
}
})
}In src/renderer/src/components/layout/RunTab.tsx, add a rendering case for the \x00TRUNC: prefix in the runOutput.map() block. Place it BEFORE the existing \x00CMD: and \x00ERR: checks:
{
runOutput.map((line, i) => {
if (line.startsWith('\x00TRUNC:')) {
const msg = line.slice(7)
return (
<div
key={i}
className="text-muted-foreground text-center text-[10px] py-1 border-b border-border/50"
>
{msg}
</div>
)
}
// ... existing CMD and ERR checks ...
})
}src/renderer/src/stores/useScriptStore.ts— addMAX_RUN_OUTPUT_CHARS, trimming logic inappendRunOutputsrc/renderer/src/components/layout/RunTab.tsx— render\x00TRUNC:marker
-
appendRunOutputtrims output from the front when total characters exceed 500,000 - A
[older output truncated]marker appears as the first entry after trimming - The truncation marker does not get duplicated on subsequent trims
- The most recent output is always preserved (trimming only removes old entries)
- The truncation marker renders as a centered, muted, small text line in the Run tab
- Output under 500K characters is unaffected (no trimming)
-
clearRunOutputstill works correctly (resets to empty array) -
pnpm lintpasses -
pnpm testpasses
- Start a dev server that produces continuous output (e.g., a watch mode build)
- Let it run for several minutes
- Verify the Run tab does not grow infinitely — scroll to the top and verify the truncation marker appears
- Verify recent output is still visible and correctly rendered
- Verify ANSI color codes still render correctly after trimming
- Stop and restart the dev server — verify
clearRunOutputresets everything
// test/phase-16/session-6/run-output-cap.test.ts
describe('Session 6: Run Output Cap', () => {
test('output under limit is not trimmed', () => {
const store = useScriptStore.getState()
store.appendRunOutput('wt-1', 'short line')
const state = store.scriptStates['wt-1']
expect(state.runOutput).toEqual(['short line'])
})
test('output over limit is trimmed from the front', () => {
const store = useScriptStore.getState()
// Append chunks that total > 500K chars
const bigChunk = 'x'.repeat(100_000)
for (let i = 0; i < 6; i++) {
store.appendRunOutput('wt-1', bigChunk)
}
const state = store.scriptStates['wt-1']
// Total should be <= 500K + truncation marker
const totalChars = state.runOutput.reduce((sum, s) => sum + s.length, 0)
expect(totalChars).toBeLessThanOrEqual(500_000 + 100) // small overhead for marker
// First entry should be the truncation marker
expect(state.runOutput[0]).toBe('\x00TRUNC:[older output truncated]')
})
test('truncation marker is not duplicated', () => {
const store = useScriptStore.getState()
const bigChunk = 'x'.repeat(100_000)
// Append enough to trigger trimming twice
for (let i = 0; i < 12; i++) {
store.appendRunOutput('wt-1', bigChunk)
}
const state = store.scriptStates['wt-1']
// Only one truncation marker at the start
const markers = state.runOutput.filter((l) => l.startsWith('\x00TRUNC:'))
expect(markers.length).toBe(1)
expect(state.runOutput[0]).toBe('\x00TRUNC:[older output truncated]')
})
test('most recent entry is always preserved', () => {
const store = useScriptStore.getState()
const bigChunk = 'x'.repeat(500_001)
store.appendRunOutput('wt-1', bigChunk) // fills entire limit
store.appendRunOutput('wt-1', 'latest')
const state = store.scriptStates['wt-1']
expect(state.runOutput[state.runOutput.length - 1]).toBe('latest')
})
test('clearRunOutput resets to empty', () => {
const store = useScriptStore.getState()
store.appendRunOutput('wt-1', 'some output')
store.clearRunOutput('wt-1')
const state = store.scriptStates['wt-1']
expect(state.runOutput).toEqual([])
})
test('special markers (CMD, ERR) are preserved in recent output', () => {
const store = useScriptStore.getState()
const bigChunk = 'x'.repeat(500_000)
store.appendRunOutput('wt-1', bigChunk)
store.appendRunOutput('wt-1', '\x00CMD:pnpm dev')
store.appendRunOutput('wt-1', 'server started')
const state = store.scriptStates['wt-1']
const lastTwo = state.runOutput.slice(-2)
expect(lastTwo[0]).toBe('\x00CMD:pnpm dev')
expect(lastTwo[1]).toBe('server started')
})
})- Verify all Phase 16 features work together end-to-end
- Run full test suite and lint
- Test edge cases and cross-feature interactions
pnpm test
pnpm lintFix any failures.
- Start a session, send a message, wait for response
/undo— verify response removed, toast shown/redo— verify response restored, toast shown/undowhen nothing to undo — verify error toast- Verify
/undoand/redoappear in popover with correct filtering - Verify no user message is created for undo/redo
- Start a session with multi-tool-call response
- Verify no premature idle transitions (no brief "Ready" flickers)
- Start background sessions and verify they show "Working" when busy
- Verify sessions genuinely complete after 300ms debounce
- Switch tabs during active streaming — verify status is correct on return
- Start a dev server that produces continuous output
- Let it run for several minutes
- Verify truncation occurs — scroll to top to see marker
- Verify recent output is readable and ANSI colors work
- Stop and restart — verify clean state
- While a session is streaming (and debounce is active), try
/undo— verify it handles gracefully - Run a dev server while switching tabs — verify both run output and session status behave correctly
- Verify slash command popover shows both built-in and SDK commands correctly after SDK reconnection
- All files modified in Sessions 1-6
-
pnpm testpasses with zero failures -
pnpm lintpasses with zero errors - Undo/redo works end-to-end with correct toast messages
- Session idle bug is resolved — no premature idle transitions during multi-tool responses
- Run output is capped correctly — truncation marker appears after sustained output
- No regressions in existing Phase 15 features
- All edge cases tested (nothing to undo, rapid tab switching, etc.)
Run the full integration test:
pnpm testThen manually test each feature as described in the individual session testing sections above.
// test/phase-16/session-7/integration-verification.test.ts
describe('Session 7: Phase 16 Integration', () => {
test('built-in commands coexist with SDK commands in popover', () => {
// Verify allCommands = [...BUILT_IN_COMMANDS, ...sdkCommands]
// Verify filtering works for both types
})
test('/undo during idle session works correctly', () => {
// Session is idle (not streaming)
// Execute /undo
// Verify it succeeds and messages are reloaded
})
test('debounced idle does not interfere with undo', () => {
// Session finishes (debounced idle in progress)
// User types /undo before debounce completes
// Verify undo executes cleanly
})
test('run output cap does not affect other store operations', () => {
// Append run output past the limit
// Verify setup output is unaffected
// Verify run PID tracking is unaffected
})
test('global listener busy + SessionView debounce work together', () => {
// Background session goes idle → busy → idle
// Active session goes idle → busy → idle
// Verify both are handled correctly without race conditions
})
})