Houston Engine — Wire Protocol

Source of truth for the HTTP + WebSocket contract spoken by houston-engine and every client (desktop, mobile, CLI, third-party). Rust types live in engine/houston-engine-protocol; TS types live in ui/engine-client/src/types.ts. The Rust side wins conflicts.

Versioning

Field	Value
Protocol major	`1` (constant `PROTOCOL_VERSION`)
Engine version	crate `houston-engine-server` version
Version header	`X-Houston-Engine-Version: <semver>` on every response
Breaking changes	require protocol major bump + client version guard

Clients refuse to talk to an engine whose major v exceeds what they know.

Transport

HTTP under /v1/* — resource-oriented REST. Content-Type: application/json.
WebSocket at /v1/ws — server-push events + lightweight client requests.

Loopback deploys bind 127.0.0.1:<random>; remote deploys must opt in via HOUSTON_BIND_ALL=1.

CORS

Fully permissive: allow_origin("*"), allow_methods(Any), allow_headers(Any). This is safe because the bearer token is not a CORS credential (no cookies), and because loopback deploys aren't browser-reachable from the public internet. Browser clients from any origin can call the engine as long as they carry a valid token.

Keep it this way — the WKWebView in the desktop app is cross-origin to 127.0.0.1:<port>, and trimming the allow-list has caused PUT/PATCH preflights to fail (e.g. setPreference returning "Load failed" in Safari/WKWebView). See engine/houston-engine-server/src/lib.rs.

Auth

Bearer token. Three accepted locations (server checks all):

Authorization: Bearer <token> — required for REST, preferred for WS in native clients.
?token=<token> — convenience for CLIs and browsers that cannot set WS headers.
Sec-WebSocket-Protocol: houston-bearer.<token> — fallback for browser WS.

Token generation: the binary auto-generates a 48-char alphanumeric token on first run unless HOUSTON_ENGINE_TOKEN is set. It is written (mode 0600) to ~/.houston/engine.json. The desktop supervisor reads that file before injecting window.__HOUSTON_ENGINE__.

REST conventions

Plural nouns: /v1/workspaces, /v1/agents/{path}/sessions.
Non-CRUD actions as sub-resource POSTs: POST /v1/agents/{p}/sessions/{k}:cancel.
Path IDs always URL-encoded.

Error body

{
  "error": {
    "code": "NOT_FOUND",
    "message": "workspace 7f3e... not found",
    "details": null
  }
}

code is a fixed enum: UNAUTHORIZED, FORBIDDEN, NOT_FOUND, BAD_REQUEST, CONFLICT, INTERNAL, UNAVAILABLE, VERSION_MISMATCH. HTTP status maps 1:1 (see engine-server/src/routes/error.rs).

Current routes

Full surface live. Every mutating route emits matching HoustonEvent on broadcast bus. 16 route modules wired in houston-engine-server/src/lib.rs. Integration tests in engine/houston-engine-server/tests/ — one file per module.

Health

Method	Path	Description
GET	`/v1/health`	`{status, version, protocol}`
GET	`/v1/version`	`{engine, protocol, build}`
GET	`/v1/ws`	WebSocket upgrade

Workspaces + nested agent CRUD

Method	Path	Description
GET	`/v1/workspaces`	List
POST	`/v1/workspaces`	Create
DELETE	`/v1/workspaces/:id`	Delete
POST	`/v1/workspaces/:id/rename`	Rename
PATCH	`/v1/workspaces/:id/locale`	Set/clear the per-workspace UI-locale override (`{ locale: "es" \| null }`)
PATCH	`/v1/workspaces/:id/provider`	Set provider/model
GET	`/v1/workspaces/:id/context`	Read shared `WORKSPACE.md` + `USER.md`
PUT	`/v1/workspaces/:id/context`	Write shared `WORKSPACE.md` + `USER.md`
GET	`/v1/workspaces/:id/agents`	List agents in workspace
POST	`/v1/workspaces/:id/agents`	Create agent
DELETE	`/v1/workspaces/:id/agents/:agent_id`	Delete agent
PATCH	`/v1/workspaces/:id/agents/:agent_id`	Update agent metadata (`color`)
POST	`/v1/workspaces/:id/agents/:agent_id/rename`	Rename agent
POST	`/v1/workspaces/install-from-github`	Import workspace template

Sessions (agent_path path-segment, URL-encoded)

Method	Path	Description
POST	`/v1/agents/:agent_path/sessions`	Start turn
POST	`/v1/agents/:agent_path/sessions/onboarding`	Start onboarding turn
POST	`/v1/agents/:agent_path/sessions/:key:cancel`	Kill CLI process tree (verified, SIGKILL escalation); a tombstone catches a CLI that spawns after Stop
GET	`/v1/agents/:agent_path/sessions/:key/history`	Load chat history
POST	`/v1/sessions/summarize`	Activity title/description

POST /v1/sessions/summarize accepts { message, agentPath?, provider?, model? }. It resolves provider/model from explicit fields, then agentPath, then default Anthropic. It is best-effort: provider CLI errors, timeouts, or malformed JSON return a deterministic fallback title instead of failing the client flow. Do not hardcode Claude for this path: Codex-only users may not have Claude Code.

Chat session starts are queued per sessionKey, not per workingDir. Follow-up turns inside the same conversation wait and resume in order. The desktop app keeps mid-run follow-ups in a visible local queued-message strip, lets users remove them, then submits the remaining queued text as one combined turn when the active run finishes. The engine queue remains the protocol safety net for other clients and direct API callers. Different sessions in the same folder run in parallel. Cancelling a session invalidates any queued turns for that session key. If multiple sessions overlap in one folder, file-change attribution is skipped for those overlapping runs because the diff cannot be assigned to one model safely. On successful non-overlapping completion, the engine may emit and persist a FeedItem with feed_type: "file_changes" and data: { created: string[], modified: string[] }; clients should render this as session-owned project artifacts. Provider/tool execution failures that need user recovery UI are emitted as feed_type: "tool_runtime_error" with data: { kind: "local_tool" | "provider_process", details: string }. Clients should render a user-safe retry and report-bug surface; details is diagnostic context for reports and logs, not user-facing copy.

Agent data (?agent_path= query; writes emit event)

Method	Path	Description
GET/POST	`/v1/agents/activities`	List/create
PATCH/DELETE	`/v1/agents/activities/:id`	Update/delete
GET/PUT	`/v1/agents/config`	Read/write project config

Routine + routine-run CRUD is not here — there is one canonical surface under /v1/routines + /v1/routine-runs (below); the engine-client points all routine CRUD at it. (The old duplicate /v1/agents/routines* mirror, which silently dropped timezone, was removed.)

Agent files (typed .houston/ + project file browser)

Method	Path	Description
GET/DELETE	`/v1/agents/files`	List / delete project file
POST	`/v1/agents/files/read`	Read typed data file
POST	`/v1/agents/files/write`	Write typed data file (emits event)
POST	`/v1/agents/files/seed-schemas`	Seed `.houston/<type>/<type>.schema.json`
POST	`/v1/agents/files/migrate`	Run idempotent migrations
POST	`/v1/agents/files/read-project`	Read project file
POST	`/v1/agents/files/rename`	Rename
POST	`/v1/agents/files/folder`	Create folder
POST	`/v1/agents/files/import`	Import paths
POST	`/v1/agents/files/import-bytes`	Import base64 bytes

Routines (the single routine surface — CRUD + scheduler)

All routine + routine-run CRUD lives here (the engine-client targets it); the /v1/agents/routines* mirror was deleted. Query params are camelCase (?agentPath, ?routineId). A routine carries optional provider/model/effort overrides (absent = inherit the agent's config at dispatch); the dispatcher resolves provider+model via sessions::resolve_provider_with_overrides and effort via resolve_effort_with_override (an effort the resolved provider rejects is dropped), the same precedence a chat turn uses. Create/update/delete + run create/update emit RoutinesChanged / RoutineRunsChanged.

Method	Path	Description
GET/POST	`/v1/routines`	List/create (by `?agentPath`)
PATCH/DELETE	`/v1/routines/:id`	Update/delete
POST	`/v1/routines/:id/runs`	Create run
POST	`/v1/routines/:id/runs/:run_id:cancel`	Stop an in-flight run (kills the provider PID, marks status `cancelled`). 409 if the run is already terminal. Deleting a routine cascades to this for any `running` runs.
POST	`/v1/routines/:id/run-now`	Manual trigger. Returns once the run row is created (404 if the routine is gone, 409 if this routine already has a run in flight); the session runs on a detached task — follow it via `RoutineRunsChanged`. Different routines on one agent both run, serialized on the folder; the same routine can't double-run.
GET	`/v1/routine-runs`	List (optional `?routineId`)
PATCH	`/v1/routine-runs/:id`	Update run
POST	`/v1/routines/scheduler/start`	Start per-agent cron
POST	`/v1/routines/scheduler/stop`	Stop
POST	`/v1/routines/scheduler/sync`	Re-read routines, rebuild cron jobs

Routine schedules are standard Unix cron (0/7 = Sunday, weekdays 1-5) everywhere a human touches them — the UI builder, the stored schedule string, and the frontend nextFire preview. The backend cron crate numbers days 1-7 (1 = Sunday) and rejects 0, so routines::cron_compat::to_engine_cron translates the day-of-week field at the single spawn_cron boundary. Without it every weekly routine fired a day early and Sunday routines never scheduled (issue #389). Keep cron generation/parsing on the standard convention; never hand a raw schedule to Schedule::from_str.

Conversations (cross-agent read)

Method	Path	Description
POST	`/v1/conversations/list`	List conversations for one agent
POST	`/v1/conversations/list-all`	List across many agents

Conversation entries include the activity's stored session_key plus the card metadata the agent board needs to render the same mission card in cross-agent surfaces: agent, routine_id, and worktree_path when present.

Skills

Method	Path	Description
GET/POST	`/v1/skills`	List/create
GET/PUT/DELETE	`/v1/skills/:name`	Load/save/delete
POST	`/v1/skills/community/search`	Search community registry, cached/throttled server-side
POST	`/v1/skills/community/install`	Install community skill
POST	`/v1/skills/repo/list`	List skills in a repo
POST	`/v1/skills/repo/install`	Install from repo

Store (agent registry + GitHub import)

Method	Path	Description
GET	`/v1/store/catalog`	Curated listing. Uses release-bundled `store/catalog.json` when available; remote API fallback remains for future hosted Store.
GET	`/v1/store/search?q=`	Search catalog
POST	`/v1/store/installs`	Install by `{repo, agentId}`. `repo: "houston-store/<id>"` installs bundled package incl. skills. GitHub repo form remains supported.
DELETE	`/v1/store/installs/:agent_id`	Uninstall
POST	`/v1/agents/install-from-github`	One-off install by URL
POST	`/v1/agents/check-updates`	Which installed agents have new versions

Preferences + providers + agent-configs

Method	Path	Description
GET/PUT	`/v1/preferences/:key`	String KV (DB-backed)
GET	`/v1/providers/:name/status`	`{cliInstalled, authState, installSource, cliPath}`
POST	`/v1/providers/:name/login`	Launch CLI login. Returns `BAD_REQUEST` for providers without an OAuth flow (e.g. `gemini`); callers must use the credentials route instead. Surfaces the OAuth URL via the `ProviderLoginUrl` WS event and the outcome via `ProviderLoginComplete`. Optional `?deviceAuth=true` selects the provider's headless device-code flow (OpenAI/codex `--device-auth`) for remote clients that can't receive the CLI's `localhost` OAuth callback; ignored by providers without a device variant (Claude keeps its paste-back code), omitted by the co-located desktop app.
POST	`/v1/providers/:name/login/code`	Relay the OAuth verification code the user pasted (paste-back flow, e.g. Claude on a remote/headless engine). Body: `{ code }`. Written to the CLI's stdin. Not used by codex's device-code flow, which self-completes after the user enters the `ProviderLoginUrl.user_code` on the provider's page.
POST	`/v1/providers/:name/login/cancel`	Abort an in-flight sign-in: kills the CLI subprocess and frees the in-flight slot so a retry isn't rejected as "already pending". Idempotent (no-op when nothing pending). Emits a benign `ProviderLoginComplete` (`success: false`, `error: null`) so pending spinners clear without an error toast. Fixes the stuck-spinner-after-closing-browser case.
POST	`/v1/providers/gemini/credentials`	Write `GEMINI_API_KEY` to `~/.gemini/.env` (atomic, mode 0600). Body: `{ apiKey }`. Provider-specific because Gemini is the only provider with file-backed credentials today.
GET	`/v1/agent-configs`	List installed agent definitions

Composio (MCP integrations)

Method	Path	Description
GET	`/v1/composio/status`	Full status bundle
GET	`/v1/composio/cli-installed`	Bool
POST	`/v1/composio/cli`	Install Composio CLI (no-op when bundled — see `knowledge-base/cli-bundling.md`)
POST	`/v1/composio/login`	Start OAuth
POST	`/v1/composio/login/complete`	Finish OAuth w/ `cli_key`
GET	`/v1/composio/apps`	Catalog
GET/POST	`/v1/composio/connections`	List / start connect

Claude Code (runtime install — proprietary CLI not bundled)

Method	Path	Description
GET	`/v1/claude/cli-installed`	Bool
GET	`/v1/claude/status`	`{installed, install_path, pinned_version, installed_version}`
POST	`/v1/claude/install`	Trigger background download + sha256 verify; progress streams as `ClaudeCliInstalling` events on the WS firehose

Worktrees + shell

Method	Path	Description
POST	`/v1/worktrees`	Create git worktree
POST	`/v1/worktrees/list`	List
POST	`/v1/worktrees/remove`	Remove
POST	`/v1/shell`	Run arbitrary shell (cwd + cmd)

Attachments

Method	Path	Description
POST	`/v1/attachments/uploads`	Create per-file upload sessions for a scope
PUT	`/v1/attachments/uploads/:upload_id/content`	Stream raw file bytes for one upload
GET	`/v1/attachments/:scope_id`	List attachment manifests for a scope
DELETE	`/v1/attachments/:scope_id`	Delete all attachments for a scope

Attachment uploads are binary, one file per PUT. The create call declares scopeId, name, size, and optional mime; the content call sends raw bytes directly, not base64 JSON. The engine writes to a temp file, counts bytes, computes SHA-256, rejects size mismatches or over-limit files, then atomically commits a manifest + prompt-readable file path under <home>/cache/attachments/scopes/<scopeId>/.

There is no user-facing attachment count cap. The SDK chunks large selections into multiple create requests so a user can attach many files, such as dozens of bank statements, while the engine still bounds each pending upload reservation. Current limits: 25 upload sessions per create request, 100MB per file, 250MB per create request, and 500MB per scope.

Mobile tunnel

Method	Path	Description
GET	`/v1/tunnel/status`	Tunnel connection state
POST	`/v1/tunnel/pairing`	Return stable phone-access QR payload (`<tunnelId>-<accessSecret>`)
POST	`/v1/tunnel/reset-access`	Rotate phone-access QR secret and revoke all device tokens

See docs/mobile-architecture.md for the full flow — desktop engine opens an outbound WS to the Houston relay, which proxies mobile HTTP+WS AND serves the PWA bundle from the same origin. Phone pairing is durable: laptop sleep/shutdown keeps the same tunnel identity and phone tokens; only Settings → Disconnect all phones rotates the QR secret.

Watcher

Method	Path	Description
POST	`/v1/watcher/start`	Start `notify` watch on agent dir
POST	`/v1/watcher/stop`	Stop

WebSocket envelope

Every WS frame is an EngineEnvelope:

{
  "v": 1,
  "id": "b6e1c7d3-...",
  "kind": "event | req | res | ping | pong",
  "ts": 1712345678901,
  "payload": { ... }
}

kind: "event" → payload is a HoustonEvent (same enum the frontend already consumes) or a LagMarker ({type:"Lag", dropped: N}).
kind: "req" → client request. {op:"sub"|"unsub", topics:[...]}. Per-topic filtering is wired — subscribing to "*" gets the firehose; subscribing to specific topics limits what the forwarder sends.
kind: "res" → server response to a prior req (future use).
kind: "ping" | "pong" → keep-alive. Server emits a ping every 20s.

Backpressure

Per-connection bounded mpsc with capacity 1024. On lag the server:

Coalesces consecutive SessionStatus and low-severity FeedItem updates.
Sends a LagMarker so the client knows to refetch.
Continues streaming once drained.

Topics

Reserved topic names. Clients that want the firehose subscribe to the special * topic. Subscribing to specific topics limits what the forwarder sends — essential for remote clients where bandwidth matters.

Topic	Payload variants
`*`	Firehose. Delivers every event regardless of its event_topic. The desktop app uses this so it doesn't need to track per-agent / per-session subscriptions. Remote clients should prefer narrower topics.
`session:{key}`	`FeedItem`, `SessionStatus`, `AuthRequired`
`agent:{path}`	`ActivityChanged`, `SkillsChanged`, `FilesChanged`, `ConfigChanged`, `ContextChanged`, `LearningsChanged`, `ConversationsChanged`
`routines:{agent}`	`RoutinesChanged`, `RoutineRunsChanged`
`composio`	`ComposioCliReady`, `ComposioCliFailed`
`scheduler`	`HeartbeatFired`, `CronFired`
`toast`	`Toast`, `CompletionToast`
`events`	`EventReceived`, `EventProcessed`
`auth`	`AuthRequired`

Auditing conformance

engine/houston-engine-server/tests/ — in-process HTTP + WS assertions.
ui/engine-client/src/types.ts — mirrors the Rust DTOs by hand until a codegen tool (ts-rs or specta) is adopted. CI should fail if shapes drift.

Integration gotchas (custom frontends)

These are load-bearing things every custom frontend must do. Missing any of them doesn't break the build but will produce a frozen or silently-wrong UI at runtime.

Start the file watcher on mount

The Claude/Codex CLI writes files via its own tools — those writes bypass the engine entirely. The engine only learns about them when the filesystem watcher is running. Call POST /v1/watcher/start (SDK: client.startAgentWatcher(agentPath)) exactly once after you resolve the agent folder. Without it, FilesChanged never fires for agent-side writes and the UI looks frozen until a manual reload.

Subscribe to WS topics before firing a session

The per-connection forwarder drops events that arrive before the client has subscribed to their topic. Subscribe to session:<key> and agent:<path> first, THEN POST /v1/agents/:path/sessions. The echoed session_key in the start response is safe; early events for that key may have been dropped — refetch with /v1/agents/:path/sessions/:key/history if you need them.

System prompts are caller-supplied

POST /v1/agents/:path/sessions accepts an optional systemPrompt field. When omitted, the engine falls back to whatever the embedding app passed in via HOUSTON_APP_SYSTEM_PROMPT at subprocess spawn. The engine has no hardcoded product copy — it only assembles generic per-agent context from disk (working directory, mode overrides, skills index, integrations). Final prompt = <product_prompt>\n\n---\n\n<agent_context>. Onboarding sessions use HOUSTON_APP_ONBOARDING_PROMPT as an additional suffix.

The assembled prompt reaches the provider CLIs via scratch files, never argv (houston-terminal-manager::prompt_scratch): codex gets a per-session profile at $CODEX_HOME/houston-tmp-*.config.toml selected with -p (requires the file-based profiles in codex ≥ 0.137 — keep the cli-deps.json pin at or above that), claude gets a temp file via --system-prompt-file. Argv tokens are capped at 32,767 chars total by Windows CreateProcessW; carrying the prompt inline (-c developer_instructions=… / --system-prompt <text>) broke every spawn with os error 206 once an agent's accumulated context outgrew the limit. Growth is also bounded at the source: workspace_context caps the WORKSPACE.md/USER.md prompt share (12 KB / 4 KB, newest-first, with an explicit omission marker) the same way learnings_context caps learnings — files on disk are never trimmed.

Feed-item streaming needs a reducer

assistant_text_streaming deltas should REPLACE the in-progress assistant message in your state; assistant_text finalizes it. Don't append every streaming delta as a new message row. Same pattern for thinking_streaming / thinking. See examples/smartbooks/src/lib/feed.ts::appendFeedItem.

Context-usage lives on `final_result`

The terminal feed_type: "final_result" item carries data: { result, cost_usd, duration_ms, usage }. usage is the normalized TokenUsage { context_tokens, output_tokens, cached_tokens } (Rust houston-terminal-manager::TokenUsage, TS @houston-ai/chat TokenUsage) or null for providers that don't report it (Anthropic + Codex do; Gemini doesn't yet). context_tokens is the prompt size of the most recent model request, i.e. how much of the context window is in use.

Anthropic: the parser sums the last assistant message's three-way split (input + cache_creation + cache_read). The per-message usage IS the last request, so this is the live fill.
Codex: trickier. codex exec --json only emits turn.completed.usage, which is the CUMULATIVE sum of every model request in the turn (a turn with N tool round-trips reports ~N× the real size — this is the 94k-instead-of-19k bug). The real last-request fill + the effective window live ONLY in Codex's on-disk rollout ($CODEX_HOME/sessions/**/rollout-*-<thread_id>.jsonl, default ~/.codex), in token_count.info.last_token_usage / model_context_window. So engine codex_rollout::latest_usage(thread_id) reads the newest rollout's last token_count and session_io patches it onto the FinalResult after the stream flushes (codex only writes the rollout fully on exit, so the held-back FinalResult is emitted post-loop). The parser leaves usage None; on any rollout failure it stays None (no % beats a wrong %). Bumping the bundled codex won't help — neither 0.130 nor 0.135 exec --json exposes the per-request data in stdout.

The desktop composer's context-usage indicator (app/src/components/context- indicator.tsx) divides the latest turn's context_tokens by a window estimate to drive a ring gauge (a donut whose arc fills with the occupied fraction and turns red near the limit; the percentage, a progress bar, and rounded token counts surface on hover); it reads usage via sessionContextUsage (app/src/lib/context-usage.ts) so it works both live and after a history reload (the field is persisted in chat_feed.data_json). /context (the interactive Claude Code slash command) is unavailable here because the engine drives claude -p in non-interactive print mode — the data comes from the stream's usage blocks, not a REPL command.

The window is an estimate, by necessity. The real context window is plan/credit-gated and is NOT reported anywhere claude -p can see (verified against Claude Code 2.1.159: system init carries only model, tools, mcp_servers, ... — no window field; no flag; no env var; Codex's thread.started likewise). The gating:

Opus 4.x → 1M automatic on Max/Team/Enterprise, else 200k (1M needs /extra-usage credits on Pro).
Sonnet 4.6 → 200k on every plan; 1M only with usage credits.
Codex gpt-5.5 → 258,400 effective = raw context_window 272k × effective_context_window_percent 95% (both from Codex's models_cache.json, and the rollout's model_context_window confirms 258400). The opt-in 1M variant maxes at 1M × 95% = 950k.

So the indicator uses a self-correcting estimate (providers.ts contextWindow = default assumption, contextWindowMax = snap-up ceiling; context-usage.ts effectiveContextWindow): start from the per-model default (Opus 1M, Sonnet 200k, gpt-5.5 258.4k), then snap UP to the ceiling once the session's observed PEAK context_tokens exceeds the default — which proves the real window is larger, because both CLIs auto-compact before the limit so observed usage can never exceed the true window. This auto-fixes Sonnet-with-credits and never reads over 100%. The one case it over-estimates is Opus on Pro WITHOUT credits (shows 1M, really 200k); it can't self-correct downward, so the dialog labels the figure "estimated". If a future CLI release exposes the window in system init / thread.started, prefer that live value over the estimate.

Autocompact (`context_compacted`)

When a conversation nears the context window, Houston frees space without touching the user's visible chat. Both paths surface as one feed_type: "context_compacted" item (data: { trigger: "native" | "proactive", pre_tokens?: number }, Rust FeedItem::ContextCompacted), rendered as a subtle divider — the full history above and below stays visible.

Native — Claude Code auto-compacts its own transcript as it nears the window (~95%) and emits a top-level stream-json system event {"subtype":"compact_boundary","compact_metadata":{"trigger","pre_tokens",…}} (verified against Claude Code 2.1.160). parser.rs lifts it into ContextCompacted { trigger: Native }. Claude-only today: Codex's exec auto-compaction is unreliable, which is exactly why the forced path exists.
Proactive — the desktop client watches the context-usage % and, once it crosses the threshold (default 93%, overridable at build time via VITE_AUTOCOMPACT_THRESHOLD), sets compact: true on the next startSession. The engine (sessions::compaction) summarizes the visible chat via a one-shot provider call, abandons the current resume id with SessionIdHandle::clear_current_preserving_history() (the id stays in .history so session_ids_for_history still loads the full chat_feed), emits + persists a Proactive marker under the old id, then runs the turn on a FRESH provider session seeded with [summary + the user's message]. The persisted/displayed user message stays the original; only the agent's working context shrank. Provider-agnostic — the reliable path for Codex.

Autocompact is always on — there is no user-facing toggle. It's a non-destructive guarantee (the full chat_feed stays visible regardless), so the decision is purely client-side: lib/autocompact.ts, called from tauriChat.send so every send path gets it, reads the live feed usage synchronously. The only knob is the threshold, a build-time constant (VITE_AUTOCOMPACT_THRESHOLD, default 93), not a user setting. compact is honored only when a resume id exists (ignored on turn 1). On summary failure the engine logs and falls back to a normal resume (the CLI's own auto-compaction is the backstop), so a turn never fails because compaction couldn't run.

Provider switch (`provider_switched`)

POST .../sessions accepts an optional providerSwitch: { mode: "replay" | "summarize", fromProvider } (HOU-424). It is set by the client when the user moves a live conversation to a DIFFERENT provider mid-stream. Provider CLI sessions aren't portable, so the engine reseeds a FRESH session on the resolved (new) provider with prior context — the full transcript verbatim (replay) or an AI summary (summarize) — and clears any current resume id for the resolved provider so a switch-back never resumes a stale cross-provider session. It takes precedence over compact. The seed is built from the DB chat_feed, and the summary (for summarize) runs on the TARGET provider, so a switch away from an out-of-credits provider still works. Unlike compact, a switch seed failure is NOT swallowed: it surfaces as a SessionStatus error (beta no-silent-failure policy).

The boundary is recorded as a feed_type: "provider_switched" item (data: { provider, summarized, pre_tokens? }, Rust FeedItem::ProviderSwitched), rendered as a subtle divider like context_compacted. provider is the provider switched TO; summarized distinguishes the verbatim carry from the summarized one. The full chat_feed above and below stays visible.

Binary file downloads

The read-project route returns text only. For xlsx, pdf, images, etc., call POST /v1/shell with open "<path>" (macOS), xdg-open "<path>" (Linux), or start "" "<path>" (Windows) to hand the file to the host OS's default application. A first-class binary-read endpoint is on the roadmap — until it lands, the shell route is the escape hatch.

Bearer token placement for WebSocket

Browsers can't set Authorization on WebSocket upgrades. Use ?token=<token> on the WS URL instead. The engine accepts all three (Authorization header, ?token=, Sec-WebSocket-Protocol: houston-bearer.<token>).

Reference implementation

examples/smartbooks/ — a complete custom frontend consumer of the engine, ~400 lines of TSX, zero @houston-ai/* UI deps. Treat as a copy-paste template.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Houston Engine — Wire Protocol

Versioning

Transport

CORS

Auth

REST conventions

Error body

Current routes

WebSocket envelope

Backpressure

Topics

Auditing conformance

Integration gotchas (custom frontends)

Start the file watcher on mount

Subscribe to WS topics before firing a session

System prompts are caller-supplied

Feed-item streaming needs a reducer

Context-usage lives on `final_result`

Autocompact (`context_compacted`)

Provider switch (`provider_switched`)

Binary file downloads

Bearer token placement for WebSocket

Reference implementation

FilesExpand file tree

engine-protocol.md

Latest commit

History

engine-protocol.md

File metadata and controls

Houston Engine — Wire Protocol

Versioning

Transport

CORS

Auth

REST conventions

Error body

Current routes

WebSocket envelope

Backpressure

Topics

Auditing conformance

Integration gotchas (custom frontends)

Start the file watcher on mount

Subscribe to WS topics before firing a session

System prompts are caller-supplied

Feed-item streaming needs a reducer

Context-usage lives on final_result

Autocompact (context_compacted)

Provider switch (provider_switched)

Binary file downloads

Bearer token placement for WebSocket

Reference implementation

Context-usage lives on `final_result`

Autocompact (`context_compacted`)

Provider switch (`provider_switched`)