Source of truth for the HTTP + WebSocket contract spoken by houston-engine
and every client (desktop, mobile, CLI, third-party). Rust types live in
engine/houston-engine-protocol; TS types live in
ui/engine-client/src/types.ts. The Rust side wins conflicts.
| Field | Value |
|---|---|
| Protocol major | 1 (constant PROTOCOL_VERSION) |
| Engine version | crate houston-engine-server version |
| Version header | X-Houston-Engine-Version: <semver> on every response |
| Breaking changes | require protocol major bump + client version guard |
Clients refuse to talk to an engine whose major v exceeds what they know.
- HTTP under
/v1/*— resource-oriented REST.Content-Type: application/json. - WebSocket at
/v1/ws— server-push events + lightweight client requests.
Loopback deploys bind 127.0.0.1:<random>; remote deploys must opt in via
HOUSTON_BIND_ALL=1.
Fully permissive: allow_origin("*"), allow_methods(Any),
allow_headers(Any). This is safe because the bearer token is not a
CORS credential (no cookies), and because loopback deploys aren't
browser-reachable from the public internet. Browser clients from any
origin can call the engine as long as they carry a valid token.
Keep it this way — the WKWebView in the desktop app is cross-origin to
127.0.0.1:<port>, and trimming the allow-list has caused PUT/PATCH
preflights to fail (e.g. setPreference returning "Load failed" in
Safari/WKWebView). See engine/houston-engine-server/src/lib.rs.
Bearer token. Three accepted locations (server checks all):
Authorization: Bearer <token>— required for REST, preferred for WS in native clients.?token=<token>— convenience for CLIs and browsers that cannot set WS headers.Sec-WebSocket-Protocol: houston-bearer.<token>— fallback for browser WS.
Token generation: the binary auto-generates a 48-char alphanumeric token on
first run unless HOUSTON_ENGINE_TOKEN is set. It is written (mode 0600) to
~/.houston/engine.json. The desktop supervisor reads that file before
injecting window.__HOUSTON_ENGINE__.
- Plural nouns:
/v1/workspaces,/v1/agents/{path}/sessions. - Non-CRUD actions as sub-resource POSTs:
POST /v1/agents/{p}/sessions/{k}:cancel. - Path IDs always URL-encoded.
{
"error": {
"code": "NOT_FOUND",
"message": "workspace 7f3e... not found",
"details": null
}
}code is a fixed enum: UNAUTHORIZED, FORBIDDEN, NOT_FOUND,
BAD_REQUEST, CONFLICT, INTERNAL, UNAVAILABLE, VERSION_MISMATCH.
HTTP status maps 1:1 (see engine-server/src/routes/error.rs).
Full surface live. Every mutating route emits matching HoustonEvent on
broadcast bus. 16 route modules wired in
houston-engine-server/src/lib.rs.
Integration tests in engine/houston-engine-server/tests/ — one file per
module.
Health
| Method | Path | Description |
|---|---|---|
| GET | /v1/health |
{status, version, protocol} |
| GET | /v1/version |
{engine, protocol, build} |
| GET | /v1/ws |
WebSocket upgrade |
Workspaces + nested agent CRUD
| Method | Path | Description |
|---|---|---|
| GET | /v1/workspaces |
List |
| POST | /v1/workspaces |
Create |
| DELETE | /v1/workspaces/:id |
Delete |
| POST | /v1/workspaces/:id/rename |
Rename |
| PATCH | /v1/workspaces/:id/locale |
Set/clear the per-workspace UI-locale override ({ locale: "es" | null }) |
| PATCH | /v1/workspaces/:id/provider |
Set provider/model |
| GET | /v1/workspaces/:id/context |
Read shared WORKSPACE.md + USER.md |
| PUT | /v1/workspaces/:id/context |
Write shared WORKSPACE.md + USER.md |
| GET | /v1/workspaces/:id/agents |
List agents in workspace |
| POST | /v1/workspaces/:id/agents |
Create agent |
| DELETE | /v1/workspaces/:id/agents/:agent_id |
Delete agent |
| PATCH | /v1/workspaces/:id/agents/:agent_id |
Update agent metadata (color) |
| POST | /v1/workspaces/:id/agents/:agent_id/rename |
Rename agent |
| POST | /v1/workspaces/install-from-github |
Import workspace template |
Sessions (agent_path path-segment, URL-encoded)
| Method | Path | Description |
|---|---|---|
| POST | /v1/agents/:agent_path/sessions |
Start turn |
| POST | /v1/agents/:agent_path/sessions/onboarding |
Start onboarding turn |
| POST | /v1/agents/:agent_path/sessions/:key:cancel |
Kill CLI process tree (verified, SIGKILL escalation); a tombstone catches a CLI that spawns after Stop |
| GET | /v1/agents/:agent_path/sessions/:key/history |
Load chat history |
| POST | /v1/sessions/summarize |
Activity title/description |
POST /v1/sessions/summarize accepts { message, agentPath?, provider?, model? }.
It resolves provider/model from explicit fields, then agentPath, then default
Anthropic. It is best-effort: provider CLI errors, timeouts, or malformed JSON
return a deterministic fallback title instead of failing the client flow. Do
not hardcode Claude for this path: Codex-only users may not have Claude Code.
Chat session starts are queued per sessionKey, not per workingDir.
Follow-up turns inside the same conversation wait and resume in order.
The desktop app keeps mid-run follow-ups in a visible local queued-message
strip, lets users remove them, then submits the remaining queued text as one
combined turn when the active run finishes. The engine queue remains the
protocol safety net for other clients and direct API callers.
Different sessions in the same folder run in parallel. Cancelling a session
invalidates any queued turns for that session key. If multiple sessions overlap
in one folder, file-change attribution is skipped for those overlapping runs
because the diff cannot be assigned to one model safely. On successful
non-overlapping completion, the engine may emit and persist a FeedItem with
feed_type: "file_changes" and data: { created: string[], modified: string[] }; clients should render this as session-owned project artifacts.
Provider/tool execution failures that need user recovery UI are emitted as
feed_type: "tool_runtime_error" with data: { kind: "local_tool" | "provider_process", details: string }. Clients should render a user-safe retry
and report-bug surface; details is diagnostic context for reports and logs,
not user-facing copy.
Agent data (?agent_path= query; writes emit event)
| Method | Path | Description |
|---|---|---|
| GET/POST | /v1/agents/activities |
List/create |
| PATCH/DELETE | /v1/agents/activities/:id |
Update/delete |
| GET/PUT | /v1/agents/config |
Read/write project config |
Routine + routine-run CRUD is not here — there is one canonical surface
under /v1/routines + /v1/routine-runs (below); the engine-client points all
routine CRUD at it. (The old duplicate /v1/agents/routines* mirror, which
silently dropped timezone, was removed.)
Agent files (typed .houston/ + project file browser)
| Method | Path | Description |
|---|---|---|
| GET/DELETE | /v1/agents/files |
List / delete project file |
| POST | /v1/agents/files/read |
Read typed data file |
| POST | /v1/agents/files/write |
Write typed data file (emits event) |
| POST | /v1/agents/files/seed-schemas |
Seed .houston/<type>/<type>.schema.json |
| POST | /v1/agents/files/migrate |
Run idempotent migrations |
| POST | /v1/agents/files/read-project |
Read project file |
| POST | /v1/agents/files/rename |
Rename |
| POST | /v1/agents/files/folder |
Create folder |
| POST | /v1/agents/files/import |
Import paths |
| POST | /v1/agents/files/import-bytes |
Import base64 bytes |
Routines (the single routine surface — CRUD + scheduler)
All routine + routine-run CRUD lives here (the engine-client targets it); the
/v1/agents/routines* mirror was deleted. Query params are camelCase
(?agentPath, ?routineId). A routine carries optional
provider/model/effort overrides (absent = inherit the agent's config at
dispatch); the dispatcher resolves provider+model via
sessions::resolve_provider_with_overrides and effort via
resolve_effort_with_override (an effort the resolved provider rejects is
dropped), the same precedence a chat turn uses. Create/update/delete + run create/update
emit RoutinesChanged / RoutineRunsChanged.
| Method | Path | Description |
|---|---|---|
| GET/POST | /v1/routines |
List/create (by ?agentPath) |
| PATCH/DELETE | /v1/routines/:id |
Update/delete |
| POST | /v1/routines/:id/runs |
Create run |
| POST | /v1/routines/:id/runs/:run_id:cancel |
Stop an in-flight run (kills the provider PID, marks status cancelled). 409 if the run is already terminal. Deleting a routine cascades to this for any running runs. |
| POST | /v1/routines/:id/run-now |
Manual trigger. Returns once the run row is created (404 if the routine is gone, 409 if this routine already has a run in flight); the session runs on a detached task — follow it via RoutineRunsChanged. Different routines on one agent both run, serialized on the folder; the same routine can't double-run. |
| GET | /v1/routine-runs |
List (optional ?routineId) |
| PATCH | /v1/routine-runs/:id |
Update run |
| POST | /v1/routines/scheduler/start |
Start per-agent cron |
| POST | /v1/routines/scheduler/stop |
Stop |
| POST | /v1/routines/scheduler/sync |
Re-read routines, rebuild cron jobs |
Routine schedules are standard Unix cron (0/7 = Sunday, weekdays 1-5)
everywhere a human touches them — the UI builder, the stored schedule string,
and the frontend nextFire preview. The backend cron crate numbers days
1-7 (1 = Sunday) and rejects 0, so routines::cron_compat::to_engine_cron
translates the day-of-week field at the single spawn_cron boundary. Without it
every weekly routine fired a day early and Sunday routines never scheduled
(issue #389). Keep cron generation/parsing on the standard convention; never
hand a raw schedule to Schedule::from_str.
Conversations (cross-agent read)
| Method | Path | Description |
|---|---|---|
| POST | /v1/conversations/list |
List conversations for one agent |
| POST | /v1/conversations/list-all |
List across many agents |
Conversation entries include the activity's stored session_key plus the
card metadata the agent board needs to render the same mission card in
cross-agent surfaces: agent, routine_id, and worktree_path when present.
Skills
| Method | Path | Description |
|---|---|---|
| GET/POST | /v1/skills |
List/create |
| GET/PUT/DELETE | /v1/skills/:name |
Load/save/delete |
| POST | /v1/skills/community/search |
Search community registry, cached/throttled server-side |
| POST | /v1/skills/community/install |
Install community skill |
| POST | /v1/skills/repo/list |
List skills in a repo |
| POST | /v1/skills/repo/install |
Install from repo |
Store (agent registry + GitHub import)
| Method | Path | Description |
|---|---|---|
| GET | /v1/store/catalog |
Curated listing. Uses release-bundled store/catalog.json when available; remote API fallback remains for future hosted Store. |
| GET | /v1/store/search?q= |
Search catalog |
| POST | /v1/store/installs |
Install by {repo, agentId}. repo: "houston-store/<id>" installs bundled package incl. skills. GitHub repo form remains supported. |
| DELETE | /v1/store/installs/:agent_id |
Uninstall |
| POST | /v1/agents/install-from-github |
One-off install by URL |
| POST | /v1/agents/check-updates |
Which installed agents have new versions |
Preferences + providers + agent-configs
| Method | Path | Description |
|---|---|---|
| GET/PUT | /v1/preferences/:key |
String KV (DB-backed) |
| GET | /v1/providers/:name/status |
{cliInstalled, authState, installSource, cliPath} |
| POST | /v1/providers/:name/login |
Launch CLI login. Returns BAD_REQUEST for providers without an OAuth flow (e.g. gemini); callers must use the credentials route instead. Surfaces the OAuth URL via the ProviderLoginUrl WS event and the outcome via ProviderLoginComplete. Optional ?deviceAuth=true selects the provider's headless device-code flow (OpenAI/codex --device-auth) for remote clients that can't receive the CLI's localhost OAuth callback; ignored by providers without a device variant (Claude keeps its paste-back code), omitted by the co-located desktop app. |
| POST | /v1/providers/:name/login/code |
Relay the OAuth verification code the user pasted (paste-back flow, e.g. Claude on a remote/headless engine). Body: { code }. Written to the CLI's stdin. Not used by codex's device-code flow, which self-completes after the user enters the ProviderLoginUrl.user_code on the provider's page. |
| POST | /v1/providers/:name/login/cancel |
Abort an in-flight sign-in: kills the CLI subprocess and frees the in-flight slot so a retry isn't rejected as "already pending". Idempotent (no-op when nothing pending). Emits a benign ProviderLoginComplete (success: false, error: null) so pending spinners clear without an error toast. Fixes the stuck-spinner-after-closing-browser case. |
| POST | /v1/providers/gemini/credentials |
Write GEMINI_API_KEY to ~/.gemini/.env (atomic, mode 0600). Body: { apiKey }. Provider-specific because Gemini is the only provider with file-backed credentials today. |
| GET | /v1/agent-configs |
List installed agent definitions |
Composio (MCP integrations)
| Method | Path | Description |
|---|---|---|
| GET | /v1/composio/status |
Full status bundle |
| GET | /v1/composio/cli-installed |
Bool |
| POST | /v1/composio/cli |
Install Composio CLI (no-op when bundled — see knowledge-base/cli-bundling.md) |
| POST | /v1/composio/login |
Start OAuth |
| POST | /v1/composio/login/complete |
Finish OAuth w/ cli_key |
| GET | /v1/composio/apps |
Catalog |
| GET/POST | /v1/composio/connections |
List / start connect |
Claude Code (runtime install — proprietary CLI not bundled)
| Method | Path | Description |
|---|---|---|
| GET | /v1/claude/cli-installed |
Bool |
| GET | /v1/claude/status |
{installed, install_path, pinned_version, installed_version} |
| POST | /v1/claude/install |
Trigger background download + sha256 verify; progress streams as ClaudeCliInstalling events on the WS firehose |
Worktrees + shell
| Method | Path | Description |
|---|---|---|
| POST | /v1/worktrees |
Create git worktree |
| POST | /v1/worktrees/list |
List |
| POST | /v1/worktrees/remove |
Remove |
| POST | /v1/shell |
Run arbitrary shell (cwd + cmd) |
Attachments
| Method | Path | Description |
|---|---|---|
| POST | /v1/attachments/uploads |
Create per-file upload sessions for a scope |
| PUT | /v1/attachments/uploads/:upload_id/content |
Stream raw file bytes for one upload |
| GET | /v1/attachments/:scope_id |
List attachment manifests for a scope |
| DELETE | /v1/attachments/:scope_id |
Delete all attachments for a scope |
Attachment uploads are binary, one file per PUT. The create call declares
scopeId, name, size, and optional mime; the content call sends raw bytes
directly, not base64 JSON. The engine writes to a temp file, counts bytes,
computes SHA-256, rejects size mismatches or over-limit files, then atomically
commits a manifest + prompt-readable file path under
<home>/cache/attachments/scopes/<scopeId>/.
There is no user-facing attachment count cap. The SDK chunks large selections into multiple create requests so a user can attach many files, such as dozens of bank statements, while the engine still bounds each pending upload reservation. Current limits: 25 upload sessions per create request, 100MB per file, 250MB per create request, and 500MB per scope.
Mobile tunnel
| Method | Path | Description |
|---|---|---|
| GET | /v1/tunnel/status |
Tunnel connection state |
| POST | /v1/tunnel/pairing |
Return stable phone-access QR payload (<tunnelId>-<accessSecret>) |
| POST | /v1/tunnel/reset-access |
Rotate phone-access QR secret and revoke all device tokens |
See docs/mobile-architecture.md for the full flow — desktop engine opens an outbound WS to the Houston relay, which proxies mobile HTTP+WS AND serves the PWA bundle from the same origin. Phone pairing is durable: laptop sleep/shutdown keeps the same tunnel identity and phone tokens; only Settings → Disconnect all phones rotates the QR secret.
Watcher
| Method | Path | Description |
|---|---|---|
| POST | /v1/watcher/start |
Start notify watch on agent dir |
| POST | /v1/watcher/stop |
Stop |
Every WS frame is an EngineEnvelope:
{
"v": 1,
"id": "b6e1c7d3-...",
"kind": "event | req | res | ping | pong",
"ts": 1712345678901,
"payload": { ... }
}kind: "event"→payloadis aHoustonEvent(same enum the frontend already consumes) or aLagMarker({type:"Lag", dropped: N}).kind: "req"→ client request.{op:"sub"|"unsub", topics:[...]}. Per-topic filtering is wired — subscribing to"*"gets the firehose; subscribing to specific topics limits what the forwarder sends.kind: "res"→ server response to a priorreq(future use).kind: "ping" | "pong"→ keep-alive. Server emits apingevery 20s.
Per-connection bounded mpsc with capacity 1024. On lag the server:
- Coalesces consecutive
SessionStatusand low-severityFeedItemupdates. - Sends a
LagMarkerso the client knows to refetch. - Continues streaming once drained.
Reserved topic names. Clients that want the firehose subscribe to the
special * topic. Subscribing to specific topics limits what the
forwarder sends — essential for remote clients where bandwidth matters.
| Topic | Payload variants |
|---|---|
* |
Firehose. Delivers every event regardless of its event_topic. The desktop app uses this so it doesn't need to track per-agent / per-session subscriptions. Remote clients should prefer narrower topics. |
session:{key} |
FeedItem, SessionStatus, AuthRequired |
agent:{path} |
ActivityChanged, SkillsChanged, FilesChanged, ConfigChanged, ContextChanged, LearningsChanged, ConversationsChanged |
routines:{agent} |
RoutinesChanged, RoutineRunsChanged |
composio |
ComposioCliReady, ComposioCliFailed |
scheduler |
HeartbeatFired, CronFired |
toast |
Toast, CompletionToast |
events |
EventReceived, EventProcessed |
auth |
AuthRequired |
engine/houston-engine-server/tests/— in-process HTTP + WS assertions.ui/engine-client/src/types.ts— mirrors the Rust DTOs by hand until a codegen tool (ts-rsorspecta) is adopted. CI should fail if shapes drift.
These are load-bearing things every custom frontend must do. Missing any of them doesn't break the build but will produce a frozen or silently-wrong UI at runtime.
The Claude/Codex CLI writes files via its own tools — those writes
bypass the engine entirely. The engine only learns about them when
the filesystem watcher is running. Call
POST /v1/watcher/start (SDK: client.startAgentWatcher(agentPath))
exactly once after you resolve the agent folder. Without it,
FilesChanged never fires for agent-side writes and the UI looks
frozen until a manual reload.
The per-connection forwarder drops events that arrive before the
client has subscribed to their topic. Subscribe to session:<key>
and agent:<path> first, THEN POST /v1/agents/:path/sessions.
The echoed session_key in the start response is safe; early
events for that key may have been dropped — refetch with
/v1/agents/:path/sessions/:key/history if you need them.
POST /v1/agents/:path/sessions accepts an optional systemPrompt
field. When omitted, the engine falls back to whatever the embedding
app passed in via HOUSTON_APP_SYSTEM_PROMPT at subprocess spawn. The
engine has no hardcoded product copy — it only assembles generic
per-agent context from disk (working directory, mode overrides,
skills index, integrations). Final prompt =
<product_prompt>\n\n---\n\n<agent_context>. Onboarding sessions use
HOUSTON_APP_ONBOARDING_PROMPT as an additional suffix.
The assembled prompt reaches the provider CLIs via scratch files,
never argv (houston-terminal-manager::prompt_scratch): codex gets a
per-session profile at $CODEX_HOME/houston-tmp-*.config.toml selected
with -p (requires the file-based profiles in codex ≥ 0.137 — keep the
cli-deps.json pin at or above that), claude gets a temp file via
--system-prompt-file. Argv tokens are capped at 32,767 chars total by
Windows CreateProcessW; carrying the prompt inline (-c developer_instructions=… / --system-prompt <text>) broke every spawn
with os error 206 once an agent's accumulated context outgrew the limit.
Growth is also bounded at the source: workspace_context caps the
WORKSPACE.md/USER.md prompt share (12 KB / 4 KB, newest-first, with
an explicit omission marker) the same way learnings_context caps
learnings — files on disk are never trimmed.
assistant_text_streaming deltas should REPLACE the in-progress
assistant message in your state; assistant_text finalizes it.
Don't append every streaming delta as a new message row. Same
pattern for thinking_streaming / thinking. See
examples/smartbooks/src/lib/feed.ts::appendFeedItem.
The terminal feed_type: "final_result" item carries data: { result, cost_usd, duration_ms, usage }. usage is the normalized TokenUsage
{ context_tokens, output_tokens, cached_tokens } (Rust
houston-terminal-manager::TokenUsage, TS @houston-ai/chat TokenUsage)
or null for providers that don't report it (Anthropic + Codex do; Gemini
doesn't yet). context_tokens is the prompt size of the most recent model
request, i.e. how much of the context window is in use.
- Anthropic: the parser sums the last assistant message's three-way split
(
input + cache_creation + cache_read). The per-message usage IS the last request, so this is the live fill. - Codex: trickier.
codex exec --jsononly emitsturn.completed.usage, which is the CUMULATIVE sum of every model request in the turn (a turn with N tool round-trips reports ~N× the real size — this is the 94k-instead-of-19k bug). The real last-request fill + the effective window live ONLY in Codex's on-disk rollout ($CODEX_HOME/sessions/**/rollout-*-<thread_id>.jsonl, default~/.codex), intoken_count.info.last_token_usage/model_context_window. Soengine codex_rollout::latest_usage(thread_id)reads the newest rollout's lasttoken_countandsession_iopatches it onto theFinalResultafter the stream flushes (codex only writes the rollout fully on exit, so the held-back FinalResult is emitted post-loop). The parser leavesusageNone; on any rollout failure it stays None (no % beats a wrong %). Bumping the bundled codex won't help — neither 0.130 nor 0.135exec --jsonexposes the per-request data in stdout.
The desktop composer's context-usage indicator (app/src/components/context- indicator.tsx) divides the latest turn's context_tokens by a window
estimate to drive a ring gauge (a donut whose arc fills with the occupied
fraction and turns red near the limit; the percentage, a progress bar, and
rounded token counts surface on hover); it reads usage via sessionContextUsage
(app/src/lib/context-usage.ts) so it works both live and after a history
reload (the field is persisted in chat_feed.data_json). /context (the
interactive Claude Code slash command) is unavailable here because the
engine drives claude -p in non-interactive print mode — the data comes
from the stream's usage blocks, not a REPL command.
The window is an estimate, by necessity. The real context window is
plan/credit-gated and is NOT reported anywhere claude -p can see (verified
against Claude Code 2.1.159: system init carries only model, tools,
mcp_servers, ... — no window field; no flag; no env var; Codex's
thread.started likewise). The gating:
- Opus 4.x → 1M automatic on Max/Team/Enterprise, else 200k (1M needs
/extra-usagecredits on Pro). - Sonnet 4.6 → 200k on every plan; 1M only with usage credits.
- Codex gpt-5.5 → 258,400 effective = raw
context_window272k ×effective_context_window_percent95% (both from Codex'smodels_cache.json, and the rollout'smodel_context_windowconfirms 258400). The opt-in 1M variant maxes at 1M × 95% = 950k.
So the indicator uses a self-correcting estimate (providers.ts
contextWindow = default assumption, contextWindowMax = snap-up ceiling;
context-usage.ts effectiveContextWindow): start from the per-model
default (Opus 1M, Sonnet 200k, gpt-5.5 258.4k), then snap UP to the ceiling
once the session's observed PEAK context_tokens exceeds the default —
which proves the real window is larger, because both CLIs auto-compact
before the limit so observed usage can never exceed the true window. This
auto-fixes Sonnet-with-credits and never reads over 100%. The one case it
over-estimates is Opus on Pro WITHOUT credits (shows 1M, really 200k); it
can't self-correct downward, so the dialog labels the figure "estimated".
If a future CLI release exposes the window in system init /
thread.started, prefer that live value over the estimate.
When a conversation nears the context window, Houston frees space without
touching the user's visible chat. Both paths surface as one
feed_type: "context_compacted" item (data: { trigger: "native" | "proactive", pre_tokens?: number }, Rust FeedItem::ContextCompacted),
rendered as a subtle divider — the full history above and below stays visible.
- Native — Claude Code auto-compacts its own transcript as it nears the
window (~95%) and emits a top-level stream-json
systemevent{"subtype":"compact_boundary","compact_metadata":{"trigger","pre_tokens",…}}(verified against Claude Code 2.1.160).parser.rslifts it intoContextCompacted { trigger: Native }. Claude-only today: Codex'sexecauto-compaction is unreliable, which is exactly why the forced path exists. - Proactive — the desktop client watches the context-usage % and, once it
crosses the threshold (default 93%, overridable at build time via
VITE_AUTOCOMPACT_THRESHOLD), setscompact: trueon the nextstartSession. The engine (sessions::compaction) summarizes the visible chat via a one-shot provider call, abandons the current resume id withSessionIdHandle::clear_current_preserving_history()(the id stays in.historysosession_ids_for_historystill loads the fullchat_feed), emits + persists aProactivemarker under the old id, then runs the turn on a FRESH provider session seeded with[summary + the user's message]. The persisted/displayed user message stays the original; only the agent's working context shrank. Provider-agnostic — the reliable path for Codex.
Autocompact is always on — there is no user-facing toggle. It's a
non-destructive guarantee (the full chat_feed stays visible regardless), so
the decision is purely client-side: lib/autocompact.ts, called from
tauriChat.send so every send path gets it, reads the live feed usage
synchronously. The only knob is the threshold, a build-time constant
(VITE_AUTOCOMPACT_THRESHOLD, default 93), not a user setting.
compact is honored only when a resume id exists (ignored on turn 1). On
summary failure the engine logs and falls back to a normal resume (the CLI's own
auto-compaction is the backstop), so a turn never fails because compaction
couldn't run.
POST .../sessions accepts an optional providerSwitch: { mode: "replay" | "summarize", fromProvider } (HOU-424). It is set by the client when the user
moves a live conversation to a DIFFERENT provider mid-stream. Provider CLI
sessions aren't portable, so the engine reseeds a FRESH session on the resolved
(new) provider with prior context — the full transcript verbatim (replay) or
an AI summary (summarize) — and clears any current resume id for the resolved
provider so a switch-back never resumes a stale cross-provider session. It takes
precedence over compact. The seed is built from the DB chat_feed, and the
summary (for summarize) runs on the TARGET provider, so a switch away from an
out-of-credits provider still works. Unlike compact, a switch seed failure is
NOT swallowed: it surfaces as a SessionStatus error (beta no-silent-failure
policy).
The boundary is recorded as a feed_type: "provider_switched" item (data: { provider, summarized, pre_tokens? }, Rust FeedItem::ProviderSwitched),
rendered as a subtle divider like context_compacted. provider is the
provider switched TO; summarized distinguishes the verbatim carry from the
summarized one. The full chat_feed above and below stays visible.
The read-project route returns text only. For xlsx, pdf, images,
etc., call POST /v1/shell with open "<path>" (macOS),
xdg-open "<path>" (Linux), or start "" "<path>" (Windows) to
hand the file to the host OS's default application. A first-class
binary-read endpoint is on the roadmap — until it lands, the shell
route is the escape hatch.
Browsers can't set Authorization on WebSocket upgrades. Use
?token=<token> on the WS URL instead. The engine accepts all three
(Authorization header, ?token=, Sec-WebSocket-Protocol: houston-bearer.<token>).
examples/smartbooks/ — a complete custom
frontend consumer of the engine, ~400 lines of TSX, zero @houston-ai/*
UI deps. Treat as a copy-paste template.