Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 10 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,8 @@ Other install methods: [pip install](#alternative-install-with-pip) | [uv instal
## 🔥🔥🔥 News (Pacific Time)


- May 10, 2026 (latest): **Web Chat UI fixes — slash commands no longer reply twice; `--web --model X` actually applies the model.** Two related issues that surfaced when wiring a self-hosted vLLM endpoint into the Chat UI. (1) **Issue #111 — slash commands duplicated in Chat UI but not in terminal.** `web/api.py:handle_slash_sync` was both returning events inline in the HTTP response **and** broadcasting the same events to the WS subscribers of the same client; `chat.js` then iterated `data.events` AND fired `_handleEvent` from `ws.onmessage`, rendering every reply twice. Same bug in `handle_slash_stream` for SSE-streamed long commands (`/brainstorm`, `/worker`, `/agent`, `/plan`). Both helpers now deliver events through a single channel — HTTP/SSE only — so `_handleEvent` runs exactly once per event. Background-thread events (sentinel flows, agent runs) are unaffected: by the time the worker thread emits, `_broadcast` is already restored to the live WS broadcaster in `finally`. (2) **`--web --model X` was silently ignored.** The CLI override branch only ran in the interactive-REPL path; the `if args.web:` branch loaded config straight from disk and started the server, so `python cheetahclaws.py --web --model custom/qwen2.5-72b` would happily boot but every request handler reloaded `~/.cheetahclaws/config.json` with the previous model name (e.g. `gemma-4-31B-it`), producing a confusing `404: model does not exist` against the new endpoint. Fix: `cheetahclaws.py` now persists `args.model` to config before calling `start_web_server`, matching the documented behavior; `provider:model` → `provider/model` normalization is identical to the REPL path. User-side guide: [`docs/guides/web-ui.md`](docs/guides/web-ui.md) (Troubleshooting + Architecture notes updated).
- May 10, 2026 (latest): **Web Chat UI session organization — folders, drag-drop, batch ops, resizable sidebar, ChatGPT-style active-folder context.** Built on top of the slash-fix branch the same day. (1) **Folders.** New `folders` table (per-user, name unique), `chat_sessions.folder_id` nullable FK with `ON DELETE SET NULL` semantics enforced at the repo layer (SQLite `PRAGMA foreign_keys` is off in this engine). Light-touch migration runs in `init_db()`: a `PRAGMA table_info` probe adds the column to existing DBs in place — no Alembic, no manual steps for upgraders. New endpoints: `GET/POST /api/folders`, `PATCH/DELETE /api/folders/{id}`, `PATCH /api/sessions/{id}/folder` (body `{folder_id: int|null}`); deleting a folder preserves its sessions and reparents them to "Ungrouped". (2) **Drag-and-drop + Move-to context menu.** Session items are HTML5-draggable; folder rows and the Ungrouped header are drop targets with `drop-target` highlight. Right-click on a session shows a flat `Move to:` section listing every folder, plus `(Ungrouped)` when applicable and `+ New folder…` for create-and-move in one shot. Right-click on a folder header offers Rename / Delete (with a `confirm()` that spells out "sessions become Ungrouped — they are NOT deleted"). (3) **Active-folder context (ChatGPT-style).** Click a folder name (not the disclosure arrow) to "enter" that folder — the row gets accent highlighting and the topbar grows a `Chat · in <Folder>` breadcrumb. While a folder is active, **`+ New` and direct-typing auto-create both drop the new session into that folder**, mirroring how OpenAI Projects scope new chats. Switching to any session syncs active-folder to that session's folder so the breadcrumb stays honest. State persists across reloads via `localStorage` (`cc-active-folder`); deleted folders auto-clear. (4) **Batch select.** "Select" button in the sidebar header enters a checkbox mode with a footer action bar: count, Select all (respects the search filter), Delete (single confirm with total-message count), Export (single combined Markdown download with one `## Session: <title>` block per session, `chats-N-sessions.md`), Cancel. Right-click context menu is suppressed in select mode to avoid mode confusion. (5) **Resizable sidebar.** 4-px drag handle between sidebar and main pane (mouse + touch); width clamped to 200–600 px and persisted to `localStorage` (`cc-sidebar-w`). Double-click resets to default. Hidden under `@media (max-width: 768px)` so the mobile drawer keeps its swipe behavior. **Tests:** +10 new in `test_web_api.py` (folder CRUD, duplicate-409, move, delete-preserves, cross-user isolation, list includes `folder_id`, batch delete, batch delete cross-user, batch export, batch export empty 400) — full file at 31 tests, all passing, zero regressions on the existing 21. User-side guide: [`docs/guides/web-ui.md`](docs/guides/web-ui.md) (Layout / Session management / Folders / HTTP API all updated).
- May 10, 2026: **Web Chat UI fixes — slash commands no longer reply twice; `--web --model X` actually applies the model.** Two related issues that surfaced when wiring a self-hosted vLLM endpoint into the Chat UI. (1) **Issue #111 — slash commands duplicated in Chat UI but not in terminal.** `web/api.py:handle_slash_sync` was both returning events inline in the HTTP response **and** broadcasting the same events to the WS subscribers of the same client; `chat.js` then iterated `data.events` AND fired `_handleEvent` from `ws.onmessage`, rendering every reply twice. Same bug in `handle_slash_stream` for SSE-streamed long commands (`/brainstorm`, `/worker`, `/agent`, `/plan`). Both helpers now deliver events through a single channel — HTTP/SSE only — so `_handleEvent` runs exactly once per event. Background-thread events (sentinel flows, agent runs) are unaffected: by the time the worker thread emits, `_broadcast` is already restored to the live WS broadcaster in `finally`. (2) **`--web --model X` was silently ignored.** The CLI override branch only ran in the interactive-REPL path; the `if args.web:` branch loaded config straight from disk and started the server, so `python cheetahclaws.py --web --model custom/qwen2.5-72b` would happily boot but every request handler reloaded `~/.cheetahclaws/config.json` with the previous model name (e.g. `gemma-4-31B-it`), producing a confusing `404: model does not exist` against the new endpoint. Fix: `cheetahclaws.py` now persists `args.model` to config before calling `start_web_server`, matching the documented behavior; `provider:model` → `provider/model` normalization is identical to the REPL path. User-side guide: [`docs/guides/web-ui.md`](docs/guides/web-ui.md) (Troubleshooting + Architecture notes updated).
- May 10, 2026: **Small-context local models survive large workloads — 4-part fix: ctx cap, auto-fanout, stagnation-stop, output paths under `~/.cheetahclaws/`.** Repro that motivated the work: running `/agent → 1 (Research Assistant)` on a 6.6 MB PDF (`AutoRedTeamer.pdf` — ~70k tokens of extracted text) with `custom/qwen2.5-72b` (32k ctx). Old behavior: 400 BadRequest "context length 32768"; the agent_runner kept polling the template every 2 s; the model produced **1500+ identical "task complete" summaries** before anything stopped it. New behavior, four cooperating layers: (1) **Per-model context-window registry + dynamic max_tokens cap** (`providers._MODEL_CONTEXT_LIMITS` + `get_model_context_window` + `dynamic_cap_max_tokens`) — covers Qwen 2.5/3, Llama 3.x, Mistral/Mixtral, Phi, Gemma, DeepSeek local variants; `_fetch_custom_model_limit` now backfills `PROVIDERS["custom"]["context_limit"]` so compaction sees the live `/v1/models` value; per-call shrink based on actual prompt size keeps `input + output + 1024 safety ≤ ctx`. `compaction.get_context_limit` gains an optional `config` arg so custom-endpoint detection works on the very first turn. (2) **Auto-fanout for oversize tool outputs** (`multi_agent/fanout.py`) — when a single tool result (Read on a huge PDF, Grep over a giant tree, WebFetch of a long article) exceeds 0.4 × ctx_window, split into chunks at paragraph boundaries with token-overlap, dispatch parallel sub-LLM map calls (one per chunk, default cap 5 subagents), merge with a single reduce call; substitutes the merged summary in conversation history instead of letting the next API call overflow. Hooked at the tool-result append site in `agent.py`; transparent UX prints `[Auto-fanout: <Tool> returned ~N chars (>threshold) → dispatching K parallel sub-summaries]`. Configurable: `auto_fanout_enabled` / `_threshold` / `_max_subagents` / `_chunk_overlap_tokens`. (3) **Stagnation-stop in `agent_runner.py`** — when the model emits the same summary N iterations in a row (default 3, whitespace/case-normalized), stop the loop with a clear notification instead of burning thousands of API calls; configurable via `auto_agent_dup_summary_limit` (0 disables). (4) **Agent output paths under `~/.cheetahclaws/`** — `/agent` wizard now resolves relative output filenames (e.g. `research_notes.md`) to absolute paths under `~/.cheetahclaws/agents/<name>/output/` instead of CWD; `AgentRunner` exposes `runner.output_dir`, eagerly mkdir'd; Summary block + post-start info show the resolved path in green; absolute paths pass through unchanged. **Tests:** +47 new (fanout 23, ctx cap 18, dup-stop 13, output paths 8). **Full suite: 2139 passing, zero regressions.** User-side guide: [`docs/guides/extensions.md`](docs/guides/extensions.md).
- May 9, 2026: **`fix/agentic-on-every-model` branch — make every model produce useful work, and make `/brainstorm` an actual debate.** A single coordinated branch (9 commits, 269 new tests, zero regressions) that lands on weak / non-Claude models specifically. **Prompts:** new `prompts/overlays/qwen.md` overlay for qwen / qwq families plus an explore-first section in `default.md` so any model walks a directory before asking the user to name a file. **Runtime:** `agent.py` auto-nudge (one-shot, when user message contains an absolute path but the model replies text-only); read-only tool dedup (Read/Glob/Grep/WebFetch/WebSearch with identical args within a turn → 2nd call short-circuited, model gets a `[deduped]` reminder); KeyError-on-empty-args hardening in tool dispatch (`Write({}) → KeyError: 'file_path'` is now a friendly "missing required parameter" error the model can self-correct from). **Providers:** new `nim` provider (build.nvidia.com free tier, 10-model curated chain) invoked as `nim/<vendor>/<model>`, with 429 cascade fallback (cap 3 swaps/turn, gated to NIM only). **`/brainstorm` overhaul:** real lead moderator (`--lead <model>`) does opening (sets agenda + bans filler) → personas debate in N rounds (`--rounds N`, default 2) → lead probes after each round → lead synthesizes a structured master plan inline (no main-agent Read needed); round 2+ is **adversarial cross-examination** — every persona MUST quote another agent's claim and attack it with a falsifiable counter, "agree-and-extend" is forbidden, lead probes any dodge. New `--models a,b,c` flag distributes different models per persona for epistemic diversity. **`/monitor` + `/research` stability:** `/subscribe` no longer truncates multi-word topics ("Agent OS Benchmark" used to become "Agent"); aggregator no longer deadlocks on a hung source after `as_completed` timeout; REPL Ctrl+C during a slow slash command cancels just that command instead of killing the whole process. Branch: `fix/agentic-on-every-model`. User-side guide: [`docs/guides/brainstorm.md`](docs/guides/brainstorm.md).
- May 8, 2026: **Agent-OS layer (`cc_kernel/`) reaches v1.0 — 27 RFCs shipped, 1771 tests passing, zero regressions on the legacy REPL/bridges path.**
Expand Down Expand Up @@ -932,8 +933,11 @@ On first visit to `http://localhost:<port>/chat`, the UI routes you to a **regis
|---------|---------|
| **Streaming chat** | WebSocket for live prompts + SSE for long-running slash commands |
| **Persistent history** | Every session + message lives in SQLite (`~/.cheetahclaws/web.db`). Server restart does not lose state. |
| **Sidebar session management** | Title auto-titled from first user message, relative time ("12m ago"), message count, busy dot, client-side search, right-click menu (Rename / Export Markdown / Delete) |
| **Cross-user isolation** | Each user only sees their own sessions — enforced at DB query and in-memory cache |
| **Sidebar session management** | Title auto-titled from first user message, relative time ("12m ago"), message count, busy dot, client-side search, right-click menu (Rename / Export Markdown / Move to / Delete) |
| **Folders + ChatGPT-style Projects** | `+ Folder` button creates per-user folders; drag a session onto a folder header (or right-click → Move to ▸) to file it; click a folder name to "enter" — `+ New` and direct-typing then auto-drop the new session into that folder, with a `Chat · in <Folder>` topbar breadcrumb. Deleting a folder reparents its sessions to "Ungrouped" rather than deleting them. |
| **Batch operations** | "Select" button enters multi-select mode (checkboxes, Select all respects the search filter); a footer action bar batch-deletes (single confirm + total-message count) or batch-exports as a single combined Markdown (`chats-N-sessions.md`). |
| **Resizable sidebar** | Drag the 4-px divider between the sidebar and the chat pane (200–600 px clamp); double-click resets; width persists across reloads. |
| **Cross-user isolation** | Each user only sees their own sessions and folders — enforced at DB query and in-memory cache |
| **Tool cards** | Collapsible cards show tool name, inputs, outputs, status (running / done / denied) |
| **Permission approval** | Inline Allow / Deny buttons |
| **45+ slash commands** | `/status`, `/model`, `/brainstorm`, `/ssj`, `/plan`, `/telegram`, `/wechat`, `/slack`, `/voice`, `/image`, etc. |
Expand All @@ -955,6 +959,9 @@ Browser ──→ /chat ──→ 9 JS modules load from /static/
──→ /api/prompt (POST) ──→ persists to SQLite, fans events out
──→ /api/events (WS) ──→ real-time text_chunk / tool_* / permission_*
──→ /api/sessions/* ──→ list / get / rename / delete / export
+ batch_delete / batch_export
+ {id}/folder (move to folder)
──→ /api/folders ──→ list / create / rename / delete folders

──→ / ──→ xterm.js PTY (password-gated)
──→ /health ──→ { ok, db, uptime_s } (unauthenticated)
Expand Down
Loading
Loading