This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
OpenLess is a menu-bar/tray voice-input layer. Hold or toggle a global hotkey, speak, and the dictated text is polished and inserted at the current cursor in any app. Product principles, state machine, and module list live in docs/openless-development.md and docs/openless-overall-logic.md — read those before changing product behavior.
The active codebase lives at openless-all/app/ and is Tauri 2 + Rust backend + React/TS frontend, targeting macOS 12+ and Windows. The legacy Swift implementation (Sources/, Tests/, Package.swift, appcast.xml, Sparkle pipeline) was removed in commit 34d2823; do not resurrect it.
UI must match openless-all/design_handoff_openless/*.jsx pixel-for-pixel; the JSX is reference-only, never imported.
Adjacent docs:
AGENTS.mdis the parallel of this file for Codex sessions; the research-before-coding rules at the bottom of this file delegate to it.README.md/README.zh.md(root) are user-facing install + feature guides;USAGE.mdcovers runtime usage. Update them when shipping user-visible features, not for internal refactors.
cd "openless-all/app"
npm ci
# Dev: vite at :1420 + tauri shell
npm run tauri dev
# Build .app (+ DMG) — use this script, not `tauri build` directly,
# because it threads Apple signing env vars and validates Info.plist.
./scripts/build-mac.sh # build, sign, install to /Applications, reset TCC
INSTALL=0 ./scripts/build-mac.sh # build only
# Frontend-only TS check
npm run build # = tsc && vite build
# Rust type-check without full compile
cargo check --manifest-path src-tauri/Cargo.toml# Preflight: verify toolchain
.\scripts\windows-preflight.ps1
# Build (requires Windows host or cross-compile target)
.\scripts\windows-build-gnu.ps1Generated artifacts:
openless-all/app/src-tauri/target/release/bundle/macos/OpenLess.appopenless-all/app/src-tauri/target/release/bundle/dmg/OpenLess_<version>_aarch64.dmg
Logs: ~/Library/Logs/OpenLess/openless.log (macOS) / %LOCALAPPDATA%\OpenLess\Logs\openless.log (Windows).
There is no test runner wired in for the frontend. src/lib/providerSetup.test.ts is a hand-rolled assertion script — run with npx tsx src/lib/providerSetup.test.ts if you need it. Rust backend unit tests are run with cargo test --manifest-path src-tauri/Cargo.toml --lib; hardware / OS-integration behavior is still verified by running the app.
coordinator::Coordinator is the single owner of all session state — both the dictation phase machine (Idle → Starting → Listening → Processing → Inserting → Done) and the parallel QA phase machine (Idle → Recording → Processing). Hotkey edges drive both. Recorder, ASR, polish, insertion, selection capture, and history are wired here and nowhere else. Leaf modules never call across each other — they each depend only on types.rs.
The coordinator was split into a module: coordinator.rs is the public entry; coordinator/{dictation,qa,resources}.rs carry per-pipeline logic; coordinator_state.rs is the pure (no Tauri / audio / clipboard) state-transition layer that makes phase decisions unit-testable.
Rust (openless-all/app/src-tauri/src) Purpose
────────────────────────────────────────── ────────────────────────────────
types.rs Pure value types: sessions, PolishMode, HotkeyBinding, errors, QaChatMessage
coordinator.rs Public entry; owns Inner, hotkey wiring, capsule emits
coordinator/{dictation,qa,resources}.rs Dictation pipeline / QA pipeline / shared helpers (begin/end/cancel)
coordinator_state.rs Pure state transitions — Tauri-free, unit-testable
commands.rs + lib.rs + main.rs IPC surface (`invoke_handler!`), tray icon, window plumbing, entry
permissions.rs TCC checks (Accessibility / Microphone / AppleEvents)
— Hotkeys (three parallel monitors) —
hotkey.rs Modifier-only hotkey via native CGEventTap (macOS) / rdev (Win/Linux)
combo_hotkey.rs Custom-combo dictation hotkey (when user picks combo over modifier-only)
qa_hotkey.rs QA toggle hotkey (default Cmd/Ctrl+Shift+;) via `global-hotkey` crate
global_hotkey_runtime.rs Shared `global-hotkey` Carbon/Win event runtime (combo + QA share it)
shortcut_binding.rs Shared parse/validate of user-configurable bindings
— Audio / ASR / LLM —
recorder.rs Mic → 16 kHz mono Int16 PCM, RMS callback
audio_mute.rs System-output mute guard while recording (RAII)
asr/{mod,frame,volcengine,whisper}.rs + asr/local/* ASR providers: Volcengine streaming WS, Whisper HTTP, Bailian, local Foundry
polish.rs OpenAI-compatible chat completions (Ark / DeepSeek / Codex OAuth reuse)
llm_gemini.rs Native Google Gemini client — NOT OpenAI-compatible (separate auth, thinkingConfig, role:model)
correction.rs User-defined correction rules (separate from vocab dictionary)
— Insertion (two paths) —
insertion.rs AX focused-element write → clipboard + paste shortcut → copy-only fallback
windows_ime_{ipc,profile,protocol,session}.rs Windows IME-side text injection over IPC (parallel insertion path; activates OpenLess TSF profile and submits text via named pipe)
selection.rs Cross-platform selection capture for QA: macOS AX → Cmd/Ctrl+C simulate-copy → Linux PRIMARY (best-effort)
persistence.rs history.json / preferences.json / dictionary.json + platform credential vault
Frontend (openless-all/app/src)
src/components/Capsule.tsx Capsule view + state enum
src/ (React) Main window UI: Overview / History / Vocab / Style / Settings
src/i18n/ react-i18next init + zh-CN / en resources (zh-CN is source of truth)
src/pages/_atoms.tsx Recoil atoms — global frontend state
src/state/HotkeySettingsContext.tsx HotkeySettings React context (capability + binding from backend)
hotkey edge (1st) → beginSession: Recorder.start → ASR.openSession → BufferingAudioConsumer.attach
hotkey edge (2nd) → endSession: Recorder.stop → ASR.sendLastFrame → awaitFinal → Polish → Insert → History.save
.cancelled → ASR.cancel, Recorder.stop, capsule .cancelled
Invariants:
- Polish/ASR fallbacks are silent. Missing Ark creds → insert raw transcript. Missing Volcengine creds → mock pipeline copies a placeholder. The contract is "the user's words don't get lost" — don't add hard errors here.
BufferingAudioConsumerqueues PCM until the WebSocket is ready, then drains. Recorder always pushes to it; ASR is attached afteropenSessionresolves.- Hotkey is toggle-only, not press-and-hold. The monitor yields one edge per modifier-key keydown; the coordinator interprets odd/even.
Parallel state machine, lives in coordinator/qa.rs + qa_hotkey.rs + selection.rs. Default trigger: Cmd+Shift+; (macOS) / Ctrl+Shift+; (Win/Linux).
QA hotkey edge → toggle panel: open → capture front_app, clear messages, show QA window
close → cancel session, hide window, sweep capsule
Option/dictation edge → routed by panel_visible flag (see below):
while panel_visible & dictation Idle → handle_qa_option_edge:
QaPhase::Idle → begin_qa_session: capture_selection() → Recorder.start → ASR.openSession
QaPhase::Recording → end_qa_session: Recorder.stop → ASR final → LLM (with selection as context) → emit qa:state
QaPhase::Processing→ ignored (LLM in flight)
otherwise handle_pressed (normal dictation)
Invariants & gotchas:
- Hotkey routing. When the QA panel is visible, the dictation hotkey edge routes to QA — unless a dictation session is already mid-flight (
Starting/Listening/Processing/Inserting), in which case the edge stays with dictation. Otherwise QA'sbegin_qa_sessionwould race for the same mic device (cpal rejects the secondbuild_input_streamon macOS/Win, PipeWire opens two streams on Linux — neither is recoverable from the QA panel UI). See audit 3.3.1 incoordinator/dictation.rs. - Capsule sweep on panel open. Open emits a fresh
CapsuleState::Idleonly if dictation is Idle. If dictation is Recording/Polishing/Inserting/Done, the sweep is suppressed so the user's in-flight feedback isn't wiped. See audit 3.3.4. - Selection capture is a 3-tier fallback (
selection.rs): (1) macOS AXkAXSelectedTextAttributedirect read, no clipboard touched; (2) macOS/Windows simulate Cmd/Ctrl+C → snapshot + restore original clipboard, 80 ms read window; (3) Linux PRIMARY viawl-paste/xclip/xsel, best-effort. ReturnsNonewhen the user genuinely selected nothing. - Selection truncation. Hard cap 4000 chars; over → keep first 2000 +
[…truncated…]+ last 2000. Don't raise this without checking LLM context budgeting — Gemini and Ark have different limits. - Multi-turn memory.
QaSessionState.messagesaccumulatesuser→assistantpairs across turns within a single panel session; closing the panel clears them.
insertion.rs is the cross-platform default. On Windows there is a second insertion path in windows_ime_{ipc,profile,protocol,session}.rs that activates a TSF profile (CLSID + GUID baked in windows_ime_profile.rs) and submits text over a named-pipe IPC. The coordinator picks one based on user preference / fallback status; both routes return the same InsertStatus (Inserted / CopiedFallback). When changing insertion behavior, decide which path you're touching — they don't share code.
- Bundle ID
com.openless.appis hard-coded inopenless-all/app/src-tauri/tauri.conf.jsonandCredentialsVault.serviceName. Changing it breaks system credential vault lookups and every existing TCC grant. - TCC: Microphone + Accessibility + AppleEvents.
NSMicrophoneUsageDescription/NSAccessibilityUsageDescription/NSAppleEventsUsageDescriptionlive inopenless-all/app/src-tauri/Info.plist. After a fresh build that resets TCC, the app must be fully quit and relaunched after granting Accessibility before the global hotkey tap installs. - Credentials live in the OS credential vault (macOS Keychain, Windows Credential Manager, Linux keyring) under service
com.openless.app. The legacy plaintext JSON (~/.openless/credentials.jsonon macOS/Linux,%APPDATA%\OpenLess\credentials.jsonon Windows) is only a migration source and is removed after a successful vault write. Never hard-code keys or include legacy credential files in logs, exports, build artifacts, or bug reports. - Per-user data:
- macOS:
~/Library/Application Support/OpenLess/{history.json, preferences.json, dictionary.json}— capped at 200 history entries. Do not renamedictionary.jsontovocab.json(drops user data). - Windows:
%APPDATA%\OpenLess\ - Linux:
$XDG_DATA_HOME/OpenLess
- macOS:
Push a v*-tauri tag → .github/workflows/release-tauri.yml builds macOS arm64 .dmg and Windows x64 .msi. macOS Developer ID signing + notarization runs only when APPLE_CERTIFICATE / APPLE_CERTIFICATE_PASSWORD / APPLE_ID / APPLE_PASSWORD / APPLE_TEAM_ID secrets are set; otherwise it falls back to ad-hoc signing with a CI warning.
When bumping versions, update both version fields: openless-all/app/package.json and openless-all/app/src-tauri/tauri.conf.json (and Cargo.toml).
Two-channel branching. Branch name = release channel.
beta— Beta channel (开发版). Default branch, integration buffer. All PRs targetbeta(nevermain). Beta builds may exist but are not pushed to general users — only opt-in users on the Beta channel see them.main— Stable channel (正式版). Always-releasable. Updated only bybeta → mainmerges performed by maintainers after a two-platform smoke build. Release tagsv<version>-tauriare pushed onmainand triggerrelease-tauri.yml(tag-driven; unaffected by branch renames).
Per-PR contract:
- Run the change locally on your target platform before opening the PR (build green + manual verification of the affected feature).
pr-agent.ymlruns one AI review pass per PR — treat it as advisory, do not iterate on it.- Keep AI rework rounds tight (1–2). If a fix resists, escalate to a human or restart with fresh context.
ci.ymlruns on push/PR for bothmainandbeta; no extra wiring needed when adding new branches offbeta.
For maintainers:
- Merge
beta → mainonly after the two-platform (macOS + Windows) smoke build passes. Beta work must not leak to Stable — that gate exists for a reason. - Tag
v<version>-taurionmain, not onbeta. The release workflow keys off the tag, but tagging onmainkeeps the release commit linear with the always-releasable line. - Avoid direct pushes to
mainoutside thebeta → mainmerge — it bypasses the smoke-test gate.
Channel distribution (manual-download opt-in):
- Tag convention.
v<v>-tauri→ Stable release (GitHubprerelease=false, manifestlatest-{tgt}-{arch}.json).v<v>-beta-tauri→ Beta release (GitHubprerelease=true, manifestlatest-{tgt}-{arch}-beta.json). The two manifest filenames never overlap, so the in-app updater endpoint (which is fixed at compile time to the no-suffix file) cannot pick up Beta releases. This is the physical isolation that guarantees Beta does not leak to Stable users. - Why not auto-update for Beta.
tauri-plugin-updater2.10'sBuilderdoes not exposeendpoints()— endpoints are only readable fromtauri.conf.jsonat build time and cannot be swapped at runtime. Rather than fork the plugin or write a custom updater (~500 lines, high risk), Beta opt-in is implemented as a manual-download flow: Settings → About has a "Join Beta channel" toggle that, when on, callsfetch_latest_beta_release(GitHub Releases API), shows the latest pre-release tag, and routes the user to the GitHub release page to download manually. No installer signing/install path needs to be re-implemented. - Where the wiring lives. Pref field:
UserPreferences::update_channel(types.rs). IPC:get_update_channel/set_update_channel/fetch_latest_beta_release(commands.rs). UI:BetaChannelControlinsideAboutMini(SettingsModal.tsx). i18n:settings.about.betaChannel*keys.
Run after pushing either a v*-tauri or v*-beta-tauri tag, before announcing the release:
- GitHub Release page matches expectation:
- Stable tag: not marked
Pre-release, in thereleases/latestredirect. - Beta tag: marked
Pre-release, not the target ofreleases/latest.
- Stable tag: not marked
- Release assets are channel-correct:
- Stable tag includes
latest-{darwin,windows,linux}-{aarch64,x86_64}.json+ their-mirror.jsonsiblings, without-betasuffix. - Beta tag includes
latest-{tgt}-{arch}-beta.json+-beta-mirror.json, without the no-suffix variant.
- Stable tag includes
- Stable user flow. Install a Stable build, click
Settings → About → Check for updates. After a Stable release: should offer the new version. After a Beta release only: should report "up to date" (Beta must not appear). - Beta user flow. In the same Stable build, toggle on
Join Beta channel. The latest Beta tag should appear (or "no Beta released yet"). Clicking the download button should open the corresponding GitHub release page. - Updater endpoint sanity.
curl -fsSL https://github.com/appergb/openless/releases/latest/download/latest-darwin-aarch64.jsonreturns the Stable manifest (version field matches the latest Stable tag). It should never return a Beta version, regardless of which tag was pushed most recently.
If any step fails, do not announce the release; investigate release-tauri.yml channel detection (endsWith(github.ref_name, '-beta-tauri')) and the OPENLESS_RELEASE_CHANNEL env propagation in the run logs.
- Comments, log messages, user-facing strings, and most docs are in Simplified Chinese. UI strings additionally route through
react-i18next(src/i18n/{zh-CN,en}.ts) so we ship English alongside;zh-CN.tsis source of truth. - macOS hotkey monitor must use native
CGEventTap, neverrdev.rdevsynchronously callsTSMGetInputSourcePropertyfrom non-main threads, which macOS 14+ aborts viadispatch_assert_queue_fail→ SIGTRAP. macOS uses CGEventTap;rdevis only used on Linux/Windows. - Don't
NSApp.activateon the dictation path — it steals focus and breaks insertion. Only callset_activation_policy(Regular)+activateIgnoringOtherAppsfromshow_main_window/ mic-permission prompts, never fromstart_dictation. - Rust modules wrap shared mutable state with
Arc<Mutex<...>>(parking_lot). Keep that locking discipline when adding fields. - Rust modules depend only on
types.rs. New cross-module wiring goes incoordinator.rs, not in the leaf modules.
- Add a
<name>.rs(or directory) underopenless-all/app/src-tauri/src/, importing only fromtypes. - Register it in
lib.rs(mod <name>;). - Wire it into
coordinator.rsand expose any frontend-callable surface viacommands.rs+invoke_handler!. - Add the matching TS wrapper in
openless-all/app/src/lib/ipc.ts(with a mock branch for browser dev).
完整规则在 AGENTS.md Third-party service integrations & library / platform API research 段落(line 171-191)。 这里列的是 Claude Code 入场后用得上的具体工具映射。
- 第三方 HTTP API(ASR 厂家 / LLM 端点 / GitHub API / Tauri plugin 服务等)
- 不熟的 Rust crate / npm 包:连签名和 feature flag 都不确定时
- 平台 API:Apple Security framework / CoreFoundation / Win32 / Carbon / AppKit
- 仓库 lock 文件锁着的某版本到底支持什么 — 训练记忆和
Cargo.lock/package-lock.json实际版本可能不一致 - 任何跟「训练 cutoff 之后才迭代过」相关的接口
- 仓库代码里已有现成调用 →
rg/grep找参考即可(仓库即文档) - 通用编程 / 算法 / 自己能推导的语言特性
- 单文件 surgical 改动且改动点的 API 已有用例
- 查本仓库已有模块(
types.rs/coordinator.rs等)— 直接 Read
1. Context7 MCP(最高优先 — 主流库覆盖广,version-aware)
- mcp__context7__resolve-library-id → 拿 library id
- mcp__context7__query-docs → 当前版本的官方文档片段
2. documentation-lookup skill
/skill documentation-lookup —— 包装 Context7,含路由 + 缓存。
3. Agent 子 agent(subagent_type=general-purpose)
场景:Context7 没覆盖(小众 crate / 新 SDK / 非英文文档),
或需多源交叉(官方文档 + GitHub README + Stack Overflow)。
子 agent 用 WebFetch / WebSearch / Context7 综合,回 200-400 字结构化结果。
4. 单点兜底:直接 WebFetch 单页文档(只读最权威一篇时)
- 目标问题:一句话讲清要解决的具体技术问题(不要"了解一下 X"这种空靶)
- 本仓库现状:当前 lock 着的版本(
Cargo.lock/package-lock.json拉一下)+ 现有调用点file:line(若有) - 必须返回的结构:函数/端点签名 → 最小可运行示例(≤20 行)→ 版本兼容范围(vs 训练记忆的核心校验点)→ 已知坑 / 平台差异 / 弃用计划
- 禁令:不改本仓库代码;不贴文档原文(distill 关键部分,避免上下文撑爆);多个独立服务分别派 agent — 一个服务一个 agent
- ✗ 凭训练记忆写第三方 API 调用,假定参数签名就这样
- ✗ 把整段官方文档 paste 进主上下文
- ✗ 先写代码再查文档
- ✗ 单子 agent 同时调研 5 个不相关库(每个独立 prompt + 独立上下文)
- ✗ 子 agent 返回后跳过 cross-verify 直接写代码 — AGENTS.md 第 4 步要求至少用一次
WebFetch直接命中官方源核对一项关键事实