You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-**Global push-to-talk** — press a hotkey anywhere (default `F9`), speak, release. Transcript is pasted at your cursor.
25
25
-**Voice activation (optional)** — hands-free mode powered by [Silero VAD](https://github.com/snakers4/silero-vad). Flip it on in Settings and Echo auto-transcribes each utterance as you speak. Ignores non-speech noise.
26
-
-**Fully local transcription** — all audio and transcription stays on-device via [whisper.cpp](https://github.com/ggml-org/whisper.cpp). Your voice never leaves your machine.
27
-
-**Automatic translation (optional)** — point Echo at a target language and it will translate each transcript via [DeepL](https://www.deepl.com/) before pasting. Skipped automatically when you're already speaking the target language. Audio still stays local; only the transcript text is sent.
28
-
-**Low-latency transcription** — Whisper runs as a persistent `whisper-server` process with the model kept resident in memory, so every utterance skips cold-load overhead.
29
-
-**GPU acceleration** — CUDA on Windows (NVIDIA), Metal on Apple Silicon, CPU fallback everywhere.
30
-
-**Five model sizes** — from 75 MB (`tiny`) to 1.6 GB (`large-v3-turbo`). Pick the accuracy/speed tradeoff you want.
26
+
-**Fully local transcription** — Whisper runs in-process via [ONNX Runtime](https://onnxruntime.ai/) + [Transformers.js](https://huggingface.co/docs/transformers.js). Audio never leaves your machine.
27
+
-**Automatic translation (optional)** — point Echo at a target language and it will translate each transcript via [DeepL](https://www.deepl.com/) before pasting. Audio stays local; only the transcript text is sent.
28
+
-**Model kept resident** — loaded once at startup into the main process, reused for every utterance — no cold-load per transcription.
29
+
-**Five model sizes** — from `tiny` to `large-v3-turbo`. Pick the accuracy/speed tradeoff you want.
> **macOS note:** the build is unsigned (no paid Apple Developer ID). After dragging **Echo** to Applications, right-click the app → **Open** on first launch to bypass Gatekeeper.
47
46
>
48
-
> If macOS still refuses with *"Echo is damaged and can't be opened"*, it's the quarantine flag from the download. Clear it with:
47
+
> If macOS still refuses with *"Echo is damaged and can't be opened"*, clear the quarantine flag:
49
48
>
50
49
> ```bash
51
50
> xattr -cr /Applications/Echo.app && open /Applications/Echo.app
52
51
>```
53
52
54
-
On first launch, Echo will auto-download the selected whisper model (`base` by default, ~142 MB) with a progress indicator. If you have an NVIDIA GPU and selectthe**CUDA** backend, it will also download the CUDA-enabled binary.
53
+
On first launch, Echo will auto-download the selected Whisper model (`base` by default) into `~/.../Echo/whisper-models`with a progress indicator.
55
54
56
55
## Usage
57
56
@@ -77,8 +76,7 @@ Open settings from the tray icon. Everything is persisted to `electron-store` in
77
76
| Push-to-talk hotkey |`F9`| Any key or modifier combo |
- **[whisper.cpp](https://github.com/ggml-org/whisper.cpp)** — transcription engine, run as a long-lived `whisper-server` subprocess with the model kept resident in RAM
113
+
- **[@huggingface/transformers](https://huggingface.co/docs/transformers.js)** + **[onnxruntime-node](https://onnxruntime.ai/)** — Whisper ASR runs in-process using ONNX models from [Xenova/whisper-*](https://huggingface.co/Xenova)
115
114
- **[Silero VAD](https://github.com/snakers4/silero-vad)** via [@ricky0123/vad-web](https://github.com/ricky0123/vad) — neural voice activity detection for the voice-activation mode
116
115
- **[DeepL API](https://www.deepl.com/pro-api)** — optional cloud translation layer between Whisper and paste
117
116
- **[koffi](https://koffi.dev/)** — FFI for global key polling on Windows
0 commit comments