A native macOS speech-to-text app powered by WhisperKit. Press a hotkey, speak, and the transcription is typed directly into any text field.
- macOS 14.0+
- Apple Silicon (recommended) or Intel Mac
- Microphone permission
- Accessibility permission (for global hotkey and auto-type)
xcodebuild build -scheme SimpleWhisper -configuration Debug \
-project SimpleWhisper/SimpleWhisper.xcodeprojOr open in Xcode:
open SimpleWhisper/SimpleWhisper.xcodeprojNo external dependency managers needed — WhisperKit is included via Swift Package Manager.
- Launch the app — The Settings window opens. Go to the Model tab and download a Whisper model (Base is recommended for most users).
- Grant permissions — Allow Microphone and Accessibility access when prompted.
- Start transcribing — Hold
Fn+Control(or your custom hotkey), speak, then release. The transcription is typed at your cursor.
Enable AI Enhance in settings to clean up transcriptions with an LLM. Requires an OpenAI or Anthropic API key.
SimpleWhisper/
├── Models/ # AppState (Observable), enums, localization strings
├── Services/ # Audio recording, Whisper inference, LLM, hotkey, config
├── Views/ # SwiftUI settings, popover states, floating pill
└── Theme/ # Design tokens (colors, spacing, sizing)
Key design decisions:
- SwiftUI + AppKit hybrid — SwiftUI for views,
NSPanelfor the always-on-top floating pill. - Observable singleton —
AppStatedrives the entire UI via Swift Observation. - Actor-isolated inference —
WhisperServiceruns transcription off the main thread. - No external assets — Sound effects are synthesized in-memory; icons use SF Symbols.
- Persistent config — Settings and history stored at
~/.simple-whisper/.
MIT