Real-time voice-to-text dictation for Linux using OpenAI Whisper and GPU acceleration.
Press Ctrl+M anywhere to start recording, speak, and have your words automatically transcribed and pasted where your cursor is. Perfect for coding, writing, email, and any text input.
- 🎯 Global Hotkey: Press
Ctrl+Mto record from any application - 🌍 Auto Language Detection: Automatically detects and transcribes any language
- ⚡ GPU Accelerated: Uses whisper.cpp with CUDA for RTX GPUs (or CPU fallback)
- 🔄 Auto-paste: Transcribed text automatically pastes at cursor position
- 📍 System Tray: Visual status indicator (idle/recording/transcribing)
- 📜 History: Access your last 10 transcriptions from the tray menu
- 🚀 Boot on Startup: Automatically starts with your system
Idle (Blue) → Press Ctrl+M → Recording (Red Mic) → Press Ctrl+M → Transcribing (Orange) → Text auto-pastes!
- OS: Linux (Ubuntu 20.04+, Debian, Fedora, Arch)
- Python: 3.8 or higher
- GPU (optional): NVIDIA GPU with CUDA support for faster transcription
- RAM: 4GB minimum, 8GB recommended
- Disk: ~2GB for models and dependencies
git clone https://github.com/NicolasHuberty/whisper-dictation.git && cd whisper-dictation && chmod +x install.sh && ./install.shThat's it! The installer will:
- ✅ Install system dependencies (build tools, CUDA if available)
- ✅ Clone and build whisper.cpp from source
- ✅ Download the Whisper model (default: base, ~150MB)
- ✅ Install Python dependencies
- ✅ Set up autostart on boot
- ✅ Launch the application
After installation, the app runs automatically with a blue circle icon in your system tray.
- Start Recording: Press
Ctrl+M- Icon turns red (microphone)
- Speak your text
- Stop Recording: Press
Ctrl+Magain- Icon turns orange (processing)
- Auto-paste: Text appears at your cursor automatically!
Right-click the tray icon to:
- View current status
- See transcription history (last 10)
- Quit the application
cd whisper-dictation
python3 whisper-dictation.pyEdit whisper-dictation.py to customize:
# Line ~245
hotkeys = keyboard.GlobalHotKeys({
'<ctrl>+m': toggle_recording # Change to '<f9>' or '<ctrl>+<alt>+v'
})# Line ~32 - Available models:
# tiny (~75MB, fastest, least accurate)
# base (~150MB, default, fast and decent accuracy)
# small (~500MB, balanced)
# medium (~1.5GB, better accuracy)
# large-v3 (~3GB, best accuracy, slower)
WHISPER_MODEL = "path/to/whisper.cpp/models/ggml-base.bin"The app auto-detects USB microphones. If you have multiple mics, edit line ~20:
mic_device = usb_mics[0] # Change index to select different miccd whisper.cpp/models
./download-ggml-model.sh medium # Download medium modelThe installer automatically detects NVIDIA GPUs and builds with CUDA support. For AMD GPUs or CPU-only:
# CPU-only build
cd whisper.cpp
make clean
make# Install dependencies
sudo apt update
sudo apt install -y python3-pip git build-essential portaudio19-dev
# Clone whisper.cpp
git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp
# Build with GPU support (NVIDIA)
make clean
WHISPER_CUDA=1 make
# Download model
bash ./models/download-ggml-model.sh base
# Install Python packages
pip3 install -r ../requirements.txt
# Run
cd ..
python3 whisper-dictation.pyProblem: Text is copied to clipboard but not pasted.
Solutions:
- Install xdotool:
sudo apt install xdotool - Check permissions: The app needs permission to simulate keyboard input
- Try a different hotkey (some apps block
Ctrl+M)
Problem: "No USB microphone detected" error.
Solutions:
# List audio devices
python3 -c "import sounddevice as sd; print(sd.query_devices())"
# If your mic isn't USB, edit line ~15 to detect all input devices:
mic_device = [i for i, d in enumerate(devices) if d['max_input_channels'] > 0][0]Problem: Ctrl+M doesn't work in browsers (Chrome, Firefox).
Solutions:
- Chrome/Firefox intercept
Ctrl+Mfor bookmark management - Change hotkey to
F9,Ctrl+Alt+M, or another combination (see Configuration above)
Problem: Transcription takes >5 seconds.
Solutions:
- Use GPU: Ensure CUDA build with
nvidia-smito verify GPU is detected - Smaller model: Switch to
tinyorbasemodel for faster transcription - Check CPU: Close heavy applications during transcription
Problem: No tray icon visible.
Solutions:
# For GNOME, install AppIndicator extension
sudo apt install gnome-shell-extension-appindicator
# For KDE/XFCE, restart the panel
killall plasmashell && plasmashell & # KDE
xfce4-panel -r # XFCE- Audio Capture: Uses
sounddeviceto capture audio from your microphone - Whisper.cpp: Transcribes audio using the optimized C++ implementation of OpenAI Whisper
- Clipboard: Copies transcription to clipboard via
pyperclip - Auto-paste: Simulates
Ctrl+Vusingpyautoguito paste text - System Tray:
pystrayprovides the tray icon and menu
Contributions are welcome! Please feel free to submit a Pull Request.
git clone https://github.com/NicolasHuberty/whisper-dictation.git
cd whisper-dictation
# Install in development mode
pip3 install -r requirements.txt
# Make changes to whisper-dictation.py
# Test
python3 whisper-dictation.py- Support for more hotkey customization via config file
- Web UI for configuration
- Windows and macOS support
- Plugin system for custom post-processing
- Voice commands (punctuation, formatting)
- Multiple language profiles
MIT License - see LICENSE file for details.
- OpenAI Whisper - The amazing speech recognition model
- whisper.cpp - High-performance C++ implementation
- All the amazing open-source libraries used in this project
If you find this useful, please consider giving it a star! ⭐
Made with ❤️ for the Linux community
Have questions? Open an issue on GitHub!