Kuiskaus 🎤

A fast, local speech-to-text application for Apple Silicon Macs using OpenAI's Whisper V3 Turbo model with MLX optimization. Hold a hotkey to record your voice, release to transcribe and automatically insert the text at your cursor position.

Features

Blazing Fast: Leverages Apple Silicon's Neural Engine via MLX for 8-15x real-time transcription
Global Hotkey: Hold Control+Option (⌃⌥) to record from anywhere
Automatic Text Insertion: Transcribed text is automatically typed at your cursor position
Menu Bar App: Unobtrusive menu bar interface with easy access to settings
100% Local: All processing happens on your Mac - no internet required after setup
Privacy First: Your audio never leaves your device

Requirements

macOS 12.0 or later
Apple Silicon Mac (M1/M2/M3) - Required
Python 3.8 or higher
~1.5GB disk space for the Whisper model
8GB RAM minimum (16GB recommended)

⚠️ Note: This application is optimized exclusively for Apple Silicon and does not support Intel-based Macs.

Installation

Quick Install

Clone this repository:

git clone https://github.com/randomm/kuiskaus.git
cd kuiskaus

Run the setup script:

./setup.sh

The setup script will:

Verify you're running on Apple Silicon
Install UV for ultra-fast package management
Install system dependencies (portaudio, ffmpeg)
Create a Python virtual environment
Install all Python dependencies including MLX
Download the Whisper V3 Turbo model (~1.5GB)
Create launch scripts

Grant accessibility permissions when prompted:
- Go to System Settings > Privacy & Security > Accessibility
- Add and enable Terminal (or your terminal app)
- Restart the app after granting permissions

Manual Installation

If the setup script fails, you can install manually:

# Verify Apple Silicon
if [[ $(sysctl -n machdep.cpu.brand_string) != *"Apple"* ]]; then
    echo "Error: This app requires Apple Silicon"
    exit 1
fi

# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install system dependencies
brew install portaudio ffmpeg

# Create virtual environment
uv venv
source .venv/bin/activate

# Install Python packages
uv pip compile requirements.in -o requirements.txt
uv pip sync requirements.txt

Usage

Menu Bar App (Recommended)

Launch the menu bar version:

./launch_kuiskaus.sh

The app will appear in your menu bar as a microphone icon (🎤). Click it to:

See current status
Enable/disable speech recognition
Change Whisper model size
View usage statistics
Quit the app

CLI Version

For a command-line interface:

./launch_cli.sh

How to Use

Start the app using one of the methods above
Hold Control+Option (⌃⌥) to start recording
Speak clearly into your microphone
Release the keys to stop recording and transcribe
The text will be automatically inserted at your cursor position

Performance

With MLX optimization on Apple Silicon:

Whisper V3 Turbo: 8-15x real-time (0.3-0.6s for 5s audio)
Model loads in ~1-2 seconds
Leverages Neural Engine for maximum efficiency
Minimal CPU usage during transcription

Configuration

Available Models

The menu bar app allows you to switch between different Whisper models:

Turbo (default): Fastest, optimized for speed
Base: Smallest model, ultra-fast
Small: Good balance of speed and accuracy
Medium: Better accuracy, slower
Large: Best accuracy, slowest

Changing the Hotkey

To modify the hotkey, edit kuiskaus/hotkey_listener_cgevent.py and change the required_modifiers.

Troubleshooting

"Accessibility permissions required"

Grant permissions in System Settings > Privacy & Security > Accessibility
Add your terminal application to the list and enable it
Restart the app after granting permissions

No audio is being recorded

Check that your microphone is working in other apps
Ensure no other app is exclusively using the microphone
Try selecting a different audio input device in System Settings

Text not being inserted

Some applications may block programmatic text input
Try using the clipboard paste method (longer text is automatically pasted)
Ensure the target application has focus when releasing the hotkey

Model loading is slow

First-time model download can take several minutes (~1.5GB)
The model is cached locally after first download
Subsequent loads take only 1-2 seconds

Known Issues

Info.plist notification error: If you see errors about Info.plist when running from a virtual environment, this is a known issue with rumps. The app will still work, but notifications may not display correctly.

Privacy & Security

100% Local: All speech processing happens on-device
No Internet Required: Works completely offline after setup
No Data Collection: Your audio and transcriptions never leave your Mac
Open Source: Full source code available for inspection

Development

Project Structure

kuiskaus/
├── kuiskaus/               # Core application package
│   ├── audio_recorder.py   # PyAudio-based recording
│   ├── whisper_transcriber.py # MLX Whisper integration
│   ├── hotkey_listener_cgevent.py # Global hotkey detection
│   ├── text_inserter.py    # Text insertion at cursor
│   ├── app.py              # CLI application
│   └── menubar.py          # Menu bar application
├── tests/                  # Test suite
├── setup.sh                # Installation script
├── launch_kuiskaus.sh      # Menu bar launcher
├── launch_cli.sh           # CLI launcher
├── run_tests.sh            # Test runner
├── requirements.in         # Direct dependencies
├── requirements.txt        # Locked dependencies
└── README.md

Testing

Run the test suite:

./run_tests.sh

Updating Dependencies

If you modify requirements.in, regenerate the locked dependencies:

uv pip compile requirements.in -o requirements.txt
uv pip sync requirements.txt

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

License

MIT License - see LICENSE file for details

Acknowledgments

OpenAI for the Whisper model
Apple for the MLX framework
The Python community for excellent macOS integration libraries

Note: "Kuiskaus" is Finnish for "whisper" 🇫🇮

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kuiskaus 🎤

Features

Requirements

Installation

Quick Install

Manual Installation

Usage

Menu Bar App (Recommended)

CLI Version

How to Use

Performance

Configuration

Available Models

Changing the Hotkey

Troubleshooting

"Accessibility permissions required"

No audio is being recorded

Text not being inserted

Model loading is slow

Known Issues

Privacy & Security

Development

Project Structure

Testing

Updating Dependencies

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
kuiskaus		kuiskaus
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
launch_cli.sh		launch_cli.sh
launch_kuiskaus.sh		launch_kuiskaus.sh
requirements.in		requirements.in
requirements.txt		requirements.txt
run_tests.sh		run_tests.sh
setup.sh		setup.sh

Folders and files

Latest commit

History

Repository files navigation

Kuiskaus 🎤

Features

Requirements

Installation

Quick Install

Manual Installation

Usage

Menu Bar App (Recommended)

CLI Version

How to Use

Performance

Configuration

Available Models

Changing the Hotkey

Troubleshooting

"Accessibility permissions required"

No audio is being recorded

Text not being inserted

Model loading is slow

Known Issues

Privacy & Security

Development

Project Structure

Testing

Updating Dependencies

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages