Rex Home Agent is a personal AI assistant.
Now that consumer hardware can run local LLMs fairly easily, I wanted to see if I could build a private voice assistant.
Features:
- works out-of-the-box on any M4 Mac.
- tool usage (i.e. do a web search for information past the model's cutoff date).
- sound effects for wake word + thinking.
- persistent memory.
- supports barge-in (user can interrupt voice output).
This project was also developed with PyCharm integrated with local models via the Continue plugin and ollama.
I used qwen3-coder:30b for the IDE agent, and qwen2.5-coder:7b for in-editor autocomplete.
Assuming your main model is ~30B parameters, this should run comfortably on a MacBook Pro, Mac Mini, Mac Studio, etc. with:
- an M4 chip
- 48GB RAM
- 14 CPU cores
- 20 GPU cores
- Install Homebrew if you don't already have it installed:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" (echo; echo 'eval "$(/opt/homebrew/bin/brew shellenv)"') >> ~/.zprofile eval "$(/opt/homebrew/bin/brew shellenv)" - Install Ollama with
brew install ollama. - Start Ollama with
brew services start ollama. - Download a model like Mistral Small with
ollama pull huihui_ai/mistral-small-abliterated:latest. - Set this model name as
OLLAMA_MODEL_PRIMARYin config.py - Activate the virtual environment with
source .venv/bin/activate. - Install Python dependencies with
pip3 install -r requirements.txt. - Get a Picovoice license key and set it as
PICOVOICE_LICENSE_KEYin your environment variables. - Start the agent with
python main.py
- Visit Braze Search API.
- Create an account and confirm your email.
- Login > Subscription > Select the $5.00 plan (actually free).
- API keys > Add API Key > Save
- Copy API key and set it as an env var under
BRAVE_SEARCH_API_KEY.
You get $5 in credit each month, and 1000 searches per month on this plan, so unless you're averaging more than 33 searches a day this should be free.
In your Picovoice console, create a new Porcupine wake word model and download it.
Add the file to /models and update the path to this model as PORCUPINE_WAKE_WORD_PATH in config.py.
The voice assistant is structured as a series of independent workers. Each worker runs in its own thread and handles a separate function.
Data is streamed between workers via queues.
User voice input
↓
Microphone
↓
Audio Capture Worker
↓
[captured audio queue]
↓
Wake Word Detection Worker
↓
[speech audio queue]
↓
Speech-To-Text Worker
↓
[STT text queue]
↓
LLM Worker ←→ Agent (intent extraction, chat memory, tool selection)
↓
[LLM response queue]
↓
Text-To-Speech Worker
↓
[TTS audio queue]
↓
Speaker → Agent voice output
Any worker can also write directly to the audio queue as well (i.e. for sound effects like wake and thinking).
- Implement barge-in
- Sustain conversation. Don't require wake word unless no user voice input for ~5 minutes, etc.
- Add model memory like MemoryBuffer from LangChain.
- Add a model call for intent extraction.
- Add tool to perform a web search.
- Add a sleep sound effect when the agent goes back to sleep.
Sound credits:
- Wake sound: kickhat on Freesound.org
- Thinking sound 1: Tissman on Freesound.org
- Thinking sound 2: DanJFilms on Freesound.org