Have you ever had a real conversation with a local LLM? Or even taken a VoIP (SIP) phone call with one? #576
Replies: 4 comments 2 replies
-
|
This is really impressive work. A fully on-device voice agent like Kurtis E1 shows how powerful local AI has become, especially with privacy-first, offline use cases. The stack choice makes a lot of sense, and SIP VoIP integration sounds like a big step forward. Even if it’s not aimed at math or coding, it’s a strong proof of what on-device workflows can look like. Wishing you success in finding the right partners to turn these POCs, including OCA0188 error, into real-world applications. |
Beta Was this translation helpful? Give feedback.
-
|
Really interesting work on Kurtis E1! The on-device stack (Whisper STT → LLM → Coqui TTS) is a solid foundation. For the SIP/VoIP integration you're testing — one of the trickier parts is getting the RTP audio in and out of the local pipeline without introducing too much latency. A few things that tend to matter: Audio path considerations:
SIP stack options for Python/local integration:
On that last point — VoIPBin is built specifically for AI agents and handles the RTP/STT/TTS layer on its side, so the local model only needs to process text over a simple API. That could help isolate the SIP complexity while you focus on the on-device LLM side. It also supports Direct Hash SIP URIs so you can test without provisioning a real phone number. Either way, good luck with the integration — SIP VoIP + local LLM is a genuinely useful combination for privacy-sensitive use cases. (Disclosure: I work on VoIPBin, but the advice above applies regardless of which approach you take.) |
Beta Was this translation helpful? Give feedback.
-
|
This is a fascinating project — running the entire voice agent stack on-device with Apple Silicon is genuinely impressive, and the SIP/VoIP integration path you're exploring is the right direction for making it practically usable. A few thoughts on the SIP integration side: On-device SIP client challenges:
Architecture suggestion for the SIP layer: Instead of embedding a full SIP UA in the same process as the AI engine, it's worth considering a split: a lightweight SIP/RTP gateway that connects to the trunk and streams audio to a local socket, with Kurtis consuming from that socket. This isolates the latency-sensitive SIP/RTP work from the inference pipeline. For outbound call scenarios (or if you want to expose the agent on a stable SIP URI without managing a PBX), VoIPBin provides Direct Hash SIP URIs ( The on-device + SIP combination has real potential for private enterprise deployments. Looking forward to seeing the VoIP integration land — will you be using baresip or a custom UA for the SIP stack? |
Beta Was this translation helpful? Give feedback.
-
|
Really interesting project! The SIP/VoIP integration you're working on is exactly the hard part. For anyone trying to replicate or extend this kind of setup, VoIPBin is an open-source CPaaS that simplifies the phone call side of an AI voice agent. The key feature for a setup like yours is Media Offloading — rather than having to build the full RTP pipeline and STT integration yourself, VoIPBin handles the audio transport layer and exposes a clean API for your LLM to interact with. Your agent just sends/receives text. Getting started is instant — no OTP or manual verification: # Get an accesskey
curl -X POST https://api.voipbin.net/v1.0/auth/signup -H "Content-Type: application/json" -d '{"username":"kurtis-agent","password":"secret"}'You also don't need a real phone number to test — use a Direct Hash SIP URI: For on-device / local deployments like Kurtis E1, this means the SIP signaling is handled externally by VoIPBin while your MLX models handle the actual reasoning — which keeps the on-device compute budget focused on the LLM/STT/TTS stack. Go SDK: |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Check out Kurtis E1: A Fully On-Device MLX Voice Agent.
The entire stack runs on-device, leveraging MLX-LM on Apple Silicon:
ethicalabs/Kurtis-E1.1-Qwen2.5-3B-Instruct
This showcases the power of local AI/ML. I am also actively developing the SIP #VoIP integration (now in testing). The goal? To let you take a phone call and talk directly with your private agent, even without a computer or internet connection.
While Kurtis isn't built for math/coding, it shows a valuable path forward for on-device workflows.
We are actively looking for partners and clients to build out these POCs into real-world use cases.
https://www.ethicalabs.ai/ isn't a startup. We are not looking for VCs, equity deals, or grants: we're an open-source project.
If you like the R&D, you can support the R&D directly: https://github.com/sponsors/ethicalabs-ai?frequency=one-time
Beta Was this translation helpful? Give feedback.
All reactions