OpenAI Realtime API over WebRTC Push-To-Talk Android Phone[/Mobile] + Watch[/Wear] + [Bluetooth]AudioRouting:
- https://platform.openai.com/docs/guides/realtime
- https://platform.openai.com/docs/api-reference/realtime
Edited (cut, some think "fake") | Full/Raw (definitely not fake) |
---|---|
https://youtube.com/shorts/2dk9uPPfRKw https://youtu.be/2dk9uPPfRKw |
https://youtu.be/KTrm58dskTk |
- Phone[/Mobile]: Android 14 (API 34) and above
- Watch[/Wear]: Android Wear OS 5.0 (Android 14, API 34) and above
-
Q: Why not just write this as a webapp like everyone else?
Example:
https://youtu.be/oMKOtYQljM4
(Almost all examples out there are Node/JavaScript/TypeScript or Python based)
A: Because I want to:- Control it remotely via a watch
- Route audio to/from different devices
- Do something different (Native Android) from I guess almost everyone else :/
-
Q: Why use WebRTC Android SDK and not LiveKit Android SDK or GetStream Android SDK (or even GetStream Android Compose SDK)?
A: I was so confused by the LiveKit and GetStream offerings that I gave up and decided to just useio.github.webrtc-sdk:android
.
I was even confused by theWebRTC Android SDK
readme saying:
We also offer a shadowed version that moves the org.webrtc package to livekit.org.webrtc, avoiding any collisions with other WebRTC libraries
.
Why is WebRTC.org distributing alivekit
module? Are they related?
I believe the LiveKit and GetStream SDKs could be amazeballz, especially for general peer-to-peer or multi-party WebRTC, but I wanted to learn how to do AI WebRTC myself and mitigate any 3rd party points of failure.
I even looked at https://github.com/shepeliev/webrtc-kmp, felt safer sticking with the original raw WebRTC Android SDK than a Kotlin-Multi-Platform wrapper. -
Q: Why OpenAI and not Google (Gemini), Microsoft (Copilot), Perplexity, Anthropic (Claude), DeepSeek, etc?
(https://firstpagesage.com/reports/top-generative-ai-chatbots/)
A: As far as I know, as of 2025/01/30, OpenAI is the only company that has a AI WebRTC "Realtime" API.
There is LiveKit, which OpenAI uses in their OpenAI Android App, but LiveKit does not directly provide access to an AI:
https://docs.livekit.io/agents/quickstarts/s2s/
https://docs.livekit.io/agents/quickstarts/voice-agent/
https://docs.livekit.io/agents/openai/overview/
https://playground.livekit.io/
https://github.com/livekit-examples/realtime-playgroundI am also still trying to better understand the relationship between OpenAI and LiveKit.
On 2024/10/03 they mentioned some partnership with each other.
I have decompiled the OpenAI Android App with JADX and see extensive use of the LiveKit Android SDK, but I don't see how the LiveKit Android SDK helps them or me or anyone else much more than OpenAI just using the WebRTC Android SDK.
OpenAI also shows they are hiring for aSoftware Engineer, Real Time
[which I have applied for and [I think] am fully qualified for but have never heard back from them about], but that job has been listed since 2024/08, and why doesn't OpenAI just hire or buy the whole LiveKit team?
(Not necessarily in any order)
- Localize strings (I committed the sin of hard coding strings)
- Tests (another sin I committed)
- Get
Stop
working better - Standalone
Wear
version (lower priority; requires adding tiles for settings, conversation, etc) - Learn Tool/Function integration
- Find way to integrate with Gmail/Tasks/Keep/etc
- Find way to save conversations to https://chatgpt.com/ history
- Implement a
VoiceInteractionService
? https://developer.android.com/reference/android/service/voice/VoiceInteractionService
- The on/off switch acts a little odd
- SharedViewModel needs to search for remote device when "waking up"
(especially when disconnected/screen-off, when screen turns on or reconnecting)
- If Mobile or Wear physical device wireless debugging does not connect in Android Studio:
(from https://youtu.be/lLUYPdaf_Ow)- Look on device Dev options Wireless debugging for pairing ip address and port
- Example:
adb pair 10.0.0.113:42145
(replace the ip address and port with yours) - Look on device Dev options Wireless debugging for connection ip address and port
- Example:
adb connect 10.0.0.113:43999
(replace the ip address and port with yours)
- To record videos, use https://github.com/Genymobile/scrcpy
https://github.com/Genymobile/scrcpy/blob/master/doc/recording.md#recording
(confimed works on both Mobile/Phone and Wear/Watch!):brew install scrcpy
scrcpy -s 10.0.0.113:43999 --record=wear.mp4 & scrcpy -s 10.0.0.137:46129 --record=mobile.mp4 &
(replace the ip address and port with yours)