Open
Description
Objective:
Improve the ability to align text and audio deltas for smoother playback and interruption handling.
Proposed solutions (in order of preference):
- Implement corresponding event_ids between text and audio delta events.
- Alternatively, provide approximate audio frame numbers for sentence pauses or completions.
Benefits:
Enables graceful sentence completion before cutting off buffered audio from the previous turn.
Improves overall user experience with more natural speech flow and interruptions.
This would make life much easier :)