Refactor chunking logic for lower-latency realtime audio streaming by mgupta-soundhound · Pull Request #22 · soundhound/houndify-sdk-go

mgupta-soundhound · 2026-03-20T19:36:55Z

Problem(s)

The realtime streaming example currently sends LPCM audio in fixed 1-second chunks. This doesn’t reflect real realtime behaviour and introduces unnecessary latency before partial transcripts and SafeToStopAudio signals are received.
The chunking logic is embedded in the example and assumes a wav format, making it harder to reuse, validate, or modify for different LPCM formats.
Using such large chunks also increases the chance that the client is still writing the final chunk after the server has already determined it has enough audio, has processed the query and closed the connection. In this case, the client can attempt to write to a closed socket and encounter a TCP reset.

Solution Summary

Improve realtime LPCM streaming packetization by moving chunk-size calculation into a reusable helper, tightening input validation, and adding unit test coverage for chunk sizing across common sample rates.

What changed

Reduce streamed audio packetization from 1 second to smaller realtime intervals for faster partial transcripts and SafeToStopAudio handling.
Add GetLPCMStreamInfo to centralize LPCM chunk-size and streaming-interval calculation.
Validate LPCM inputs more strictly (numChans, bitDepth, sampleRate, and targetStreamIntervalMs).
Refactor the example streamer to use the shared LPCM stream info helper.
Add isolated table-driven tests covering expected chunk sizes and streaming intervals for multiple sample rates and intervals, plus invalid-input cases.
Replaced the timer.Sleep() to ticker in audio streaming example - to avoid any drifts over time. This change will also help prevent writing any chunks after SafeToStopAudio has been received.
Other upgrades:
- Update go version to 1.26
- Remove usage of deprecated io/ioutil
- Remove usage of deprecated github.com/pkg/errors
- Replace gotest.tools/assert with github.com/stretchr/testify/assert

Why

Keeps realtime streaming pacing consistent with frame-aligned LPCM audio.
Makes the chunking logic reusable and easier to reason about outside the example.
Adds regression coverage for sample rates where chunk math is easy to get wrong.

LPCM Streaming Calculations
The helper calculates streaming info in three steps:

Compute the ideal byte count for the requested streaming interval using sample rate, channel count, and bit depth.
Align that byte count to full LPCM audio frames so chunks never split samples.
Derive the actual streaming interval represented by the aligned chunk size, so pacing stays consistent with the bytes being sent.
This keeps the example’s realtime stream frame-aligned and minimizes drift between transmitted audio data and wall-clock timing.

Testing

Added lpcm_stream_info_test.go coverage for valid and invalid cases.
Manually tested the example.go code:

% ./example -voice ../test_audio/what_is_the_weather_like_in_toronto.wav -stream 
what
what
what is
what is the
what is the
what is the weather
what is the weather
what is the weather
what is the weather like
what is the weather like
what is the weather like in
what is the weather like in
what is the weather like in tur
what is the weather like in toronto
what is the weather like in toronto
what is the weather like in toronto
what is the weather like in toronto
what is the weather like in toronto
what is the weather like in toronto
Reached end of file
what is the weather like in toronto
what is the weather like in toronto
what is the weather like in toronto
The weather is 38 °F and raining in Toronto, Canada.

Follow-up(s)

As a follow-up, we should consider removing the current io.Pipe reader/writer pattern from the example. In Go, io.Pipe behaves like a buffer-less channel: writes block until the reader is ready to consume the data. That means the writer’s pacing is coupled to the request body reader, which can distort the intended realtime streaming cadence. Replacing this pattern with a simple buffered approach would decouple writes from reads, making the streaming example non-blocking.

zhili-soundhound · 2026-03-27T14:57:49Z

Should we move this file into example folder to indicate this is not part of the core SDK but some helper functions used in the client code?

We could keep it here if we want add some simpler API to allow SDK users to stream with a certain interval in the future.

I’m intentionally including this in the main SDK so clients can directly configure streaming chunking and duration for LPCM audio formats. This makes it a first-class part of the SDK, available to anyone who needs it. If it lived under github.com/soundhound/houndify-sdk-go/example, clients would have to either copy the code each time or import an example package, which isn’t ideal for production use. As it stands, they can simply import github.com/soundhound/houndify-sdk-go and use it out of the box.

zhili-soundhound · 2026-03-27T14:58:06Z

LGTM

mgupta-soundhound changed the title ~~Reduce chunk size from 1s to 20ms for audio streaming~~ [Draft] Reduce chunk size from 1s to 20ms for audio streaming Mar 20, 2026

mgupta-soundhound force-pushed the mgupta/reduce_audio_packetization branch 2 times, most recently from c84e205 to c10c6cd Compare March 20, 2026 23:30

mgupta-soundhound changed the title ~~[Draft] Reduce chunk size from 1s to 20ms for audio streaming~~ Refactor chunking logic for lower-latency realtime audio streaming Mar 20, 2026

mgupta-soundhound force-pushed the mgupta/reduce_audio_packetization branch 5 times, most recently from e70d0dc to a92c6b4 Compare March 22, 2026 14:54

mgupta-soundhound self-assigned this Mar 22, 2026

mgupta-soundhound force-pushed the mgupta/reduce_audio_packetization branch 4 times, most recently from 86b1e3d to 8d6e5fe Compare March 22, 2026 16:13

Refactor chunking logic for lower-latency realtime audio streaming

4620d42

mgupta-soundhound force-pushed the mgupta/reduce_audio_packetization branch from 8d6e5fe to 4620d42 Compare March 22, 2026 16:15

zhili-soundhound reviewed Mar 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor chunking logic for lower-latency realtime audio streaming#22

Refactor chunking logic for lower-latency realtime audio streaming#22
mgupta-soundhound wants to merge 1 commit into
masterfrom
mgupta/reduce_audio_packetization

mgupta-soundhound commented Mar 20, 2026 •

edited

Loading

Uh oh!

zhili-soundhound Mar 27, 2026

Uh oh!

mgupta-soundhound Mar 27, 2026

Uh oh!

zhili-soundhound commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mgupta-soundhound commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhili-soundhound Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

mgupta-soundhound Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

zhili-soundhound commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mgupta-soundhound commented Mar 20, 2026 •

edited

Loading