-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Feature Type
I cannot use LiveKit without it
Feature Description
Hi LiveKit Team,
I am currently using the livekit-plugins-nvidia plugin to leverage NVIDIA Parakeet models for transcription in my voice agents.
While the STT performance is excellent, many use cases (e.g., multi-user meetings, interviews) require identifying "who spoke when." Other LiveKit plugins, such as Speechmatics and Deepgram, already offer a native enable_diarization flag that populates the speaker_id field.
Feature Request:
Is there a plan to implement speaker diarization for the NVIDIA Riva plugin?
Technical Context:
NVIDIA Riva supports diarization via Sortformer for both streaming and batch modes.
Enabling this typically requires passing diarization parameters to the Riva ASR config (e.g., enable_speaker_diarization: true).
This would bring the NVIDIA plugin to feature parity with other SOTA transcription plugins in the ecosystem. Thank you!
Workarounds / Alternatives
No response
Additional Context
No response