Skip to content

[Feature] Option to choose between Pyannote and NeMo for diarization #115

@Arche151

Description

@Arche151

First of all, @pluja I want to thank you again for developing whishper/soon to be anysub!

I basically check out the v4 branch every day, because I'm too excited for when anysub is ready! :) And I can't believe, that my feature request - user authentication - will actually be implemented. Thanks so much for that!

My new feature request probably comes way too late, considering how deeply WhisperX will be integrated into anysub and how much work you've put into the WhisperX API, but I want to try anyway.

I suggest adding the option to choose between Pyannote and Nvidia NeMo for diarization for two reasons:

  1. Unlike Pyannote NeMo is truly open source, with no requirement for obtaining and entering an authorization token.
  2. From my personal tests and to my surprise NeMo is way better than Pyannote at accurately diarizing speakers.

@MahmoudAshraf97 created whisper-diarization which is in parts based on WhisperX, but uses NeMo for diarization.

I know, that I am asking a lot here, but for the two reasons, that I stated, I would really appreciate it, if you could still consider it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions