Skip to content

Feat: Parler TTS Support#337

Draft
DogeDark wants to merge 1 commit into
floneum:mainfrom
DogeDark:parler-support
Draft

Feat: Parler TTS Support#337
DogeDark wants to merge 1 commit into
floneum:mainfrom
DogeDark:parler-support

Conversation

@DogeDark

@DogeDark DogeDark commented Feb 24, 2025

Copy link
Copy Markdown

Adds basic Parler TTS support using the Candle integration. This supports both Mini and Large Parler models.

This implementation is very basic with no streaming and limited output methods. I did not (knowingly) implement any of the optimization tips.

This implementation is slow. The optimization tips link mentioned the capability of 500ms to-first-speech with streaming implemented. On my machine, it takes a good 10 seconds on the mini model w/RTX 3080 cuda.

Wav file output is behind the wav feature flag.

Demo

yodude1.mp4
#[tokio::main]
async fn main() {
    // Get parler
    let parler = ParlerBuilder::default()
        .with_source(ParlerSource::MiniV1)
        .build()
        .await
        .unwrap();

    // Get a task handle
    let task = parler.generate(
        "Yo dude! This Parler text-to-speech integration for Kalosm is sick!",
        "Will's voice is enthusiastic and slightly fast in delivery, with a very close recording that almost has no background noise.",
    );
    
    // Set settings
    let task = task.with_settings(
        GenerationSettings::new()
            .with_temperature(Some(1.0))
            .with_top_p(Some(0.2)),
    );

    // Get the decoder and output to wav file.
    let decoder = task.await.unwrap();
    decoder
        .to_wav(PathBuf::from("./parler-output.wav"))
        .unwrap();
}

To-Do:

  • Maybe try optimizing it.
  • Kalosm-sound re-export?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant