deduplicate and order suppression tokens in apply not add #1923

rjames-0 · 2025-10-31T12:53:08Z

This PR implements an optimization to prevent an increasing overhead when running faster-whisper with large batches and token suppression enabled. Fixes issue mentioned in #1566

The suppress token list ordering and deduplication is no longer done on every add call (which scaled badly when batching) but instead once just at apply before launching the cuda kernel.

results from a local machine test with and without optimization:

deduplicate and order suppression tokens in apply not add

0b81fb2

rjames-0 mentioned this pull request Oct 31, 2025

Inefficient tokens suppression during BeamSearch #1566

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

deduplicate and order suppression tokens in apply not add #1923

deduplicate and order suppression tokens in apply not add #1923

Uh oh!

rjames-0 commented Oct 31, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

deduplicate and order suppression tokens in apply not add #1923

Are you sure you want to change the base?

deduplicate and order suppression tokens in apply not add #1923

Uh oh!

Conversation

rjames-0 commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rjames-0 commented Oct 31, 2025 •

edited

Loading