Skip to content

Whisper models in Foundry Local SDK: v1 models return empty text, v2 models only transcribe first 30 seconds #517

@yuvalshmaryahu

Description

@yuvalshmaryahu

Hi,

I’m trying to use the Whisper-CPU models through the Foundry Local SDK, but I’m encountering some unexpected behavior.
There appear to be two model versions: v1 and v2.
For v1 models (e.g., base, small, medium), the API consistently returns empty text, even for audio files that produce correct transcriptions when using whisper-tiny.
For v2 models (e.g., tiny, large), the transcription is returned, but only for the first ~30 seconds of the audio. The rest of the audio is not transcribed.
Because of this, I’m currently unable to get a full transcription using the available models.

Could you please help clarify:
Whether this is a known issue with these models in the Foundry Local SDK?
If there is a recommended configuration or workaround to obtain full transcriptions?

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions