You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm puzzling over an issue where the accuracy of timestamps seems to vary depending on the encoding of the input file,
We have MP3 files encoded at 48kpbs. When using WhisperX 3.3.0 to transcribe these (aligning them with whisperx.align method through the Python API) I'm seeing segments that are reasonably well aligned at the beginning, but drifting so that they are a second or so out by the end of the 25 minute audio.
However, if I take that same MP3 file and reencode it to AAC format at 63kbs, the segments are aligned perfectly all the way through the file.
Can anybody provide any insight why this might be happening? My understanding it that WhisperX uses FFMPEG to resample the audio files, so surely the original encoding shouldn't make any difference?
I have attached a pair of sample files, the M4a reencoded from the MP3 file if anybody is able to look at this.
I'm puzzling over an issue where the accuracy of timestamps seems to vary depending on the encoding of the input file,
We have MP3 files encoded at 48kpbs. When using WhisperX 3.3.0 to transcribe these (aligning them with whisperx.align method through the Python API) I'm seeing segments that are reasonably well aligned at the beginning, but drifting so that they are a second or so out by the end of the 25 minute audio.
However, if I take that same MP3 file and reencode it to AAC format at 63kbs, the segments are aligned perfectly all the way through the file.
Can anybody provide any insight why this might be happening? My understanding it that WhisperX uses FFMPEG to resample the audio files, so surely the original encoding shouldn't make any difference?
I have attached a pair of sample files, the M4a reencoded from the MP3 file if anybody is able to look at this.
DriftingTimestampsIssue_Trimmed.zip
The text was updated successfully, but these errors were encountered: