whisperx Fails transcription with “Could not load library open shared object file: No such file or directory*” #1027

sijitang opened this issue Jan 30, 2025 · 2 comments


Using whisperx on google colab
Installed newest dev whisperx using:

!pip install git+

whisperx /content/htdemucs_ft/7xCxpfERepU/enhanced_vocals.wav --verbose True --print_progress True --model large-v3 --device cuda --batch_size 16 --chunk_size 30 --vad_onset 0.500 --vad_offset 0.363 --vad_method pyannote --language de --compute_type float16 --output_dir . --align_model WAV2VEC2_ASR_LARGE_LV60K_960H

/usr/local/lib/python3.11/dist-packages/torchvision/io/ UserWarning: Failed to load image Python extension: '/usr/local/lib/python3.11/dist-packages/torchvision/ undefined symbol: _ZN3c1017RegisterOperatorsD1Ev'If you don't plan on using image functionality from, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
2025-01-29 23:13:45.762227: I tensorflow/core/util/] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2025-01-29 23:13:45.781163: E external/local_xla/xla/stream_executor/cuda/] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1738192425.804872 6091] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1738192425.811846 6091] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-01-29 23:13:45.834760: I tensorflow/core/platform/] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
/usr/local/lib/python3.11/dist-packages/torchvision/datapoints/ UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: pytorch/vision#6753, and you can also check out pytorch/vision#7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
/usr/local/lib/python3.11/dist-packages/torchvision/transforms/v2/ UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: pytorch/vision#6753, and you can also check out pytorch/vision#7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
INFO:speechbrain.utils.quirks:Applied quirks (see speechbrain.utils.quirks): [disable_jit_profiling, allow_tf32]
INFO:speechbrain.utils.quirks:Excluded quirks specified by the SB_DISABLE_QUIRKS environment (comma-separated list): []
model.bin: 0% 0.00/3.09G [00:00<?, ?B/s]
vocabulary.json: 0% 0.00/1.07M [00:00<?, ?B/s]

config.json: 100% 2.39k/2.39k [00:00<00:00, 19.6MB/s]

preprocessor_config.json: 100% 340/340 [00:00<00:00, 3.09MB/s]

tokenizer.json: 0% 0.00/2.48M [00:00<?, ?B/s]
vocabulary.json: 100% 1.07M/1.07M [00:00<00:00, 7.80MB/s]
model.bin: 0% 10.5M/3.09G [00:00<01:16, 40.3MB/s]

tokenizer.json: 100% 2.48M/2.48M [00:00<00:00, 9.26MB/s]
model.bin: 100% 3.09G/3.09G [01:12<00:00, 42.4MB/s]

Performing voice activity detection using Pyannote...
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.5.0.post0. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint ../usr/local/lib/python3.11/dist-packages/whisperx/assets/pytorch_model.bin
Model was trained with 0.0.1, yours is 3.3.2. Bad things might happen unless you revert to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.6.0+cu124. Bad things might happen unless you revert torch to 1.x.
Performing transcription...
/usr/local/lib/python3.11/dist-packages/pyannote/audio/utils/ ReproducibilityWarning: TensorFloat-32 (TF32) has been disabled as it might lead to reproducibility issues and lower accuracy.
It can be re-enabled by calling

import torch
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.allow_tf32 = True
See pyannote/pyannote-audio#1370 for more details.

Could not load library Error: cannot open shared object file: No such file or directory

Seems like Colab upgraded from CUDA 12.1 to CUDA 12.4

Facing same issue. Any quick fix for this? Our entire pipeline is stuck due to this.

