Fix: bypass torchcodec crash in _save_mp3 on PyTorch 2.10+#1145
Fix: bypass torchcodec crash in _save_mp3 on PyTorch 2.10+#1145psale wants to merge 1 commit intoace-step:mainfrom
Conversation
Replaced torchaudio.save with soundfile.write for intermediate WAV files to avoid DLL/symbol errors in Colab/Linux environments. Also updated unit tests.
📝 WalkthroughWalkthroughThe PR replaces Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
acestep/audio_utils.py (2)
302-308: Consider extending the torchcodec bypass to the wav/flac path as a follow-up.The same torchcodec dispatch that motivated this PR affects
torchaudio.save(..., backend='soundfile')on PyTorch 2.10+ — soformat == "flac"/"wav"here will still crash in the Colab environment described in the PR, even though MP3 is now fixed. The exception handler at lines 321-342 will likely catch it and fall back tosf.write, but that makes the slow/noisy path the common case on 2.10+.Out of scope for this PR (which explicitly limits the change to MP3), but worth a follow-up to route WAV/FLAC writes directly through
sf.writefor the same reason.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/audio_utils.py` around lines 302 - 308, The torchaudio.save call for WAV/FLAC still goes through the torchcodec dispatch and can fail on PyTorch 2.10+, so update the save routine to bypass torchaudio when format is "wav" or "flac" by dispatching directly to soundfile's sf.write (the same path used in the exception fallback) instead of calling torchaudio.save; specifically modify the branch around torchaudio.save (referencing torchaudio.save, the format variable, and sf.write) to short-circuit for format == "wav" || format == "flac" and write via sf.write with the same audio_tensor/sample_rate handling to avoid triggering the exception handler and unnecessary slow fallbacks.
288-288: Nit: redundant localimport soundfile as sf.
soundfileis now imported at module scope (line 22), so the local imports insidesave_audio(line 288 for the wav32 path, line 326 in the exception fallback) are redundant. Safe to remove for consistency.♻️ Proposed cleanup
@@ -285,8 +285,6 @@ if format == "wav32": try: - import soundfile as sf - # Use soundfile directly for 32-bit float audio_np = audio_tensor.transpose(0, 1).numpy() # [channels, samples] -> [samples, channels]@@ -323,7 +321,6 @@ logger.error(f"[AudioSaver] MP3 export failed without fallback: {e}") raise try: - import soundfile as sf audio_np = audio_tensor.transpose(0, 1).numpy() # -> [samples, channels]Also applies to: 326-326
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/audio_utils.py` at line 288, The local redundant "import soundfile as sf" statements inside save_audio (present near the wav32 path branch and the exception fallback) should be removed because soundfile is already imported at module scope; update the save_audio function by deleting those local imports and relying on the module-level sf, ensuring any references in the wav32 branch and the exception handling branch continue to call sf.* without re-importing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@acestep/audio_utils.py`:
- Around line 302-308: The torchaudio.save call for WAV/FLAC still goes through
the torchcodec dispatch and can fail on PyTorch 2.10+, so update the save
routine to bypass torchaudio when format is "wav" or "flac" by dispatching
directly to soundfile's sf.write (the same path used in the exception fallback)
instead of calling torchaudio.save; specifically modify the branch around
torchaudio.save (referencing torchaudio.save, the format variable, and sf.write)
to short-circuit for format == "wav" || format == "flac" and write via sf.write
with the same audio_tensor/sample_rate handling to avoid triggering the
exception handler and unnecessary slow fallbacks.
- Line 288: The local redundant "import soundfile as sf" statements inside
save_audio (present near the wav32 path branch and the exception fallback)
should be removed because soundfile is already imported at module scope; update
the save_audio function by deleting those local imports and relying on the
module-level sf, ensuring any references in the wav32 branch and the exception
handling branch continue to call sf.* without re-importing.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: aaec91ae-0945-4722-841c-5a75d84234ac
📒 Files selected for processing (2)
acestep/audio_utils.pyacestep/audio_utils_test.py
|
This fix needs merging in. I used it successfully on my Linux Mint 22 install. It fixed the torchcodec error at the mp3 creation stage. |
Summary
AudioSaver._save_mp3()crashes on environments wheretorchcodec's shared libraries(
libtorchcodec_core*.so) cannot load — notably Google Colab with PyTorch 2.10.0+cu128.The fix replaces
torchaudio.save()with directsoundfile.write()for the intermediateWAV file in
_save_mp3. This is a minimal, surgical change: the ffmpeg-based MP3 encodingpipeline is untouched.
Root Cause
In
torchaudio >= 2.10,torchaudio.save()unconditionally routes throughsave_with_torchcodec()for all formats, even whenbackend='soundfile'is specifiedexplicitly. If
torchcodeccannot load its native FFmpeg shared libraries (common on Colabdue to mismatched
libavutil.soversions or thetorch_dtype_float4_e2m1fn_x2symbol),the call fails with a hard
RuntimeError.The error chain is:
_save_mp3 → torchaudio.save(..., backend='soundfile') → save_with_torchcodec() ← dispatched unconditionally → load_torchcodec_shared_libraries() → RuntimeError: Could not load libtorchcodec
Since
save_audio()has a soundfile fallback for non-MP3 formats (lines 317–338), thosesurvive. But MP3 has no fallback — it re-raises immediately, so the entire generation
crashes after the audio is already computed.
The Fix
Replace the
torchaudio.save()call in_save_mp3(used only to write a temporary WAV)with a direct
soundfile.write()call. Thesoundfilelibrary is already a declareddependency (
soundfile>=0.13.1inpyproject.toml) and writes WAV files without touchingtorchcodec.Risk Assessment
Tested On
Summary by CodeRabbit
Release Notes
Bug Fixes
Refactor