Fix: bypass torchcodec crash in _save_mp3 on PyTorch 2.10+ by psale · Pull Request #1145 · ace-step/ACE-Step-1.5

psale · 2026-04-24T16:48:00Z

Summary

AudioSaver._save_mp3() crashes on environments where torchcodec's shared libraries
(libtorchcodec_core*.so) cannot load — notably Google Colab with PyTorch 2.10.0+cu128.

The fix replaces torchaudio.save() with direct soundfile.write() for the intermediate
WAV file in _save_mp3. This is a minimal, surgical change: the ffmpeg-based MP3 encoding
pipeline is untouched.

Root Cause

In torchaudio >= 2.10, torchaudio.save() unconditionally routes through
save_with_torchcodec() for all formats, even when backend='soundfile' is specified
explicitly. If torchcodec cannot load its native FFmpeg shared libraries (common on Colab
due to mismatched libavutil.so versions or the torch_dtype_float4_e2m1fn_x2 symbol),
the call fails with a hard RuntimeError.

The error chain is:
_save_mp3 → torchaudio.save(..., backend='soundfile') → save_with_torchcodec() ← dispatched unconditionally → load_torchcodec_shared_libraries() → RuntimeError: Could not load libtorchcodec

Since save_audio() has a soundfile fallback for non-MP3 formats (lines 317–338), those
survive. But MP3 has no fallback — it re-raises immediately, so the entire generation
crashes after the audio is already computed.

The Fix

Replace the torchaudio.save() call in _save_mp3 (used only to write a temporary WAV)
with a direct soundfile.write() call. The soundfile library is already a declared
dependency (soundfile>=0.13.1 in pyproject.toml) and writes WAV files without touching
torchcodec.

Risk Assessment

Area	Risk	Notes
MP3 export quality	None	WAV→MP3 conversion via ffmpeg is unchanged
Other formats	None	Untouched by this change
Non-Colab environments	None	soundfile produces identical WAV output
Platforms (CUDA/MPS/XPU/CPU)	None	soundfile is platform-agnostic

Tested On

Google Colab (Ubuntu 22.04, A100, PyTorch 2.10.0+cu128, torchaudio 2.10.0+cu128)
Generation completes and MP3 files are saved successfully after patch

Summary by CodeRabbit

Release Notes

Bug Fixes
- Enhanced MP3 export stability and reliability through an improved audio processing pipeline.
Refactor
- Modernized the internal audio file serialization mechanism in the MP3 export workflow to use a more robust audio library approach.

Replaced torchaudio.save with soundfile.write for intermediate WAV files to avoid DLL/symbol errors in Colab/Linux environments. Also updated unit tests.

coderabbitai · 2026-04-24T16:48:14Z

📝 Walkthrough

Walkthrough

The PR replaces torchaudio.save with soundfile.write for intermediate WAV file generation in the MP3 export path, converting torch tensors to numpy arrays and transposing from channel-first to sample-first layout before writing.

Changes

Cohort / File(s)	Summary
Audio utils implementation `acestep/audio_utils.py`	Replaced torchaudio-based WAV serialization with soundfile (sf.write) for MP3 export, including tensor-to-numpy conversion and channel dimension transposition.
Audio utils tests `acestep/audio_utils_test.py`	Updated MP3 export tests to validate soundfile.write invocation and added regression tests ensuring torchaudio.save is not called during MP3 export.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

fix: torchaudio 2.10 breaks all audio I/O (torchcodec ABI incompatibility) #1073 — Directly modifies the same MP3 export codepath by replacing torchaudio.save with soundfile.write serialization.
RuntimeError: Could not load libtorchcodec. #685 — Addresses torchcodec/libtorchcodec load failures by avoiding torchaudio.save in MP3 export, which this PR implements via soundfile.

Possibly related PRs

Selectable MP3 export format #852 — Modifies the MP3 export path in acestep/audio_utils.py with changes to WAV serialization approach.

Suggested reviewers

ChuxiJ

Poem

🐰 A hop, a skip, through audio streams,
From torch to sound, with numpy dreams,
WAV files dance in new array form,
MP3s emerge, reborn! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly references the specific fix—bypassing torchcodec crash in _save_mp3 on PyTorch 2.10+—which matches the main technical change: replacing torchaudio.save with soundfile.write to avoid torchcodec library loading failures.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

acestep/audio_utils.py (2)
302-308: Consider extending the torchcodec bypass to the wav/flac path as a follow-up.

The same torchcodec dispatch that motivated this PR affects torchaudio.save(..., backend='soundfile') on PyTorch 2.10+ — so format == "flac"/"wav" here will still crash in the Colab environment described in the PR, even though MP3 is now fixed. The exception handler at lines 321-342 will likely catch it and fall back to sf.write, but that makes the slow/noisy path the common case on 2.10+.

Out of scope for this PR (which explicitly limits the change to MP3), but worth a follow-up to route WAV/FLAC writes directly through sf.write for the same reason.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acestep/audio_utils.py` around lines 302 - 308, The torchaudio.save call for
WAV/FLAC still goes through the torchcodec dispatch and can fail on PyTorch
2.10+, so update the save routine to bypass torchaudio when format is "wav" or
"flac" by dispatching directly to soundfile's sf.write (the same path used in
the exception fallback) instead of calling torchaudio.save; specifically modify
the branch around torchaudio.save (referencing torchaudio.save, the format
variable, and sf.write) to short-circuit for format == "wav" || format == "flac"
and write via sf.write with the same audio_tensor/sample_rate handling to avoid
triggering the exception handler and unnecessary slow fallbacks.
288-288: Nit: redundant local import soundfile as sf.

soundfile is now imported at module scope (line 22), so the local imports inside save_audio (line 288 for the wav32 path, line 326 in the exception fallback) are redundant. Safe to remove for consistency.
♻️ Proposed cleanup
@@ -285,8 +285,6 @@
                 if format == "wav32":
                     try:
-                        import soundfile as sf
-                        
                         # Use soundfile directly for 32-bit float
                         audio_np = audio_tensor.transpose(0, 1).numpy() # [channels, samples] -> [samples, channels]
@@ -323,7 +321,6 @@
                 logger.error(f"[AudioSaver] MP3 export failed without fallback: {e}")
                 raise
             try:
-                import soundfile as sf
                 audio_np = audio_tensor.transpose(0, 1).numpy()  # -> [samples, channels]
Also applies to: 326-326
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acestep/audio_utils.py` at line 288, The local redundant "import soundfile as
sf" statements inside save_audio (present near the wav32 path branch and the
exception fallback) should be removed because soundfile is already imported at
module scope; update the save_audio function by deleting those local imports and
relying on the module-level sf, ensuring any references in the wav32 branch and
the exception handling branch continue to call sf.* without re-importing.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@acestep/audio_utils.py`:
- Around line 302-308: The torchaudio.save call for WAV/FLAC still goes through
the torchcodec dispatch and can fail on PyTorch 2.10+, so update the save
routine to bypass torchaudio when format is "wav" or "flac" by dispatching
directly to soundfile's sf.write (the same path used in the exception fallback)
instead of calling torchaudio.save; specifically modify the branch around
torchaudio.save (referencing torchaudio.save, the format variable, and sf.write)
to short-circuit for format == "wav" || format == "flac" and write via sf.write
with the same audio_tensor/sample_rate handling to avoid triggering the
exception handler and unnecessary slow fallbacks.
- Line 288: The local redundant "import soundfile as sf" statements inside
save_audio (present near the wav32 path branch and the exception fallback)
should be removed because soundfile is already imported at module scope; update
the save_audio function by deleting those local imports and relying on the
module-level sf, ensuring any references in the wav32 branch and the exception
handling branch continue to call sf.* without re-importing.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: aaec91ae-0945-4722-841c-5a75d84234ac

📥 Commits

Reviewing files that changed from the base of the PR and between d5d958e and 00ad6f7.

📒 Files selected for processing (2)

acestep/audio_utils.py
acestep/audio_utils_test.py

dvc50 · 2026-05-02T04:59:46Z

This fix needs merging in. I used it successfully on my Linux Mint 22 install. It fixed the torchcodec error at the mp3 creation stage.

Fix: bypass torchcodec crash in _save_mp3 on PyTorch 2.10+

00ad6f7

Replaced torchaudio.save with soundfile.write for intermediate WAV files to avoid DLL/symbol errors in Colab/Linux environments. Also updated unit tests.

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

Ferase mentioned this pull request Apr 30, 2026

(Hacky Fixes Included) The UI resolves incorrect directory for ACE-Step-1.5 on Linux & Torchcodec icompatibility with PyTorch 2.10 fspecii/ace-step-ui#76

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: bypass torchcodec crash in _save_mp3 on PyTorch 2.10+#1145

Fix: bypass torchcodec crash in _save_mp3 on PyTorch 2.10+#1145
psale wants to merge 1 commit intoace-step:mainfrom
psale:main

psale commented Apr 24, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

dvc50 commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

psale commented Apr 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

The Fix

Risk Assessment

Tested On

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

dvc50 commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

psale commented Apr 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading