fix: stream save_model to prevent OOM on large MoE models by 0xClandestine · Pull Request #18 · Blaizzy/mlx-lm

0xClandestine · 2026-04-27T17:23:21Z

Summary

When converting DeepSeek V4 Flash (256 experts × 43 layers) with -q, the process gets OOM-killed during save_model. The lazy computation graph from dequant → stack → quantize creates enormous BF16 intermediates that all materialize at once.
Refactors save_model to build and save shards incrementally: pop weights as each shard is constructed, explicitly mx.eval before writing, then free. Bounds peak memory to ~one shard (~5 GB) + one evaluation intermediate (~4 GB) instead of the entire model's lazy graph.

Test plan

All 6 test_utils.py tests pass
Run mlx_lm convert --hf-path deepseek-ai/DeepSeek-V4-Flash -q on a machine with sufficient disk space

When converting DeepSeek V4 Flash (256 experts × 43 layers) with -q, the process gets OOM-killed during save. The lazy computation graph from dequant → stack → quantize creates enormous BF16 intermediates that all materialize at once when saving. Build and save shards incrementally: pop weights from the dict as each shard is constructed, explicitly mx.eval before writing, then free. This bounds peak memory to ~one shard + one evaluation intermediate instead of the entire model's lazy graph.

0xClandestine · 2026-04-27T17:25:41Z

PR: ml-explore#1192
Issue: ml-explore#1192 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: stream save_model to prevent OOM on large MoE models#18

fix: stream save_model to prevent OOM on large MoE models#18
0xClandestine wants to merge 1 commit intoBlaizzy:pc/add-deepseekv4flash-modelfrom
0xClandestine:fix/ds4-quantize-oom

0xClandestine commented Apr 27, 2026

Uh oh!

0xClandestine commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

0xClandestine commented Apr 27, 2026

Summary

Test plan

Uh oh!

0xClandestine commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant