v0.12.0

angeloskath released this 25 Apr 21:31

· 472 commits to main since this release

82463e9

Highlights

Faster quantized matmul
- Up to 40% faster QLoRA or prompt processing, some numbers

Core

mx.synchronize to wait for computation dispatched with mx.async_eval
mx.radians and mx.degrees
mx.metal.clear_cache to return to the OS the memory held by MLX as a cache for future allocations
Change quantization to always represent 0 exactly (relevant issue)

Bugfixes

Fixed quantization of a block with all 0s that produced NaNs
Fixed the len field in the buffer protocol implementation

Assets 2