How can i compile a LLM model? #1959

Zijie-Tian · 2025-03-13T13:24:41Z

Zijie-Tian
Mar 13, 2025

Hi everyone,
I'm new to MLX and would like to know if MLX supports compiling a model and performing inference, similar to how it's done in PyTorch. If so, could someone please guide me on how to achieve this?
Thanks in advance for your help!

Zijie-Tian · 2025-03-13T14:18:57Z

Zijie-Tian
Mar 13, 2025
Author

I have another question regarding performance testing of compile. I ran the tutorial code as folowing on my M4 MAX.

import mlx.core as mx
import mlx.nn as nn

import time

def timeit(fun, x, name=""):
    # warm up
    for _ in range(10):
        mx.eval(fun(x))

    tic = time.perf_counter()
    for _ in range(100):
        mx.eval(fun(x))
    toc = time.perf_counter()
    tpi = 1e3 * (toc - tic) / 100
    print(f"{name} Time per iteration {tpi:.3f} (ms)")
    
    
x = mx.random.uniform(shape=(32, 1000, 4096))
timeit(nn.gelu, x, "No Compile")
timeit(mx.compile(nn.gelu), x, "Compiled")

According to the tutorial, a 5x speedup is achieved on the M1 MAX. However, my results on the M4 MAX show different performance characteristics.

No Compile Time per iteration 8.267 (ms)
Compiled Time per iteration 2.869 (ms)

I'm curious about what exactly influences the performance difference between the compiled and non-compiled versions. Could someone shed some light on this?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can i compile a LLM model? #1959

{{title}}

Replies: 1 comment

{{title}}

Select a reply

How can i compile a LLM model? #1959

Zijie-Tian Mar 13, 2025

Replies: 1 comment

Zijie-Tian Mar 13, 2025 Author

Zijie-Tian
Mar 13, 2025

Zijie-Tian
Mar 13, 2025
Author