Hey!
I wanted to use Mlx embedding models like mlx-community/Qwen3-Embedding-4B-4bit-DWQ locally from the llm CLI but it did not work.
I could use the models from llm mlx, they were not available in llm embed-models.
A little research showed it's because llm-mlx does not provide an implementation of llm.EmbeddingModel and a register_embedding_models hook.
As a short term fix I built https://github.com/spod/llm-mlx-embed with some help from Claude to solve this. It works for me, see below.
If this is something that you would welcome as a PR for llm-mlx let me know and I'll put something together when I have time.
Michael
example usage of spod/llm-mlx-embed
$ llm mlx download-model mlx-community/Qwen3-Embedding-4B-4bit-DWQ
...
$ llm mlx download-model mlx-community/qwen3-embedding-0.6b-8bit
...
$ llm embed-models | grep Mlx
MlxEmbeddingModel: mlx-community/qwen3-embedding-0.6b-8bit (aliases: qwen3-embedding-0.6b-8bit)
MlxEmbeddingModel: mlx-community/Qwen3-Embedding-4B-4bit-DWQ (aliases: Qwen3-Embedding-4B-4bit-DWQ)
$ llm embed -m Qwen3-Embedding-4B-4bit-DWQ -c 'hello' -f hex
0000d3b90000d2bb0000023d0000cd3c000005bb00.....
Hey!
I wanted to use Mlx embedding models like mlx-community/Qwen3-Embedding-4B-4bit-DWQ locally from the llm CLI but it did not work.
I could use the models from
llm mlx, they were not available inllm embed-models.A little research showed it's because
llm-mlxdoes not provide an implementation ofllm.EmbeddingModeland aregister_embedding_modelshook.As a short term fix I built https://github.com/spod/llm-mlx-embed with some help from Claude to solve this. It works for me, see below.
If this is something that you would welcome as a PR for llm-mlx let me know and I'll put something together when I have time.
Michael
example usage of spod/llm-mlx-embed