Skip to content

Use mlx models with llm embedding functionality #26

Description

@spod

Hey!

I wanted to use Mlx embedding models like mlx-community/Qwen3-Embedding-4B-4bit-DWQ locally from the llm CLI but it did not work.

I could use the models from llm mlx, they were not available in llm embed-models.

A little research showed it's because llm-mlx does not provide an implementation of llm.EmbeddingModel and a register_embedding_models hook.

As a short term fix I built https://github.com/spod/llm-mlx-embed with some help from Claude to solve this. It works for me, see below.

If this is something that you would welcome as a PR for llm-mlx let me know and I'll put something together when I have time.

Michael

example usage of spod/llm-mlx-embed

$ llm mlx download-model mlx-community/Qwen3-Embedding-4B-4bit-DWQ
...
$ llm mlx download-model mlx-community/qwen3-embedding-0.6b-8bit
...
$ llm embed-models | grep Mlx
MlxEmbeddingModel: mlx-community/qwen3-embedding-0.6b-8bit (aliases: qwen3-embedding-0.6b-8bit)
MlxEmbeddingModel: mlx-community/Qwen3-Embedding-4B-4bit-DWQ (aliases: Qwen3-Embedding-4B-4bit-DWQ)
$ llm embed -m Qwen3-Embedding-4B-4bit-DWQ -c 'hello' -f hex
0000d3b90000d2bb0000023d0000cd3c000005bb00.....

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions