Skip to content

downloading Qwen3-8B-4bit on Ubuntu leads to error: Cannot get device info without metal backend #17

Description

@foobacca

The terminal output is:

❯ llm mlx download-model mlx-community/Qwen3-8B-4bit                                                                                                                  
Fetching 9 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:00<00:00, 144631.17it/s]
Traceback (most recent call last):                                                                                                                                    
  File "/home/hamish/.local/bin/llm", line 10, in <module>                                                                                                            
    sys.exit(cli())                                                                                                                                                   
             ^^^^^                                                                                                                                                    
  File "/home/hamish/.local/share/uv/tools/llm/lib/python3.11/site-packages/click/core.py", line 1442, in __call__                                                    
    return self.main(*args, **kwargs)                                                                                                                                 
           ^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                 
  File "/home/hamish/.local/share/uv/tools/llm/lib/python3.11/site-packages/click/core.py", line 1363, in main                                                        
    rv = self.invoke(ctx)                                                                                                                                             
         ^^^^^^^^^^^^^^^^                                                                                                                                             
  File "/home/hamish/.local/share/uv/tools/llm/lib/python3.11/site-packages/click/core.py", line 1830, in invoke                                                  
    return _process_result(sub_ctx.command.invoke(sub_ctx))                                                                                                           
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                            
  File "/home/hamish/.local/share/uv/tools/llm/lib/python3.11/site-packages/click/core.py", line 1830, in invoke                                
    return _process_result(sub_ctx.command.invoke(sub_ctx))                                                                                                           
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                            
  File "/home/hamish/.local/share/uv/tools/llm/lib/python3.11/site-packages/click/core.py", line 1226, in invoke                                      
    return ctx.invoke(self.callback, **ctx.params)                                                                                                                    
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                    
  File "/home/hamish/.local/share/uv/tools/llm/lib/python3.11/site-packages/click/core.py", line 794, in invoke                        
    return callback(*args, **kwargs)                                                                                                                                  
           ^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                  
  File "/home/hamish/.local/share/uv/tools/llm/lib/python3.11/site-packages/llm_mlx.py", line 56, in download_model                  
    MlxModel(model_path).prompt("hi").text()                                                                                                                          
  File "/home/hamish/.local/share/uv/tools/llm/lib/python3.11/site-packages/llm/models.py", line 520, in text                                                   
    self._force()                                                                                                                                                     
  File "/home/hamish/.local/share/uv/tools/llm/lib/python3.11/site-packages/llm/models.py", line 517, in _force                                                 
    list(self)                                                                                                                                                        
  File "/home/hamish/.local/share/uv/tools/llm/lib/python3.11/site-packages/llm/models.py", line 554, in __iter__                                                     
    for chunk in self.model.execute(                                                                                                                                  
  File "/home/hamish/.local/share/uv/tools/llm/lib/python3.11/site-packages/llm_mlx.py", line 260, in execute                                                         
    for chunk in stream_generate(                                                  
  File "/home/hamish/.local/share/uv/tools/llm/lib/python3.11/site-packages/mlx_lm/generate.py", line 633, in stream_generate           
    with wired_limit(model, [generation_stream]):                                                                                                                     
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/contextlib.py", line 137, in __enter__                                                              
    return next(self.gen)                                                                                                                                             
           ^^^^^^^^^^^^^^
  File "/home/hamish/.local/share/uv/tools/llm/lib/python3.11/site-packages/mlx_lm/generate.py", line 222, in wired_limit                                             
    max_rec_size = mx.metal.device_info()["max_recommended_working_set_size"]
                   ^^^^^^^^^^^^^^^^^^^^^^                                                                                                                             
RuntimeError: [metal::device_info] Cannot get device info without metal backend

My reading suggests metal is a MacOS thing, so not sure why it is trying to use it on Ubuntu.

I am running

  • Ubuntu 24.04
  • llm version 0.25

Other info:

❯ llm plugins
[
  {
    "name": "llm-mlx",
    "hooks": [
      "register_commands",
      "register_models"
    ],
    "version": "0.4"
  },
  {
    "name": "llm-gpt4all",
    "hooks": [
      "register_models"
    ],
    "version": "0.4"
  }
]

running on python 3.11 to get past the sentencepiece bug.

/home/hamish/.local/share/uv/tools/llm/bin/python
Python 3.11.12 (main, Apr  8 2025, 14:15:29) [GCC 11.4.0] on linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions