You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using an RISC-V development board to run Exo. The board information is available at https://www.bit-brick.com/k1/.
The running environment is as follows:
Python: Python 3.12.3 (main, Apr 10 2024, 05:33:47) [GCC 13.2.0] on linux
OS: Bianbu OS based on Ubuntu 24.04
After successful installation, I ran Exo and then downloaded a model. The following is the prompt log:
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Selected inference engine: None
_____ _____
/ _ \ \/ / _ \
| __/> < (_) |
\___/_/\_\___/
Detected system: Linux
Inference engine name after selection: tinygrad
Using inference engine: TinygradDynamicShardInferenceEngine with shard downloader: HFShardDownloader
[]
Chat interface started:
- http://172.17.0.1:52415
- http://192.168.3.134:52415
- http://127.0.0.1:52415
ChatGPT API endpoint served at:
- http://172.17.0.1:52415/v1/chat/completions
- http://192.168.3.134:52415/v1/chat/completions
- http://127.0.0.1:52415/v1/chat/completions
has_read=True, has_write=True
Removing download task for Shard(model_id='llama-3.2-1b', start_layer=0, end_layer=0, n_layers=16): True
╭──────────────────────────────────────────────── Exo Cluster (1 node) ────────────────────────────────────────────────╮
│ │
Task exception was never retrieved
future: <Task finished name='Task-108' coro=<TinygradDynamicShardInferenceEngine.ensure_shard() done, defined at
/home/bitbrick/exo/exo/inference/tinygrad/inference.py:142> exception=CalledProcessError(1, ['clang', '-shared',
'-march=native', '-O2', '-Wall', '-Werror', '-x', 'c', '-fPIC', '-ffreestanding', '-nostdlib', '-', '-o',
'/tmp/tmp53tml3dh'])>
Traceback (most recent call last):
File "/home/bitbrick/exo/exo/inference/tinygrad/inference.py", line 151, in ensure_shard
model_shard = await loop.run_in_executor(self.executor, build_transformer, model_path, shard, parameters)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bitbrick/exo/exo/inference/tinygrad/inference.py", line 45, in build_transformer
model = Transformer(**MODEL_PARAMS[model_size]["args"], linear=linear, max_context=8192, jit=True, shard=shard)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bitbrick/exo/exo/inference/tinygrad/models/llama.py", line 198, in __init__
self.freqs_cis = precompute_freqs_cis(dim // n_heads, self.max_context*2, rope_theta,
rope_scaling=rope_scaling).contiguous()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bitbrick/exo/exo/inference/tinygrad/models/llama.py", line 17, in precompute_freqs_cis
freqs[:dim // 4] *= low_freq_factor
~~~~~^^^^^^^^^^^
File "/home/bitbrick/exo/myenv/lib/python3.12/site-packages/tinygrad/tensor.py", line 3754, in _wrapper
ret = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/bitbrick/exo/myenv/lib/python3.12/site-packages/tinygrad/tensor.py", line 1222, in __setitem__
res = self.realize()._getitem(indices, v)
^^^^^^^^^^^^^^
File "/home/bitbrick/exo/myenv/lib/python3.12/site-packages/tinygrad/tensor.py", line 3729, in _wrapper
if _METADATA.get() is not None: return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/bitbrick/exo/myenv/lib/python3.12/site-packages/tinygrad/tensor.py", line 219, in realize
run_schedule(*self.schedule_with_vars(*lst), do_update_stats=do_update_stats)
File "/home/bitbrick/exo/myenv/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 215, in run_schedule
for ei in lower_schedule(schedule):
File "/home/bitbrick/exo/myenv/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 208, in lower_schedule
raise e
File "/home/bitbrick/exo/myenv/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 202, in lower_schedule
try: yield lower_schedule_item(si)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bitbrick/exo/myenv/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 187, in
lower_schedule_item
runner = get_runner(si.outputs[0].device, si.ast)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bitbrick/exo/myenv/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 155, in get_runner
method_cache[ckey] = method_cache[bkey] = ret = CompiledRunner(replace(prg, device=device))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bitbrick/exo/myenv/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 81, in __init__
self.lib:bytes = precompiled if precompiled is not None else Device[p.device].compiler.compile_cached(p.src)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bitbrick/exo/myenv/lib/python3.12/site-packages/tinygrad/device.py", line 194, in compile_cached
lib = self.compile(src)
^^^^^^^^^^^^^^^^^
File "/home/bitbrick/exo/myenv/lib/python3.12/site-packages/tinygrad/runtime/ops_clang.py", line 16, in compile
subprocess.check_output(['clang', '-shared', *self.args, '-O2', '-Wall', '-Werror', '-x', 'c', '-fPIC',
'-ffreestanding', '-nostdlib',
File "/usr/lib/python3.12/subprocess.py", line 466, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['clang', '-shared', '-march=native', '-O2', '-Wall', '-Werror', '-x', 'c',
'-fPIC', '-ffreestanding', '-nostdlib', '-', '-o', '/tmp/tmp53tml3dh']' returned non-zero exit status 1.
╭──────────────────────────────────────────────── Exo Cluster (1 node) ────────────────────────────────────────────────╮
│ │
│ _____ _____ │
│ / _ \ \/ / _ \ │
│ | __/> < (_) | │
│ \___/_/\_\___/ │
│ │
│ │
│ Web Chat URL (tinychat): http://172.17.0.1:52415 │
│ ChatGPT API endpoint: http://172.17.0.1:52415/v1/chat/completions │
│ GPU poor ▼ GPU rich │
│ [🟥🟥🟥🟥🟥🟥🟥🟥🟧🟧🟧🟧🟧🟧🟧🟨🟨🟨🟨🟨🟨🟨🟨🟩🟩🟩🟩🟩🟩🟩] │
│ 0.00 TFLOPS │
│ ▲ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
I installed pip install llvmlite according to the answer in the issues and continued to start. Now, instead of an error during startup, the same error is reported in the web interface.
Could you please tell me how to solve this problem? If you need the same environment to reproduce the issue, please contact me and I can provide a test development board for you. Thank you.
The text was updated successfully, but these errors were encountered:
I am using an RISC-V development board to run Exo. The board information is available at https://www.bit-brick.com/k1/.
The running environment is as follows:
After successful installation, I ran Exo and then downloaded a model. The following is the prompt log:
I installed
pip install llvmlite
according to the answer in the issues and continued to start. Now, instead of an error during startup, the same error is reported in the web interface.Could you please tell me how to solve this problem? If you need the same environment to reproduce the issue, please contact me and I can provide a test development board for you. Thank you.
The text was updated successfully, but these errors were encountered: