AMD 7600XT - llama off loads to GPU but low TpS #707
-
I've been trying to use an AMD 7600XT with Pi5. DISPLAY=:0 glxinfo -B gives:
I get:
but performance are really low and i can't really use the LLM:
If I also open nvtop everything goes smoothly and i can have normal conversation but i still get:
which seems to me a bit odd since conversation is really fluid. Also if I try a beefier model (mistral-7b-instruct-v0.2.Q8_0.gguf), I can't even load it on to the GPU even though the model is just around 7GB:
not sure if this is just vulkan fault though. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
You may need to adjust the |
Beta Was this translation helpful? Give feedback.
You may need to adjust the
GGML_VK_FORCE_MAX_ALLOCATION_SIZE
size due to some bugs in the driver on arm64. See: geerlingguy/ollama-benchmark#1 and have a read through my notes towards the bottom of that original issue post.