Does vLLM-MLX Support efficient SSD caching? #99
Unanswered
TomLucidor
asked this question in
Q&A
Replies: 1 comment 2 replies
-
|
The README mentions that vllm-mlx supports an "SSD-tiered cache." However, it's not clear what level of efficiency or specific implementation details this entails for multi-agent designs. Could you clarify whether you are asking about specific support for multi-agent systems or more about the efficiency of SSD caching in general? Additionally, if you have any specific use cases or configurations in mind, please share those details. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Saw this being recommended, and I wonder if vLLM-MLX might have (or could have) something similar for multi-agent designs (assuming they all use the same model for minimizing memory footprint) https://github.com/jundot/omlx
Beta Was this translation helpful? Give feedback.
All reactions