Skip to content

Request to replace the model acceleration technique "flash attention" with a more versatile "vllm". #138

jake123456789ok started this conversation in Ideas
Discussion options

You must be logged in to vote

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
1 participant
Converted from issue

This discussion was converted from issue #19 on November 16, 2023 04:49.