-
Notifications
You must be signed in to change notification settings - Fork 632
Closed as not planned
Labels
Description
What behavior of the library made you think about the improvement?
It's slow as described in #617
How would you like it to behave?
Improve performance.
Based on #587 there are a few areas impacting performance
- 1) Retrieving
RegexFSM
from cache is slow - 2)
LALRInteractiveParser.accepts()
calls are slow- I've run into this issue before and have a solution here: https://github.com/lapp0/vllm/blob/344f27b84c41034067227d740967aca5254b0ade/vllm/grammar.py#L23-L68
- addressed via Use
FastInteractiveParser
Subclass to ImproveCFGFSM
Performance #622
- 3) After addressing the two above areas, the slowest part of second run is
TransformerTokenizer.__hash__