Skip to content

Commit fdf7397

Browse files
committed
Save the vllm tokenizer adapted state
1 parent c2599c9 commit fdf7397

File tree

2 files changed

+12
-3
lines changed

2 files changed

+12
-3
lines changed

docs/reference/vllm.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ curl http://127.0.0.1:8000/generate \
5050
}'
5151
```
5252

53-
To generate a string that matches the grammar `<grammar>`:
53+
To generate a string that matches a given grammar `<grammar>`:
5454

5555
```bash
5656
curl http://127.0.0.1:8000/generate \

outlines/serve/vllm.py

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,10 +44,16 @@ def _adapt_tokenizer(tokenizer):
4444
"""Adapt vLLM's tokenizer to use to compile the FSM.
4545
4646
The API of Outlines tokenizers is slightly different to that of
47-
`transformers`. In addition we need to handle the missing spaces to
48-
Llama's tokenizer to be able to compile FSMs for this model.
47+
`transformers`. The decoder of outlines, returns a list whereas
48+
the decode of vLLM returns an str. To sync the vLLM decoder with
49+
outlines internal api, the decoder should be adapted. In addition
50+
we need to handle the missing spaces to Llama's tokenizer to be
51+
able to compile FSMs for this model.
4952
5053
"""
54+
if getattr(tokenizer, "_outlines_adapted", False):
55+
return tokenizer
56+
5157
tokenizer.vocabulary = tokenizer.get_vocab()
5258
tokenizer.special_tokens = set(tokenizer.all_special_tokens)
5359

@@ -65,13 +71,16 @@ def convert_token_to_string(token: str) -> str:
6571
def change_decoder(
6672
decoder: Callable[[List[int]], str]
6773
) -> Callable[[List[int]], List[str]]:
74+
"""Sync vLLM's decoder with the outlines expectations by returning list"""
75+
6876
def new_decoder(inp_tokens: List[int]) -> List[str]:
6977
return [decoder(inp_tokens)]
7078

7179
return new_decoder
7280

7381
tokenizer.convert_token_to_string = convert_token_to_string
7482
tokenizer.decode = change_decoder(tokenizer.decode)
83+
setattr(tokenizer, "_outlines_adapted", True)
7584

7685
return tokenizer
7786

0 commit comments

Comments
 (0)