Skip to content

Commit d969a1e

Browse files
committed
Change cfg key in vllm serving
1 parent af0b15b commit d969a1e

File tree

4 files changed

+24
-9
lines changed

4 files changed

+24
-9
lines changed

docs/reference/vllm.md

+11-5
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,7 @@ You can then query the model in shell by passing a prompt and either
2525
1. a [JSON Schema][jsonschema]{:target="_blank"} specification or
2626
2. a [Regex][regex]{:target="_blank"} pattern
2727

28-
<<<<<<< HEAD
29-
with the `schema` or `regex` parameters, respectively, to the `/generate` endpoint. If both are specified, the schema will be used. If neither is specified, the generated text will be unconstrained.
30-
=======
31-
with the `schema`, `regex` or `cfg` parameters, respectively, to the `/generate` endpoint. If both are specified, the schema will be used. If neither is specified, the generated text will be unconstrained.
32-
>>>>>>> 43ff5c5 (Expect vllm.LLMEngine as processor's argument)
28+
with the `schema`, `regex` or `grammar` parameters, respectively, to the `/generate` endpoint. If both are specified, the schema will be used. If neither is specified, the generated text will be unconstrained.
3329

3430
For example, to generate a string that matches the schema `{"type": "string"}` (any string):
3531

@@ -51,6 +47,16 @@ curl http://127.0.0.1:8000/generate \
5147
}'
5248
```
5349

50+
To generate a string that matches the grammar `<grammar>`:
51+
52+
```bash
53+
curl http://127.0.0.1:8000/generate \
54+
-d '{
55+
"prompt": "What is Pi? Give me the first 15 digits: ",
56+
"grammar": "start: DECIMAL \r\nDIGIT: \"0\"..\"9\"\r\nINT: DIGIT+\r\nDECIMAL: INT \".\" INT? | \".\" INT"
57+
}'
58+
```
59+
5460
Instead of `curl`, you can also use the [requests][requests]{:target="_blank"} library from another python program.
5561

5662
Please consult the [vLLM documentation][vllm]{:target="_blank"} for details on additional request parameters. You can also [read the code](https://github.com/outlines-dev/outlines/blob/main/outlines/serve/serve.py) in case you need to customize the solution to your needs.

examples/vllm_integration.py

+6-2
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,14 @@ class User(BaseModel):
1515

1616
llm = vllm.LLM(model="gpt2")
1717
logits_processor = JSONLogitsProcessor(User, llm.llm_engine)
18-
result = llm.generate(
18+
outputs = llm.generate(
1919
["A prompt", "Another prompt"],
2020
sampling_params=vllm.SamplingParams(
2121
max_tokens=100, logits_processors=[logits_processor]
2222
),
2323
)
24-
print(result)
24+
25+
for output in outputs:
26+
prompt = output.prompt
27+
generated_text = output.outputs[0].text
28+
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

outlines/serve/serve.py

+5
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
from vllm.utils import random_uuid
2626

2727
from .vllm import (
28+
CFGLogitsProcessor,
2829
JSONLogitsProcessor,
2930
RegexLogitsProcessor,
3031
_patched_apply_logits_processors,
@@ -65,10 +66,14 @@ async def generate(request: Request) -> Response:
6566

6667
json_schema = request_dict.pop("schema", None)
6768
regex_string = request_dict.pop("regex", None)
69+
cfg_string = request_dict.pop("grammar", None)
70+
6871
if json_schema is not None:
6972
logits_processors = [JSONLogitsProcessor(json_schema, engine.engine)]
7073
elif regex_string is not None:
7174
logits_processors = [RegexLogitsProcessor(regex_string, engine.engine)]
75+
elif cfg_string is not None:
76+
logits_processors = [CFGLogitsProcessor(cfg_string, engine.engine)]
7277
else:
7378
logits_processors = []
7479

outlines/serve/vllm.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,12 @@
22
import json
33
import math
44
from collections import defaultdict
5-
from typing import DefaultDict, List, Callable
5+
from typing import Callable, DefaultDict, List
66

77
import torch
88
from vllm import LLMEngine
99

10-
from outlines.fsm.fsm import RegexFSM, CFGFSM, FSM
10+
from outlines.fsm.fsm import CFGFSM, FSM, RegexFSM
1111
from outlines.fsm.json_schema import build_regex_from_object
1212

1313

0 commit comments

Comments
 (0)