Skip to content

Commit 9554b2c

Browse files
committed
Update the documentation to include tools and ouputs
1 parent a89ab6f commit 9554b2c

25 files changed

+485
-153
lines changed

docs/features/advanced/backends.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,11 @@ model = outlines.from_transformers(
2525
)
2626

2727
result = model("What is the capital of France?", output_type, backend="llguidance")
28-
print(result) # 'Paris'
28+
print(result.content) # 'Paris'
2929

3030
generator = outlines.Generaor(model, output_type)
3131
result = generator("What is the capital of France?", backend="xgrammar")
32-
print(result) # 'Paris'
32+
print(result.content) # 'Paris'
3333
```
3434

3535
If you do not provide a value for the `backend` argument, the default value will be used. The default value depends on the type of output type:

docs/features/advanced/logits_processors.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ logits_processor = RegexLogitsProcessor(r"U\+[0-9A-Fa-f]{4,6}", model.tokenizer,
4444
generator = Generator(model, processor=logits_processor)
4545
response = generator("What's the unicode for the hugging face emoji")
4646

47-
print(response) # U+1F917
47+
print(response.content) # U+1F917
4848
```
4949

5050
## Creating Custom Logits Processors
@@ -95,5 +95,5 @@ formatted_prompt = tf_tokenizer.apply_chat_template(
9595
generator = Generator(model, processor=logits_processor)
9696
response = generator(formatted_prompt)
9797

98-
print(response) # "101111"
98+
print(response.content) # "101111"
9999
```

docs/features/core/generator.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ generator = Generator(model)
4747
result = generator("Write a short poem about AI.")
4848

4949
# Print the result
50-
print(result)
50+
print(result.content)
5151
```
5252

5353
## Structured Generation
@@ -77,7 +77,7 @@ generator = Generator(model, BookRecommendation)
7777
result = generator("Recommend a science fiction book.")
7878

7979
# Parse the JSON result into a Pydantic model
80-
book = BookRecommendation.model_validate_json(result)
80+
book = BookRecommendation.model_validate_json(result.content)
8181
print(f"{book.title} by {book.author} ({book.year})")
8282
```
8383

@@ -109,7 +109,7 @@ result = generator(
109109

110110
## Return Value
111111

112-
The generator always returns a raw string containing the generated text. When generating structured outputs, you need to parse this string into the desired format.
112+
The generator returns an `Output` instance (or a iterator containing `StreamingOutput` instances in case of streaming). The `content` field contains the generated text as a string. When generating structured outputs, you need to parse this string into the desired format.
113113

114114
Unlike in Outlines v0, where the return type could be a parsed object, in v1 you are responsible for parsing the output when needed:
115115

@@ -126,7 +126,7 @@ generator = Generator(model, Person)
126126
result = generator("Generate a person:")
127127

128128
# Parse the result yourself
129-
person = Person.model_validate_json(result)
129+
person = Person.model_validate_json(result.content)
130130
```
131131

132132
::: outlines.generator.Generator

docs/features/core/inputs.md

Lines changed: 19 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ model = outlines.from_transformers(
3232

3333
# Simple text prompt
3434
response = model("What's the capital of France?", max_new_tokens=20)
35-
print(response) # 'Paris'
35+
print(response.content) # 'Paris'
3636
```
3737

3838
## Multimodal Inputs (Vision)
@@ -76,16 +76,22 @@ prompt = [
7676

7777
# Call the model to generate a response
7878
response = model(prompt, max_tokens=50)
79-
print(response) # 'This is a picture of a black dog.'
79+
print(response.content) # 'This is a picture of a black dog.'
8080
```
8181

8282
## Chat Inputs
8383

8484
For conversational models, you can use the `Chat` class to provide a conversation history with multiple messages.
8585

86-
A `Chat` instance is instantiated with an optional list of messages. Each message must be a dictionary containing two mandatory keys:
87-
- `role`: must be one of `system`, `assistant` or `user`
88-
- `content`: must be either a string or a multimodal input (if the model supports it)
86+
A `Chat` is instantiated with an optional list of messages. The type of each message is defined by the value of the mandatory `role` key. There are 4 types of messages that each have their associated keys:
87+
- `system`: system instructions to give context to the LLM on the task to perform. The only other key is `content` (mandatory).
88+
- `user`: a message from you in the conversation. The only other key is `content` (mandatory).
89+
- `assistant`: a response from the LLM. The other keys are `content` and `tool_calls` (a list of `ToolCall` instances). At least one of those two must be provided.
90+
- `tool`: a tool call response. The other keys are `content` (mandatory), `tool_name` and `tool_call_id`. Depending on the models you are using, one of those two is mandatory.
91+
92+
Support for the various message types and fields described above depends on the capabilities of the model you are using. Tool calling is limited to a few models at the moment for instance. To know more about tools, consult the dedicated section on [tools](./tools.md).
93+
94+
An `Output` instance returned by a model can also be added to a `Chat`. It will automatically be turned into a user message. To know more about model outputs, consult the dedicated section on [outputs](./outputs.md).
8995

9096
For instance:
9197

@@ -149,13 +155,15 @@ print(prompt)
149155
# {'role': 'assistant', 'content': 'Excellent, thanks!'}
150156
```
151157

152-
Finally, there are three convenience method to easily add a message:
158+
There are four convenience method to easily add a message:
153159

154-
- add_system_message
155-
- add_user_message
156-
- add_assistant_message
160+
- `add_system_message`
161+
- `add_user_message`
162+
- `add_assistant_message`
163+
- `add_tool_message`
164+
- `add_output`
157165

158-
As the role is already set, you only need to provide the content.
166+
As the role is already set, you only need to provide values for the other keys of the message type, except for the `add_output` for which you would just provide the model call output.
159167

160168
For instance:
161169

@@ -200,5 +208,5 @@ prompts = [
200208

201209
# Call it to generate text
202210
result = model.batch(prompts, max_new_tokens=20)
203-
print(result) # ['Vilnius', 'Riga', 'Tallinn']
211+
print([item.content for item in result]) # ['Vilnius', 'Riga', 'Tallinn']
204212
```

docs/features/core/output_types.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -48,9 +48,9 @@ def create_character() -> Character:
4848
With an Outlines model, you can generate text that respects the type hints above by providing those as the output type:
4949

5050
```python
51-
model("How many minutes are there in one hour", int) # "60"
52-
model("Pizza or burger", Literal["pizza", "burger"]) # "pizza"
53-
model("Create a character", Character, max_new_tokens=100) # '{"name": "James", "birth_date": "1980-05-10)", "skills": ["archery", "negotiation"]}'
51+
model("How many minutes are there in one hour", int).content # "60"
52+
model("Pizza or burger", Literal["pizza", "burger"]).content # "pizza"
53+
model("Create a character", Character, max_new_tokens=100).content # '{"name": "James", "birth_date": "1980-05-10)", "skills": ["archery", "negotiation"]}'
5454
```
5555

5656
An important difference with function type hints though is that an Outlines generator always returns a string.
@@ -61,8 +61,8 @@ For instance:
6161
```python
6262
result = model("Create a character", Character, max_new_tokens=100)
6363
casted_result = Character.model_validate_json(result)
64-
print(result) # '{"name": "Aurora", "birth_date": "1990-06-15", "skills": ["Stealth", "Diplomacy"]}'
65-
print(casted_result) # name=Aurora birth_date=datetime.date(1990, 6, 15) skills=['Stealth', 'Diplomacy']
64+
print(result).content # '{"name": "Aurora", "birth_date": "1990-06-15", "skills": ["Stealth", "Diplomacy"]}'
65+
print(casted_result).content # name=Aurora birth_date=datetime.date(1990, 6, 15) skills=['Stealth', 'Diplomacy']
6666
```
6767

6868
## Output Type Categories

docs/features/core/outputs.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
---
2+
title: Outputs
3+
---
4+
5+
# Outputs
6+
7+
## Overview
8+
9+
Outlines uses two objcets to contain model response: `Ouptut` and `StreamingOutput`.
10+
11+
They both have two fields:
12+
13+
- `content`: the raw text reponse returned by the model
14+
- `tool_calls`: a list of `ToolCallOutput` or `StreamingToolCallOutput` instances if the model decided to call a tool instead of giving a response directly. This field can only have a value if you provided a list of tools to the model in the first place.
15+
16+
To access the text response from the model, you would thus typically only do `reponse.output`. In the case of streaming, it would give you a chunk of the response.
17+
18+
## Chat
19+
20+
If you are using a `Chat` input to call the model, you can add the `Output` you received from the model to your `Chat` instance to add a new message that will be part of the conversation provided to the model the next time you can it.
21+
22+
For instance:
23+
24+
```python
25+
import transformers
26+
import outlines
27+
from outlines.inputs import Chat, Image
28+
29+
MODEL_ID = "microsoft/Phi-3-mini-4k-instruct"
30+
31+
model = outlines.from_transformers(
32+
transformers.AutoModelForCausalLM.from_pretrained(MODEL_ID),
33+
transformers.AutoTokenizer.from_pretrained(MODEL_ID),
34+
)
35+
36+
# Initialize the chat with a system message.
37+
chat_prompt = Chat([
38+
{"role": "system", "content": "You are a helpful assistant."},
39+
])
40+
41+
# Add a user message to the chat.
42+
chat_prompt.add_user_message("What's the capital of Latvia?")
43+
44+
# Call the model with the chat input.
45+
response = model(chat_prompt)
46+
print(response.content) # 'The capital of Latvia is Riga.'
47+
48+
# Add the output to the chat.
49+
chat_prompt.add_output(response)
50+
51+
# Add another user message to the chat and call the model again.
52+
chat_prompt.add_user_message("How many inhabitants does it have?")
53+
response = model(chat_prompt)
54+
print(response.content) # '600,000'
55+
```
56+
57+
## Tool Calls
58+
59+
As described above, the output you receive from the model can contain a list of `ToolCallOutput` or `StreamingToolCallOutput` instances for the `tool_calls` field if the model decided to first call tools.
60+
61+
A `ToolCallOutput` or `StreamingToolCallOutput` contains three fields:
62+
- `name`: the name of the tool to call
63+
- `id`: the id of the tool call to make. If provided, it should typically be included in the `ToolMessage` containing the tool response you would add to the `Chat`
64+
- `args`: the arguments to provide to the tool to call. This is a dictionnary for regular call and a string for streaming calls (as it could contain only a chunk of the whole args)
65+
66+
See the section on [tools](./tools.md) for an explanation on how to use the `ToolCallOutput` to make a tool call.

docs/features/core/tools.md

Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
---
2+
title: Tools
3+
---
4+
5+
# Tools
6+
7+
## Overview
8+
9+
Some models support tool calling, meaning that instead of directly providing its final response, the model can require to call tools you have defined and would later use the tool response in its final response. Tool calling typically goes along providing a `Chat` input as it implies a multiturn conversation with the model.
10+
11+
For the moment, tool calling is supported by three Outlines models:
12+
13+
- `Anthropic`
14+
- `Gemini`
15+
- `OpenAI`
16+
17+
## Tool Definition
18+
19+
Using tool calling starts with defining the tools that the model can call. There are three formats currently supported as described below.
20+
21+
Once defined, the tools must be provided in a list to the `tools` keyword argument to the `Generator` constructor or to the text generation methods of a model. As such, the interface for `tools` is very similar to that of the `output_type`.
22+
23+
#### ToolDef
24+
25+
A tool can first by defined as a dictionnary. A `ToolDef` dict must contain the following keys:
26+
27+
- `name`: The name of the tool
28+
- `description`: A description of the tool to help the LLM understand its use
29+
- `parameters`: A dictionnary containing the paramters of the tool, using the JSON properties format. If the LLM decides to call the tool, it will provide values for the parameters
30+
- `required`: A list of parameters that are mandatory. All those parameters must be included in the `parameters` key described above
31+
32+
For instance:
33+
34+
```python
35+
import openai
36+
from outlines import from_openai
37+
from outlines.inputs import Chat
38+
from outlines.tools import ToolDef
39+
40+
client = openai.OpenAI()
41+
model = from_openai(client, "gpt-4o")
42+
43+
chat = Chat([
44+
{"role": "system", "content": "You are a helpful assistant."},
45+
{"role": "user", "content": "What's the weather in Tokyo?"},
46+
])
47+
48+
weather_tool = ToolDef(
49+
name="get_weather",
50+
description="Give the weather for a given city, and optionally for a specific hour of the day",
51+
parameters={"city": {"type": "string"}, "hour": {"type": "integer"}},
52+
required=["city"],
53+
)
54+
55+
response = model(chat, tools=[weather_tool])
56+
print(response.tool_calls) # [ToolCallOutput(name='get_weather', id='call_p7ToNwgrgoEk9poN7PXTELT5', args={'city': 'Tokyo'})]
57+
```
58+
59+
#### Function
60+
61+
A python function can be used as a tool definition. The `description` would then correspond to the docstring while the `parameters` and `required` would be deduced from the signature.
62+
63+
```python
64+
import openai
65+
from outlines import from_openai
66+
from outlines.inputs import Chat
67+
from typing import Optional
68+
69+
client = openai.OpenAI()
70+
model = from_openai(client, "gpt-4o")
71+
72+
chat = Chat([
73+
{"role": "system", "content": "You are a helpful assistant."},
74+
{"role": "user", "content": "What's the weather in Tokyo?"},
75+
])
76+
77+
def get_weather(city: str, hour: Optional[int] = None):
78+
"""Give the weather for a given city, and optionally for a specific hour of the day"""
79+
pass
80+
81+
response = model(chat, tools=[get_weather])
82+
print(response.tool_calls) # [ToolCallOutput(name='get_weather', id='call_IdsfmBss6XhiBDbchTqp3HHz', args={'city': 'Tokyo'})]
83+
```
84+
85+
#### Pydantic model
86+
87+
Lastly, you can use a Pydantic model to define the interface of your tool.
88+
89+
```python
90+
import openai
91+
from outlines import from_openai
92+
from outlines.inputs import Chat
93+
from pydantic import BaseModel
94+
from typing import Optional
95+
96+
client = openai.OpenAI()
97+
model = from_openai(client, "gpt-4o")
98+
99+
chat = Chat([
100+
{"role": "system", "content": "You are a helpful assistant."},
101+
{"role": "user", "content": "What's the weather in Tokyo?"},
102+
])
103+
104+
class GetWeather(BaseModel):
105+
"""Give the weather for a given city, and optionally for a specific hour of the day"""
106+
city: str
107+
hour: Optional[int] = None
108+
109+
response = model(chat, tools=[GetWeather])
110+
print(response.tool_calls) # [ToolCallOutput(name='GetWeather', id='call_KWfADMEr6dnDDcw1m2dllRvq', args={'city': 'Tokyo'})]
111+
```
112+
113+
## Tool Calls and Responses
114+
115+
If the model decides to call a tool, you'll get a value for the `tool_calls` attribute of the `Output` received. This value is a `OutputToolCall` instance containing three attributes:
116+
117+
- `name`: The name of the tool to call
118+
- `id`: The id of the tool call to be able to easily link the tool call and the tool response
119+
- `args`: A dictionnary containing for each parameter required by the tool the value provided by the LLM
120+
121+
You should use the `name` and the `args` to call your tool yourself and get its reponse. Afterward, you can add to your chat the `Output` you first receive and a `ToolMessage` before being able to call the model again to continue the conversation.
122+
123+
For instance:
124+
125+
```python
126+
import openai
127+
from outlines import Generator, from_openai
128+
from outlines.inputs import Chat
129+
from typing import Optional
130+
131+
# Our tool
132+
def get_weather(city: str, hour: Optional[int] = None):
133+
"""Give the weather for a given city, and optionally for a specific hour of the day"""
134+
return "20 degrees"
135+
136+
client = openai.OpenAI()
137+
model = from_openai(client, "gpt-4o")
138+
generator = Generator(model, tools=[get_weather])
139+
140+
chat = Chat([
141+
{"role": "system", "content": "You are a helpful assistant."},
142+
{"role": "user", "content": "What's the weather in Tokyo?"},
143+
])
144+
145+
response = generator(chat)
146+
print(response.tool_calls) # [ToolCallOutput(name='get_weather', id='call_NlIGHr8HoiVgSZfOJ7Y5xz35', args={'city': 'Tokyo'})]
147+
148+
# Add the model response to the chat
149+
chat.add_output(response)
150+
151+
# Call the tool with the parameters given by the model and add a tool message to the chat
152+
tool_call = response.tool_calls[0]
153+
tool_response = get_weather(**tool_call.args)
154+
chat.add_tool_message(
155+
content=tool_response,
156+
tool_name=tool_call.name,
157+
tool_call_id=tool_call.id
158+
)
159+
160+
response = generator(chat)
161+
print(response.content) # The weather in Tokyo is currently 20 degrees.
162+
```
163+
164+
When using streaming, the response would be a `StreamingOutput` and the `tool_calls` value a list of `StreamingOutputToolCall`. The only difference compared to what's the describe above is that the `args` field would be a string as the value is received by chunks. You need to concatenate the chunks together to get the full `args` to use to call the tool.

0 commit comments

Comments
 (0)