You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+12-8Lines changed: 12 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -46,14 +46,14 @@ RubyLLM fixes all that. One beautiful API for everything. One consistent format.
46
46
chat =RubyLLM.chat
47
47
chat.ask "What's the best way to learn Ruby?"
48
48
49
-
# Analyze images
50
-
chat.ask "What's in this image?", with: { image:"ruby_conf.jpg" }
49
+
# Analyze images, audio, documents, and text files
50
+
chat.ask "What's in this image?", with:"ruby_conf.jpg"
51
+
chat.ask "Describe this meeting", with:"meeting.wav"
52
+
chat.ask "Summarize this document", with:"contract.pdf"
53
+
chat.ask "Explain this code", with:"app.rb"
51
54
52
-
# Analyze audio recordings
53
-
chat.ask "Describe this meeting", with: { audio:"meeting.wav" }
54
-
55
-
# Analyze documents
56
-
chat.ask "Summarize this document", with: { pdf:"contract.pdf" }
55
+
# Multiple files at once - types automatically detected
56
+
chat.ask "Analyze these files", with: ["diagram.png", "report.pdf", "notes.txt"]
57
57
58
58
# Stream responses in real-time
59
59
chat.ask "Tell me a story about a Ruby programmer"do |chunk|
@@ -90,7 +90,7 @@ chat.with_tool(Weather).ask "What's the weather in Berlin? (52.5200, 13.4050)"
90
90
* 💬 **Unified Chat:** Converse with models from OpenAI, Anthropic, Gemini, Bedrock, OpenRouter, DeepSeek, Ollama, or any OpenAI-compatible API using `RubyLLM.chat`.
91
91
* 👁️ **Vision:** Analyze images within chats.
92
92
* 🔊 **Audio:** Transcribe and understand audio content.
93
-
* 📄 **PDF Analysis:** Extract information and summarize PDF documents.
93
+
* 📄 **Document Analysis:** Extract information from PDFs, text files, and other documents.
94
94
* 🖼️ **Image Generation:** Create images with `RubyLLM.paint`.
95
95
* 📊 **Embeddings:** Generate text embeddings for vector search with `RubyLLM.embed`.
96
96
* 🔧 **Tools (Function Calling):** Let AI models call your Ruby code using `RubyLLM::Tool`.
@@ -143,6 +143,10 @@ end
143
143
# Now interacting with a Chat record persists the conversation:
Copy file name to clipboardExpand all lines: docs/configuration.md
+12-20Lines changed: 12 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,13 +31,12 @@ After reading this guide, you will know:
31
31
32
32
## Global Configuration (`RubyLLM.configure`)
33
33
34
-
{: .warning }
35
-
> Native OpenRouter and Ollama support is coming in v1.3.0
36
-
>
37
-
> Consider using `openai_api_base` in the meantime.
38
-
39
34
The primary way to configure RubyLLM is using the `RubyLLM.configure` block. This typically runs once when your application starts (e.g., in `config/initializers/ruby_llm.rb` for Rails apps, or at the top of a script).
40
35
36
+
RubyLLM provides sensible defaults, so you only need to configure what you really need.
37
+
38
+
Here's a reference of all the configuration options RubyLLM provides:
39
+
41
40
```ruby
42
41
require'ruby_llm'
43
42
@@ -78,7 +77,13 @@ RubyLLM.configure do |config|
78
77
config.retry_interval =0.1# Initial delay in seconds (default: 0.1)
79
78
config.retry_backoff_factor =2# Multiplier for subsequent retries (default: 2)
You only need to set the API keys for the providers you actually plan to use. Attempting to use an unconfigured provider will result in a `RubyLLM::ConfigurationError`.
96
+
You only need to set configuration options you need and the API keys for the providers you actually plan to use. Attempting to use an unconfigured provider will result in a `RubyLLM::ConfigurationError`.
92
97
93
98
## Provider API Keys
94
99
@@ -122,10 +127,6 @@ end
122
127
This setting redirects requests made with `provider: :openai` to your specified base URL. See the [Working with Models Guide]({% link guides/models.md %}#connecting-to-custom-endpoints--using-unlisted-models) for more details on using custom models with this setting.
123
128
124
129
## Optional OpenAI Headers
125
-
{: .d-inline-block }
126
-
127
-
Coming in v1.3.0
128
-
{: .label .label-yellow }
129
130
130
131
OpenAI supports additional headers for organization and project management:
131
132
@@ -157,11 +158,6 @@ Fine-tune how RubyLLM handles HTTP connections and retries.
157
158
Adjust these based on network conditions and provider reliability.
158
159
159
160
## Logging Settings
160
-
{: .d-inline-block }
161
-
162
-
Coming in v1.3.0
163
-
{: .label .label-yellow }
164
-
165
161
RubyLLM provides flexible logging configuration to help you monitor and debug API interactions. You can configure both the log file location and the logging level.
166
162
167
163
```ruby
@@ -186,10 +182,6 @@ end
186
182
You can also set the debug level by setting the `RUBYLLM_DEBUG` environment variable to `true`.
187
183
188
184
## Scoped Configuration with Contexts
189
-
{: .d-inline-block }
190
-
191
-
Coming in v1.3.0
192
-
{: .label .label-yellow }
193
185
194
186
While `RubyLLM.configure` sets global defaults, `RubyLLM.context` allows you to create temporary, isolated configuration scopes for specific API calls. This is ideal for situations requiring different keys, endpoints, or timeouts temporarily without affecting the rest of the application.
Copy file name to clipboardExpand all lines: docs/guides/chat.md
+52-10Lines changed: 52 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -119,7 +119,7 @@ RubyLLM manages a registry of known models and their capabilities. For detailed
119
119
120
120
## Multi-modal Conversations
121
121
122
-
Modern AI models can often process more than just text. RubyLLM provides a unified way to include images, audio, and even PDFs in your chat messages using the `with:` option in the `ask` method.
122
+
Modern AI models can often process more than just text. RubyLLM provides a unified way to include images, audio, text files, and PDFs in your chat messages using the `with:` option in the `ask` method.
123
123
124
124
### Working with Images
125
125
@@ -130,17 +130,15 @@ Provide image paths or URLs to vision-capable models (like `gpt-4o`, `claude-3-o
130
130
chat =RubyLLM.chat(model:'gpt-4o')
131
131
132
132
# Ask about a local image file
133
-
response = chat.ask "Describe this logo.", with:{ image:"path/to/ruby_logo.png" }
133
+
response = chat.ask "Describe this logo.", with:"path/to/ruby_logo.png"
134
134
puts response.content
135
135
136
136
# Ask about an image from a URL
137
-
response = chat.ask "What kind of architecture is shown here?", with:{ image:"https://example.com/eiffel_tower.jpg" }
137
+
response = chat.ask "What kind of architecture is shown here?", with:"https://example.com/eiffel_tower.jpg"
138
138
puts response.content
139
139
140
140
# Send multiple images
141
-
response = chat.ask "Compare the user interfaces in these two screenshots.", with: {
142
-
image: ["screenshot_v1.png", "screenshot_v2.png"]
143
-
}
141
+
response = chat.ask "Compare the user interfaces in these two screenshots.", with: ["screenshot_v1.png", "screenshot_v2.png"]
144
142
puts response.content
145
143
```
146
144
@@ -154,14 +152,30 @@ Provide audio file paths to audio-capable models (like `gpt-4o-audio-preview`).
154
152
chat =RubyLLM.chat(model:'gpt-4o-audio-preview') # Use an audio-capable model
response = chat.ask "Please transcribe this meeting recording.", with:"path/to/meeting.mp3"
158
156
puts response.content
159
157
160
158
# Ask follow-up questions based on the audio context
161
159
response = chat.ask "What were the main action items discussed?"
162
160
puts response.content
163
161
```
164
162
163
+
### Working with Text Files
164
+
165
+
Provide text file paths to models that support document analysis.
166
+
167
+
```ruby
168
+
chat =RubyLLM.chat(model:'claude-3-5-sonnet')
169
+
170
+
# Analyze a text file
171
+
response = chat.ask "Summarize the key points in this document.", with:"path/to/document.txt"
172
+
puts response.content
173
+
174
+
# Ask questions about code files
175
+
response = chat.ask "Explain what this Ruby file does.", with:"app/models/user.rb"
176
+
puts response.content
177
+
```
178
+
165
179
### Working with PDFs
166
180
167
181
Provide PDF paths or URLs to models that support document analysis (currently Claude 3+ and Gemini models).
@@ -171,21 +185,49 @@ Provide PDF paths or URLs to models that support document analysis (currently Cl
171
185
chat =RubyLLM.chat(model:'claude-3-7-sonnet')
172
186
173
187
# Ask about a local PDF
174
-
response = chat.ask "Summarize the key findings in this research paper.", with:{ pdf:"path/to/paper.pdf" }
188
+
response = chat.ask "Summarize the key findings in this research paper.", with:"path/to/paper.pdf"
175
189
puts response.content
176
190
177
191
# Ask about a PDF via URL
178
-
response = chat.ask "What are the terms and conditions outlined here?", with:{ pdf:"https://example.com/terms.pdf" }
192
+
response = chat.ask "What are the terms and conditions outlined here?", with:"https://example.com/terms.pdf"
179
193
puts response.content
180
194
181
195
# Combine text and PDF context
182
-
response = chat.ask "Based on section 3 of this document, what is the warranty period?", with:{ pdf:"manual.pdf" }
196
+
response = chat.ask "Based on section 3 of this document, what is the warranty period?", with:"manual.pdf"
183
197
puts response.content
184
198
```
185
199
186
200
{: .note }
187
201
**PDF Limitations:** Be mindful of provider-specific limits. For example, Anthropic Claude models currently have a 10MB per-file size limit, and the total size/token count of all PDFs must fit within the model's context window (e.g., 200,000 tokens for Claude 3 models).
188
202
203
+
### Simplified Attachment API
204
+
205
+
RubyLLM automatically detects file types based on extensions and content, so you can pass files directly without specifying the type:
206
+
207
+
```ruby
208
+
chat =RubyLLM.chat(model:'claude-3-5-sonnet')
209
+
210
+
# Single file - type automatically detected
211
+
response = chat.ask "What's in this file?", with:"path/to/document.pdf"
212
+
213
+
# Multiple files of different types
214
+
response = chat.ask "Analyze these files", with: [
215
+
"diagram.png",
216
+
"report.pdf",
217
+
"meeting_notes.txt",
218
+
"recording.mp3"
219
+
]
220
+
221
+
# Still works with the explicit hash format if needed
222
+
response = chat.ask "What's in this image?", with: { image:"photo.jpg" }
-**Code:** .rb, .py, .js, .html, .css (and many others)
230
+
189
231
## Controlling Creativity: Temperature
190
232
191
233
The `temperature` setting influences the randomness and creativity of the AI's responses. A higher value (e.g., 0.9) leads to more varied and potentially surprising outputs, while a lower value (e.g., 0.1) makes the responses more focused, deterministic, and predictable. The default is generally around 0.7.
# Use a model not in the registry (useful for custom endpoints)
85
+
embedding_custom =RubyLLM.embed(
86
+
"Custom model test",
87
+
model:"my-custom-embedding-model",
88
+
provider::openai,
89
+
assume_model_exists:true
90
+
)
83
91
```
84
92
85
93
You can configure the default embedding model globally:
@@ -93,10 +101,6 @@ end
93
101
Refer to the [Working with Models Guide]({% link guides/models.md %}) for details on finding available embedding models and their capabilities.
94
102
95
103
## Choosing Dimensions
96
-
{: .d-inline-block }
97
-
98
-
Coming in v1.3.0
99
-
{: .label .label-yellow }
100
104
101
105
Each embedding model has its own default output dimensions. For example, OpenAI's `text-embedding-3-small` outputs 1536 dimensions by default, while `text-embedding-3-large` outputs 3072 dimensions. RubyLLM allows you to specify these dimensions per request:
Copy file name to clipboardExpand all lines: docs/guides/rails.md
+3-15Lines changed: 3 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -50,7 +50,7 @@ This two-phase approach (create empty → update with content) is intentional an
50
50
51
51
1.**Streaming-first design**: By creating the message record before the API call, your UI can immediately show a "thinking" state and have a DOM target ready for incoming chunks.
52
52
2.**Turbo Streams compatibility**: Works perfectly with `after_create_commit { broadcast_append_to... }` for real-time updates.
53
-
3.**Clean rollback on failure**: If the API call fails, the empty message is automatically removed.
53
+
3.**Clean rollback on failure**: If the API call fails, the empty assistant message is automatically removed, preventing orphaned records that could cause issues with providers like Gemini that reject empty messages.
54
54
55
55
### Content Validation Implications
56
56
@@ -118,10 +118,6 @@ end
118
118
Run the migrations: `rails db:migrate`
119
119
120
120
### ActiveStorage Setup for Attachments (Optional)
121
-
{: .d-inline-block }
122
-
123
-
Coming in v1.3.0
124
-
{: .label .label-yellow }
125
121
126
122
If you want to use attachments (images, audio, PDFs) with your AI chats, you need to set up ActiveStorage:
If you've set up ActiveStorage as described above, you can easily send attachments to AI models with automatic type detection:
281
273
@@ -290,22 +282,18 @@ chat_record.ask("What's in this file?", with: "app/assets/images/diagram.png")
290
282
chat_record.ask("What are in these files?", with: [
291
283
"app/assets/documents/report.pdf",
292
284
"app/assets/images/chart.jpg",
285
+
"app/assets/text/notes.txt",
293
286
"app/assets/audio/recording.mp3"
294
287
])
295
288
296
-
# Still works with manually categorized hash (backward compatible)
297
-
chat_record.ask("What's in this image?", with: {
298
-
image:"app/assets/images/diagram.png"
299
-
})
300
-
301
289
# Works with file uploads from forms
302
290
chat_record.ask("Analyze this file", with: params[:uploaded_file])
303
291
304
292
# Works with existing ActiveStorage attachments
305
293
chat_record.ask("What's in this document?", with: user.profile_document)
306
294
```
307
295
308
-
The attachment API automatically detects file types based on file extension or content type, so you don't need to specify whether something is an image, audio file, or PDF - RubyLLM figures it out for you!
296
+
The attachment API automatically detects file types based on file extension or content type, so you don't need to specify whether something is an image, audio file, PDF, or text document - RubyLLM figures it out for you!
Copy file name to clipboardExpand all lines: docs/index.md
+12-8Lines changed: 12 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -72,14 +72,14 @@ RubyLLM fixes all that. One beautiful API for everything. One consistent format.
72
72
chat =RubyLLM.chat
73
73
chat.ask "What's the best way to learn Ruby?"
74
74
75
-
# Analyze images
76
-
chat.ask "What's in this image?", with: { image:"ruby_conf.jpg" }
75
+
# Analyze images, audio, documents, and text files
76
+
chat.ask "What's in this image?", with:"ruby_conf.jpg"
77
+
chat.ask "Describe this meeting", with:"meeting.wav"
78
+
chat.ask "Summarize this document", with:"contract.pdf"
79
+
chat.ask "Explain this code", with:"app.rb"
77
80
78
-
# Analyze audio recordings
79
-
chat.ask "Describe this meeting", with: { audio:"meeting.wav" }
80
-
81
-
# Analyze documents
82
-
chat.ask "Summarize this document", with: { pdf:"contract.pdf" }
81
+
# Multiple files at once - types automatically detected
82
+
chat.ask "Analyze these files", with: ["diagram.png", "report.pdf", "notes.txt"]
83
83
84
84
# Stream responses in real-time
85
85
chat.ask "Tell me a story about a Ruby programmer"do |chunk|
@@ -116,7 +116,7 @@ chat.with_tool(Weather).ask "What's the weather in Berlin? (52.5200, 13.4050)"
116
116
* 💬 **Unified Chat:** Converse with models from OpenAI, Anthropic, Gemini, Bedrock, OpenRouter, DeepSeek, Ollama, or any OpenAI-compatible API using `RubyLLM.chat`.
117
117
* 👁️ **Vision:** Analyze images within chats.
118
118
* 🔊 **Audio:** Transcribe and understand audio content.
119
-
* 📄 **PDF Analysis:** Extract information and summarize PDF documents.
119
+
* 📄 **Document Analysis:** Extract information from PDFs, text files, and other documents.
120
120
* 🖼️ **Image Generation:** Create images with `RubyLLM.paint`.
121
121
* 📊 **Embeddings:** Generate text embeddings for vector search with `RubyLLM.embed`.
122
122
* 🔧 **Tools (Function Calling):** Let AI models call your Ruby code using `RubyLLM::Tool`.
@@ -169,5 +169,9 @@ end
169
169
# Now interacting with a Chat record persists the conversation:
0 commit comments