Skip to content

Commit 555cf0a

Browse files
committed
Updated documentation for 1.3.0
1 parent 4d23c98 commit 555cf0a

File tree

8 files changed

+127
-65
lines changed

8 files changed

+127
-65
lines changed

README.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -46,14 +46,14 @@ RubyLLM fixes all that. One beautiful API for everything. One consistent format.
4646
chat = RubyLLM.chat
4747
chat.ask "What's the best way to learn Ruby?"
4848

49-
# Analyze images
50-
chat.ask "What's in this image?", with: { image: "ruby_conf.jpg" }
49+
# Analyze images, audio, documents, and text files
50+
chat.ask "What's in this image?", with: "ruby_conf.jpg"
51+
chat.ask "Describe this meeting", with: "meeting.wav"
52+
chat.ask "Summarize this document", with: "contract.pdf"
53+
chat.ask "Explain this code", with: "app.rb"
5154

52-
# Analyze audio recordings
53-
chat.ask "Describe this meeting", with: { audio: "meeting.wav" }
54-
55-
# Analyze documents
56-
chat.ask "Summarize this document", with: { pdf: "contract.pdf" }
55+
# Multiple files at once - types automatically detected
56+
chat.ask "Analyze these files", with: ["diagram.png", "report.pdf", "notes.txt"]
5757

5858
# Stream responses in real-time
5959
chat.ask "Tell me a story about a Ruby programmer" do |chunk|
@@ -90,7 +90,7 @@ chat.with_tool(Weather).ask "What's the weather in Berlin? (52.5200, 13.4050)"
9090
* 💬 **Unified Chat:** Converse with models from OpenAI, Anthropic, Gemini, Bedrock, OpenRouter, DeepSeek, Ollama, or any OpenAI-compatible API using `RubyLLM.chat`.
9191
* 👁️ **Vision:** Analyze images within chats.
9292
* 🔊 **Audio:** Transcribe and understand audio content.
93-
* 📄 **PDF Analysis:** Extract information and summarize PDF documents.
93+
* 📄 **Document Analysis:** Extract information from PDFs, text files, and other documents.
9494
* 🖼️ **Image Generation:** Create images with `RubyLLM.paint`.
9595
* 📊 **Embeddings:** Generate text embeddings for vector search with `RubyLLM.embed`.
9696
* 🔧 **Tools (Function Calling):** Let AI models call your Ruby code using `RubyLLM::Tool`.
@@ -143,6 +143,10 @@ end
143143
# Now interacting with a Chat record persists the conversation:
144144
chat_record = Chat.create!(model_id: "gpt-4.1-nano")
145145
chat_record.ask("Explain Active Record callbacks.") # User & Assistant messages saved
146+
147+
# Works seamlessly with file attachments - types automatically detected
148+
chat_record.ask("What's in this file?", with: "report.pdf")
149+
chat_record.ask("Analyze these", with: ["image.jpg", "data.csv", "notes.txt"])
146150
```
147151
Check the [Rails Integration Guide](https://rubyllm.com/guides/rails) for more.
148152

docs/configuration.md

Lines changed: 12 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -31,13 +31,12 @@ After reading this guide, you will know:
3131

3232
## Global Configuration (`RubyLLM.configure`)
3333

34-
{: .warning }
35-
> Native OpenRouter and Ollama support is coming in v1.3.0
36-
>
37-
> Consider using `openai_api_base` in the meantime.
38-
3934
The primary way to configure RubyLLM is using the `RubyLLM.configure` block. This typically runs once when your application starts (e.g., in `config/initializers/ruby_llm.rb` for Rails apps, or at the top of a script).
4035

36+
RubyLLM provides sensible defaults, so you only need to configure what you really need.
37+
38+
Here's a reference of all the configuration options RubyLLM provides:
39+
4140
```ruby
4241
require 'ruby_llm'
4342

@@ -78,7 +77,13 @@ RubyLLM.configure do |config|
7877
config.retry_interval = 0.1 # Initial delay in seconds (default: 0.1)
7978
config.retry_backoff_factor = 2 # Multiplier for subsequent retries (default: 2)
8079
config.retry_interval_randomness = 0.5 # Jitter factor (default: 0.5)
81-
config.http_proxy = ENV.fetch('HTTP_PROXY', 'http://proxy.example.com:3128') # Optional HTTP proxy
80+
81+
# --- HTTP Proxy Support ---
82+
config.http_proxy = ENV.fetch('HTTP_PROXY', nil) # Optional HTTP proxy
83+
# Examples:
84+
# config.http_proxy = "http://proxy.company.com:8080" # Basic proxy
85+
# config.http_proxy = "http://user:[email protected]:8080" # Authenticated proxy
86+
# config.http_proxy = "socks5://proxy.company.com:1080" # SOCKS5 proxy
8287

8388
# --- Logging Settings ---
8489
config.log_file = '/logs/ruby_llm.log'
@@ -88,7 +93,7 @@ end
8893
```
8994

9095
{: .note }
91-
You only need to set the API keys for the providers you actually plan to use. Attempting to use an unconfigured provider will result in a `RubyLLM::ConfigurationError`.
96+
You only need to set configuration options you need and the API keys for the providers you actually plan to use. Attempting to use an unconfigured provider will result in a `RubyLLM::ConfigurationError`.
9297

9398
## Provider API Keys
9499

@@ -122,10 +127,6 @@ end
122127
This setting redirects requests made with `provider: :openai` to your specified base URL. See the [Working with Models Guide]({% link guides/models.md %}#connecting-to-custom-endpoints--using-unlisted-models) for more details on using custom models with this setting.
123128

124129
## Optional OpenAI Headers
125-
{: .d-inline-block }
126-
127-
Coming in v1.3.0
128-
{: .label .label-yellow }
129130

130131
OpenAI supports additional headers for organization and project management:
131132

@@ -157,11 +158,6 @@ Fine-tune how RubyLLM handles HTTP connections and retries.
157158
Adjust these based on network conditions and provider reliability.
158159

159160
## Logging Settings
160-
{: .d-inline-block }
161-
162-
Coming in v1.3.0
163-
{: .label .label-yellow }
164-
165161
RubyLLM provides flexible logging configuration to help you monitor and debug API interactions. You can configure both the log file location and the logging level.
166162

167163
```ruby
@@ -186,10 +182,6 @@ end
186182
You can also set the debug level by setting the `RUBYLLM_DEBUG` environment variable to `true`.
187183

188184
## Scoped Configuration with Contexts
189-
{: .d-inline-block }
190-
191-
Coming in v1.3.0
192-
{: .label .label-yellow }
193185

194186
While `RubyLLM.configure` sets global defaults, `RubyLLM.context` allows you to create temporary, isolated configuration scopes for specific API calls. This is ideal for situations requiring different keys, endpoints, or timeouts temporarily without affecting the rest of the application.
195187

docs/guides/chat.md

Lines changed: 52 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ RubyLLM manages a registry of known models and their capabilities. For detailed
119119

120120
## Multi-modal Conversations
121121

122-
Modern AI models can often process more than just text. RubyLLM provides a unified way to include images, audio, and even PDFs in your chat messages using the `with:` option in the `ask` method.
122+
Modern AI models can often process more than just text. RubyLLM provides a unified way to include images, audio, text files, and PDFs in your chat messages using the `with:` option in the `ask` method.
123123

124124
### Working with Images
125125

@@ -130,17 +130,15 @@ Provide image paths or URLs to vision-capable models (like `gpt-4o`, `claude-3-o
130130
chat = RubyLLM.chat(model: 'gpt-4o')
131131

132132
# Ask about a local image file
133-
response = chat.ask "Describe this logo.", with: { image: "path/to/ruby_logo.png" }
133+
response = chat.ask "Describe this logo.", with: "path/to/ruby_logo.png"
134134
puts response.content
135135

136136
# Ask about an image from a URL
137-
response = chat.ask "What kind of architecture is shown here?", with: { image: "https://example.com/eiffel_tower.jpg" }
137+
response = chat.ask "What kind of architecture is shown here?", with: "https://example.com/eiffel_tower.jpg"
138138
puts response.content
139139

140140
# Send multiple images
141-
response = chat.ask "Compare the user interfaces in these two screenshots.", with: {
142-
image: ["screenshot_v1.png", "screenshot_v2.png"]
143-
}
141+
response = chat.ask "Compare the user interfaces in these two screenshots.", with: ["screenshot_v1.png", "screenshot_v2.png"]
144142
puts response.content
145143
```
146144

@@ -154,14 +152,30 @@ Provide audio file paths to audio-capable models (like `gpt-4o-audio-preview`).
154152
chat = RubyLLM.chat(model: 'gpt-4o-audio-preview') # Use an audio-capable model
155153

156154
# Transcribe or ask questions about audio content
157-
response = chat.ask "Please transcribe this meeting recording.", with: { audio: "path/to/meeting.mp3" }
155+
response = chat.ask "Please transcribe this meeting recording.", with: "path/to/meeting.mp3"
158156
puts response.content
159157

160158
# Ask follow-up questions based on the audio context
161159
response = chat.ask "What were the main action items discussed?"
162160
puts response.content
163161
```
164162

163+
### Working with Text Files
164+
165+
Provide text file paths to models that support document analysis.
166+
167+
```ruby
168+
chat = RubyLLM.chat(model: 'claude-3-5-sonnet')
169+
170+
# Analyze a text file
171+
response = chat.ask "Summarize the key points in this document.", with: "path/to/document.txt"
172+
puts response.content
173+
174+
# Ask questions about code files
175+
response = chat.ask "Explain what this Ruby file does.", with: "app/models/user.rb"
176+
puts response.content
177+
```
178+
165179
### Working with PDFs
166180

167181
Provide PDF paths or URLs to models that support document analysis (currently Claude 3+ and Gemini models).
@@ -171,21 +185,49 @@ Provide PDF paths or URLs to models that support document analysis (currently Cl
171185
chat = RubyLLM.chat(model: 'claude-3-7-sonnet')
172186

173187
# Ask about a local PDF
174-
response = chat.ask "Summarize the key findings in this research paper.", with: { pdf: "path/to/paper.pdf" }
188+
response = chat.ask "Summarize the key findings in this research paper.", with: "path/to/paper.pdf"
175189
puts response.content
176190

177191
# Ask about a PDF via URL
178-
response = chat.ask "What are the terms and conditions outlined here?", with: { pdf: "https://example.com/terms.pdf" }
192+
response = chat.ask "What are the terms and conditions outlined here?", with: "https://example.com/terms.pdf"
179193
puts response.content
180194

181195
# Combine text and PDF context
182-
response = chat.ask "Based on section 3 of this document, what is the warranty period?", with: { pdf: "manual.pdf" }
196+
response = chat.ask "Based on section 3 of this document, what is the warranty period?", with: "manual.pdf"
183197
puts response.content
184198
```
185199

186200
{: .note }
187201
**PDF Limitations:** Be mindful of provider-specific limits. For example, Anthropic Claude models currently have a 10MB per-file size limit, and the total size/token count of all PDFs must fit within the model's context window (e.g., 200,000 tokens for Claude 3 models).
188202

203+
### Simplified Attachment API
204+
205+
RubyLLM automatically detects file types based on extensions and content, so you can pass files directly without specifying the type:
206+
207+
```ruby
208+
chat = RubyLLM.chat(model: 'claude-3-5-sonnet')
209+
210+
# Single file - type automatically detected
211+
response = chat.ask "What's in this file?", with: "path/to/document.pdf"
212+
213+
# Multiple files of different types
214+
response = chat.ask "Analyze these files", with: [
215+
"diagram.png",
216+
"report.pdf",
217+
"meeting_notes.txt",
218+
"recording.mp3"
219+
]
220+
221+
# Still works with the explicit hash format if needed
222+
response = chat.ask "What's in this image?", with: { image: "photo.jpg" }
223+
```
224+
225+
**Supported file types:**
226+
- **Images:** .jpg, .jpeg, .png, .gif, .webp, .bmp
227+
- **Audio:** .mp3, .wav, .m4a, .ogg, .flac
228+
- **Documents:** .pdf, .txt, .md, .csv, .json, .xml
229+
- **Code:** .rb, .py, .js, .html, .css (and many others)
230+
189231
## Controlling Creativity: Temperature
190232

191233
The `temperature` setting influences the randomness and creativity of the AI's responses. A higher value (e.g., 0.9) leads to more varied and potentially surprising outputs, while a lower value (e.g., 0.1) makes the responses more focused, deterministic, and predictable. The default is generally around 0.7.

docs/guides/embeddings.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,14 @@ embedding_google = RubyLLM.embed(
8080
"This is another test sentence",
8181
model: "text-embedding-004" # Google's model
8282
)
83+
84+
# Use a model not in the registry (useful for custom endpoints)
85+
embedding_custom = RubyLLM.embed(
86+
"Custom model test",
87+
model: "my-custom-embedding-model",
88+
provider: :openai,
89+
assume_model_exists: true
90+
)
8391
```
8492

8593
You can configure the default embedding model globally:
@@ -93,10 +101,6 @@ end
93101
Refer to the [Working with Models Guide]({% link guides/models.md %}) for details on finding available embedding models and their capabilities.
94102

95103
## Choosing Dimensions
96-
{: .d-inline-block }
97-
98-
Coming in v1.3.0
99-
{: .label .label-yellow }
100104

101105
Each embedding model has its own default output dimensions. For example, OpenAI's `text-embedding-3-small` outputs 1536 dimensions by default, while `text-embedding-3-large` outputs 3072 dimensions. RubyLLM allows you to specify these dimensions per request:
102106

docs/guides/image-generation.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,14 @@ image_imagen = RubyLLM.paint(
7777
"Cyberpunk city street at night, raining, neon signs",
7878
model: "imagen-3.0-generate-002"
7979
)
80+
81+
# Use a model not in the registry (useful for custom endpoints)
82+
image_custom = RubyLLM.paint(
83+
"A sunset over mountains",
84+
model: "my-custom-image-model",
85+
provider: :openai,
86+
assume_model_exists: true
87+
)
8088
```
8189

8290
You can configure the default model globally:

docs/guides/models.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,26 @@ chat.with_model(
191191
)
192192
```
193193

194+
The `assume_model_exists` flag also works with `RubyLLM.embed` and `RubyLLM.paint` for embedding and image generation models:
195+
196+
```ruby
197+
# Custom embedding model
198+
embedding = RubyLLM.embed(
199+
"Test text",
200+
model: 'my-custom-embedder',
201+
provider: :openai,
202+
assume_model_exists: true
203+
)
204+
205+
# Custom image model
206+
image = RubyLLM.paint(
207+
"A beautiful landscape",
208+
model: 'my-custom-dalle',
209+
provider: :openai,
210+
assume_model_exists: true
211+
)
212+
```
213+
194214
**Key Points when Assuming Existence:**
195215

196216
* **`provider:` is Mandatory:** You must tell RubyLLM which API format to use (`ArgumentError` otherwise).

docs/guides/rails.md

Lines changed: 3 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ This two-phase approach (create empty → update with content) is intentional an
5050

5151
1. **Streaming-first design**: By creating the message record before the API call, your UI can immediately show a "thinking" state and have a DOM target ready for incoming chunks.
5252
2. **Turbo Streams compatibility**: Works perfectly with `after_create_commit { broadcast_append_to... }` for real-time updates.
53-
3. **Clean rollback on failure**: If the API call fails, the empty message is automatically removed.
53+
3. **Clean rollback on failure**: If the API call fails, the empty assistant message is automatically removed, preventing orphaned records that could cause issues with providers like Gemini that reject empty messages.
5454

5555
### Content Validation Implications
5656

@@ -118,10 +118,6 @@ end
118118
Run the migrations: `rails db:migrate`
119119

120120
### ActiveStorage Setup for Attachments (Optional)
121-
{: .d-inline-block }
122-
123-
Coming in v1.3.0
124-
{: .label .label-yellow }
125121

126122
If you want to use attachments (images, audio, PDFs) with your AI chats, you need to set up ActiveStorage:
127123

@@ -272,10 +268,6 @@ puts chat_record.messages.count # => 3 (user, assistant's tool call, tool result
272268
```
273269

274270
### Working with Attachments
275-
{: .d-inline-block }
276-
277-
Coming in v1.3.0
278-
{: .label .label-yellow }
279271

280272
If you've set up ActiveStorage as described above, you can easily send attachments to AI models with automatic type detection:
281273

@@ -290,22 +282,18 @@ chat_record.ask("What's in this file?", with: "app/assets/images/diagram.png")
290282
chat_record.ask("What are in these files?", with: [
291283
"app/assets/documents/report.pdf",
292284
"app/assets/images/chart.jpg",
285+
"app/assets/text/notes.txt",
293286
"app/assets/audio/recording.mp3"
294287
])
295288

296-
# Still works with manually categorized hash (backward compatible)
297-
chat_record.ask("What's in this image?", with: {
298-
image: "app/assets/images/diagram.png"
299-
})
300-
301289
# Works with file uploads from forms
302290
chat_record.ask("Analyze this file", with: params[:uploaded_file])
303291

304292
# Works with existing ActiveStorage attachments
305293
chat_record.ask("What's in this document?", with: user.profile_document)
306294
```
307295

308-
The attachment API automatically detects file types based on file extension or content type, so you don't need to specify whether something is an image, audio file, or PDF - RubyLLM figures it out for you!
296+
The attachment API automatically detects file types based on file extension or content type, so you don't need to specify whether something is an image, audio file, PDF, or text document - RubyLLM figures it out for you!
309297

310298
## Handling Persistence Edge Cases
311299

docs/index.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -72,14 +72,14 @@ RubyLLM fixes all that. One beautiful API for everything. One consistent format.
7272
chat = RubyLLM.chat
7373
chat.ask "What's the best way to learn Ruby?"
7474

75-
# Analyze images
76-
chat.ask "What's in this image?", with: { image: "ruby_conf.jpg" }
75+
# Analyze images, audio, documents, and text files
76+
chat.ask "What's in this image?", with: "ruby_conf.jpg"
77+
chat.ask "Describe this meeting", with: "meeting.wav"
78+
chat.ask "Summarize this document", with: "contract.pdf"
79+
chat.ask "Explain this code", with: "app.rb"
7780

78-
# Analyze audio recordings
79-
chat.ask "Describe this meeting", with: { audio: "meeting.wav" }
80-
81-
# Analyze documents
82-
chat.ask "Summarize this document", with: { pdf: "contract.pdf" }
81+
# Multiple files at once - types automatically detected
82+
chat.ask "Analyze these files", with: ["diagram.png", "report.pdf", "notes.txt"]
8383

8484
# Stream responses in real-time
8585
chat.ask "Tell me a story about a Ruby programmer" do |chunk|
@@ -116,7 +116,7 @@ chat.with_tool(Weather).ask "What's the weather in Berlin? (52.5200, 13.4050)"
116116
* 💬 **Unified Chat:** Converse with models from OpenAI, Anthropic, Gemini, Bedrock, OpenRouter, DeepSeek, Ollama, or any OpenAI-compatible API using `RubyLLM.chat`.
117117
* 👁️ **Vision:** Analyze images within chats.
118118
* 🔊 **Audio:** Transcribe and understand audio content.
119-
* 📄 **PDF Analysis:** Extract information and summarize PDF documents.
119+
* 📄 **Document Analysis:** Extract information from PDFs, text files, and other documents.
120120
* 🖼️ **Image Generation:** Create images with `RubyLLM.paint`.
121121
* 📊 **Embeddings:** Generate text embeddings for vector search with `RubyLLM.embed`.
122122
* 🔧 **Tools (Function Calling):** Let AI models call your Ruby code using `RubyLLM::Tool`.
@@ -169,5 +169,9 @@ end
169169
# Now interacting with a Chat record persists the conversation:
170170
chat_record = Chat.create!(model_id: "gpt-4.1-nano")
171171
chat_record.ask("Explain Active Record callbacks.") # User & Assistant messages saved
172+
173+
# Works seamlessly with file attachments - types automatically detected
174+
chat_record.ask("What's in this file?", with: "report.pdf")
175+
chat_record.ask("Analyze these", with: ["image.jpg", "data.csv", "notes.txt"])
172176
```
173177
Check the [Rails Integration Guide](https://rubyllm.com/guides/rails) for more.

0 commit comments

Comments
 (0)