Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 90 additions & 0 deletions docs/batch_requests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Batch Request Feature

The batch request feature allows you to generate API request payloads without actually making API calls. This is useful for:

1. **Batch Processing**: Generate multiple request payloads and send them to provider batch endpoints
2. **Testing**: Verify request payload structure without making API calls
3. **Debugging**: Inspect the exact payload that would be sent to the provider

## Basic Usage

```ruby
# Enable batch request mode
chat = RubyLLM.chat.for_batch_request
chat.add_message(role: :user, content: "What's 2 + 2?")

# Returns the request payload instead of making an API call
payload = chat.complete
# => {:custom_id=>"...", :method=>"POST", :url=>"/v1/chat/completions", :body=>{...}}
```

## Generating Multiple Batch Requests

```ruby
requests = []

3.times do |i|
chat = RubyLLM.chat.for_batch_request
chat.add_message(role: :user, content: "Question #{i + 1}")

requests << chat.complete
end

# Now you have an array of request payloads
# You can format them as JSONL and send to provider batch endpoints
```

## Provider Support

Currently, only OpenAI supports batch requests. Other providers will raise `NotImplementedError`:

```ruby
# OpenAI (supported)
chat = RubyLLM.chat(provider: :openai).for_batch_request
chat.add_message(role: :user, content: "Hello")
payload = chat.complete
# => {
# :custom_id=>"request-abc123",
# :method=>"POST",
# :url=>"/v1/chat/completions",
# :body=>{:model=>"gpt-4", :messages=>[...]}
# }

# Other providers (not supported)
chat = RubyLLM.chat(provider: :anthropic).for_batch_request
chat.add_message(role: :user, content: "Hello")
chat.complete # Raises NotImplementedError
```

## Usage with Other Methods

The `for_batch_request` method chains with other configuration methods:

```ruby
chat = RubyLLM.chat
.with_model('gpt-4')
.with_temperature(0.7)
.with_tool(MyTool)
.for_batch_request

chat.ask("Process this")
payload = chat.complete # Returns batch request payload
```

## Notes

- Streaming is not supported when in batch request mode
- The batch request payload includes all configured parameters (tools, schema, temperature, etc.)
- No messages are added to the chat history when generating batch request payloads
- Providers must explicitly implement `render_payload_for_batch_request` to support this feature

## Future Enhancements

The remaining steps for full batch processing support (to be implemented by users):

2. Combine multiple request payloads (typically in JSONL format)
3. Submit to provider's batch endpoint
4. Poll for batch completion status
5. Process batch results

These steps are provider-specific and can be implemented based on your needs.
20 changes: 20 additions & 0 deletions lib/ruby_llm/chat.rb
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ def initialize(model: nil, provider: nil, assume_model_exists: false, context: n
@params = {}
@headers = {}
@schema = nil
@batch_request = false
@on = {
new_message: nil,
end_message: nil,
Expand Down Expand Up @@ -111,6 +112,11 @@ def with_schema(schema, force: false)
self
end

def for_batch_request
@batch_request = true
self
end

def on_new_message(&block)
@on[:new_message] = block
self
Expand All @@ -136,6 +142,20 @@ def each(&)
end

def complete(&) # rubocop:disable Metrics/PerceivedComplexity
# If batch_request mode is enabled, render and return the payload
if @batch_request
raise ArgumentError, 'Streaming is not supported for batch requests' if block_given?

return @provider.render_payload_for_batch_request(
messages,
tools: @tools,
temperature: @temperature,
model: @model.id,
params: @params,
schema: @schema
)
end

response = @provider.complete(
messages,
tools: @tools,
Expand Down
5 changes: 5 additions & 0 deletions lib/ruby_llm/provider.rb
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,11 @@ def complete(messages, tools:, temperature:, model:, params: {}, headers: {}, sc
end
end

def render_payload_for_batch_request(_messages, tools:, temperature:, model:, params: {}, schema: nil) # rubocop:disable Metrics/ParameterLists
raise NotImplementedError, "#{self.class.name} does not support batch requests. " \
'Provider must implement render_payload_for_batch_request to enable batch request generation.'
end

def list_models
response = @connection.get models_url
parse_list_models_response response, slug, capabilities
Expand Down
6 changes: 6 additions & 0 deletions lib/ruby_llm/providers/deepseek.rb
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,12 @@ def headers
}
end

# DeepSeek doesn't support batch requests yet
def render_payload_for_batch_request(_messages, tools:, temperature:, model:, params: {}, schema: nil) # rubocop:disable Metrics/ParameterLists
raise NotImplementedError, 'DeepSeek does not support batch requests. ' \
'Batch request generation is not available for this provider.'
end

class << self
def capabilities
DeepSeek::Capabilities
Expand Down
6 changes: 6 additions & 0 deletions lib/ruby_llm/providers/gpustack.rb
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,12 @@ def headers
}
end

# GPUStack doesn't support batch requests yet
def render_payload_for_batch_request(_messages, tools:, temperature:, model:, params: {}, schema: nil) # rubocop:disable Metrics/ParameterLists
raise NotImplementedError, 'GPUStack does not support batch requests. ' \
'Batch request generation is not available for this provider.'
end

class << self
def local?
true
Expand Down
6 changes: 6 additions & 0 deletions lib/ruby_llm/providers/mistral.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,12 @@ def headers
}
end

# Mistral doesn't support batch requests yet
def render_payload_for_batch_request(_messages, tools:, temperature:, model:, params: {}, schema: nil) # rubocop:disable Metrics/ParameterLists
raise NotImplementedError, 'Mistral does not support batch requests. ' \
'Batch request generation is not available for this provider.'
end

class << self
def capabilities
Mistral::Capabilities
Expand Down
6 changes: 6 additions & 0 deletions lib/ruby_llm/providers/ollama.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,12 @@ def headers
{}
end

# Ollama doesn't support batch requests yet
def render_payload_for_batch_request(_messages, tools:, temperature:, model:, params: {}, schema: nil) # rubocop:disable Metrics/ParameterLists
raise NotImplementedError, 'Ollama does not support batch requests. ' \
'Batch request generation is not available for this provider.'
end

class << self
def configuration_requirements
%i[ollama_api_base]
Expand Down
26 changes: 26 additions & 0 deletions lib/ruby_llm/providers/openai.rb
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,32 @@ def maybe_normalize_temperature(temperature, model_id)
OpenAI::Capabilities.normalize_temperature(temperature, model_id)
end

# Override to format payload according to OpenAI's batch request API
# https://platform.openai.com/docs/guides/batch
def render_payload_for_batch_request(messages, tools:, temperature:, model:, params: {}, schema: nil) # rubocop:disable Metrics/ParameterLists
normalized_temperature = maybe_normalize_temperature(temperature, model)

payload = Utils.deep_merge(
params,
render_payload(
messages,
tools: tools,
temperature: normalized_temperature,
model: model,
stream: false,
schema: schema
)
)

# Format according to OpenAI's batch request API
{
custom_id: "request-#{SecureRandom.uuid}",
method: 'POST',
url: '/v1/chat/completions',
body: payload
}
end

class << self
def capabilities
OpenAI::Capabilities
Expand Down
6 changes: 6 additions & 0 deletions lib/ruby_llm/providers/openrouter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,12 @@ def headers
}
end

# OpenRouter doesn't support batch requests yet
def render_payload_for_batch_request(_messages, tools:, temperature:, model:, params: {}, schema: nil) # rubocop:disable Metrics/ParameterLists
raise NotImplementedError, 'OpenRouter does not support batch requests. ' \
'Batch request generation is not available for this provider.'
end

class << self
def configuration_requirements
%i[openrouter_api_key]
Expand Down
6 changes: 6 additions & 0 deletions lib/ruby_llm/providers/perplexity.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,12 @@ def headers
}
end

# Perplexity doesn't support batch requests yet
def render_payload_for_batch_request(_messages, tools:, temperature:, model:, params: {}, schema: nil) # rubocop:disable Metrics/ParameterLists
raise NotImplementedError, 'Perplexity does not support batch requests. ' \
'Batch request generation is not available for this provider.'
end

class << self
def capabilities
Perplexity::Capabilities
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading