Skip to content

Commit f33e5d8

Browse files
finbarrclaudecrmne
authored
Fix embedding return format inconsistency for single-string arrays (#267)
## Summary - Fixes inconsistent return format when passing an array with a single string to the embed method - Ensures `RubyLLM.embed([string])` returns `[[vector]]` instead of `[vector]` - Maintains backward compatibility for single string inputs ## Problem As reported in #254, the embedding API had inconsistent behavior: ```ruby RubyLLM.embed(string) -> [vector] ✓ RubyLLM.embed([string, string]) -> [[vector], [vector]] ✓ RubyLLM.embed([string]) -> [vector] ✗ (should be [[vector]]) ``` ## Solution - Modified the base Provider class to pass the original `text` parameter to `parse_embedding_response` - Updated OpenAI and Gemini providers to check if input was an array before unwrapping single embeddings - Added comprehensive test coverage for this edge case ## Test plan - [x] Added new test case specifically for single-string arrays - [x] All existing tests pass - [x] Verified fix works for both OpenAI and Gemini providers Fixes #254 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <[email protected]> Co-authored-by: Carmine Paolino <[email protected]>
1 parent 78482e5 commit f33e5d8

File tree

6 files changed

+2479
-6
lines changed

6 files changed

+2479
-6
lines changed

lib/ruby_llm/provider.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ def list_models(connection:)
3434
def embed(text, model:, connection:, dimensions:)
3535
payload = render_embedding_payload(text, model:, dimensions:)
3636
response = connection.post(embedding_url(model:), payload)
37-
parse_embedding_response(response, model:)
37+
parse_embedding_response(response, model:, text:)
3838
end
3939

4040
def paint(prompt, model:, size:, connection:)

lib/ruby_llm/providers/gemini/embeddings.rb

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,11 @@ def render_embedding_payload(text, model:, dimensions:)
1515
{ requests: [text].flatten.map { |t| single_embedding_payload(t, model:, dimensions:) } }
1616
end
1717

18-
def parse_embedding_response(response, model:)
18+
def parse_embedding_response(response, model:, text:)
1919
vectors = response.body['embeddings']&.map { |e| e['values'] }
20-
vectors in [vectors]
20+
# If we only got one embedding AND the input was a single string (not an array),
21+
# return it as a single vector
22+
vectors = vectors.first if vectors&.length == 1 && !text.is_a?(Array)
2123

2224
Embedding.new(vectors:, model:, input_tokens: 0)
2325
end

lib/ruby_llm/providers/openai/embeddings.rb

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,13 +19,14 @@ def render_embedding_payload(text, model:, dimensions:)
1919
}.compact
2020
end
2121

22-
def parse_embedding_response(response, model:)
22+
def parse_embedding_response(response, model:, text:)
2323
data = response.body
2424
input_tokens = data.dig('usage', 'prompt_tokens') || 0
2525
vectors = data['data'].map { |d| d['embedding'] }
2626

27-
# If we only got one embedding, return it as a single vector
28-
vectors in [vectors]
27+
# If we only got one embedding AND the input was a single string (not an array),
28+
# return it as a single vector
29+
vectors = vectors.first if vectors.length == 1 && !text.is_a?(Array)
2930

3031
Embedding.new(vectors:, model:, input_tokens:)
3132
end

0 commit comments

Comments
 (0)