Skip to content

Conversation

@zhixiangxue
Copy link

Problem

Different embedding API providers have different batch size limits:

  • OpenAI: ~25 items per batch (no strict limit documented)
  • Bailian/DashScope: max 10 items per batch (strict limit)
  • Other providers may have different limits

Currently, MemU processes all embeddings in a single batch, which causes errors when the batch size exceeds the provider's limit.

Solution

This PR adds a configurable batch_size parameter to handle provider-specific limits:

  1. Added batch_size to EmbeddingConfig

    • Default: 25 (suitable for OpenAI)
    • Users can configure it based on their provider (e.g., 10 for Bailian)
  2. Implemented batch processing in OpenAIEmbeddingSDKClient

    • Automatically splits large input lists into smaller batches
    • Optimized: skips batching when input size <= batch_size
  3. Updated service initialization

    • Passes batch_size from config to embedding client

Changes

  • src/memu/app/settings.py: Add batch_size field to EmbeddingConfig
  • src/memu/embedding/openai_sdk.py:
    • Add batch_size parameter to __init__
    • Implement batch processing in embed() method
  • src/memu/app/service.py: Pass batch_size to embedding client

Example Usage

from memu.app import MemoryService

# For Bailian/DashScope (max 10 per batch)
embedding_config = {
    "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
    "api_key": "YOUR_KEY",
    "embed_model": "text-embedding-v3",
    "batch_size": 10  # Configure batch size
}

service = MemoryService(embedding_config=embedding_config)

- Add batch_size parameter to EmbeddingConfig (default: 25)
- Implement batch processing in OpenAIEmbeddingSDKClient.embed()
- Support providers with different batch size limits (e.g., Bailian/DashScope: 10)
- Optimize performance: skip batching when input size <= batch_size
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant