Skip to content

Fix #2753: Handle large inputs in memory by chunking text before embedding #2754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

devin-ai-integration[bot]
Copy link
Contributor

Handle large inputs in memory by chunking text before embedding

Problem

When memory=True is enabled and a large input is provided, the system crashes with a token limit error from the embedding model. This happens because large inputs aren't being chunked or truncated before being passed to the embedding model.

Solution

  • Added constants for chunk size and overlap in utilities/constants.py
  • Implemented a _chunk_text method in RAGStorage to split large texts into smaller chunks
  • Modified _generate_embedding to handle chunking and add each chunk to the collection
  • Added a test to verify the fix works with large inputs

Testing

  • Added a new test file large_input_memory_test.py to test memory with large inputs
  • Verified that all existing tests still pass

Link to Devin run

https://app.devin.ai/sessions/472b1317d1074353b6a4dedc629755b8

Requested by: Joe Moura ([email protected])

Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@joaomdmoura
Copy link
Collaborator

Disclaimer: This review was made by a crew of AI Agents.

Code Review Comment for PR #2754

Overview

This pull request effectively addresses the issue of managing large text inputs within the RAG storage system by implementing a chunking mechanism. This improvement aids in handling memory limitations and prevents token limit errors during data processing. The PR introduces changes across three significant files and includes comprehensive test coverage.

Code Quality Findings and Suggestions

1. src/crewai/memory/storage/rag_storage.py

  • Positive Aspects:

    • The introduction of the _chunk_text method allows the system to handle large text inputs effectively, enhancing overall stability.
    • The implementation employs good error handling practices, including logging, which will aid in debugging.
  • Specific Improvements:

    • Method Documentation: The documentation for _chunk_text should detail the parameters and return types more explicitly. Example:

      def _chunk_text(self, text: str) -> List[str]:
          """
          Split text into chunks to avoid token limits.
          
          Args:
              text: Input text to chunk.
          
          Returns:
              List[str]: A list of chunked text segments, adhering to defined size and overlap.
          ```
        
    • Type Hints Enhancement: Consider enhancing type hints, particularly in _generate_embedding.

    • Chunk Processing Optimization: Ensure that chunk generation is efficient to minimize performance overhead. Example:

      start_indices = range(0, len(text), MEMORY_CHUNK_SIZE - MEMORY_CHUNK_OVERLAP)

2. src/crewai/utilities/constants.py

  • Suggestions for Improvement:
    • Provide clear documentation for constants such as MEMORY_CHUNK_SIZE and MEMORY_CHUNK_OVERLAP. For instance:
      # Maximum size for each text chunk in characters
      MEMORY_CHUNK_SIZE = 4000

3. tests/memory/large_input_memory_test.py

  • Positive Aspects:

    • The newly created tests offer strong coverage for large input handling, validating the new chunking functionality.
  • Suggestions for Improvement:

    • Add Edge Case Tests: Implement tests for edge cases, such as handling empty strings or inputs that match the chunk size exactly. For example:
      def test_empty_input(short_term_memory):
          short_term_memory.save(value="", agent="test_agent")

Historical Context and Related Findings

In reviewing related pull requests, there has been a recurrent focus on improving input handling and error management. Previous discussions highlighted the need for better documentation and robust testing for newly implemented features, a trend that this PR continues.

General Recommendations

  1. Performance Monitoring:

    • Integrate logging for processing times and chunk sizes to better assess performance during heavy loads.
  2. Memory Management:

    • Implement limits on the number of processed chunks to prevent memory overflow and optimize resource allocation.
  3. Error Handling:

    • Enhance error handling throughout the chunking and embedding process to provide detailed logs for failures.

Conclusion

The changes presented in this PR establish a solid foundation for managing large text inputs while maintaining system performance. By addressing the outlined improvements, particularly enhancing documentation and testing coverage, the code can achieve greater clarity and robustness, making it better suited for future development needs.

Overall, this PR is a significant contribution to the project, effectively tackling the core issue at hand while promoting a maintainable and scalable codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant