Skip to content

Conversation

@logan-markewich
Copy link
Collaborator

Adds a raptor pack, implementing the RAPTOR paper

Credits to

@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Mar 1, 2024
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Collaborator

@jerryjliu jerryjliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sweet! some minor comments

Copy link
Contributor

@nerdai nerdai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, the code lgtm!

A couple of questions tho:

  1. I thought maybe it would have made sense to implement the data processing step with clustering as a custom TransformComponent that can then be used in an IngestionPipeline. Any benefits in doing this? One that I think of is just to continue to enforce/promote usage of IngestionPipeline (in cases that fit well with it ofc, and I do think data processing / transforming is its primary raison d'etre?

  2. Would it be possible (and not so time consuming) to reproduce results of RAPTOR paper with their experiments using this pack implementation?

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 1, 2024
@logan-markewich logan-markewich merged commit d97cec8 into main Mar 1, 2024
@logan-markewich logan-markewich deleted the logan/raptor branch March 1, 2024 23:22
bdonkey added a commit to bdonkey/gpt_index that referenced this pull request Mar 5, 2024
* main: (2881 commits)
  Feature: Improve batch embedding generation throughput for Cohere in Bedrock (run-llama#11572)
  tqdm: add tdqm.gather (run-llama#11562)
  Fix URLs in Prompts documentation (run-llama#11571)
  Corrected colab links (run-llama#11577)
  add syntatic sugar to create chat prompt / chat message more easily  (run-llama#11583)
  Fix Issue 11565 - The MilvusVectorStore MetaDataFilters FilterCondition.OR is ignored (run-llama#11566)
  docs: fixes LangfuseCallbackHandler link (run-llama#11576)
  GHA: Add Check for repo source (run-llama#11575)
  add raptor (run-llama#11527)
  Logan/v0.10.15 (run-llama#11551)
  feat: adds langfuse callback handler (run-llama#11324)
  fixed storage context update & service context issue (run-llama#11475)
  Add async capability to OpensearchVectorStore (run-llama#11513)
  Logan/fix publish (run-llama#11549)
  Prevent async_response_gen from Stalling with asyncio Timeout (run-llama#11548)
  VideoDB Integration for Retrievers (run-llama#11463)
  fix import error in CLI (run-llama#11544)
  Updated the simple fusion to handle duplicate nodes (run-llama#11542)
  Add mixedbread reranker cookbook (run-llama#11536)
  Fixed some minor gramatical issues (run-llama#11530)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants