Vectorize and RAG Ancestor Nodes When Exceeding Token Limit #26
VentureTactics
started this conversation in
Ideas
Replies: 1 comment 1 reply
-
An alternative to vectorization could be summarization. If the context is exceeded, a summary of the upstream can be inserted. IMO the sweet spot of Chat Stream seems to be in conveniently providing context management for chat AI experiments. As folks go deeper into their use case, it would be natural that they bootstrap into more sophisticated tools. LangChain may have a visual no-code builder by now, for ex. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Summary
Automatically vectorize up-stream nodes (including embedded files) and run a RAG version of the generation for all of the ancestor data other than the System Prompt.
Reason
When referencing several very large files, the context window token limit can easily be exceeded.
Rather than aborting the run or trimming the content, I suggest vectorizing the upstream ancestor data (just the ones that would be used in that specific workflow) and having the model generate as large a context as possible within the RAG retrieval window.
Features
Beta Was this translation helpful? Give feedback.
All reactions