Vectorize and RAG Ancestor Nodes When Exceeding Token Limit #26

VentureTactics · 2024-04-02T19:51:55Z

VentureTactics
Apr 2, 2024

Summary

Automatically vectorize up-stream nodes (including embedded files) and run a RAG version of the generation for all of the ancestor data other than the System Prompt.

Reason

When referencing several very large files, the context window token limit can easily be exceeded.
Rather than aborting the run or trimming the content, I suggest vectorizing the upstream ancestor data (just the ones that would be used in that specific workflow) and having the model generate as large a context as possible within the RAG retrieval window.

Features

It would be nice to have a file with the data retrieved by the RAG. This would allow someone to edit that file, then try a separate run on the altered RAG context file.
You could utilize code that has already been created by the Copilot plugin to vectorize vault data https://github.com/logancyang/obsidian-copilot (note that the Copilot plugins' shortcoming is in how the vectorized data gets segmented. Querying the entire vault leads to messy outputs, which is why vectorizing only select nodes upon Assistant generation is needed).

rpggio · 2024-04-04T00:20:52Z

rpggio
Apr 4, 2024
Maintainer

An alternative to vectorization could be summarization. If the context is exceeded, a summary of the upstream can be inserted.
This can be done manually with a quick manual step - in my mind a reasonable workaround. Unless there's a lot of folks exceeding the massive context windows available these days :).

IMO the sweet spot of Chat Stream seems to be in conveniently providing context management for chat AI experiments. As folks go deeper into their use case, it would be natural that they bootstrap into more sophisticated tools. LangChain may have a visual no-code builder by now, for ex.

1 reply

VentureTactics Apr 4, 2024
Author

Yeah, I do use other low-code interfaces for LLM workflows like n8n, BuildShip, and similar.
I also utilize databases like Xano for data storage and advanced data manipulation.
However, none of that can operate at the speed of thought like Obsidian can.
Database tools like Xano (or any SQL database) aren't suitable interfaces to organize documents, notes, strategies, etc., which makes Obsidian the best place to run AI workflows that require a visual interface with deep access to files/knowledge without just vectorizing the whole thing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorize and RAG Ancestor Nodes When Exceeding Token Limit #26

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Vectorize and RAG Ancestor Nodes When Exceeding Token Limit #26

VentureTactics Apr 2, 2024

Summary

Reason

Features

Replies: 1 comment · 1 reply

rpggio Apr 4, 2024 Maintainer

VentureTactics Apr 4, 2024 Author

VentureTactics
Apr 2, 2024

Replies: 1 comment 1 reply

rpggio
Apr 4, 2024
Maintainer

VentureTactics Apr 4, 2024
Author