-
Checked other resources
Commit to Help
Example CodeCreation of vectors import requests
import os
from bs4 import BeautifulSoup
import html2text
from dotenv import load_dotenv
from langchain_community.embeddings import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter, MarkdownHeaderTextSplitter
from langchain_postgres.vectorstores import PGVector
from settings import Settings
PGDATABASE = '...'
CONNECTION_STRING = f"postgresql+psycopg://{Settings.PGUSER}:{Settings.PGPASSWORD}@{Settings.PGHOST}:{Settings.PGPORT}/{PGDATABASE}"
COLLECTION_NAME = "..."
...a bunch of markdown webpages read...
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=3000,
chunk_overlap=150,
length_function=len,
is_separator_regex=False,
)
text_splits = text_splitter.create_documents([text_content])
docs = md_header_splits + text_splits
embedding = OpenAIEmbeddings(openai_api_key=Settings.OPENAI_API_KEY)
vector_store = PGVector(
embeddings=embedding,
collection_name=COLLECTION_NAME,
connection=CONNECTION_STRING,
pre_delete_collection=True,
use_jsonb=True,
)
document_ids = vector_store.add_documents(documents=docs)
print(document_ids[0:5]) Retriever code for these vectors: from langchain.tools.retriever import create_retriever_tool
from langchain_postgres.vectorstores import PGVector
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
COLLECTION_NAME = "..."
PGDATABASE = '...'
CONNECTION_STRING = f"postgresql+psycopg://{settings.PGUSER}:{settings.PGPASSWORD}@{settings.PGHOST}:{settings.PGPORT}/{PGDATABASE}"
store = PGVector(
collection_name=COLLECTION_NAME,
connection=CONNECTION_STRING,
embeddings=embeddings,
)
retriever = store.as_retriever(search_kwargs={"k": 5})
MyDocs = create_retriever_tool(
retriever,
"...",
"""
Description of RAG.
""",
) DescriptionHi, all. I have created vector embeddings using
It seems this is related to System Infolangchain 0.3.14 |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
I am going to close this discussion because I found the answer elsewhere. However, for the sake of documenting the solution, the answer came from this issue. The bottom line is that the retriever needs an async engine. Just passing the connection string is not sufficient. So the change in the retriever tool looks like this: from langchain.tools.retriever import create_retriever_tool
from langchain_postgres.vectorstores import PGVector
from langchain_openai import OpenAIEmbeddings
from sqlalchemy.ext.asyncio import create_async_engine
embeddings = OpenAIEmbeddings()
COLLECTION_NAME = "..."
PGDATABASE = '...'
CONNECTION_STRING = f"postgresql+psycopg://{settings.PGUSER}:{settings.PGPASSWORD}@{settings.PGHOST}:{settings.PGPORT}/{PGDATABASE}"
engine = create_async_engine(CONNECTION_STRING)
store = PGVector(
collection_name=COLLECTION_NAME,
connection=engine,
embeddings=embeddings,
)
retriever = store.as_retriever(search_kwargs={"k": 5})
MyDocs = create_retriever_tool(
retriever,
"...",
"""
Description of RAG.
""",
) (Note the use of the |
Beta Was this translation helpful? Give feedback.
I am going to close this discussion because I found the answer elsewhere. However, for the sake of documenting the solution, the answer came from this issue.
The bottom line is that the retriever needs an async engine. Just passing the connection string is not sufficient. So the change in the retriever tool looks like this: