-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add SurrealDB examples #3799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add SurrealDB examples #3799
Conversation
|
@martinschaer Thanks for working on this! We're about to release #3252, which would be great to use in this example, as well as https://ai.pydantic.dev/web/ for the web UI. That'll make this a bit more differentiated/modern compared to the existing example it's based on. Note that we'll also need a doc like https://ai.pydantic.dev/examples/rag/, so that people can find it in the sidebar. I suggesting creating a new dedicated RAG category (separate from the existing "Data & Analytics"), so that it's more natural to have separate examples for different vector DBs. You can then move the existing Postgres/pgvector example there, and mention "SurrealDB" in the title of your new page. Can you please make those changes (once that feature is released)? |
|
Why do you folks use |
@Kludex good question. I think we should consider pydantic-core |
|
@Kludex I just updated this PR with the latest surrealdb sdk which has pydantic-core instead of cerberus |
| - examples/sql-gen.md | ||
| - examples/data-analyst.md | ||
| - RAG: | ||
| - examples/rag.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please rename this file to rag-pgvector, update the page title, and set up a redirect (in docs-site/src/index.ts)?
| @@ -0,0 +1,243 @@ | |||
| """Simple chat app example build with FastAPI using SurrealDB embedded. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this anymore? At least it's not mention in the doc, and we have the native web app feature now
| content: str | ||
| dist: float | ||
|
|
||
| result_ta = TypeAdapter(list[RetrievalQueryResult]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move the type and type adapter out of the tool
| assert len(embedding) == 1, ( | ||
| f'Expected 1 embedding, got {len(embedding)}, doc query: {search_query!r}' | ||
| ) | ||
| embedding_vector = list(embedding[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need the list here?
| raise | ||
|
|
||
| return '\n\n'.join( | ||
| f'# {row.title}\nDocumentation URL:{row.url}\n\n{row.content}\n' for row in rows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need the final \n if we're already joining these by 2 newlines!
| embedding = result.embeddings | ||
|
|
||
| assert len(embedding) == 1, ( | ||
| f'Expected 1 embedding, got {len(embedding)}, doc query: {search_query!r}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes it seem like this is something the user needs to watch out for themselves, but the embedder methods are guaranteed to return the same number of embeddings as inputs
|
|
||
| # Process SurrealDB query result | ||
| try: | ||
| rows = result_ta.validate_python(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we extract the surreal db query + result validation into a separate method with clean typing?
| db: AsyncWsSurrealConnection | AsyncHttpSurrealConnection, | ||
| section: DocsSection, | ||
| ) -> None: | ||
| async with embedding_sem: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this?
| namespace = 'pydantic_ai_examples' | ||
| database = 'rag_surrealdb' | ||
| username = 'root' | ||
| password = 'root' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should these be constants at the top level?
| else: | ||
| q = 'How do I configure logfire to work with FastAPI?' | ||
| asyncio.run(run_agent(q)) | ||
| else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we support a web subcommand?
This PR: