Skip to content

Conversation

@martinschaer
Copy link

@martinschaer martinschaer commented Dec 22, 2025

This PR:

  • Adds examples chat_app_surreal.py and rag_surrealdb.py, clones of chat_app.py and rag.py, but using SurrealDB embedded
  • Adds a new RAG section in the docs

@DouweM
Copy link
Collaborator

DouweM commented Dec 23, 2025

@martinschaer Thanks for working on this!

We're about to release #3252, which would be great to use in this example, as well as https://ai.pydantic.dev/web/ for the web UI. That'll make this a bit more differentiated/modern compared to the existing example it's based on.

Note that we'll also need a doc like https://ai.pydantic.dev/examples/rag/, so that people can find it in the sidebar. I suggesting creating a new dedicated RAG category (separate from the existing "Data & Analytics"), so that it's more natural to have separate examples for different vector DBs. You can then move the existing Postgres/pgvector example there, and mention "SurrealDB" in the title of your new page.

Can you please make those changes (once that feature is released)?

@Kludex
Copy link
Member

Kludex commented Dec 30, 2025

Why do you folks use cerberus and not pydantic?

@Kludex Kludex self-requested a review December 30, 2025 12:20
@Kludex Kludex assigned Kludex and unassigned DouweM Dec 30, 2025
@martinschaer
Copy link
Author

Why do you folks use cerberus and not pydantic?

@Kludex good question. I think we should consider pydantic-core

@DouweM DouweM added docs Improvements or additions to documentation size: L Large PR (501-1500 weighted lines) labels Jan 6, 2026
@martinschaer
Copy link
Author

@Kludex I just updated this PR with the latest surrealdb sdk which has pydantic-core instead of cerberus

@Kludex Kludex removed their assignment Jan 8, 2026
- examples/sql-gen.md
- examples/data-analyst.md
- RAG:
- examples/rag.md
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please rename this file to rag-pgvector, update the page title, and set up a redirect (in docs-site/src/index.ts)?

@@ -0,0 +1,243 @@
"""Simple chat app example build with FastAPI using SurrealDB embedded.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this anymore? At least it's not mention in the doc, and we have the native web app feature now

content: str
dist: float

result_ta = TypeAdapter(list[RetrievalQueryResult])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move the type and type adapter out of the tool

assert len(embedding) == 1, (
f'Expected 1 embedding, got {len(embedding)}, doc query: {search_query!r}'
)
embedding_vector = list(embedding[0])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the list here?

raise

return '\n\n'.join(
f'# {row.title}\nDocumentation URL:{row.url}\n\n{row.content}\n' for row in rows
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the final \n if we're already joining these by 2 newlines!

embedding = result.embeddings

assert len(embedding) == 1, (
f'Expected 1 embedding, got {len(embedding)}, doc query: {search_query!r}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it seem like this is something the user needs to watch out for themselves, but the embedder methods are guaranteed to return the same number of embeddings as inputs


# Process SurrealDB query result
try:
rows = result_ta.validate_python(result)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we extract the surreal db query + result validation into a separate method with clean typing?

db: AsyncWsSurrealConnection | AsyncHttpSurrealConnection,
section: DocsSection,
) -> None:
async with embedding_sem:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this?

namespace = 'pydantic_ai_examples'
database = 'rag_surrealdb'
username = 'root'
password = 'root'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these be constants at the top level?

else:
q = 'How do I configure logfire to work with FastAPI?'
asyncio.run(run_agent(q))
else:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we support a web subcommand?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting author revision docs Improvements or additions to documentation size: L Large PR (501-1500 weighted lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants