🏗️ RAG Builder

RAG Builder is a well-architected, scalable, and secure RAG (Retrieval-Augmented Generation) application built on AWS. It allows users to create a knowledge base from PDFs and websites and then ask questions about it. The project is built with Python, AWS CDK, and LangChain, and it serves as a powerful demonstration of how to build production-ready GenAI applications on AWS.

🎥 Demo

rag-builder-demo.mp4

🏛️ Architecture

The application is built using a serverless-first architecture on AWS, designed for scalability, security, and maintainability.

architecture-beta
    %% External
    service user(internet)[User]

    %% AWS Cloud
    group aws(cloud)[AWS]

    service cloudfront(internet)[Cloudfront] in aws
    service chainlit_app(server)[Chainlit App] in aws
    service cognito(cloud)[Cognito] in aws

    %% Storage
    group data_plane(database)[Conversation Memory] in aws
    service dynamodb(disk)[DynamoDB] in data_plane
    service s3_chainlit(database)[S3] in data_plane

    group knowledge_base(database)[Knowledge Base] in aws
    service dynamodb_metadata(disk)[DynamoDB Metadata] in knowledge_base
    service vector_store(database)[S3 LanceDB Vector Store] in knowledge_base

    %% Bedrock
    service bedrock(cloud)[Bedrock LLM and Embeddings] in aws

    %% Backend
    service backend(server)[Lambda FastAPI Backend] in aws

    %% Data processing
    service lambda(server)[Lambda Data Processing Layer] in aws

    %% Edges
    user:R --> L:cloudfront
    cloudfront:R --> L:chainlit_app
    chainlit_app:T -- B:cognito

    dynamodb:R -- L:s3_chainlit
    chainlit_app:B -- T:dynamodb{group}

    dynamodb_metadata:R -- L:vector_store
    s3_chainlit{group}:R -- L:dynamodb_metadata{group}

    chainlit_app:R -- L:bedrock

    chainlit_app:R --> L:backend
    backend:R -- L:lambda

    lambda:B -- T:dynamodb_metadata{group}

Key Components

Frontend: A Chainlit application running on an AWS Fargate container. It is fronted by an Application Load Balancer and a CloudFront distribution to provide HTTPS and low-latency content delivery.
Authentication: Amazon Cognito is used for user authentication and authorization, securing the application and its data.
Backend API: A FastAPI application running on a Lambda function and exposed via API Gateway. It provides a RESTful API for managing documents and the knowledge base.
Document Processing:
- Document loading and deletion are handled asynchronously using Amazon SQS queues, which makes the application more resilient and responsive.
- A Lambda function is triggered by the queue to download, chunk, create embeddings for, and store documents in the vector store.
- The vector store is built with LanceDB and stored on Amazon S3, providing a serverless and scalable solution for vector search.
AI Models: The application uses Amazon Bedrock for both the embeddings model (amazon.titan-embed-text-v2:0) and the agent's language model (amazon.nova-pro-v1:0).
Database: Amazon DynamoDB is used to store document metadata and Chainlit conversation history.
Scheduled Tasks: A weekly scheduled Lambda function, triggered by Amazon EventBridge Scheduler, optimizes the LanceDB vector store to maintain performance.

Optimization covers three operations:
- Compaction: Merges small files into larger ones
- Prune: Removes old versions of the dataset
- Index: Optimizes the indices, adding new data to existing indices (incremental indexing)

RAG Search Implementation

The RAG agent employs a hybrid search strategy to retrieve relevant context from the LanceDB vector store. This approach combines:

Vector Search: Retrieves documents based on semantic similarity using embeddings generated by Amazon Bedrock.
Keyword Search: Matches specific terms using full-text search.
Reranking: Uses Reciprocal Rank Fusion (RRF) to combine and order the results from both search methods.

This ensures a robust retrieval process that captures both conceptually similar content and exact keyword matches.

📊 Evaluation Module

The evaluation module provides a comprehensive framework (CLI tool built with Typer) for testing and analyzing RAG system performance. It uses RAGAS to generate synthetic test datasets, run experiments with different configurations, and visualize results through an interactive dashboard.

The evaluation workflow consists of four main steps:

Create Knowledge Base - Build an evaluation dataset from research papers
Generate Test Set - Create synthetic question-answer pairs using RAGAS
Run Experiments - Test different model configurations and measure performance
Visualize Results - Analyze experiment outcomes through an interactive dashboard

`create-kb`

Creates a knowledge base for evaluation by downloading and processing a curated set of research papers from arXiv. The documents are stored in a LanceDB table named evaluation_{embedding_model} for use in subsequent evaluation steps.

`generate-testset`

Generates a synthetic test dataset using RAGAS based on the evaluation knowledge base. Creates realistic question-answer pairs with personas and different query types for comprehensive testing.

`run-experiment`

Runs evaluation experiments using the synthetic testset with specified model configurations. Measures faithfulness and answer accuracy metrics to assess RAG performance.

`visualize-experiments`

Generates an interactive Plotly dashboard to visualize experiment results over time. Shows trends in faithfulness and accuracy metrics across different experimental configurations.

Example output:

Tip

Hover over data points to see the detailed experiment configuration.

⚙️ CI/CD

This project implements a production-grade CI/CD pipeline using GitHub Actions, focusing on speed, security, and developer feedback.

Continuous Integration (CI)

Triggered on: Pull Requests to main

Efficient Dependency Management: Uses uv to install Python dependencies at lightning speeds, significantly reducing CI build times compared to pip or poetry.
Smart Monorepo Testing: Implements dorny/paths-filter to only run tests for components that have changed (e.g., if only the Backend API is modified, only those tests run), saving compute resources.
Automated Feedback: Posts detailed Test Coverage Reports directly to Pull Requests as comments, ensuring code quality visibility before merging.
Isolated Environments: Runs unit tests for each Lambda function in isolated environments to prevent dependency conflicts.

Continuous Deployment (CD)

Triggered on: Push to main (after CI passes and PR is merged)

Secure Authentication: Uses OpenID Connect (OIDC) to authenticate with AWS, eliminating the need for long-lived Access Keys in GitHub Secrets.
Infrastructure as Code: Automatically deploys infrastructure changes via AWS CDK.
Concurrency Control: Prevents race conditions by ensuring only one deployment pipeline runs at a time for the production environment.

🚀 Getting Started

Prerequisites

An AWS account
AWS CLI configured with your credentials and appropriate permissions
Python 3.12+
uv installed

Deployment to AWS

Clone the repository

git clone https://github.com/gontzalm/rag-builder.git
cd rag-builder

Install dependencies
```
uv sync
```
Bootstrap the CDK environment (if you haven't already)
```
cdk bootstrap
```
Deploy the stack
```
cdk deploy
```
The deployment will take several minutes. Once it's complete, the CDK will output the URL of the Chainlit application and a .env file for local testing.

Tip

To save costs, speed up deployments, or if you're developing the Chainlit UI locally, you can disable its deployment (Fargate service, Load Balancer, and CloudFront distribution) by using the deploy_chainlit context value:

cdk deploy -c deploy_chainlit=false

Local Development and Testing

For a faster development cycle, you can run the Chainlit application locally while connecting to the deployed AWS resources.

After deploying the stack, copy the .env file content from the CDK output.
Create a file named .env in the rag_builder/fargate/chainlit-app/ directory and paste the content into it.
Navigate to the Chainlit app directory:
```
cd rag_builder/fargate/chainlit-app
```
Install the local dependencies:
```
uv sync
```
Run the Chainlit application:
```
uv run chainlit run main.py -w
```
This will start a local server, and you can access the application at http://localhost:8000.

🛠️ Technology Stack

Category	Technology
Infrastructure as Code
Frontend
Backend
GenAI
CLI
Package Management

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
.github/workflows		.github/workflows
doc		doc
evaluation		evaluation
rag_builder		rag_builder
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
cdk.context.json		cdk.context.json
cdk.json		cdk.json
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏗️ RAG Builder

🎥 Demo

🏛️ Architecture

Key Components

RAG Search Implementation

📊 Evaluation Module

`create-kb`

`generate-testset`

`run-experiment`

`visualize-experiments`

⚙️ CI/CD

Continuous Integration (CI)

Continuous Deployment (CD)

🚀 Getting Started

Prerequisites

Deployment to AWS

Local Development and Testing

🛠️ Technology Stack

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🏗️ RAG Builder

🎥 Demo

🏛️ Architecture

Key Components

RAG Search Implementation

📊 Evaluation Module

create-kb

generate-testset

run-experiment

visualize-experiments

⚙️ CI/CD

Continuous Integration (CI)

Continuous Deployment (CD)

🚀 Getting Started

Prerequisites

Deployment to AWS

Local Development and Testing

🛠️ Technology Stack

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages

`create-kb`

`generate-testset`

`run-experiment`

`visualize-experiments`