Skip to content

gontzalm/rag-builder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

82 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ—๏ธ RAG Builder

GitHub Actions Workflow Status GitHub commit activity GitHub License


RAG Builder is a well-architected, scalable, and secure RAG (Retrieval-Augmented Generation) application built on AWS. It allows users to create a knowledge base from PDFs and websites and then ask questions about it. The project is built with Python, AWS CDK, and LangChain, and it serves as a powerful demonstration of how to build production-ready GenAI applications on AWS.

๐ŸŽฅ Demo

rag-builder-demo.mp4

๐Ÿ›๏ธ Architecture

The application is built using a serverless-first architecture on AWS, designed for scalability, security, and maintainability.

architecture-beta
    %% External
    service user(internet)[User]

    %% AWS Cloud
    group aws(cloud)[AWS]

    service cloudfront(internet)[Cloudfront] in aws
    service chainlit_app(server)[Chainlit App] in aws
    service cognito(cloud)[Cognito] in aws

    %% Storage
    group data_plane(database)[Conversation Memory] in aws
    service dynamodb(disk)[DynamoDB] in data_plane
    service s3_chainlit(database)[S3] in data_plane

    group knowledge_base(database)[Knowledge Base] in aws
    service dynamodb_metadata(disk)[DynamoDB Metadata] in knowledge_base
    service vector_store(database)[S3 LanceDB Vector Store] in knowledge_base

    %% Bedrock
    service bedrock(cloud)[Bedrock LLM and Embeddings] in aws

    %% Backend
    service backend(server)[Lambda FastAPI Backend] in aws

    %% Data processing
    service lambda(server)[Lambda Data Processing Layer] in aws

    %% Edges
    user:R --> L:cloudfront
    cloudfront:R --> L:chainlit_app
    chainlit_app:T -- B:cognito

    dynamodb:R -- L:s3_chainlit
    chainlit_app:B -- T:dynamodb{group}

    dynamodb_metadata:R -- L:vector_store
    s3_chainlit{group}:R -- L:dynamodb_metadata{group}

    chainlit_app:R -- L:bedrock

    chainlit_app:R --> L:backend
    backend:R -- L:lambda

    lambda:B -- T:dynamodb_metadata{group}
Loading

Key Components

  • Frontend: A Chainlit application running on an AWS Fargate container. It is fronted by an Application Load Balancer and a CloudFront distribution to provide HTTPS and low-latency content delivery.

  • Authentication: Amazon Cognito is used for user authentication and authorization, securing the application and its data.

  • Backend API: A FastAPI application running on a Lambda function and exposed via API Gateway. It provides a RESTful API for managing documents and the knowledge base.

  • Document Processing:

    • Document loading and deletion are handled asynchronously using Amazon SQS queues, which makes the application more resilient and responsive.
    • A Lambda function is triggered by the queue to download, chunk, create embeddings for, and store documents in the vector store.
    • The vector store is built with LanceDB and stored on Amazon S3, providing a serverless and scalable solution for vector search.
  • AI Models: The application uses Amazon Bedrock for both the embeddings model (amazon.titan-embed-text-v2:0) and the agent's language model (amazon.nova-pro-v1:0).

  • Database: Amazon DynamoDB is used to store document metadata and Chainlit conversation history.

  • Scheduled Tasks: A weekly scheduled Lambda function, triggered by Amazon EventBridge Scheduler, optimizes the LanceDB vector store to maintain performance.

    Optimization covers three operations:

    • Compaction: Merges small files into larger ones
    • Prune: Removes old versions of the dataset
    • Index: Optimizes the indices, adding new data to existing indices (incremental indexing)

RAG Search Implementation

The RAG agent employs a hybrid search strategy to retrieve relevant context from the LanceDB vector store. This approach combines:

  • Vector Search: Retrieves documents based on semantic similarity using embeddings generated by Amazon Bedrock.
  • Keyword Search: Matches specific terms using full-text search.
  • Reranking: Uses Reciprocal Rank Fusion (RRF) to combine and order the results from both search methods.

This ensures a robust retrieval process that captures both conceptually similar content and exact keyword matches.

๐Ÿ“Š Evaluation Module

The evaluation module provides a comprehensive framework (CLI tool built with Typer) for testing and analyzing RAG system performance. It uses RAGAS to generate synthetic test datasets, run experiments with different configurations, and visualize results through an interactive dashboard.

The evaluation workflow consists of four main steps:

  1. Create Knowledge Base - Build an evaluation dataset from research papers
  2. Generate Test Set - Create synthetic question-answer pairs using RAGAS
  3. Run Experiments - Test different model configurations and measure performance
  4. Visualize Results - Analyze experiment outcomes through an interactive dashboard

main

create-kb

Creates a knowledge base for evaluation by downloading and processing a curated set of research papers from arXiv. The documents are stored in a LanceDB table named evaluation_{embedding_model} for use in subsequent evaluation steps.

create-kb

generate-testset

Generates a synthetic test dataset using RAGAS based on the evaluation knowledge base. Creates realistic question-answer pairs with personas and different query types for comprehensive testing.

generate-testset

run-experiment

Runs evaluation experiments using the synthetic testset with specified model configurations. Measures faithfulness and answer accuracy metrics to assess RAG performance.

run-experiment

visualize-experiments

Generates an interactive Plotly dashboard to visualize experiment results over time. Shows trends in faithfulness and accuracy metrics across different experimental configurations.

visualize-experiments

Example output:

Experiment Results

Tip

Hover over data points to see the detailed experiment configuration.

โš™๏ธ CI/CD

This project implements a production-grade CI/CD pipeline using GitHub Actions, focusing on speed, security, and developer feedback.

Continuous Integration (CI)

Triggered on: Pull Requests to main

  • Efficient Dependency Management: Uses uv to install Python dependencies at lightning speeds, significantly reducing CI build times compared to pip or poetry.
  • Smart Monorepo Testing: Implements dorny/paths-filter to only run tests for components that have changed (e.g., if only the Backend API is modified, only those tests run), saving compute resources.
  • Automated Feedback: Posts detailed Test Coverage Reports directly to Pull Requests as comments, ensuring code quality visibility before merging.
  • Isolated Environments: Runs unit tests for each Lambda function in isolated environments to prevent dependency conflicts.

Continuous Deployment (CD)

Triggered on: Push to main (after CI passes and PR is merged)

  • Secure Authentication: Uses OpenID Connect (OIDC) to authenticate with AWS, eliminating the need for long-lived Access Keys in GitHub Secrets.
  • Infrastructure as Code: Automatically deploys infrastructure changes via AWS CDK.
  • Concurrency Control: Prevents race conditions by ensuring only one deployment pipeline runs at a time for the production environment.

๐Ÿš€ Getting Started

Prerequisites

  • An AWS account
  • AWS CLI configured with your credentials and appropriate permissions
  • Python 3.12+
  • uv installed

Deployment to AWS

  1. Clone the repository

    git clone https://github.com/gontzalm/rag-builder.git
    cd rag-builder
  2. Install dependencies

    uv sync
  3. Bootstrap the CDK environment (if you haven't already)

    cdk bootstrap
  4. Deploy the stack

    cdk deploy

    The deployment will take several minutes. Once it's complete, the CDK will output the URL of the Chainlit application and a .env file for local testing.

Tip

To save costs, speed up deployments, or if you're developing the Chainlit UI locally, you can disable its deployment (Fargate service, Load Balancer, and CloudFront distribution) by using the deploy_chainlit context value:

cdk deploy -c deploy_chainlit=false

Local Development and Testing

For a faster development cycle, you can run the Chainlit application locally while connecting to the deployed AWS resources.

  1. After deploying the stack, copy the .env file content from the CDK output.

  2. Create a file named .env in the rag_builder/fargate/chainlit-app/ directory and paste the content into it.

  3. Navigate to the Chainlit app directory:

    cd rag_builder/fargate/chainlit-app
  4. Install the local dependencies:

    uv sync
  5. Run the Chainlit application:

    uv run chainlit run main.py -w
    

    This will start a local server, and you can access the application at http://localhost:8000.

๐Ÿ› ๏ธ Technology Stack

Category Technology
Infrastructure as Code AWS CDK
Frontend Chainlit
Backend FastAPI
GenAI LangChain AWS Bedrock LanceDB
CLI Typer
Package Management UV

About

Build and query your RAG knowledge base on a serverless AWS architecture.

Resources

License

Stars

Watchers

Forks

Contributors

Languages