Advanced RAG System with LangGraph🦜🕸️

An advanced Agentic RAG system implementing a multi-agent architecture with Adaptive, Corrective, and Self-RAG patterns for efficient document retrieval and question answering.

System Overview

This system provides an intelligent documentation assistant for a Proof of Concept (POC) that leverages collaborative specialized agents to:

Process and index AWS documentation with versioning control
Deliver accurate, context-aware responses using advanced RAG patterns
Provide source attribution and relevant document references
Handle both in-vectorstore and web-based information retrieval
Implement multi-agent verification to prevent hallucinations

Solution Design

System Architecture

The architecture diagram shows the main components and data flow of the system, including the document ingestion pipeline, vector store, and the multi-agent RAG workflow.

LangGraph Workflow

The graph visualization demonstrates the message passing and state transitions between specialized agents:

Router Agent: Decides between vectorstore and web search
Retrieval Agent: Gets relevant documents
Document Grading Agent: Evaluates document relevance
Web Search Agent: Augments knowledge by searching online when either:
- Router determines question is not in VecDB context, then requires external data;
- Document grading finds insufficient relevant context (< 3 docs);
- Initial answer fails verification checks
Generation Agent: Produces final response
Verification Agents: Two-layer checking (fact and relevance)

Key Features

1. Intelligent Document Processing

Versioning Control: Two-phase versioning detection using file timestamps and MD5 hashing of the page content.
Smart Chunking: Token-aware text splitting with natural boundary preservation
Metadata Enrichment: Automatic extraction of document attributes and relationships - possibility to expand to hybrid search in the future (Keyword/filter search + semantic search).
Change Detection: Efficient handling of document updates and modifications, by implementing an upsert logic that prevents duplicates and stale data of files older versions.

2. Advanced RAG Implementation

Adaptive Retrieval: Dynamic switching between vectorstore and web search based on query context
Corrective-RAG: Document relevance verification
- Grades each retrieved document from the VecDB for relevance to the question
- Requires minimum of 3 relevant documents
- Triggers web search (2 attempts max) if insufficient relevant documents found, in order to improve the answer quality with more information.
Self-Verification: Two-layer verification system:
- Layer 1: Built-in fact-checking and hallucination prevention
- Layer 2: Generated Answer relevance verification against original question
Source Attribution: Automatic linking to source documentation
Multi-step Processing: Question routing, retrieval, generation, and verification pipeline

2.1 Multi-Agent Architecture

Collaborative Agents: Specialized agents working together in a coordinated workflow:
- Document Grading Agent: Evaluates document relevance (C-RAG)
- Web Search Agent: Augments knowledge with online information
- Generation Agent: Produces grounded responses using hub-optimized prompts
- Verification Agents: Two-layer verification system
Agent Communication: LangGraph-orchestrated message passing and state management
Agentic Decision Making:
- Autonomous routing between vectorstore and web search
- Dynamic verification paths based on LLM-powered judges (LLM-as-a-judge):
  - Document relevance grading by retrieval_grader
  - Factual accuracy checking by hallucination_grader
  - Answer relevance verification by answer_grader

3. Production-Ready Architecture

Modular Design: Clear separation of concerns with independent components
Extensible Pipeline: LangGraph-based workflow for easy modification - add new agents or change the order of execution
LLM Testing: Basic Unit tests for core components
Performance Optimization:
- Smart Updates:
  - Only processes changed documents using timestamp + hash checks
  - Deletes old chunks before adding new ones
  - Updates multiple documents at once
- Memory Efficiency:
  - Processes documents in batches
    - Collects document IDs in lists for bulk operations
    - Performs batch deletions using ID lists
    - Groups document updates into single transactions
  - Uses in-memory storage for development
  - SQLite storage for data persistence option for production

Technical Specifications

Vector Store Configuration

Engine: ChromaDB
Embedding Model: OpenAI Embeddings (default model: text-embedding-3-small, dim 1536 - same as Amazon Titan Text Embeddings)
Similarity Method: Cosine similarity
Collection Name: "public_data"

Document Processing

Chunking Strategy: RecursiveCharacterTextSplitter with tiktoken
Chunk Size: 500 tokens
Overlap: 50 tokens (10% overlap)
Split Hierarchy: paragraphs → lines → sentences → clauses → words → chars

RAG Pipeline Components

Query Router: Context-aware routing between vectorstore and web search
Retriever: Similarity-based document retrieval (k=7)
Generator: Response generation with source grounding
- Generator Model: claude-3-5-sonnet-20240620, same model available on AWS Bedrock
Verifier: Multi-step verification for factual accuracy

Project Structure

advanced-rag-agents/
├── graph/                      # Core RAG system components
│   ├── chains/                 # LLM chain definitions
│   │   ├── answer_grader_dev.py       # Grades final answer relevance
│   │   ├── hallucination_grader_dev.py # Checks for factual accuracy
│   │   ├── retrieval_grader_dev.py    # Grades document relevance
│   │   └── router_dev.py              # Routes questions to appropriate source
│   ├── nodes/                  # Graph node implementations
│   │   ├── generate_dev.py            # Response generation node
│   │   ├── grade_documents_dev.py     # Document grading node
│   │   ├── retrieve_dev.py            # Vector DB retrieval node
│   │   └── web_search_dev.py          # Web search augmentation node
│   ├── utils/                  # Utility functions
│   │   ├── ingestion_formatter.py     # Pretty printing for ingestion
│   │   └── output_formatter.py        # Pretty printing for RAG output
│   ├── consts_dev.py          # Graph constants and node names
│   ├── graph_dev.py           # Main graph definition and workflow
│   └── state_dev.py           # Shared graph state type definitions
├── public_data/               # Directory for markdown files to ingest
├── .chroma/                   # ChromaDB persistence directory
├── .env                       # Environment variables (use .env_template as base)
├── ingestion_dev.py          # Document processing and vectorstore updates
├── main_dev.py                # Retrieval and RAG pipeline for development and testing
├── main.py                   # Application entry point, run this to use the RAG system
├── poetry.toml                # Poetry configuration with project dependencies
├── pyproject.lock.toml        # Poetry lock file
└── README.md                 # Project documentation

API Keys Required

ANTHROPIC_API_KEY: For Claude 3.5 Sonnet (grading and generation)
OPENAI_API_KEY: For embeddings model
TAVILY_API_KEY: For web search capabilities
LANGCHAIN_API_KEY (Optional): For tracing and monitoring with LangSmith

Project Setup

Clone the repository
Install dependencies with Poetry:
```
poetry install
```

Set up environment variables (use .env_template as base):

ANTHROPIC_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
TAVILY_API_KEY=your_key_here
ANTHROPIC_MODEL=claude-3-sonnet-20240320
LANGCHAIN_API_KEY=your_key_here  # Optional: For tracing with LangSmith

Run the application:
```
poetry run python main.py
```

The main application performs two key functions:

Document Ingestion: First checks the public_data directory for new or modified markdown files. If changes are detected, it prompts the user to run the ingestion pipeline. When approved by the user, it uses a two-phase versioning system (timestamp + MD5 hash) to efficiently process only changed documents, updating the ChromaDB vector store with properly chunked and embedded content. Each document is split into 500-token chunks with 10% overlap for optimal retrieval.
Interactive Question Answering: After ingestion, it starts an interactive CLI where the user can:
- Enter questions (type 'quit' to exit)
- Get responses through the RAG pipeline that:
  - Routes questions between vectorstore and web search
  - Retrieves and grades relevant documents (minimum 3 required)
  - Generates answers using Claude 3.5 Sonnet
  - Verifies responses through a two-layer fact-checking system
  - Provides source attribution for all answers

All operations are logged to rag_system.log for monitoring and debugging.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Advanced RAG System with LangGraph🦜🕸️

System Overview

Solution Design

System Architecture

LangGraph Workflow

Key Features

1. Intelligent Document Processing

2. Advanced RAG Implementation

2.1 Multi-Agent Architecture

3. Production-Ready Architecture

Technical Specifications

Vector Store Configuration

Document Processing

RAG Pipeline Components

Project Structure

API Keys Required

Project Setup

The main application performs two key functions:

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
graph		graph
img		img
public_data		public_data
.env_template		.env_template
.gitignore		.gitignore
README.md		README.md
ingestion_dev.py		ingestion_dev.py
langgraph.json		langgraph.json
main.py		main.py
main_dev.py		main_dev.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Advanced RAG System with LangGraph🦜🕸️

System Overview

Solution Design

System Architecture

LangGraph Workflow

Key Features

1. Intelligent Document Processing

2. Advanced RAG Implementation

2.1 Multi-Agent Architecture

3. Production-Ready Architecture

Technical Specifications

Vector Store Configuration

Document Processing

RAG Pipeline Components

Project Structure

API Keys Required

Project Setup

The main application performs two key functions:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages