An AI-powered document assistant that lets you upload any PDF and have a natural conversation with its contents β built with LangChain, LLaMA 3.3, and Streamlit.
π Live Demo
Most chatbots only know what they were trained on. This one reads your documents in real time.
Upload a PDF β a research paper, contract, textbook, report β and ask questions about it. The app retrieves the most relevant sections and feeds them to the LLM as context, so answers are grounded in your document rather than general knowledge.
User uploads PDF
β
PDF is split into overlapping text chunks (chunk_size=1000, overlap=200)
β
Chunks are embedded using sentence-transformers/all-MiniLM-L6-v2
β
Embeddings stored in ChromaDB (in-memory vector store)
β
User sends a message β top-3 relevant chunks retrieved via similarity search
β
Chunks injected into prompt β LLaMA 3.3 70B generates a grounded response
| Layer | Tool |
|---|---|
| LLM | LLaMA 3.3 70B via Groq API |
| Orchestration | LangChain |
| Embeddings | HuggingFace all-MiniLM-L6-v2 |
| Vector Store | ChromaDB |
| PDF Parsing | PyPDFLoader |
| UI | Streamlit |
- Upload any PDF and query it in natural language
- Retrieval-Augmented Generation (RAG) pipeline β answers grounded in document content
- Persistent chat history within session
- Switchable AI personas (Helpful Assistant, Engineering Tutor, Financial Advisor, Exam Coach, Creative Writing)
- Adjustable temperature for response creativity
- Clean one-click chat reset
git clone https://github.com/Wemelo1/llm-assistant.git
cd llm-assistant
pip install -r requirements.txt
streamlit run PDF_APP.pyYou'll need a free Groq API key β enter it in the sidebar when the app loads.
Standard LLM chatbots hallucinate when asked about specific documents. I wanted to understand how RAG solves this by anchoring model responses to retrieved source content. This project taught me the full pipeline: chunking strategy, embedding tradeoffs, vector similarity search, and prompt construction with injected context.
- Support for multiple PDFs simultaneously
- Source citation β show which chunk each answer came from
- Persistent vector store (currently resets on session end)
- Swap ChromaDB for FAISS for faster local retrieval
Built by Pr0_M1se β LLM Engineer