🤖 Project Goal: Let the user upload a PDF and ask questions about its content. The app will read the PDF, understand it, and answer questions based on the text inside.
🛠️ Tools You Used: Tool Purpose Streamlit Builds the user interface (upload, chat box) PyPDF2 Reads and extracts text from the PDF LangChain Handles the text processing and Q&A logic HuggingFace Transformers Provides the language model for answering FAISS Searches through PDF content to find the most relevant parts dotenv Loads secrets (like API keys) from .env file Pickle Saves and loads the processed data to speed things up
🧩 Step-by-Step Explanation:
- 🖼️ Sidebar UI Displays app title and info.
Made with st.sidebar and markdown.
- 📤 PDF Upload User uploads a PDF file.
Code reads the text using PdfReader.
- ✂️ Split Text into Chunks Big PDF texts are split into smaller parts (1000 characters with overlap).
This helps the AI understand and process it better.
- 💾 Create or Load Vector Store Converts text chunks into vectors using Hugging Face embeddings.
Saves them using FAISS (for fast searching).
Stores this data as a .pkl file so you don’t need to redo it every time.
-
❓ User Enters a Question You type a question like: "What is this PDF about?"
-
🔍 Search Similar Chunks The app finds the 3 most relevant chunks from the PDF.
-
🤖 Generate Answer Uses the Hugging Face model flan-t5-base to read the selected chunks and generate an answer to your question.
-
📃 Show the Answer The answer is displayed in the app using st.write().