This project implements a Natural Language Query Agent that answers questions based on the content of uploaded PDF files. It uses Google Generative AI for embeddings and conversational responses, FAISS for vector storage, and Streamlit for the web interface.
- Extracts text from PDF files
- Splits text into manageable chunks
- Generates embeddings using Google Generative AI
- Stores embeddings in a FAISS vector store
- Handles user queries and generates detailed answers based on the content
- Provides a Streamlit interface for easy interaction
-
Open Google Colab:
- Go to Google Colab
-
Clone the repository and open the
.ipynb
file:- Clone this repository containing the
.ipynb
file. - Upload the
.ipynb
file to Colab and open it.
- Clone this repository containing the
-
Install necessary packages in Colab:
!pip install streamlit !pip install google-generativeai !pip install python-dotenv !pip install langchain !pip install PyPDF2 !pip install faiss-cpu !pip install langchain_google_genai !pip install pyngrok !pip install -U langchain-community
-
Set up environment variables:
- In a code cell, add your Google API key:
import os os.environ["GOOGLE_API_KEY"] = "your_api_key_here"
- Go to Google AI Studio
- In a code cell, add your Google API key:
-
Run the Streamlit app within Colab:
- Use the code in the notebook to run the Streamlit app, which includes the Ngrok setup for tunneling. Make sure to replace the
your_ngrok_auth_token
with your actual Ngrok auth token. - Go to Ngrok
- Use the code in the notebook to run the Streamlit app, which includes the Ngrok setup for tunneling. Make sure to replace the
-
Start the Ngrok tunnel and access the app:
- The notebook will print a public URL provided by Ngrok. Use this URL to access the Streamlit app and interact with it.
-
Clone the repository:
git clone <repository-url> cd <repository-directory>
-
Install required packages:
pip install -r requirements.txt
-
Set up environment variables:
- Create a
.env
file in the root directory - Add your Google API key:
GOOGLE_API_KEY=your_api_key_here
- Create a
-
Run the Streamlit app:
streamlit run app.py ``
Ngrok not used in the case ( Access it using local env )
- Implement conversational memory for context-aware interactions
- Enhance error handling and robustness
- Optimize storage and retrieval for scalability
- Include citation of references in responses