VarDaan.ai 🤖

VarDaan.ai is an AI-powered platform that transforms any web content—be it blogs, articles, YouTube videos, or PDF documents—into a chatbot that users can interact with. By providing a URL or uploading a file, you can ask natural language questions about the content and receive accurate, contextually relevant answers in real-time.

Features

Web Content as Chatbots: Turn any web page, blog, or article into an interactive chatbot.
YouTube Video Chatbot: Input a YouTube video URL and ask questions based on video content.
PDF Support: Upload PDF files and generate an AI chatbot to query the content.
Conversational AI: Powered by advanced natural language processing models that provide accurate answers based on the context.
Simple User Interface: Easy-to-use web interface that allows users to engage with different types of content seamlessly.

Installation

Prerequisites

Make sure you have the following installed:

Python 3.8+
pip (Python package installer)
Git

Steps to Install and Run VarDaan.ai Locally

Clone the Repository
Open your terminal and run the following command to clone the repository:
```
git clone https://github.com/amMistic/vardaan.ai.git
```
Navigate to the Project Directory
Move into the project folder:
```
cd vardaan-ai
```
Install Dependencies
Install the required Python packages by running:
```
pip install -r requirements.txt
```
Set Up Environment Variables
Create a .env file in the root directory and add the necessary environment variables (like API keys). Example:
```
PINECONE_API_KEY=<your_pinecone_api_key>
HUGGINGFACE_API_TOKEN=<your_api_token>
```
Run the Application
Start the VarDaan.ai application using Streamlit:
```
streamlit run app.py
```
Access the App
Open your web browser and navigate to the local server link provided by Streamlit (usually http://localhost:8501).

Usage

Web Interface: Once the app is running, you will see a simple input field for URLs or file uploads (PDFs).
Enter Content:
- For blogs/articles: Enter the URL of the blog or article.
- For YouTube videos: Enter the YouTube video URL.
- For PDFs: Upload the PDF document directly into the app.
Interactive Chat: After processing the content, you can ask questions in the chat interface, and VarDaan.ai will respond based on the content provided.

Example Commands (Future Task)

Blog/Article: vardaan.ai.http://example.com/blog-post
YouTube Video: vardaan.ai.youtube.com/watch?v=example-video
PDF: Drag and drop a PDF document into the interface.

Project Structure

vardaan-ai/
│
├── app.py                   # Main application entry point
├── src/                     # Source code
│   ├── Online_src/           # Web content and YouTube processing
│   ├── Offline_src/          # PDF handling
│   ├── Handle_user.py        # Handles user queries and responses
│   ├── embedding_model.py    # Embedding logic for vector storage
├── vecDatabase/             # Stores vectorized representations of content
├── requirements.txt         # List of Python dependencies
└── README.md                # Project documentation

How It Works

Extracting Content:
- Web Content: VarDaan.ai fetches and processes text from web pages or articles using web scraping methods.
- YouTube: The app uses YouTube's transcript API to extract the spoken text from videos.
- PDF: For PDF files, VarDaan.ai extracts the textual content and splits it into manageable chunks.
Processing & Storage:
- The content is split into smaller text chunks.
- Each chunk is embedded using a pre-trained NLP model, converting the text into a vector format.
- These vectors are stored in a vector database (Chroma) for efficient querying.
Conversational Queries:
- When the user asks a question, VarDaan.ai retrieves relevant information from the vector store.
- The retrieval system uses advanced language models to generate appropriate, context-aware responses.

Tech Stack

Python: Core programming language
Streamlit: Web interface framework
LangChain: Used for managing document chains and information retrieval
Chroma: For vector storage and search capabilities
Hugging Face Models: Provides the embeddings for text representation
YouTube Transcript API: Used to fetch video transcripts
pdfplumber: For handling PDF text extraction

Contributing

I welcome contributions from the community! To get started:

Fork the repository.

Create a new branch:

git checkout -b feature/your-feature-name

Make your changes and commit them:
```
git commit -m "Add a new feature"
```

Push to the branch:

git push origin feature/your-feature-name

Create a pull request on GitHub.

git pull origin feature/your-feature-name

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

VarDaan.ai 🤖

Features

Table of Contents

Installation

Prerequisites

Steps to Install and Run VarDaan.ai Locally

Usage

Example Commands (Future Task)

Project Structure

How It Works

Tech Stack

Contributing

Files

README.md

Latest commit

History

README.md

File metadata and controls

VarDaan.ai 🤖

Features

Table of Contents

Installation

Prerequisites

Steps to Install and Run VarDaan.ai Locally

Usage

Example Commands (Future Task)

Project Structure

How It Works

Tech Stack

Contributing