A 100% local, privacy-focused AI research assistant that can answer questions about your own documents using a Retrieval-Augmented Generation (RAG) pipeline.
All processing — including the database, vector search, and AI inference — happens entirely on your machine. No internet calls, no API keys, no data leaks.
- 📚 Document Ingestion — Supports PDF (.pdf), Markdown (.md), and Text (.txt) files.
- 🧠 Local Question Answering — Fully local AI inference, no cloud required.
- 🌐 Simple Web Interface — Easy-to-use browser UI.
- 🗣 Text-to-Speech — Convert answers to speech instantly.
- 🔒 Completely Private — Your data never leaves your computer.
- ⚙ Extensible — Modular architecture for adding features easily.
Backend: Python, FastAPI, Uvicorn AI / ML: LangChain, Sentence Transformers (Hugging Face), PyTorch Vector Database: ChromaDB Text-to-Speech: pyttsx3 Frontend: HTML, CSS, JavaScript
personal-ai-research-assistant/
├── data/ # Your .pdf, .md, and .txt files
├── models/ # Local AI model file(s)
├── services/
│ ├── ingestion/ # Document loading and database creation
│ ├── langchain_api/ # FastAPI backend
│ ├── tts/ # Text-to-speech logic
│ └── web/ # Frontend (HTML/CSS/JS)
├── .env # Configuration file (create this)
├── requirements.txt # Python dependencies
└── run_all.py # Main startup script
git clone https://github.com/YourUsername/personal-ai-research-assistant.git
cd personal-ai-research-assistantpython -m venv .venv
# Activate (Windows Git Bash)
source .venv/Scripts/activate
# Activate (Linux/Mac)
source .venv/bin/activatepip install -r requirements.txtCreate a .env file in the project root:
# --- ChromaDB Settings ---
CHROMA_DIR=./chroma_db
CHROMA_COLLECTION_NAME=personal_ai_research_assistant
# --- Embedding Model ---
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
# --- Local AI Settings ---
LLM_MODEL_PATH=./models/Place .pdf, .txt, or .md files into data/. Then run:
python services/ingestion/ingest.pypython run_all.pyThis starts Uvicorn at http://0.0.0.0:8000.
- Open
services/web/index.htmlin your browser. - Type a question and click Ask.
- Click Speak Answer to hear it aloud.
(Tip: In VS Code, right-click index.html → Open with Live Server for instant reload.)
Example using curl:
curl -X POST "http://127.0.0.1:8000/ask" \
-H "Content-Type: application/json" \
-d '{"query": "Summarize the main points of the research documents."}'