The project in this repository consists of a fullstack LLM chat system with retrieval-augmented generation (RAG) features. After giving a arbitrary username in the login page, you will be redirected to a simple chat interface, where you can create a new conversation with a LLM model for the context of a website of your choice. The LLM model will answer you questions based on the content provided by the website.
After the indexing of the website, you will be able to ask questions and to keep a conversation that evolves based on your interaction and follow-up questions. You can create multiple conversations about the same website, without mixing the conversation context. You can also make the login with different usernames to be presented to single-user session chats.
-
LLM provider: OpenAI
-
Default model: gpt-4o-mini
-
Vector database: ChromaDB 0.5.23
-
-
Backend: Python 3.12.5
-
Web server: Flask 3.0.3
-
RAG: Langchain 0.2.13
-
-
Frontend: Node 20.16.0
-
Framework: Nuxt.js 3.12.4
-
UI components: Vuetify 3.6.14
-
The communication with the LLM model, as well as the history, chat and RAG features, were implemented using the Langchain library for Python. In summary, we index the website contents in ChromaDB and then use a retrieval interface to get the documents context. Chat sessions and history are stored in an SQLite file.
To have unique ChromaDB collections for each website, I used SHA256 hashes of their URLs (referred as url_hashs) as indexes. Also, for identifying unique chat sessions, I used a combination of the url_hash, user_id and the session's database primary key to create session_ids for indexes.
This implementation provides two major endpoints, index_url and ask.
-
The
POST /api/index_urlendpoint expects an URL in the request body keyurl. This URL will be indexed by the model and stored in ChromaDB. -
The
POST /api/askendpoint expects an URL in the request body keyurl. It also expects amessageand asession_id. -
Other then these, the
GET /api/sessions,POST /api/sessions,GET /api/loginandGET /api/messagesendpoints were implemented because of the project needs.
The project was divided into three containers: backend, frontend and chromadb. By setting the .env file with your environment variables, as present in the example.env file, and running the docker compose up command, you are able to run it.