A set of services to create a prototype to speak with a chatbot that speaks back the answer to the user.
This repository is designed to create an application for a personal computer (with one GPU having 6Go of VRAM minium), further modifications are required to serve it for multiple users on the cloud. The tutorial associated with this release is available on my blog here: https://website.vincent-roger.fr/blog/2025/03-17-speechbot-starting-point/. More tutorials will come over my blog with next releases.
The announcement for the first version of this project is now live. Watch the video on YouTube: Watch Now.
Streamlit service that interact with the user (to record speech and play the answer). It uses all other services in the same order as described here.
This service is responsible for converting spoken language into text. It uses the Whisper tiny model.
A chatBot (based on Qwen 2.5.1) to create text answer to the request of the user.
This service converts the generated text response back into spoken language. It uses mini parler-tts model.
Before setting up the project, ensure you have the following installed:
- Docker
- Docker Compose
- Docker GPU support - Follow the NVIDIA Docker guide for installation.
Tested on linux using Fedora 41. Feedback on other platforms are welcomed.
To set up the project, clone the repository and navigate to the project directory:
git clone https://github.com/yourusername/voice-chatbot.git
cd voice-chatbot
-
Build and launch the Docker containers using the command:
docker compose up --build
-
Open your web browser and go to http://localhost:8501/.
-
The web interface will wait for other services to be up and running before speaking with the bot.
-
Use the interface to record your speech and get a spoken response from the chatbot.
Contributions are welcome! Please follow these steps to contribute:
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch
). - Make your changes.
- Commit your changes (
git commit -m 'Add some feature'
). - Push to the branch (
git push origin feature-branch
). - Open a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
- Whisper tiny model: MIT License
- Qwen 2.5.1: Apache License 2.0
- Mini parler-tts model: Apache License 2.0