SpeechBot

A set of services to create a prototype to speak with a chatbot that speaks back the answer to the user.

This repository is designed to create an application for a personal computer (with one GPU having 6Go of VRAM minium), further modifications are required to serve it for multiple users on the cloud. The tutorial associated with this release is available on my blog here: https://website.vincent-roger.fr/blog/2025/03-17-speechbot-starting-point/. More tutorials will come over my blog with next releases.

The announcement for the first version of this project is now live. Watch the video on YouTube: Watch Now.

Services description

Frontend

Streamlit service that interact with the user (to record speech and play the answer). It uses all other services in the same order as described here.

Speech To Text service (stt_service)

This service is responsible for converting spoken language into text. It uses the Whisper tiny model.

Large Language Model service (llm_service)

A chatBot (based on Qwen 2.5.1) to create text answer to the request of the user.

Text-to-Speech (TTS) Service

This service converts the generated text response back into spoken language. It uses mini parler-tts model.

Prerequisites

Before setting up the project, ensure you have the following installed:

Docker
Docker Compose
Docker GPU support - Follow the NVIDIA Docker guide for installation.

Tested on linux using Fedora 41. Feedback on other platforms are welcomed.

Setup

To set up the project, clone the repository and navigate to the project directory:

git clone https://github.com/yourusername/voice-chatbot.git
cd voice-chatbot

Usage

Build and launch the Docker containers using the command:
```
docker compose up --build
```
Open your web browser and go to http://localhost:8501/.
The web interface will wait for other services to be up and running before speaking with the bot.
Use the interface to record your speech and get a spoken response from the chatbot.

Contributing

Contributions are welcome! Please follow these steps to contribute:

Fork the repository.
Create a new branch (git checkout -b feature-branch).
Make your changes.
Commit your changes (git commit -m 'Add some feature').
Push to the branch (git push origin feature-branch).
Open a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Licenses of the Models used

Whisper tiny model: MIT License
Qwen 2.5.1: Apache License 2.0
Mini parler-tts model: Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
frontend		frontend
llm_service		llm_service
stt_service		stt_service
tts_service		tts_service
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
compose.yaml		compose.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SpeechBot

Services description

Frontend

Speech To Text service (stt_service)

Large Language Model service (llm_service)

Text-to-Speech (TTS) Service

Prerequisites

Setup

Usage

Contributing

License

Licenses of the Models used

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

vroger11/SpeechBot

Folders and files

Latest commit

History

Repository files navigation

SpeechBot

Services description

Frontend

Speech To Text service (stt_service)

Large Language Model service (llm_service)

Text-to-Speech (TTS) Service

Prerequisites

Setup

Usage

Contributing

License

Licenses of the Models used

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages