Skip to content

Welcome to SpeechBot, an innovative voice assistant that integrates advanced speech-to-text, large language model, and text-to-speech technologies for a seamless interactive experience.

License

Notifications You must be signed in to change notification settings

vroger11/SpeechBot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SpeechBot

A set of services to create a prototype to speak with a chatbot that speaks back the answer to the user.

This repository is designed to create an application for a personal computer (with one GPU having 6Go of VRAM minium), further modifications are required to serve it for multiple users on the cloud. The tutorial associated with this release is available on my blog here: https://website.vincent-roger.fr/blog/2025/03-17-speechbot-starting-point/. More tutorials will come over my blog with next releases.

The announcement for the first version of this project is now live. Watch the video on YouTube: Watch Now.

Services description

Frontend

Streamlit service that interact with the user (to record speech and play the answer). It uses all other services in the same order as described here.

Speech To Text service (stt_service)

This service is responsible for converting spoken language into text. It uses the Whisper tiny model.

Large Language Model service (llm_service)

A chatBot (based on Qwen 2.5.1) to create text answer to the request of the user.

Text-to-Speech (TTS) Service

This service converts the generated text response back into spoken language. It uses mini parler-tts model.

Prerequisites

Before setting up the project, ensure you have the following installed:

Tested on linux using Fedora 41. Feedback on other platforms are welcomed.

Setup

To set up the project, clone the repository and navigate to the project directory:

git clone https://github.com/yourusername/voice-chatbot.git
cd voice-chatbot

Usage

  1. Build and launch the Docker containers using the command:

    docker compose up --build
  2. Open your web browser and go to http://localhost:8501/.

  3. The web interface will wait for other services to be up and running before speaking with the bot.

  4. Use the interface to record your speech and get a spoken response from the chatbot.

Contributing

Contributions are welcome! Please follow these steps to contribute:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature-branch).
  3. Make your changes.
  4. Commit your changes (git commit -m 'Add some feature').
  5. Push to the branch (git push origin feature-branch).
  6. Open a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Licenses of the Models used

About

Welcome to SpeechBot, an innovative voice assistant that integrates advanced speech-to-text, large language model, and text-to-speech technologies for a seamless interactive experience.

Resources

License

Stars

Watchers

Forks

Packages

No packages published