A Python project for generating subtitles from audio files using Voice Activity Detection (VAD) and Speech-to-Text (STT) models.
The subtitle language matches the language of the selected Vosk model. Subtitles can be generated for any language supported by both Vosk and Silero-VAD models.
You can see results of the project on example folder.
-
Voice Activity Detection (VAD) using Silero VAD -
Speech-to-Text (STT) using Vosk (English ) -
Support any vosk model. -
Subtitle file generation (SRT format) - Translation support for other languages
- Text-to-Speech (TTS) integration
-
CLI interface - GUI interface
- Docker support
- The subtitle language is determined by the Vosk model you use.
Subtitles can be generated for any language supported by both Vosk and Silero-VAD.
- Clone the repository:
git clone https://github.com/DataSciense-py/subtitel-generator.git cd subtitel-generator - Install dependencies using Poetry:
poetry install
- Download the required Vosk model and place its folder inside the
models/vosk/directory. For example, for English, usemodels/vosk/vosk-en(wherevosk-enis any Vosk model folder you want to use). No additional setup is required: the program will automatically use the model from the specified folder.
See src/main.py for a runnable example. Basic usage:
from subtitel_generator import SubtitelGenerator
from subtitel_generator.file_generator import SrtSubtitleFileGenerator
from subtitel_generator.speech_to_text import VoskSTT
from subtitel_generator.voive_activation_detector import VADSilero
s = SubtitelGenerator(
vad=VADSilero(),
stt=VoskSTT(),
file_generater=SrtSubtitleFileGenerator(),
)
s.generate(audio_file_path="example/Example_audio_endlish_small.wav") # Path to the audio fileOr you can use the CLI interface:
cd src
python cli.pyAnd write full file path (audio or video)
src/— Source codesubtitel_generator/— Main package__init__.py— Package initializationsubtitel_model- Subtitel modelsubtitel_generator.py— Main orchestration classfile_generator/— Subtitle file generators (e.g., SRT)speech_to_text/— Speech-to-text models (Vosk)voive_activation_detector/— Voice activity detection (Silero)utils/— utilitieslogging— Logging utilities
cli.py— Command-line interfacemain.py— Main script
models/— Pretrained models (e.g., Vosk)example/— Example audio filestests/— Unit tests
Run tests with:
poetry run pytestThis project is licensed under the terms of the APACHE2.0 License.
Contributions are welcome! Please open issues or submit pull requests.
- Test video Автор: shovonrdm