Yuna TTS

Welcome to Yuna TTS, a leading text-to-speech (TTS) engine designed to provide a natural and expressive voice for your applications. Yuna TTS is a deep learning-based TTS engine that can generate high-quality speech from text. It is designed to be easy to use and integrate into your applications, with support for multiple languages and voices.

Table of Content

Yuna TTS

Installation

The repo is developed and tested on Ubuntu 20.04 and Python 3.9. To install YunaTTS, you can follow the instructions below:

git clone https://github.com/yukiarimo/yunatts.git
cd yunatts
pip install -e .
python -m unidic download

Usage

Language Support

Languages supported by Yuna TTS:

English (EN)
Spanish (ES)
French (FR)
Chinese (ZH)
Japanese (JP)
Korean (KR)

CLI

You may use the YunaTTS CLI to interact with YunaTTS. The CLI may be invoked using either yunatts or yuna. Here are some examples:

Read English text:

yuna "Text to read" output.wav

Specify a language:

yuna "Text to read" output.wav --language EN

Specify a speaker:

yuna "Text to read" output.wav --language EN --speaker EN-US
yuna "Text to read" output.wav --language EN --speaker EN-AU

Specify a speed:

yuna "Text to read" output.wav --language EN --speaker EN-US --speed 1.5
yuna "Text to read" output.wav --speed 1.5

Load from a file:

yuna file.txt out.wav --file

Python API

from yuna.api import TTS

# Speed is adjustable
speed = 1.0

# CPU is sufficient for real-time inference.
# You can set it manually to 'cpu' or 'cuda' or 'cuda:0' or 'mps'
device = 'auto' # Will automatically use GPU if available
language = 'EN'

# English 
text = "Did you ever hear a folk tale about a giant turtle?"
model = TTS(language=language, device=device)
output_path = 'en.wav'
model.tts_to_file(text, 0, output_path, speed=speed)

Training

Before training, please install YunaTTS in dev mode and go to the yuna folder.

pip install -e .
cd yuna

Data Preparation

To train a TTS model, we need to prepare the audio files and a metadata file.

path/to/audio_001.wav|<speaker_name>|<language_code>|<text_001>
path/to/audio_002.wav|<speaker_name>|<language_code>|<text_002>

We can then run the preprocessing code:

python preprocess_text.py --metadata data/example/metadata.list

A config file data/example/config.json will be generated. Feel free to edit some hyper-parameters in that config file (for example, you may decrease the batch size if you have encountered the CUDA out-of-memory issue).

Training

The training can be launched by:

bash train.sh <path/to/config.json> <num_of_gpus>

Inference

Simply run:

python infer.py --text "<some text here>" -m /path/to/checkpoint/G_<iter>.pth -o <output_dir>

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github		.github
.vscode		.vscode
yuna		yuna
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE.md		LICENSE.md
README.md		README.md
SECURITY.md		SECURITY.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Yuna TTS

Table of Content

Installation

Usage

Language Support

CLI

Python API

Training

Data Preparation

Training

Inference

About

Releases

Sponsor this project

Languages

License

yukiarimo/yunatts

Folders and files

Latest commit

History

Repository files navigation

Yuna TTS

Table of Content

Installation

Usage

Language Support

CLI

Python API

Training

Data Preparation

Training

Inference

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Sponsor this project

Languages