-
02:58
- 8h behind - https://aidanpine.ca
Highlights
- Pro
Stars
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
A simple, hackable text-to-speech system in PyTorch and MLX
Unified automatic quality assessment for speech, music, and sound.
An implementation of XLS-R automatic speech recognition as a recognizer for ELAN
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
The TTSDS benchmark evaluates synthetic speech quality by considering prosody, speaker identity, and intelligibility, comparing these factors with real speech and noise datasets.
Inference and training library for high-quality TTS models.
Predicts the level of noise and reverberation on your audiofiles
VS Code extension that allows you to preview and play audio files.
The EveryVoice TTS Toolkit - Text To Speech for your language
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models
A python package for grapheme aware string handling
Simple, safe way to store and distribute tensors
This repository implements SummaryMixing, a simpler, faster and much cheaper replacement to self-attention for automatic speech recognition (see: https://arxiv.org/abs/2307.07421). The code is read…
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Domain-specific programming language for linguistic grammars and transducers — Langage dédié pour les grammaires linguistiques et les transducteurs.
open source knowledge for Syllabics font design and development
Official implementation of "Separate Anything You Describe"