lucasgris

Lucas Gris lucasgris

Interest in audio processing and deep learning.

51 followers · 73 following

@lucasrafagris

Achievements

Organizations

Lists (1)

Sort

Music generation

3 repositories

Starred repositories

Zyphra / Zonos

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python 4,807 448 Updated Feb 18, 2025

yangdongchao / SimpleSpeech

The open source code for SimpleSpeech series

Python 127 7 Updated Oct 8, 2024

freds0 / free-svc

[ICASSP 2025] FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion

Python 48 6 Updated Feb 5, 2025

kyutai-labs / hibiki

Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits for the end of the source utterance to start translating--- H…

Rust 763 58 Updated Feb 9, 2025

roudimit / whisper-flamingo

[Interspeech 2024] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation

Jupyter Notebook 127 8 Updated Feb 12, 2025

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,144 272 Updated Nov 5, 2024

NilsDem / control-transfer-diffusion

Repository for the paper "Combining audio control and style transfer using latent diffusion", accepted at ISMIR 2024

Jupyter Notebook 42 3 Updated Dec 18, 2024

rabitt / ismir2017-deepsalience

Companion code for ISMIR 2017 paper "Deep Salience Representations for $F_0$ Estimation in Polyphonic Music"

Jupyter Notebook 86 20 Updated Nov 22, 2019

multimodal-art-projection / YuE

YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open

Python 3,837 415 Updated Feb 17, 2025

falabrasil / speech-datasets

🗣️🇧🇷 Bases de áudio transcrito em Português Brasileiro

Shell 55 8 Updated Mar 30, 2023

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 10,740 1,050 Updated Feb 16, 2025

AudioLLMs / Awesome-Audio-LLM

Audio Large Language Models

Python 376 21 Updated Jan 15, 2025

shaopengw / Awesome-Music-Generation

Awesome music generation model——MG²

Python 137 10 Updated Feb 5, 2025

lucidrains / musiclm-pytorch

Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch

Python 3,227 261 Updated Sep 6, 2023

haoheliu / AudioLDM2

Text-to-Audio/Music Generation

Python 2,370 186 Updated Sep 29, 2024

edwko / OuteTTS

Interface for OuteTTS models.

Python 923 79 Updated Feb 14, 2025

lhl / voicechat2

Local SRT/LLM/TTS Voicechat

Python 618 67 Updated Oct 12, 2024

SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 9,686 1,303 Updated Feb 18, 2025

kyegomez / LiqudNet

Implementation of Liquid Nets in Pytorch

Python 57 9 Updated Jan 27, 2025

microsoft / generative-ai-for-beginners

21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

Jupyter Notebook 70,811 36,777 Updated Feb 17, 2025

keonlee9420 / Comprehensive-Transformer-TTS

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, …

Python 324 41 Updated Sep 24, 2022