Skip to content

JossCamp/DiffPitcher-Colab

 
 

Repository files navigation

A simple adaptation of Diff Pitcher to run on Google Colab

Diff-Pitcher (PyTorch)

Official Pytorch Implementation of Diff-Pitcher: Diffusion-based Singing Voice Pitch Correction

🤗huggingface link: https://huggingface.co/Higobeatz/Diff-Pitcher


Thank you all for your interest in this research project. I am currently optimizing the model's performance and computation efficiency. I plan to release a user-friendly version, either a GUI or a VST, in the first half of this year, and will update the open-source license.

If you are familiar with PyTorch, you can follow Code Examples to use Diff-Pitcher.


Diff-Pitcher

Demo

🎵 Listen to examples

Todo

  • Update codes and demo
  • Support 🤗 Diffusers
  • Upload checkpoints
  • Pipeline tutorial
  • Merge to Your-Stable-Audio
  • Audio Plugin Support

Instructions

  • Run: Clone Repo and install dependencies

  • Run: Download Models

  • Feel free to try:

    • template-based automatic pitch correction: template_based_apc.py
      Use a third reference voice to tune the input voice.
      --i: input voice path
      --r: reference voice path
      --o: output voice path
      !python template_based_apc.py --i "/content/DiffPitcher/voice.wav" --r "/content/DiffPitcher/guide.wav" --o output_template.wav
      
      
    • score-based automatic pitch correction: score_based_apc.py
      Use a MIDI file as a reference to tune voices.
      --i: input voice path
      --r: reference MIDI path
      --o: output voice path
      !python score_based_apc.py --i "/content/DiffPitcher/voice.wav" --r "/content/DiffPitcher/guide.mid" --o output_score.wav
      

References

If you find the code useful for your research, please consider citing:

@inproceedings{hai2023diff,
  title={Diff-Pitcher: Diffusion-Based Singing Voice Pitch Correction},
  author={Hai, Jiarui and Elhilali, Mounya},
  booktitle={2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
  pages={1--5},
  year={2023},
  organization={IEEE}
}

This repo is inspired by:

@article{popov2021diffusion,
  title={Diffusion-based voice conversion with fast maximum likelihood sampling scheme},
  author={Popov, Vadim and Vovk, Ivan and Gogoryan, Vladimir and Sadekova, Tasnima and Kudinov, Mikhail and Wei, Jiansheng},
  journal={arXiv preprint arXiv:2109.13821},
  year={2021}
}
@inproceedings{liu2022diffsinger,
  title={Diffsinger: Singing voice synthesis via shallow diffusion mechanism},
  author={Liu, Jinglin and Li, Chengxi and Ren, Yi and Chen, Feiyang and Zhao, Zhou},
  booktitle={Proceedings of the AAAI conference on artificial intelligence},
  volume={36},
  number={10},
  pages={11020--11028},
  year={2022}
}

Acknowledgement

Welcome to LCAP! < LCAP (jhu.edu)

We borrow code from following repos:

  • Diffusion Schedulers are based on 🤗 Diffusers
  • 2D UNet is based on DiffVC

About

Adaptation for Google Colab

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.4%
  • Jupyter Notebook 1.6%