GAN for Music Genre transfer based upon M. Pasinis Work "MelGan-VC" created in the context of a bachelor thesis. This work was created for a bachelor thesis about Music Genre Transfer. It deals with a GAN working on mel-spectrograms and is based on the following work: https://github.com/marcoppasini/MelGAN-VC, Paper from M. Pasini: https://arxiv.org/abs/1910.03713
The main dependencies are:
tensorflow
- main librarytorch
- for some audio/spectrogram related stuff that did not work properly with tensorflowlibrosa
- for conversion to spectrogram datasoundfile
- for reading and writing soundfiles
There is a requirements.txt
provided in the root directory of the project.
The tools/
directory and the testing.py
contains all of the needed logic to setup, train and test the network.
Some code is not refactored for general use, there are still "hardcoded" parts like paths, so there have to be
adjustments made for custom use. I did not find the time to refactor.
MAINCONFIG.py
holds global constants used globally in all scripts.convert_dataset.py
is an example script for how a dataset would be prepared for training.validate_ds.py
is an example script that loads two datasetsdstrain
anddsval
and extracts a sample from source and target space from it, in order to write them to wavefiles. This was used for testing proper construction and integrity of the datasets.training.py
includes functions used for training and the process of how to train the network (at the bottom__main__
section of the file) It shows how a folder of wave files is turned into a dataset ready to be trained.testing_network.py
is an example script that loads a trained network and feeds it a spectrogram for conversion. The functionuse_generator
passes the data through the loaded network and saves it as a wavefile.encode_spec.py
anddecode_spec.py
are scripts that take either a path to a wavefile and a samplerate (22050 kHz recommended) or a path to a spectogram saved as.npy
spectrogram file and convert it to either one of the formats.plotloss.py
gives information about loss development extracted from the csv file that losses are logged into. There might be some lines commented.
The main part of the code is inside of the tools/
directory:
architecture_v2.py
contains all architecture related functionsdataset_processing.py
contains all code for converting between spectrogram and wave datalosses.py
contains the loss functions and related codespecnorm.py
contains code for spectral normalization