Skip to content

v-manhlt3/Disentangle-VAE-for-VC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Many-to-Many Voice Conversion based on Variational Autoencoder

Code repository for paper link

Manh Luong, Viet Anh Tran under review in INTERSPEECH 2021

Dataset:

We use VCTK-Corpus to train and estimate our proposed model, VCTK dataset can be found in this link

Pretrained model:

pretrained model can be downloaded in this link Wavenet Vocoder: link

Requirements:

  • Python 3.6 or newer.
  • Pytorch 1.4 or newer.
  • librosa.
  • tensorboardX.
  • wavenet_vocoder pip install wavenet_vocoder

Prepare data for training

  1. Download and uncompress VCTK dataset.
  2. Move extracted dataset in [home directory].
  3. run command: export HOME=[home directory]
  4. run command: bash preprocessing.sh.

Usage

To train the model run the following command: bash training.sh

To convert voice from source to target using pretrained model. Run the follwoing commands:

  1. cd [Disentangled-VAE directory]
  2. mkdir ./results/checkpoints
  3. cp [your downloaded checkpoint] ./results/checkpoints/
  4. Download pretrained model of Wavenet_vocoder
  5. cp [downloaded Wavenet_Vocoder]/checkpoint_step001000000_ema.pth [Disentangled-VAE directory]
  6. edit two variables: src_spk and trg_spk in file conversion.sh to your source and target speaker, respectively.
  7. run command: bash conversion.sh

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published