Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train YourTTS on another language #12

Closed
annaklyueva opened this issue Mar 27, 2022 · 4 comments
Closed

Train YourTTS on another language #12

annaklyueva opened this issue Mar 27, 2022 · 4 comments

Comments

@annaklyueva
Copy link

Good day!

I have several questions, could you please help?

Do I understand correctly that if I want to train the model on another language it is better to fine tune this model (YourTTS-EN(VCTK+LibriTTS)-PT-FR SCL): https://drive.google.com/drive/folders/15G-QS5tYQPkqiXfAdialJjmuqZV0azQV
Or it is better to use other checkpoints.

How many hours of audio is needed to have appropriate quality?

I planned to use Common Voice Corpus to fine-tune the model on a new language, however, the audio format is mp3 not wav. Do I need to convert all the audio files or I can use mp3 format. If yes, how?

Thank you for your time in advance!

@Edresson
Copy link
Owner

Hi,

Yes the better is to fine-tune mentioned model.

How many hours of audio is needed to have appropriate quality?

We didn't analyze the number of hours needed to learn new languages in the YourTTS article.

I planned to use Common Voice Corpus to fine-tune the model on a new language, however, the audio format is mp3 not wav. Do I need to convert all the audio files or I can use mp3 format. If yes, how?

Yes, you need to convert the files to wav and resample it to the right sampling rate (the released model was trained in 16000 Hz). If you like you can use the resample script available at the Coqui TTS repository.

@Edresson Edresson reopened this Mar 28, 2022
@annaklyueva
Copy link
Author

annaklyueva commented Mar 29, 2022

@Edresson Thank you very much for your help!

As far as I understand I also need to add "charecters" of my language to config.json, am I right? Do I need to add something to "phonemes" part of the .json? Also do I need to change "phoneme_language" to the language I use?

And is it better to change the number of epochs/ batch size?

Maybe you can give me some other tips how to fine-tune the model on a new language?

@Edresson
Copy link
Owner

@Edresson Thank you very much for your help!

As far as I understand I also need to add "charecters" of my language to config.json, am I right? Do I need to add something to "phonemes" part of the .json? Also do I need to change "phoneme_language" to the language I use?

And is it better to change the number of epochs/ batch size?

Maybe you can give me some other tips how to fine-tune the model on a new language?

You are welcome!

Use the latest version of Coqui TTS and use this script to find all phonemes in our dataset and this to find the unique chars. After update the vocabulary and add the new datasets follow the steps describe here to training.

@annaklyueva
Copy link
Author

@Edresson Thank you very much for your help!
As far as I understand I also need to add "charecters" of my language to config.json, am I right? Do I need to add something to "phonemes" part of the .json? Also do I need to change "phoneme_language" to the language I use?
And is it better to change the number of epochs/ batch size?
Maybe you can give me some other tips how to fine-tune the model on a new language?

You are welcome!

Use the latest version of Coqui TTS and use this script to find all phonemes in our dataset and this to find the unique chars. After update the vocabulary and add the new datasets follow the steps describe here to training.

Ok! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants