Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

synthetic audio for nts data created by gTTS library #30

Closed
wants to merge 6 commits into from

Conversation

emilhovad
Copy link

This creates synthetic data from gTTS voice, the opensource library from Google with mit license.
It creates audio with the file name in the nts dataset and saves it in a train folder

Several other voices can be used to create data with this script, based on functions in the code (license should be investigated for these packages as e.g. espeak (needs to be installed, a terminal program) or mac build in voice with Siri Voices (needs to be a mac, obviously, a terminal program).

Issues, the folder structure probably wrong, some packages as click are not used in this code.
Should this code support code that are already made to create the nts dataset, which is not added now? Huggingface format is not implemented here.

Copy link
Collaborator

@saattrupdan saattrupdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great! My comments are mostly some stylistic changes, as well as potentially converting some of the hardcoded values into arguments.

When this PR has been merged, we also need a PR which adds the following:

  • Saving the dataset as a Hugging Face Dataset.
  • Including a dataset config file which uses the dataset stored on Hugging Face Hub in the training script.

src/scripts/build_synthetic_nts.py Outdated Show resolved Hide resolved
src/scripts/build_synthetic_nts.py Outdated Show resolved Hide resolved
src/scripts/build_synthetic_nts.py Outdated Show resolved Hide resolved
src/scripts/build_synthetic_nts.py Outdated Show resolved Hide resolved
"""
subprocess.run(["say", text, "-o", filename])

def generate_speech_eSpeak(text, filename, variant="+m1"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing type hints

src/scripts/build_synthetic_nts.py Outdated Show resolved Hide resolved
src/scripts/build_synthetic_nts.py Outdated Show resolved Hide resolved
src/scripts/build_synthetic_nts.py Outdated Show resolved Hide resolved
src/scripts/build_synthetic_nts.py Outdated Show resolved Hide resolved
src/scripts/build_synthetic_nts.py Outdated Show resolved Hide resolved
@saattrupdan
Copy link
Collaborator

I see that there are also failures with the linting and unit testing that needs to be fixed here. Are you using the pre-commit hooks?

@saattrupdan saattrupdan closed this Oct 4, 2023
@saattrupdan saattrupdan deleted the synthetic_data branch July 1, 2024 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants