-
Notifications
You must be signed in to change notification settings - Fork 35
2.3. Augmenting datasets
This part of Allie's skills relates to data augmentation.
Data augmentation is used to expand the training dataset in order to improve the performance and ability of a machine learning model to generalize. For example, you may want to shift, flip, brightness, and zoom on images to augment datasets to make models perform better in noisy environments indicative of real-world use. Data augmentation is especially useful when you don't have that much data, as it can greatly expand the amount of training data that you have for machine learning.
Typical augmentation scheme is to take 50% of the data and augment it and leave the rest the same. You can read more about data augmentation here.
To augment an entire folder of a certain file type (e.g. audio files of .WAV format), you can run:
cd /Users/jim/desktop/allie
cd augmentation/audio_augmentation
python3 augment.py /Users/jim/desktop/allie/train_dir/males/
python3 augment.py /Users/jim/desktop/allie/train_dir/females/
The code above will augment all the audio files in the folderpath via the default_augmenter specified in the settings.json file (e.g. 'augment_tasug'). In this case, it will augment both the males and females folders full of .WAV files
You should now have 2x the data in each folder. Here is a sample audio file and augmented audio file (in females) folder, for reference:
- Non-augmented file sample (female speaker)
- Augmented file sample (female speaker)
Click the .GIF below to follow along this example in a video format:
Note you can extend this to any of the augmentation types. The table below overviews how you could call each as a augmenter. In the code below, you must be in the proper folder (e.g. ./allie/augmentation/audio_augmentations for audio files, ./allie/augmentation/image_augmentation for image files, etc.) for the scripts to work properly.
Data type | Supported formats | Call to featurizer a folder | Current directory must be |
---|---|---|---|
audio files | .MP3 / .WAV | python3 augment.py [folderpath] |
./allie/augmentation/audio_augmentation |
text files | .TXT | python3 augment.py [folderpath] |
./allie/augmentation/text_augmentation |
image files | .PNG | python3 augment.py [folderpath] |
./allie/augmentation/image_augmentation |
video files | .MP4 | python3 augment.py [folderpath] |
./allie/augmentation/video_augmentation |
csv files | .CSV | python3 augment.py [folderpath] |
./allie/augmentation/csv_augmentation |
- augment_tsaug - adds noise and various shifts to audio files, addes 2x more data; see tutorial here.
- augment_addnoise - adds noise to an audio file.
- augment_noise - removes noise from audio files randomly.
- augment_pitch - shifts pitch up and down to correct for gender differences.
- augment_randomsplice - randomly splice an audio file to generate more data.
- augment_silence - add silence to an audio file to augment a dataset.
- augment_time - change time duration for a variety of audio files through making new files.
- augment_volume - change volume randomly (helps to mitigate effects of microphohne distance on a model).
- augment_textacy - uses textacy to augment text files.
- augment_imaug - uses imaug to augment image files (random transformations).
- augment_vidaug - uses vidaug to augment video files (random transformations).
- augment_tgan_classification - generative adverserial examples - can be done on class targets / problems.
- augment_ctgan_regression - generative adverserial example on regression problems / targets.
Here are some settings that can be customized for Allie's augmentation API. Settings can be modified in the settings.json file.
setting | description | default setting | all options |
---|---|---|---|
augment_data | whether or not to implement data augmentation policies during the model training process via default augmentation scripts. | True | True, False |
default_audio_augmenters | the default augmentation strategies used during audio modeling if augment_data == True | ["augment_tsaug"] | ["augment_tsaug", "augment_addnoise", "augment_noise", "augment_pitch", "augment_randomsplice", "augment_silence", "augment_time", "augment_volume"] |
default_csv_augmenters | the default augmentation strategies used to augment .CSV file types as part of model training if augment_data==True | ["augment_ctgan_regression"] | ["augment_ctgan_classification", "augment_ctgan_regression"] |
default_image_augmenters | the default augmentation techniques used for images if augment_data == True as a part of model training. | ["augment_imgaug"] | ["augment_imgaug"] |
default_text_augmenters | the default augmentation strategies used during model training for text data if augment_data == True | ["augment_textacy"] | ["augment_textacy", "augment_summary"] |
default_video_augmenters | the default augmentation strategies used for videos during model training if augment_data == True | ["augment_vidaug"] | ["augment_vidaug"] |