Whisper as an automatic corpus annotation tool in Lhotse #209
pzelasko
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In the latest release of Lhotse, we added an option to use Whisper to segment and transcribe unlabeled recordings and save the results as a Lhotse
CutSet
manifest. We also support forced alignment with torchaudio's pretrained Wav2Vec2 ASR to get word-level timestamps.Benefits for Whisper users: you will access all the capabilities of Lhotse in terms of data preparation (mixing/truncating multiple examples into one, merging multiple datasets), augmentation (noise mixing, speed perturbation, reverberation, SpecAugment, etc.), and data sampling and dataloading for PyTorch model training.
Benefits for Lhotse users: a familiar interface for using Whisper to annotate your data.
Want to learn more about using Lhotse? See our tutorials here.
Beta Was this translation helpful? Give feedback.
All reactions