Whisper as an automatic corpus annotation tool in Lhotse #209

pzelasko · 2022-09-30T19:17:45Z

pzelasko
Sep 30, 2022

In the latest release of Lhotse, we added an option to use Whisper to segment and transcribe unlabeled recordings and save the results as a Lhotse CutSet manifest. We also support forced alignment with torchaudio's pretrained Wav2Vec2 ASR to get word-level timestamps.

Benefits for Whisper users: you will access all the capabilities of Lhotse in terms of data preparation (mixing/truncating multiple examples into one, merging multiple datasets), augmentation (noise mixing, speed perturbation, reverberation, SpecAugment, etc.), and data sampling and dataloading for PyTorch model training.

Benefits for Lhotse users: a familiar interface for using Whisper to annotate your data.

Want to learn more about using Lhotse? See our tutorials here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whisper as an automatic corpus annotation tool in Lhotse #209

{{title}}

Replies: 0 comments

Select a reply

Whisper as an automatic corpus annotation tool in Lhotse #209

pzelasko Sep 30, 2022

Replies: 0 comments

pzelasko
Sep 30, 2022