Reproducing Delay-Penalized Transduced For Low-Latency Streaming ASR #710

Tomiinek · 2022-11-28T14:34:59Z

Hi guys, I am trying to reproduce the results from https://arxiv.org/pdf/2211.00490.pdf but I am not super successful.
Could you please provide or point me to some recipes that could help me?

CC: @pzelasko

yaozengwei · 2022-11-28T14:42:47Z

You could refer to:

add delay_penalty in rnnt loss k2#976 in k2. It implements the delay penaly algorithm in the paper.
Apply delay penalty on transducer #654 in icefall. It tells how to apply the delay penaly during training.

Tomiinek · 2022-11-28T14:56:17Z

Thanks for a prompt reply!

I noticed this PR, but what experimental setup do you suggest? (egs/librispeech/ASR/pruned_transducer_stateless{1,2,3,4,5}/train.py and similarly for LSTM) Also, are the values in the paper e.g. 0.0060 correct? I am not able to make the model converge for such a high values, 10x lower values seem to do something though

yaozengwei · 2022-11-28T15:05:42Z

You could try pruned_transducer_stateless4 and lstm_transducer_stateless3.

About the convergence issue, we apply the delay penalty after training some batches (warmup >= 2.0). You could refer to https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless4/train.py#L643 and have a try.

yaozengwei · 2022-11-28T15:11:57Z

In our experiments, we try delay_penalty=0.0015, 0.0030, 0.0060, 0.0075 and 0.0100, respectively.

Tomiinek · 2022-11-28T15:16:39Z

Ok, thank you very much 🙂

Tomiinek closed this as completed Nov 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing Delay-Penalized Transduced For Low-Latency Streaming ASR #710

Reproducing Delay-Penalized Transduced For Low-Latency Streaming ASR #710

Tomiinek commented Nov 28, 2022

yaozengwei commented Nov 28, 2022 •

edited

Loading

Tomiinek commented Nov 28, 2022

yaozengwei commented Nov 28, 2022

yaozengwei commented Nov 28, 2022

Tomiinek commented Nov 28, 2022

Reproducing Delay-Penalized Transduced For Low-Latency Streaming ASR #710

Reproducing Delay-Penalized Transduced For Low-Latency Streaming ASR #710

Comments

Tomiinek commented Nov 28, 2022

yaozengwei commented Nov 28, 2022 • edited Loading

Tomiinek commented Nov 28, 2022

yaozengwei commented Nov 28, 2022

yaozengwei commented Nov 28, 2022

Tomiinek commented Nov 28, 2022

yaozengwei commented Nov 28, 2022 •

edited

Loading