Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty or incomplete hypotheses when using non-trivial decoding graph (LG) in fast beam search (pruned_transducer_stateless2) #403

Closed
ahazned opened this issue Jun 7, 2022 · 2 comments

Comments

@ahazned
Copy link
Contributor

ahazned commented Jun 7, 2022

Hi,

I trained a model using egs/librispeech/ASR/pruned_transducer_stateless2 and decoding works fine with all the default search strategies in pruned_transducer_stateless2/decode.py: greedy_search, modified_beam_search and fast_beam_search (with trivial decoding graph)

But when I try to use LG.pt instead of the hardcoded trivial graph in fast_beam_search I often get empty or incomplete hypotheses (LG.pt is composed using local/compile_lg.py). Increasing beam helps to get better results as expected, but there are still to many empty/incomplete results. Maybe I need something like "allow partial results" in Kaldi's lattice generation.

I wonder if anyone succeeded in using LG.pt with fast_beam_search and has a recommendation for getting better results. I know this sounds a little vague but I can also share some files if wanted.

Thank you.

@pkufool
Copy link
Collaborator

pkufool commented Jun 7, 2022

Did you decode with use-max=False? And if you are using librispeech dataset, the LG decoding results are expected to be worse than trivial graph because of the OOV words. see #277.

@ahazned
Copy link
Contributor Author

ahazned commented Jun 7, 2022

Thank you very much. I didn't look at the usage of fast_beam_search in pruned_transducer_stateless/decode.py. That solved my problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants