Add symbolic beam search #233

junrushao · 2018-07-27T00:53:57Z

Description

Symbolic beam search is made possible after enabling control flow operators mx.sym.contrib.while_loop (apache/mxnet#11566) and mx.sym.contrib.cond (apache/mxnet#11760). In this PR, we create a class HybridBeamSearchSampler, which could be hybridized to perform beam search.

Checklist

Essentials

Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
Code is well-documented

Changes

Add class model.beam_search.HybridBeamSearchSampler
Allow method model.beam_search._expand_to_beam_size to accept Symbols
Add _extract_and_flatten_nested_structure and _reconstruct_flattened_structure to flatten and unflatten structure used in decoders
I slightly modified the unittest for BeamSearchSampler and HybridBeamSearchSampler to workaround failing testcases causes by topk.

TODO

Add unittests for model.beam_search.HybridBeamSearchSampler
Rename vocab_num to vocab_size
Review docstring again and again, make sure there is nothing wrong.

Comments

HybridBeamSearchSampler requires two extra arguments, batch_size and vocab_size, compared with BeamSearchSampler

mli · 2018-07-27T01:33:04Z

Job PR-233/7 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-233/7/index.html

leezu · 2018-07-27T01:42:43Z

Great! Can you make sure the HybridizedBeamSearchSampler is tested in test_beam_search.py too?

junrushao · 2018-07-27T18:00:20Z

@leezu Hey, thank you for the suggestions! So do you prefer I create a new method test_hybrid_beam_search or add test code into the existing test_beam_search?

Thanks!

leezu · 2018-07-27T18:50:28Z

Would it make sense to make BeamSearchSampler are gluon Block? In that case it may be possible to use the same test by parametrizing it with an argument that specifies the sampler implementation and wether to call .hybridize():

@pytest.mark.parametrize('hybridize', [True, False])
@pytest.mark.parametrize('sampler', [BeamSearchSampler, HybridizedBeamSearchSampler])
def test_beam_search(hybridize, sampler):
    [...]

Also, is there any advantage of BeamSearchSampler over HybridizedBeamSearchSampler when using the non-hybrid version?

junrushao · 2018-07-27T19:28:00Z

@leezu HybridizedBeamSearchSampler inherits from HybridBlock, but I am not sure if I could change BeamSearchSampler into a gluon block because it is originally inherited from object. Could you help confirm it?

HybridizedBeamSearchSampler requires two extra parameters, batch_size and vocab_size, which are required by static shape inference, while BeamSearchSampler takes advantage of imperative execution so it is more flexible.

szhengac · 2018-07-27T20:31:39Z

Good job. But for translation task, the batch_size can change across the iterations, so we cannot simply treat it as an extra parameter.

junrushao · 2018-07-27T20:36:28Z

@szhengac It is mandatory for now because static shape inference is required. This issue could be alleviated once symbolic shape is realized.

codecov · 2018-07-29T08:22:47Z

Codecov Report

Merging #233 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #233   +/-   ##
=======================================
  Coverage   74.69%   74.69%           
=======================================
  Files          83       83           
  Lines        7682     7682           
  Branches     1315     1315           
=======================================
  Hits         5738     5738           
  Misses       1675     1675           
  Partials      269      269

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0316eef...016acac. Read the comment docs.

mli · 2018-07-29T08:28:35Z

Job PR-233/8 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-233/8/index.html

junrushao · 2018-07-30T23:28:24Z

Question: do you guys prefer the name vocab_num or vocab_size?

szha · 2018-07-30T23:31:29Z

vocab_size, which is a term already used elsewhere in gluonnlp.

leezu · 2018-07-31T01:38:18Z

@junrushao1994 Yes, you can change it to inherit from Block. Just change the def __call__(self, inputs, states): to def forward(self, inputs, states):

junrushao · 2018-07-31T01:53:28Z

@szha Seems that there are already many vocab_nums in beam_search.py, should I change them to vocab_size by the way?

junrushao · 2018-08-07T00:49:55Z

It seems that numeral instability and uncertainty would cause unittest to fail. For example, when there are close (or equal) values, topk seems unable to produce a result consistent to numpy. So I slightly modify the unittest to let it pass.

szha · 2018-08-08T02:34:27Z

@junrushao1994 our test environment depends on a specific nightly version. Check under env/ and see if you need to update the date of the nightly build.

junrushao · 2018-08-08T08:11:33Z

@zheng-da finds the bug, and just now we submit the fix here: apache/mxnet#12078

junrushao · 2018-08-11T08:03:34Z

The most recent commit fails CI test for the following reason:

AttributeError: 'LSTM' object has no attribute 'h2h_weight'

Is that an incompatibility issue with MXNet nightly build, or cuDNN, or anything else? @szha

szha · 2018-08-11T19:26:38Z

@junrushao1994 yeah, I'm working on it as part of #264

mli · 2018-08-14T08:55:34Z

Job PR-233/27 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-233/27/index.html

mli · 2018-08-14T15:39:09Z

Job PR-233/28 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-233/28/index.html

junrushao · 2018-08-14T15:53:28Z

@szha This PR has passed CI test, so would you like to help review the code, especially docstring (my English is pretty bad) Thank you!

szha · 2018-08-14T17:14:23Z

tests/unittest/test_beam_search.py

@@ -196,13 +198,13 @@ def hybrid_forward(self, F, inputs, states):
            return log_probs, states

    class RNNLayerDecoder(Block):


This can be a hybrid block now

szha · 2018-08-14T17:14:45Z

tests/unittest/test_beam_search.py

-            for beam_size, bos_id, eos_id, alpha, K in [(2, 1, 3, 0, 1.0),  (4, 2, 3, 1.0, 5.0)]:
+            if sampler_cls is HybridBeamSearchSampler and decoder_fn is RNNLayerDecoder:
+                # Hybrid beam search does not work on non-hybridizable object
+                continue


no need to skip because RNNLayerDecoder can be a hybrid block.

Here is one thing I could not address. The `samples` are `taken` at each time stamp, it could not be expressed in `while_loop`. I totally have no idea how to deal with this.

mli · 2018-08-15T12:50:45Z

Job PR-233/32 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-233/32/index.html

junrushao · 2018-08-16T01:32:17Z

Thank you guys all the help and suggestions!

* Add symbolic beam search * Update _BeamSearchStepUpdate * [WIP] Fix a lot of stuff Here is one thing I could not address. The `samples` are `taken` at each time stamp, it could not be expressed in `while_loop`. I totally have no idea how to deal with this. * [WIP] fix typo * [WIP] Submit fixes for debugging cut subgraph for Da * Reduce search length to prevent numeral instability propagates * Make linter happy * Rename all vocab_num => vocab_size * Change RNNLayerDecoder to HybridBlock * Symbol[0] => Symbol.squeeze(axis=0)

junrushao requested a review from szha as a code owner July 27, 2018 00:53

junrushao force-pushed the symbolic-beam-search branch 6 times, most recently from b9b3c90 to 1e1fdd3 Compare July 27, 2018 01:12

szha requested review from sxjscience and leezu July 27, 2018 17:26

junrushao force-pushed the symbolic-beam-search branch from 50aaeb2 to 279953e Compare July 31, 2018 07:56

junrushao mentioned this pull request Jul 31, 2018

[MXNET-749] Bug fixes in control flow operators apache/mxnet#11942

Merged

6 tasks

junrushao force-pushed the symbolic-beam-search branch from 279953e to 7f00f1b Compare July 31, 2018 08:08

junrushao mentioned this pull request Aug 8, 2018

[MXNET-749] Correct usages of CutSubgraph in 3 control flow operators apache/mxnet#12078

Merged

7 tasks

junrushao force-pushed the symbolic-beam-search branch from 8dc047f to 3bd29db Compare August 11, 2018 07:26

junrushao mentioned this pull request Aug 12, 2018

GluonNLP 0.4 release tasks #271

Closed

20 tasks

junrushao force-pushed the symbolic-beam-search branch from 3bd29db to 6dec88d Compare August 13, 2018 06:48

junrushao force-pushed the symbolic-beam-search branch from c83b2fe to 9a4521b Compare August 14, 2018 15:18

szha reviewed Aug 14, 2018

View reviewed changes

junrushao added 9 commits August 14, 2018 14:01

Add symbolic beam search

e602ff3

Update _BeamSearchStepUpdate

bab7bc9

[WIP] Fix a lot of stuff

a6b5447

Here is one thing I could not address. The `samples` are `taken` at each time stamp, it could not be expressed in `while_loop`. I totally have no idea how to deal with this.

[WIP] fix typo

4a27daa

[WIP] Submit fixes for debugging cut subgraph for Da

54b7340

Reduce search length to prevent numeral instability propagates

09850de

Make linter happy

bb703fc

Rename all vocab_num => vocab_size

a69a773

Change RNNLayerDecoder to HybridBlock

a13a681

junrushao force-pushed the symbolic-beam-search branch from 9a4521b to a13a681 Compare August 14, 2018 21:03

szha approved these changes Aug 14, 2018

View reviewed changes

Symbol[0] => Symbol.squeeze(axis=0)

016acac

junrushao force-pushed the symbolic-beam-search branch from 1ae164a to 016acac Compare August 15, 2018 12:20

szha merged commit 2c6fbb9 into dmlc:master Aug 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add symbolic beam search #233

Add symbolic beam search #233

junrushao commented Jul 27, 2018 •

edited

Loading

mli commented Jul 27, 2018

leezu commented Jul 27, 2018

junrushao commented Jul 27, 2018

leezu commented Jul 27, 2018 •

edited

Loading

junrushao commented Jul 27, 2018 •

edited

Loading

szhengac commented Jul 27, 2018

junrushao commented Jul 27, 2018

codecov bot commented Jul 29, 2018 •

edited

Loading

mli commented Jul 29, 2018

junrushao commented Jul 30, 2018

szha commented Jul 30, 2018

leezu commented Jul 31, 2018

junrushao commented Jul 31, 2018

junrushao commented Aug 7, 2018

szha commented Aug 8, 2018

junrushao commented Aug 8, 2018

junrushao commented Aug 11, 2018 •

edited

Loading

szha commented Aug 11, 2018 •

edited

Loading

mli commented Aug 14, 2018

mli commented Aug 14, 2018

junrushao commented Aug 14, 2018 •

edited

Loading

szha Aug 14, 2018

szha Aug 14, 2018

mli commented Aug 15, 2018

junrushao commented Aug 16, 2018

		@@ -196,13 +198,13 @@ def hybrid_forward(self, F, inputs, states):
		return log_probs, states

		class RNNLayerDecoder(Block):

Add symbolic beam search #233

Add symbolic beam search #233

Conversation

junrushao commented Jul 27, 2018 • edited Loading

Description

Checklist

Essentials

Changes

TODO

Comments

mli commented Jul 27, 2018

leezu commented Jul 27, 2018

junrushao commented Jul 27, 2018

leezu commented Jul 27, 2018 • edited Loading

junrushao commented Jul 27, 2018 • edited Loading

szhengac commented Jul 27, 2018

junrushao commented Jul 27, 2018

codecov bot commented Jul 29, 2018 • edited Loading

Codecov Report

mli commented Jul 29, 2018

junrushao commented Jul 30, 2018

szha commented Jul 30, 2018

leezu commented Jul 31, 2018

junrushao commented Jul 31, 2018

junrushao commented Aug 7, 2018

szha commented Aug 8, 2018

junrushao commented Aug 8, 2018

junrushao commented Aug 11, 2018 • edited Loading

szha commented Aug 11, 2018 • edited Loading

mli commented Aug 14, 2018

mli commented Aug 14, 2018

junrushao commented Aug 14, 2018 • edited Loading

szha Aug 14, 2018

Choose a reason for hiding this comment

szha Aug 14, 2018

Choose a reason for hiding this comment

mli commented Aug 15, 2018

junrushao commented Aug 16, 2018

junrushao commented Jul 27, 2018 •

edited

Loading

leezu commented Jul 27, 2018 •

edited

Loading

junrushao commented Jul 27, 2018 •

edited

Loading

codecov bot commented Jul 29, 2018 •

edited

Loading

junrushao commented Aug 11, 2018 •

edited

Loading

szha commented Aug 11, 2018 •

edited

Loading

junrushao commented Aug 14, 2018 •

edited

Loading