Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

fix a bug in CachedOp. #11675

Merged
merged 5 commits into from
Jul 21, 2018
Merged

fix a bug in CachedOp. #11675

merged 5 commits into from
Jul 21, 2018

Conversation

zheng-da
Copy link
Contributor

Description

After adding kSubgraphExec, some logic in CachedOp is no longer valid. For example, an op executor that requires async execution may not contain output arrays. On the other hand, calling CreateEngineOp on op executors without output arrays can fail. This PR tries to fix this problem.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

@zheng-da zheng-da requested a review from anirudh2290 as a code owner July 12, 2018 20:43
{'static_alloc': True, 'static_shape': True} ]
for config in configs:
layer = TestRNNLayer(cell_type, hidden_size)
layer.initialize(ctx=mx.cpu(0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure about hardcoding the context?

seg = EngineOprSeg{false, nid + 1, nullptr};
} else if (is_async) {
seg = EngineOprSeg{false, nid + 1};
seg.opr.reset(CreateEngineOp(default_ctx, seg_execs));
seg_execs.clear();
seg_start = nid + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract common code outside branch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't. the code is used in "if" and "else if". it's not used in "else"

@zheng-da
Copy link
Contributor Author

the PR will be updated after #11566 is merged.

@zheng-da zheng-da force-pushed the fix_cachedop branch 2 times, most recently from 6f5b965 to 621a6d1 Compare July 20, 2018 07:06
@zheng-da
Copy link
Contributor Author

@eric-haibin-lin @piiswrong could you please review this PR?

@eric-haibin-lin
Copy link
Member

what's the new test case added for this?

@zheng-da
Copy link
Contributor Author

It's here. I tested the code in more CachedOp configurations.
https://github.com/apache/incubator-mxnet/pull/11675/files#diff-52501b7b512a5434dbc54931da1f0f2cR1006

@eric-haibin-lin eric-haibin-lin merged commit 6798703 into apache:master Jul 21, 2018
XinYao1994 pushed a commit to XinYao1994/incubator-mxnet that referenced this pull request Aug 29, 2018
* fix a bug.

* add tests.

* use default context.

* move all tests to test_contrib_control_flow.py

* fix test.
@zheng-da zheng-da deleted the fix_cachedop branch September 29, 2018 21:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants