Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-531] CNN Examples for Scala new API #11292

Merged
merged 11 commits into from
Jul 16, 2018
Merged

Conversation

lanking520
Copy link
Member

@lanking520 lanking520 commented Jun 14, 2018

Description

This PR contains examples written from the new API.
@nswamy @yzhliu @andrewfayres
All examples pushed here are tested locally on my Mac. It lives in the new package called org.apache.mxnet.examples
Be aware: CNN Example is specifically designed to run on GPU. Disable the CPU

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

@lanking520 lanking520 requested a review from yzhliu as a code owner June 14, 2018 21:39
@lanking520
Copy link
Member Author

lanking520 commented Jun 15, 2018

Currently facing the CI problem as shown below:

- Example CI - CNN Example *** FAILED ***

  java.nio.charset.MalformedInputException: Input length = 1

  at java.nio.charset.CoderResult.throwException(CoderResult.java:281)

  at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)

  at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)

  at java.io.InputStreamReader.read(InputStreamReader.java:184)

  at java.io.BufferedReader.read1(BufferedReader.java:210)

  at java.io.BufferedReader.read(BufferedReader.java:286)

  at java.io.Reader.read(Reader.java:140)

  at scala.io.BufferedSource.mkString(BufferedSource.scala:96)

  at org.apache.mxnet.examples.cnntextclassification.DataHelper$.loadMRDataAndLabels(DataHelper.scala:53)

  at org.apache.mxnet.examples.cnntextclassification.DataHelper$.loadMSDataWithWord2vec(DataHelper.scala:166)

  ...

I add a fix on codec to see if we can solve that... Here is the link I found helpful. Cannot reproduce using a ubuntu machine

@lanking520
Copy link
Member Author

The previous problem solved, currently facing the memory leak issues:

- Example CI - CNN Example *** FAILED ***

  org.apache.mxnet.MXNetError: [23:56:58] src/storage/./pooled_storage_manager.h:118: cudaMalloc failed: out of memory



Stack trace returned 10 entries:

[bt] (0) /work/mxnet/scala-package/native/linux-x86_64-gpu/target/libmxnet-scala-linux-x86_64-gpu.so(dmlc::StackTrace[abi:cxx11]()+0x1bc) [0x7f74407f0f1c]

[bt] (1) /work/mxnet/scala-package/native/linux-x86_64-gpu/target/libmxnet-scala-linux-x86_64-gpu.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x28) [0x7f74407f2138]

[bt] (2) /work/mxnet/scala-package/native/linux-x86_64-gpu/target/libmxnet-scala-linux-x86_64-gpu.so(mxnet::storage::GPUPooledStorageManager::Alloc(mxnet::Storage::Handle*)+0x159) [0x7f744342e7c9]

[bt] (3) /work/mxnet/scala-package/native/linux-x86_64-gpu/target/libmxnet-scala-linux-x86_64-gpu.so(mxnet::StorageImpl::Alloc(mxnet::Storage::Handle*)+0x5d) [0x7f744343084d]

[bt] (4) /work/mxnet/scala-package/native/linux-x86_64-gpu/target/libmxnet-scala-linux-x86_64-gpu.so(mxnet::NDArray::CheckAndAlloc() const+0x238) [0x7f74409a9048]

[bt] (5) /work/mxnet/scala-package/native/linux-x86_64-gpu/target/libmxnet-scala-linux-x86_64-gpu.so(+0x33c19f0) [0x7f7442f429f0]

[bt] (6) /work/mxnet/scala-package/native/linux-x86_64-gpu/target/libmxnet-scala-linux-x86_64-gpu.so(mxnet::imperative::PushFCompute(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)> const&, nnvm::Op const*, nnvm::NodeAttrs const&, mxnet::Context const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::Resource, std::allocator<mxnet::Resource> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<unsigned int, std::allocator<unsigned int> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&)::{lambda(mxnet::RunContext)#1}::operator()(mxnet::RunContext) const+0x1d5) [0x7f7442f6a0c5]

[bt] (7) /work/mxnet/scala-package/native/linux-x86_64-gpu/target/libmxnet-scala-linux-x86_64-gpu.so(+0x3896c7b) [0x7f7443417c7b]

[bt] (8) /work/mxnet/scala-package/native/linux-x86_64-gpu/target/libmxnet-scala-linux-x86_64-gpu.so(mxnet::engine::ThreadedEngine::ExecuteOprBlock(mxnet::RunContext, mxnet::engine::OprBlock*)+0x8e5) [0x7f7443411575]

[bt] (9) /work/mxnet/scala-package/native/linux-x86_64-gpu/target/libmxnet-scala-linux-x86_64-gpu.so(void mxnet::engine::ThreadedEnginePerDevice::GPUWorker<(dmlc::ConcurrentQueueType)0>(mxnet::Context, bool, mxnet::engine::ThreadedEnginePerDevice::ThreadWorkerBlock<(dmlc::ConcurrentQueueType)0>*, std::shared_ptr<dmlc::ManualEvent> const&)+0xeb) [0x7f744342869b]

  at org.apache.mxnet.Base$.checkCall(Base.scala:131)

  at org.apache.mxnet.NDArray.internal(NDArray.scala:942)

  at org.apache.mxnet.NDArray.toArray(NDArray.scala:935)

  at org.apache.mxnet.NDArrayFuncReturn.toArray(NDArray.scala:1152)

  at org.apache.mxnet.examples.cnntextclassification.CNNTextClassification$$anonfun$trainCNN$1$$anonfun$apply$mcVI$sp$1.apply$mcVI$sp(CNNTextClassification.scala:150)

  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)

  at org.apache.mxnet.examples.cnntextclassification.CNNTextClassification$$anonfun$trainCNN$1.apply$mcVI$sp(CNNTextClassification.scala:127)

  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)

  at org.apache.mxnet.examples.cnntextclassification.CNNTextClassification$.trainCNN(CNNTextClassification.scala:121)

  at org.apache.mxnet.examples.cnntextclassification.CNNTextClassification$.test(CNNTextClassification.scala:248)

  ...

Will try to look into the code to see if we are not disposing some ndarrays

@lanking520
Copy link
Member Author

Currently, I have switched from epoch=200 into 30 and set dropout to 0.0 to match Python configuration. It seemed the training result doesn't looks good

@lanking520 lanking520 changed the title [MXNET-531] CNN Examples for Scala new APi [MXNET-531] CNN Examples for Scala new API Jun 25, 2018
@lanking520 lanking520 force-pushed the example-cnn branch 2 times, most recently from acddbe1 to 147972c Compare June 26, 2018 20:12
@nswamy
Copy link
Member

nswamy commented Jun 29, 2018

@lanking520
Though not ideal this example is not tuned to achieve great accuracy. I think we should call this out explicitly in the README that the example is for illustration purposes only at this time.

@@ -0,0 +1,23 @@
# CNN Text Classification Example for Scala
This is the example using Scala type-safe api doing CNN text classification.
Currently, I cannot reproduce the same result in the python example here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is only for Illustration and not modeled to achieve the best accuracy.

@lanking520 lanking520 force-pushed the example-cnn branch 2 times, most recently from e4729aa to 539c091 Compare July 3, 2018 22:09
This is the example using Scala type-safe api doing CNN text classification.
This example is only for Illustration and not modeled to achieve the best accuracy.

Please contribute to improve the dev accuracy of the model.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

*/
* An Implementation of the paper
* Convolutional Neural Networks for Sentence Classification
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

val sm = Symbol.SoftmaxOutput()()(Map("data" -> fc, "label" -> inputY))
val fc = Symbol.api.FullyConnected(data = Some(hDrop), num_hidden = numLabel)
val sm = Symbol.api.SoftmaxOutput(data = Some(fc), label = Some(inputY))
fc.dispose()
Copy link
Member

@nswamy nswamy Jul 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you checked that this does not crash. Isn't SoftmaxOutput using FC during training

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have. Not crashing

@yzhliu
Copy link
Member

yzhliu commented Jul 11, 2018

Can we merge?

@nswamy nswamy merged commit 5495e2a into apache:master Jul 16, 2018
@lanking520 lanking520 deleted the example-cnn branch July 17, 2018 21:57
XinYao1994 pushed a commit to XinYao1994/incubator-mxnet that referenced this pull request Aug 29, 2018
* init commit for CNN

* Add changes to pass CPU and GPU test

* Add codec to solve the mkstring issue

* Change dropout and epoch number

* change epoch to 10 and Java download

* add README.md

* adding dispose method to avoid memory leaks

* dispose unused Symbols
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants