Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Add seed_aug parameter for ImageRecordItr to fix random seed for default augmentation #11247

Merged
merged 57 commits into from
Jun 20, 2018

Conversation

wenyangchu
Copy link
Contributor

@wenyangchu wenyangchu commented Jun 12, 2018

Description

Add random seed control to augmentations in io.ImageRecordIter.
Motivation: I am doing a medical imaging project where reproducibility of the complete training procedure is important for a legal purpose.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
    (Add seed_aug parameter for ImageRecordItr to fix random seed for default augmentation)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

Wen-yang Chu and others added 3 commits June 12, 2018 20:39
…t module (apache#11140)

* gluon import

* gluon tests

* shape issues.

* remove the dim_change list

* onnx backend tests

* changes to match onnx op set version 7

* fix

* lint fix

* add new folder

* fix

* fix

* rename test file

* tile topk clip

* more ops

* fix

* fix
@marcoabreu
Copy link
Contributor

Hi @wenyangchu ,

thanks a lot for your contribution! Would you mind adding a test to ensure that the behaviour is actually deterministic now and that everything is behaving as expected?

* Important ndarray feature

* merge generic function Invoke

* Pass the Scala Style test

* add Experimental tags

* Change with NDArgs addition

* change dir for Experimental tag

* reTrigger CI

* add Symbol Macros change

* Add some workaround on NDArray

* Simplify the base part

* add changes on ND and Symbols...

* avoid vars

* add Symbol Macros

* Trigger the CI

* Trigger CI
@wenyangchu
Copy link
Contributor Author

Hi @marcoabreu ,
sure, I will add a test for it. Do you know where the existing tests for this c++ iterators right now? Or for this default augmenter? I couldn't find it.

@wenyangchu
Copy link
Contributor Author

wenyangchu commented Jun 13, 2018

Hi @marcoabreu or any others, another question is that, I have a test failure in python3 cpu win and python3 MKL-DNN cpu. I have checked them and I do not think they have anything to do with my commit. Do you have a clue on this? Thanks!

Don't use tempdir on osx since this is not mountable by default in Docker in osx
@marcoabreu
Copy link
Contributor

Hi, our tests are in https://github.com/apache/incubator-mxnet/tree/master/tests/cpp

Yes, sorry, we are encountering quite a decent number of flaky tests. Ammending your commit should be sufficient to retrigger a new job. Sorry for the inconvenience.

ThomasDelteil and others added 7 commits June 13, 2018 10:39
* Adding examples to list

* Update links add demo

* remove onnx-mxnet as it is now part of contrib

* Trigger build

* Adding Awesome-MXNet

* Trigger build
* use nearest power of 2 for gpu memory pool sizes

* add linear

* add test
* disable scalatest on Spark

* Kill the entire build

* Disable test in test folder and add dummy label
* mkldnn support for quantization

* fix output number in graph

* update licsence

* modify Jenkinsfile

* modify Jenkinsfile

* mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu

* add mkldnn uint8 pass for quantization graph

* update ut

* retrig ic

* remove no mkldnn quantization test temp

* seperate mkldnn quantization ut from gpu quantization ut

* rm dev_id check for cpu

* add mkl tests dictionary

* resolve review comments

* simplify DequantizeStorageType() logic

* simplify quantize/quantized_conv storage type logic

* Add mkldnn_OIhw4i16o4i type case (needed by int8)

* INT8 conv/pooling: share with FP32 convolution/pooling class/function

* minor indent changes

* Remove unnecessary mkldnn_quantized_pooling-inl.h

* Fix minor issue

* Fix lint

* delete duplicated data type

* fix bugs and convert requantize data to NDArray

* fix lint

* fix requantize storgetype

* fix requantize storge type

* Fix coding style comments

* Fix compile issue

* Change to use quantized_dtype option to support uint8/int8 scenarios

* fix gpu test quantization failure

* Fix indent

* fix quantized pooling param parser

* Fix imagenet_gen_qsym.py option style

* retrigger jenkins

* retrigger again

* trigger jenkins

* Resolve further comments

* share test code

* remove unnecessary test code

* add test_quantize_model for cpu

* add comments in quantize_graph_pass.cc

* jenkins

* jenkins

* improve coding style

* improve coding style

* Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU

* rename test_quantization_cpu.py to test_quantization_mkldnn.py

* code style

* trigger

* Adjust variable naming for test quantization

* add qdtype for quantized op test case to test/bypass all cases explicitly

* change expressions to be consistent

* revert unnecessary change
* add import_ for SymbolBlock

* fix

* Update block.py

* add save_parameters

* fix

* fix lint

* fix

* fix

* fix

* fix

* fix

* Update save_load_params.md
* Adding the first set of Facebook OG tags to layout.html

* Adding the direct link to the Facebook OG image

* Re-arranging the og description tag

* Changes to fix merge conflicts

* Adding secure_url to the og:image tag

* Adding the redundant og:image tag
@wenyangchu wenyangchu changed the title add seed_aug parameter for ImageRecordItr to fix random seed for augm… [WIP] add seed_aug parameter for ImageRecordItr to fix random seed for augm… Jun 14, 2018
* add test for mkldnnsum

* add extra comment

* fix verify

* only run if mkldnn supported

* divide size by size of type

* filter out regular arrays

* fix cond

* add verify mes

* filter views

* different outputs arrays

* move print message

* add in place test

* update copy fn

* refactor copyfrom

* use arr.copy instead of tmpmemmg

* use InitMKLDNNArray helper

* fix params

* pass correct type to copyfrom

* add print message for inplace sum

* remove copyfrom refactor

* remove redundant header

* remove extra line

* fix lint

* retrigger
@wenyangchu wenyangchu closed this Jun 14, 2018
@wenyangchu wenyangchu reopened this Jun 14, 2018
piiswrong and others added 4 commits June 14, 2018 14:02
* Improve hybridblock doc

* fix
* disable testcase test_gru_bidirectional temporarily

* retrigger the build and see if permission denied /efs-ccache can pass
…rs() (apache#11210)

* flipping official and alternative

* Update block.py

* Add new line

* fix trailing whitespaces

* clarify docs

* Update block.py

* Update block.py

* fix typo

* trailing whitespace

* Trigger build
@wenyangchu wenyangchu changed the title [WIP] add seed_aug parameter for ImageRecordItr to fix random seed for augm… Add seed_aug parameter for ImageRecordItr to fix random seed for default augmentation Jun 14, 2018
@wenyangchu
Copy link
Contributor Author

wenyangchu commented Jun 14, 2018

Hi @marcoabreu or others, the c++ tests for iterators in mxnet are not there and iterator's code is entirely in cpp file which is very difficult to test in c++ without refactoring. I found other tests for c++ iterators were done in python. I follow the same way to perform tests on this augmentation seed.

Is it correct that the following document should be updated automatically after merging this PR?
https://mxnet.incubator.apache.org/api/python/io/io.html#mxnet.io.ImageRecordIter

marcoabreu and others added 2 commits June 14, 2018 14:46
* Enable shared ccache

* Increase cache size

* Don't fail if EFS ccache is not available
* improve transforms.Resize

* fix

* Trigger CI

* Trigger CI

* improve

* Trigger CI

* Trigger CI

* fix unittest

* keep_ratio is false by default, to keep consistency
@marcoabreu
Copy link
Contributor

Yes, this will automatically update after merging.

A python test is totally fine as well

haojin2 and others added 3 commits June 18, 2018 11:10
* add count_include_pad argument

* add cound_include_pad in cudnn pooling and corresponding tests

* add gluon support for the new option

* switch to built-in functions for artificial NaN

* add doc for gluon

* change isAvg/getAvg to is_avg/get_avg
…e#11316)

* MXNET_FORCE_ADDTAKEGRAD to disable AddTakeGradLargeBatchCaller

If MXNET_FORCE_ADDTAKEGRAD is set, EmbeddingOpBackward will always use
AddTakeGrad independently of gradient input and output shape

* Read MXNET_FORCE_ADDTAKEGRAD to a static variable
…am.save(). Fix trainer init_kvstore (apache#11266)

* clip sparse grad. fix _reduce for rowsparse param

* fix kvstore init for local kv

* trigger
@@ -136,6 +139,8 @@ struct DefaultImageAugmentParam : public dmlc::Parameter<DefaultImageAugmentPara
DMLC_DECLARE_FIELD(pad).set_default(0)
.describe("Change size from ``[width, height]`` into "
"``[pad + width + pad, pad + height + pad]`` by padding pixes");
DMLC_DECLARE_FIELD(seed_aug).set_default(-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better use dmlc::optional() and check has_value() rather than compare with -1.
Literally we can use any number as seed.

indhub and others added 6 commits June 18, 2018 19:12
* Add first draft of profiler tutorial

* Minor changes

- Add images from web-data
- Add <!--notebook-skip-line-->

* Language corrections

* Minor changes

- Fix image URLs
- Fix formatting of output

* Minor changes

- Add download button.
- Hide profile_stats.png in notebook.

* Add tutorial to index.

* Add tutorial to tests.

* Add a note about nd.waitall()

* Remove the example build command.

Link to installation page is sufficient.

* Fix typo

* Include info about env variables related to profiling

* Add a further reading section
…11302)

* Add ccache reporting to CI

* Restructure dockcross dockerfiles to fix caching
* add resnet augmentation

* add test

* fix scope

* fix warning

* fix lint

* fix lint

* add color jitter and pca noise

* fix center crop

* merge

* fix lint

* Trigger CI

* fix

* fix augmentation implementation

* add checks for parameters

* modify training script

* fix compile error

* Trigger CI

* Trigger CI

* modify error message

* Trigger CI

* Trigger CI

* Trigger CI

* improve script in example

* fix script

* clear code

* Trigger CI

* set min_aspect_ratio to optional, move rotation and pad before random resized crop

* fix

* Trigger CI

* Trigger CI

* Trigger CI

* fix default values

* Trigger CI
@wenyangchu
Copy link
Contributor Author

I think I have flaky test again

* add softmax imporvement

* reuse CheckAxis code

* update comment

* add tests with negative axis
@zhreshold
Copy link
Member

see if rebase fix the flaky test

@zhreshold zhreshold merged commit 9ffc03c into apache:master Jun 20, 2018
@zhreshold
Copy link
Member

@wenyangchu thanks for the contribution, this is merged.
A quick tip: please use rebase instead of merge to master because otherwise it's hard for people to review only your changes in this PR.

zheng-da pushed a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
…ult augmentation (apache#11247)

* add seed_aug parameter for ImageRecordItr to fix random seed for augmentation

* remove white space

* add test

* fix test

* improve according to review: using dmlc::optional and has_value()

* missing header

* change data type and way to get value
XinYao1994 pushed a commit to XinYao1994/incubator-mxnet that referenced this pull request Aug 29, 2018
…ult augmentation (apache#11247)

* add seed_aug parameter for ImageRecordItr to fix random seed for augmentation

* remove white space

* add test

* fix test

* improve according to review: using dmlc::optional and has_value()

* missing header

* change data type and way to get value
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.