-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-542] Fix mkldnn performance regression + improve test logging #11262
Conversation
8f89758
to
becedcc
Compare
ee981f8
to
3f7e954
Compare
42dd59a
to
4ec8698
Compare
so this diff fails because of a recent test that was added after my initial merge that caused the regression. the test compares the output of the hybridized vs non-hybridized networks and ensures the results are roughly equal. prior to my merge that caused the regression, while it was performant, the hybridized results vs non-hybridized results varied (which causes this new test to fail). |
1cceb3c
to
1fa9e35
Compare
Benchmarks as of now:
This PR
|
Benchmark from diff 9514a1e
|
@pengzhao-intel please review |
Thanks @azai91 Could you elaborate what is the root cause of the regression and how do you fix it? |
@pengzhao-intel the root cause of the issue was that we were always creating new tmp memory for WriteTo operations in the new diff (https://github.com/apache/incubator-mxnet/pull/11026/files#diff-21f97d54b22ca65a086fe9e13c217453R169). I added a check in CreateMKLDNNMem to not create new memory unless the formats are incompatible. |
Sounds good. @TaoLv please help take a look for the code changes. |
I have a separate PR where I write unit test for the CreateMKLDNNMem/CommitOutput coming. |
@@ -327,7 +327,8 @@ typedef std::pair<OutDataOp, mkldnn::memory *> mkldnn_output_t; | |||
* If these two functions are used, we have to call CommitOutput to write | |||
* the output back to the output NDArray. | |||
*/ | |||
mkldnn_output_t CreateMKLDNNMem(const NDArray &arr, | |||
mkldnn_output_t CreateMKLDNNMem(const NDArray &out_arr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you create a separate function for this? so we don't need to modify all mkldnn operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, but then technically wouldn't all shouldn't all operators be eligible to use WriteInPlace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for overriding this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@azai91 not really. WriteInplace only works in a few operators (sum, activation, batchnorm). It doesn't work on some operators such as convolution and pooling for sure because the inputs and outputs of these operators don't have the same shape.
If you prefer to use one function, please update the comment and explain why CreateMKLDNNMem
needs in_arr
. Meanwhile, please rename arr
of CreateMKLDNNWeightGrad
to out_arr
, so CreateMKLDNNMem
and CreateMKLDNNWeightGrad
are still consistent.
return mkldnn_output_t(OutDataOp::CopyBack, tmp); | ||
} else { | ||
} else if (req == kWriteInplace) { | ||
if (CanWriteTo(out_arr, in_arrs[0], desc)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you pass all inputs to this function but only use the first one?
const mkldnn::memory::primitive_desc &desc) { | ||
bool add_same = in_arr.GetMKLDNNData()->get_data_handle() == | ||
out_arr.GetMKLDNNData()->get_data_handle(); | ||
bool pdesc_same = out_arr.GetMKLDNNData()->get_primitive_desc() == desc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe you should compare the primitive descriptor of both input and output NDArrays
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can only compare with first input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but you are not comparing with the first input. desc
can be different from in_arr
's descriptor, right?
src/operator/nn/mkldnn/mkldnn_sum.cc
Outdated
@@ -92,7 +92,7 @@ void MKLDNNSumForward(const nnvm::NodeAttrs& attrs, const OpContext &ctx, | |||
} else { | |||
// req == kWriteInplace but cannot be handled by mkldnn and | |||
// req == kAddTo will run into this branch | |||
auto mem = CreateMKLDNNMem(out_data, pdesc.dst_primitive_desc(), req); | |||
auto mem = CreateMKLDNNMem(out_data, inputs, pdesc.dst_primitive_desc(), req); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you remove the code above that handles WriteInplace differently.
tests/cpp/operator/mkldnn.cc
Outdated
@@ -722,6 +731,27 @@ void TestBinaryOp(const OpAttrs &attrs, VerifyFunc verify_fn) { | |||
} | |||
} | |||
|
|||
TEST(MKLDNN_NDArray, CopyFrom) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any modification for this test? Can you move it back to its original place to show the difference more clearly.
6aa538f
to
7f9dac4
Compare
@pengzhao-intel for in place writes, is there any reason why we only compare the pdesc and data_handle of the first input and not all of the them? does the write in place mkldnn api only work if the vector to be written over the first one? https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/mkldnn/mkldnn_sum.cc#L76 |
027c620
to
873a2c6
Compare
Ran a test and it seems this is the case - write in place (at least for sum) requires that the tensor to be written over is at index 0. @pengzhao-intel please verify. |
bb31c0b
to
83ce4f0
Compare
27218eb
to
f4b7db8
Compare
f4b7db8
to
76b50bc
Compare
mkldnn::memory *mem = const_cast<NDArray &>(arr).CreateMKLDNNData(desc); | ||
if (mem == nullptr) { | ||
} else if (req == kWriteInplace && in_arr != nullptr && CanWriteTo(out_arr, *in_arr, desc)) { | ||
mkldnn::memory *mem = const_cast<NDArray &>(out_arr).CreateMKLDNNData(desc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if output data is a view and the required format is MKLDNN format, CreateMKLDNNData
can return null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everything else looks good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently we have this check before calling CreateMKLDNNData, but I will add a CHECK to make sure the developer knows to reorder2default beforehand.
https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/mkldnn/mkldnn_act.cc#L164
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is different. CreateMKLDNNData is called on output arrays. The check you added in activation is on input arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the input and output arrays are the same for kWriteInPlace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see. you are right.
2b3bd3a
to
d34a50b
Compare
…pache#11262) * do not create tmp memory during act * fix order of alloc memory * fix conditional * fix order * do not pass nullptr to commit * fix comment * do not create tmp mem unless shapes diff * fix params * always return in CreateMKLDNNMem * add boilerplate for CreateMKLDNNMem test * refactor copyfrom * use copyfrom helper in tests * add logs * missing semi * improve print msg * target out_mem * test copy from * reuse verify copy * add inplace test / use sum for test * use assert in sum verify * lint * remove unused var * fix test messsage * out_mem can be null * Revert "refactor copyfrom" This reverts commit 4ab131e. * add back missing var * writeinplace explicitly returns same memory * refactor * only writeinplace if add and pdesc are eq * fix comparison * add second CreateMKLDNNMemory * CreateMKLDNNMem accepts input * refactor WriteTo criteria into separate method * fix lint * copyfrom test back * update mldnnsum test to have diff inputs for write in place * test in place sum with diff arrs * revert CreateMKLDNNMem extra param change * pass input arr param for act_forward * remove extra header * fix indent * add check for writeto * canwriteto uses ref instead of ptr * update comments for CreateMKLDNNData * compare input and output desc with op pdesc * check CreateMKLDNNData does not return null
…pache#11262) * do not create tmp memory during act * fix order of alloc memory * fix conditional * fix order * do not pass nullptr to commit * fix comment * do not create tmp mem unless shapes diff * fix params * always return in CreateMKLDNNMem * add boilerplate for CreateMKLDNNMem test * refactor copyfrom * use copyfrom helper in tests * add logs * missing semi * improve print msg * target out_mem * test copy from * reuse verify copy * add inplace test / use sum for test * use assert in sum verify * lint * remove unused var * fix test messsage * out_mem can be null * Revert "refactor copyfrom" This reverts commit 4ab131e. * add back missing var * writeinplace explicitly returns same memory * refactor * only writeinplace if add and pdesc are eq * fix comparison * add second CreateMKLDNNMemory * CreateMKLDNNMem accepts input * refactor WriteTo criteria into separate method * fix lint * copyfrom test back * update mldnnsum test to have diff inputs for write in place * test in place sum with diff arrs * revert CreateMKLDNNMem extra param change * pass input arr param for act_forward * remove extra header * fix indent * add check for writeto * canwriteto uses ref instead of ptr * update comments for CreateMKLDNNData * compare input and output desc with op pdesc * check CreateMKLDNNData does not return null
Description
Fix MKLDNN performance issue
CreateMKLDNNMem accepts input as param to compare if WriteInPlace valid
improve test logging
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments