Performance optimizations

Improved fp32 convolutions performance for real time inference on Intel(R) Xeon processors with Intel(R) AVX512 instruction set support
Improved int8 depthwise separable convolutions performance on processors with Intel(R) AVX512 instruction set support
Improved 3D convolution performance on Intel(R) Xeon Phi(TM) processors with AVX512_4FMAPS and AVX512_4VNNIW instruction groups support
Optimized dilated convolutions for int8 and fp32 data types
Improved performance of pooling primitives for NHWC and NCHW data layouts
Improved performance of 3D pooling primitives for plain data layouts
Optimized batch normalization backpropagation for Intel(R) processors with AVX and SSE4.2 instruction groups support
Improved performance of batch normalization with 3D spatial data

New functionality

Feature preview: Introduced training and inference support for GRU cells for recurrent neural network (RNN)
Introduced general purpose SGEMM API
Introduced deconvolution (or transposed convolution) primitive for 3D spatial data
Introduced backward propagation for softmax primitive

Thanks to the contributors

This release contains contributions from many Intel(R) Performance Libraries developers as well as Tuomas Kärnä @tkarna, @msakai, Can Balioglu @cbalioglu, Jacek Czaja @jczaja, Thejan Wijesinghe @ThejanW, Jesse Nicholson @TechnikEmpire, @okdshin, Crissman Loomis @Crissman. We would also like to thank everyone who asked questions and reported issues.

*Other names and brands may be claimed as the property of others.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.15

Performance optimizations

New functionality

Thanks to the contributors