Skip to content

v3.6.1

Compare
Choose a tag to compare
@vpirogov vpirogov released this 06 Nov 00:05
· 37 commits to rls-v3.6 since this release

This is a patch release containing the following changes to v3.6:

  • Fixed convolution correctness issue in some scenarios involving persistent cache on Intel GPUs (e595e59)
  • Fixed potential page faults in reduction primitive implementation for Intel GPUs (7740c75, a4fcef9, 32d8660)
  • Implemented a workaround for GCC 13 bug that resulted in matmul hangs on some Intel Arc graphics SKUs (a30d526)
  • Updated execution units (EU) number detection logic for Intel GPUs based on Xe2 architecture to accommodate for behavior changes in Linux driver (04e7eac, 97b04bd)
  • Fixed build issue for static library with ONEDNN_VERBOSE=OFF (7f476cb)
  • Fixed correctness issue in SYCL deconvolution implementation with post-ops (8f600a3)
  • Fixed memory formats checks in SYCL softmax implementation (6ae73e4)
  • Fixed correctness issue in SYCL resampling implementation with post-ops (9845057)
  • Aligned accessor types in SYCL kernels with SYCL specification (0d9b3bd)
  • Improved scales argument checks in generic SYCL kernels (9f73bf1, 7d85c75)
  • Fixed correctness issue in int8 convolution with sum post-op on NVIDIA GPUs (7486ed8)
  • Relaxed accuracy test threshold for bf16 softmax on NVIDIA GPUs (e9d0fdb)
  • Added support for bf16 and fp16 bias for fp8 matmul on Intel CPUs (188ae7f)
  • Fixed a bug that prevented dispatching Intel AVX-512 with Intel DL Boost implementation in int8 RNN primitive (bf58e72)
  • Fixed a runtime fail with CL_OUT_OF_RESOURCES error in fp16 convolution on Intel Arc graphics (39a5f67, 7e1663f)