generic: sycl: resampling: Avoid using the cudnn resampling to use th…

…e sycl impl
oneapi-src · Aug 27, 2024 · 40c1476 · 40c1476
1 parent 16d6dd4
commit 40c1476
Show file tree

Hide file tree

Showing 5 changed files with 2 additions and 581 deletions.
diff --git a/src/gpu/gpu_resampling_list.cpp b/src/gpu/gpu_resampling_list.cpp
@@ -21,10 +21,6 @@
 #include "gpu/intel/ocl/vectorized_resampling.hpp"
 #endif
 
-#if DNNL_GPU_VENDOR == DNNL_VENDOR_NVIDIA
-#include "gpu/nvidia/cudnn_resampling.hpp"
-#endif
-
 #ifdef GENERIC_SYCL_KERNELS_ENABLED
 #include "gpu/generic/sycl/ref_resampling.hpp"
 #endif
@@ -40,15 +36,13 @@ using namespace dnnl::impl::prop_kind;
 const std::map<pk_impl_key_t, std::vector<impl_list_item_t>>
         impl_list_map REG_RESAMPLING_P({
     {{forward}, {
-        GPU_INSTANCE_INTEL(intel::ocl::ref_resampling_fwd_t)
-        GPU_INSTANCE_NVIDIA(nvidia::cudnn_resampling_fwd_t)
+        GPU_INSTANCE_INTEL(intel::ocl::ref_resampling_fwd_t)        
         GPU_INSTANCE_GENERIC_SYCL(generic::sycl::ref_resampling_fwd_t)
         nullptr,
     }},
     {{backward}, REG_BWD_PK({
         GPU_INSTANCE_INTEL(intel::ocl::vectorized_resampling_bwd_t)
-        GPU_INSTANCE_INTEL(intel::ocl::ref_resampling_bwd_t)
-        GPU_INSTANCE_NVIDIA(nvidia::cudnn_resampling_bwd_t)
+        GPU_INSTANCE_INTEL(intel::ocl::ref_resampling_bwd_t)        
         GPU_INSTANCE_GENERIC_SYCL(generic::sycl::ref_resampling_bwd_t)
         nullptr,
     })},

diff --git a/src/gpu/nvidia/README.md b/src/gpu/nvidia/README.md
@@ -350,43 +350,6 @@ GPU:
 * Forward pass supports `f32`, `f16`, `bf16` and `s8` data types.
 * Backward pass supports `f32` and `bf16` data types.
 
-### Resampling
-
-#### Using cuDNN
-
-The `cudnnSpatialTfSamplerForward` and `cudnnSpatialTfSamplerBackward` are used
-to implement the resampling primitive.
-
-The Nvidia's spatial sampling is based on
-[Spacial Transformer Network](https://papers.nips.cc/paper/5854-spatial-transformer-networks.pdf)
-where all the data locations are normalized between `-1 <= (xi, yi) <= 1`.
-
-* cuDNN backend requires a grid of coordinates that can be sample-up/down based
-  on `theta`. The grid is generated by `cudnnSpatialTfGridGeneratorForward`.
-* The `theta` is a `MB * 2 * 3` matrix scaling factor for each coordinate and is
-  used to generate the grid.
-* The grid value must be normalized in range [-1 , 1]. cuDNN clamps the out of
-  bounds coordinate to zero. Therefore, it is needed to manually clamp the out
-  of bound coordinate to edges in order to avoid incorrect result.
-* 3D spatial sampling is not supported in cuDNN.
-* `Nearest neighbour` algorithm is not supported in cuDNN.
-* Since cuDNN computation is different from that of oneDNN, the error threshold
-  is smaller than other oneDNN implementation, so reduced testing accuracy for
-  `fp32` and `fp16` data types are required.
-* The backward pass requires an output parameter for `d_grid` which cannot be
-  `nullptr`. However, since the grid coordinates are not a tunable parameter in
-  oneDNN, a dummy memory for `d_grid` is created and is deleted when the
-  destructor of the primitive is called.
-
-##### Forward Direction
-* Supported data types: `f32`, `bf16`, `f16`, `s8`, `u8` and `s32`
-* Supported post-ops: `sum`, `eltwise`, `binary`
-* Supported algorithms: nearest neighbor, bilinear
-
-##### Backward Direction
-* Supported data types: `f32`, `bf16`
-* Supported algorithms: nearest neighbor, bilinear
-
 ### Softmax/LogSoftmax
 
 #### Using cuDNN

diff --git a/src/gpu/nvidia/cudnn_resampling.cpp b/src/gpu/nvidia/cudnn_resampling.cpp