From eaab94f8f1d960ff99eb9a22025d53e875e29cb0 Mon Sep 17 00:00:00 2001 From: Denis Samoilov Date: Mon, 20 Mar 2023 14:49:18 -0700 Subject: [PATCH] doc: nvidia: update doc for prelu and shuffle --- src/gpu/nvidia/README.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/src/gpu/nvidia/README.md b/src/gpu/nvidia/README.md index 828e61680df..d8a10203899 100644 --- a/src/gpu/nvidia/README.md +++ b/src/gpu/nvidia/README.md @@ -279,7 +279,9 @@ backward propagation respectively. ### PReLU The PReLU primitive (Leaky ReLU with a trainable alpha parameter) is implemented using SYCL kernels. The primitive supports both forward and backward -propagations for the data types f32, s32, bf16, f16, s8 and u8. +propagations. +* Forward pass supports `f32`, `f16`, `bf16`, `s8` and `u8` +* Backward pass supports `f32`, `bf16` ### Reorder @@ -341,6 +343,13 @@ changed to `CUDNN_SOFTMAX_LOG`. The sum operation uses the reorder primitive to sum tensors, so the same limitation as reorder applies here. +### Shuffle + +The shuffle primitive is implemented using SYCL kernels. +This primitive supports both forward and backward propagations. +* Forward pass supports `f32`, `f16`, `bf16` and `s8` +* Backward pass supports `f32`, `bf16` + ### Other primitives Rest primitives not listed above are not supported by Nvidia backend. This is