On onnxruntime-gpu, CUDAProvider, Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. #20309

surajrao2003 · 2024-04-15T13:41:17Z

surajrao2003
Apr 15, 2024

I am trying to inference a dynamically quantized yolov8s onnx model on GPU.
I have used yolov8s.pt and exported it to yolov8.onnx using onnx export. Then I quantized the onnx model using dynamic quantization (uint8) method provided by onnxruntime which reduced the model size by around 4 times. Though the quantized model worked fine while inferencing on CPU (CPUExecutionProvider), it gives low fps (frames per sec) while inferencing on GPU using CUDAExecutionProvider.

Mentioning both providers CUDAExecutionProvider and CPUExecutionProvider does make the warnings disappear but what I am concerned is about why some nodes are forcefully being executed on the CPU and not CUDA? Is it because CUDA does not yet provide support for such quantized nodes or is there any other particular reason for it?

When checked using log.severity, got this information

import onnxruntime as ort
ort.set_default_logger_severity(1)

I am curious about why CUDA is not able to handle these nodes. Because in my opinion it is very unusual to see GPU inference giving a lower fps as compared to CPU inference.
I believe that the issue is due to the limitations of ONNX runtime support for quantized operations (int or uint) on CUDA. If this is true, is there any work going on for providing support to these quantized operations which should further enhance the GPU performance?

Is there any alternative to run these quantized operations exclusively on CUDA?

iperov · 2024-12-04T14:40:54Z

iperov
Dec 4, 2024

CUDA doesn't know anything about nodes in ONNX.
ONNX adds new bugs from version to version.

1 reply

surajrao2003 Dec 4, 2024
Author

So I think we have to just wait until onnx provides CUDA compatible quantization (Nodes that are compatible for CUDA)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On onnxruntime-gpu, CUDAProvider, Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. #20309

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

On onnxruntime-gpu, CUDAProvider, Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. #20309

surajrao2003 Apr 15, 2024

Replies: 1 comment · 1 reply

iperov Dec 4, 2024

surajrao2003 Dec 4, 2024 Author

surajrao2003
Apr 15, 2024

Replies: 1 comment 1 reply

iperov
Dec 4, 2024

surajrao2003 Dec 4, 2024
Author