Open
Description
I'm converting the YOLOv9 model to ONNX for use with NVIDIA DeepStream. Inside FastModelLoader
, the _create_onnx_model
function appears to handle the PyTorch-to-ONNX conversion. However, when I run this function, it outputs a list of 17 tensors with shapes like:
Output[0] shape: (1, 80, 80, 80)
Output[1] shape: (1, 16, 4, 80, 80)
Output[2] shape: (1, 4, 80, 80)
Output[3] shape: (1, 80, 40, 40)
Output[4] shape: (1, 16, 4, 40, 40)
Output[5] shape: (1, 4, 40, 40)
Output[6] shape: (1, 80, 20, 20)
Output[7] shape: (1, 16, 4, 20, 20)
Output[8] shape: (1, 4, 20, 20)
Output[9] shape: (1, 80, 80, 80)
Output[10] shape: (1, 16, 4, 80, 80)
Output[11] shape: (1, 4, 80, 80)
Output[12] shape: (1, 80, 40, 40)
Output[13] shape: (1, 16, 4, 40, 40)
Output[14] shape: (1, 4, 40, 40)
Output[15] shape: (1, 80, 20, 20)
Output[16] shape: (1, 16, 4, 20, 20)
Output[17] shape: (1, 4, 20, 20)
This is unexpected, as DeepStream typically expects a single output tensor or structured outputs containing bounding boxes (batch_size, num_boxes, 4)
, class confidence scores (batch_size, num_boxes, num_classes)
, and objectness scores (batch_size, num_boxes, 1)
.
How should I interpret these tensors and correctly format them for inference in DeepStream?