Skip to content

[ONNX Importer] Incorrect return type for ops from "com.microsoft" domain (!torch.none) #949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pravg-amd opened this issue Mar 11, 2025 · 18 comments
Assignees

Comments

@pravg-amd
Copy link

Some of the onnx zoo models have" QLinearAdd" op which results in return type !torch.none on importing the model.

  %88 = torch.operator "onnx.QLinearAdd"(%87, %43, %42, %46, %47, %48, %45, %44) : (!torch.vtensor<[1,4096],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[4096],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>) -> !torch.none

Steps to reproduce:

Build SHARK-TestSuite by following the steps below

https://github.com/nod-ai/SHARK-TestSuite/tree/main/alt_e2eshark

Run the following command

python run.py -va -t bvlcalexnet-12-int8

The model will be available test-run/bvlcalexnet-12-int8/

bvlcalexnet-12-int8
squeezenet1.0-12-int8
caffenet-12-int8
densenet-12-int8
vgg16-12-int8
mobilenetv2-12-int8

@pravg-amd
Copy link
Author

@pravg-amd
Copy link
Author

ONNX Runtime ticket to track the support -> microsoft/onnxruntime#24028

@pravg-amd pravg-amd self-assigned this Mar 24, 2025
@pravg-amd
Copy link
Author

ORT changes are merged . Working on supporting the op using the ort shape inference

@pravg-amd pravg-amd changed the title [ONNX Importer] QLinearAdd op issue with incorrect return type (!torch.none) [ONNX Importer] Incorrect return type for ops from "com.microsoft" domain (!torch.none) Apr 1, 2025
@pravg-amd
Copy link
Author

pravg-amd commented Apr 1, 2025

QLinearLeakyRelu - Model : yolov3-12-int8

 %673 = torch.operator "onnx.QLinearLeakyRelu"(%671, %43, %42, %41, %40) {torch.onnx.alpha = 1.000000e-01 : f32} : (!torch.vtensor<[?,32,?,?],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>) -> !torch.none

@pravg-amd
Copy link
Author

QLinearConcat - Model : version-RFB-320-int8

    %358 = torch.operator "onnx.QLinearConcat"(%158, %157, %356, %149, %148, %355, %143, %142, %357, %155, %154) {torch.onnx.axis = 1 : si64} : (!torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[1,16,30,40],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[1,16,30,40],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[1,16,30,40],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>) -> !torch.none

@pravg-amd
Copy link
Author

onnx.QLinearGlobalAveragePool - Model: squeezenet1.0-12-int8

%238 = torch.operator "onnx.QLinearGlobalAveragePool"(%237, %173, %172, %175, %174) {torch.onnx.channels_last = 0 : si64} : (!torch.vtensor<[1,1000,13,13],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>) -
> !torch.none

@pravg-amd
Copy link
Author

@vivekkhandelwal1 IR with ort shape inference for QLinearAdd

func.func @test_qlinearadd(%arg0: !torch.vtensor<[1,4096],ui8>, %arg1: !torch.vtensor<[],f32>, %arg2: !torch.vtensor<[],ui8>, %arg3: !torch.vtensor<[4096],ui8>, %arg4: !torch.vtensor<[],f32>, %arg5: !torch.vtensor<[],ui8>, %arg6: !torch.vtensor<[],f32>, %arg7: !torch.vtensor<[],ui8>) -> !torch.vtensor<[1,4096],ui8> attributes {torch.onnx_meta.ir_version = 5 : si64, torch.onnx_meta.opset_version = 10 : si64, torch.onnx_meta.producer_name = "backend-test", torch.onnx_meta.producer_version = ""} {
  %0 = torch.operator "onnx.QLinearAdd"(%arg0, %arg1, %arg2, %arg3, %arg4, %arg5, %arg6, %arg7) : (!torch.vtensor<[1,4096],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[4096],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>) -> !torch.vtensor<[1,4096],ui8>
  return %0 : !torch.vtensor<[1,4096],ui8>
}

@vivekkhandelwal1
Copy link
Contributor

@vivekkhandelwal1 IR with ort shape inference for QLinearAdd

func.func @test_qlinearadd(%arg0: !torch.vtensor<[1,4096],ui8>, %arg1: !torch.vtensor<[],f32>, %arg2: !torch.vtensor<[],ui8>, %arg3: !torch.vtensor<[4096],ui8>, %arg4: !torch.vtensor<[],f32>, %arg5: !torch.vtensor<[],ui8>, %arg6: !torch.vtensor<[],f32>, %arg7: !torch.vtensor<[],ui8>) -> !torch.vtensor<[1,4096],ui8> attributes {torch.onnx_meta.ir_version = 5 : si64, torch.onnx_meta.opset_version = 10 : si64, torch.onnx_meta.producer_name = "backend-test", torch.onnx_meta.producer_version = ""} {
  %0 = torch.operator "onnx.QLinearAdd"(%arg0, %arg1, %arg2, %arg3, %arg4, %arg5, %arg6, %arg7) : (!torch.vtensor<[1,4096],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[4096],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>) -> !torch.vtensor<[1,4096],ui8>
  return %0 : !torch.vtensor<[1,4096],ui8>
}

The lowering for the QLinearAdd op is added here: llvm/torch-mlir#4113

@vivekkhandelwal1
Copy link
Contributor

Hi @pravg-amd, can you please add the repro IRs for other ops?

@pravg-amd
Copy link
Author

QLinearLeakyRelu

func.func @test_qlinearleakyrelu(%arg0: !torch.vtensor<[?,32,?,?],ui8>, %arg1: !torch.vtensor<[],f32>, %arg2: !torch.vtensor<[],ui8>, %arg3: !torch.vtensor<[],f32>, %arg4: !torch.vtensor<[],ui8>) -> !torch.vtensor<[?,32,?,?],ui8> attributes {torch.onnx_meta.ir_version = 5 : si64, torch.onnx_meta.opset_version = 10 : si64, torch.onnx_meta.producer_name = "backend-test", torch.onnx_meta.producer_version = ""} {
  %0 = torch.operator "onnx.QLinearLeakyRelu"(%arg0, %arg1, %arg2, %arg3, %arg4) {torch.onnx.alpha = 1.000000e-01 : f32} : (!torch.vtensor<[?,32,?,?],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>) -> !torch.vtensor<[?,32,?,?],ui8>
  return %0 : !torch.vtensor<[?,32,?,?],ui8>
}

@pravg-amd
Copy link
Author

QLinearConcat

func.func @test_qlinearconcat(%arg0: !torch.vtensor<[],f32>, %arg1: !torch.vtensor<[],ui8>, %arg2: !torch.vtensor<[?,?,?,?],ui8>, %arg3: !torch.vtensor<[],f32>, %arg4: !torch.vtensor<[],ui8>, %arg5: !torch.vtensor<[?,?,?,?],ui8>, %arg6: !torch.vtensor<[],f32>, %arg7: !torch.vtensor<[],ui8>) -> !torch.vtensor<[?,?,?,?],ui8> attributes {torch.onnx_meta.ir_version = 5 : si64, torch.onnx_meta.opset_version = 10 : si64, torch.onnx_meta.producer_name = "backend-test", torch.onnx_meta.producer_version = ""} {
  %0 = torch.operator "onnx.QLinearConcat"(%arg0, %arg1, %arg2, %arg3, %arg4, %arg5, %arg6, %arg7) {torch.onnx.axis = 1 : si64} : (!torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[?,?,?,?],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[?,?,?,?],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>) -> !torch.vtensor<[?,?,?,?],ui8>
  return %0 : !torch.vtensor<[?,?,?,?],ui8>
}

@pravg-amd
Copy link
Author

QLinearGlobalAveragePool

func.func @test_qlinearglobalavgpool(%arg0: !torch.vtensor<[1,1000,13,13],ui8>, %arg1: !torch.vtensor<[],f32>, %arg2: !torch.vtensor<[],ui8>, %arg3: !torch.vtensor<[],f32>, %arg4: !torch.vtensor<[],ui8>) -> !torch.vtensor<[1,1000,1,1],ui8> attributes {torch.onnx_meta.ir_version = 5 : si64, torch.onnx_meta.opset_version = 10 : si64, torch.onnx_meta.producer_name = "backend-test", torch.onnx_meta.producer_version = ""} {
    %0 = torch.operator "onnx.QLinearGlobalAveragePool"(%arg0, %arg1, %arg2, %arg3, %arg4) {torch.onnx.channels_last = 0 : si64} : (!torch.vtensor<[1,1000,13,13],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>) -> !torch.vtensor<[1,1000,1,1],ui8>
    return %0 : !torch.vtensor<[1,1000,1,1],ui8>
}

@vivekkhandelwal1
Copy link
Contributor

QLinearLeakyRelu

func.func @test_qlinearleakyrelu(%arg0: !torch.vtensor<[?,32,?,?],ui8>, %arg1: !torch.vtensor<[],f32>, %arg2: !torch.vtensor<[],ui8>, %arg3: !torch.vtensor<[],f32>, %arg4: !torch.vtensor<[],ui8>) -> !torch.vtensor<[?,32,?,?],ui8> attributes {torch.onnx_meta.ir_version = 5 : si64, torch.onnx_meta.opset_version = 10 : si64, torch.onnx_meta.producer_name = "backend-test", torch.onnx_meta.producer_version = ""} {
  %0 = torch.operator "onnx.QLinearLeakyRelu"(%arg0, %arg1, %arg2, %arg3, %arg4) {torch.onnx.alpha = 1.000000e-01 : f32} : (!torch.vtensor<[?,32,?,?],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>) -> !torch.vtensor<[?,32,?,?],ui8>
  return %0 : !torch.vtensor<[?,32,?,?],ui8>
}

The lowering for the QLinearLeakyRelu op is added here: llvm/torch-mlir#4115.

@vivekkhandelwal1
Copy link
Contributor

The lowering for remaining 2 ops are added here:
llvm/torch-mlir#4116
llvm/torch-mlir#4120

@vivekkhandelwal1
Copy link
Contributor

All the PRs, related to this issue are merged.

@pravg-amd
Copy link
Author

QLinearSigmoid

func.func @test_qlinear_sigmoid(%arg0: !torch.vtensor<[?,?],ui8>, %arg1: !torch.vtensor<[],f32>, %arg2: !torch.vtensor<[],ui8>, %arg3: !torch.vtensor<[],f32>, %arg4: !torch.vtensor<[],ui8>) -> !torch.vtensor<[?,?],ui8> attributes {torch.onnx_meta.ir_version = 7 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.opset_versions = {com.microsoft = 1 : si64}, torch.onnx_meta.producer_name = "onnx.quantize", torch.onnx_meta.producer_version = "0.1.0"} {
  %0 = torch.operator "onnx.QLinearSigmoid"(%arg0, %arg1, %arg2, %arg3, %arg4) : (!torch.vtensor<[?,?],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>) -> !torch.vtensor<[?,?],ui8> 
  return %0 : !torch.vtensor<[?,?],ui8>
}

@pravg-amd
Copy link
Author

FusedMatMul

func.func @test_fusedMatmul(%arg0: !torch.vtensor<[?,12,256,64],f32>, %arg1: !torch.vtensor<[?,12,256,64],f32>) -> !torch.vtensor<[?,12,256,256],f32> attributes {torch.onnx_meta.ir_version = 7 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.opset_versions = {com.microsoft = 1 : si64}, torch.onnx_meta.producer_name = "onnx.quantize", torch.onnx_meta.producer_version = "0.1.0"} {
    %0 = torch.operator "onnx.FusedMatMul"(%arg0, %arg1) {torch.onnx.alpha = 1.250000e-01 : f32, torch.onnx.transA = 0 : si64, torch.onnx.transB = 1 : si64} : (!torch.vtensor<[?,12,256,64],f32>, !torch.vtensor<[?,12,256,64],f32>) -> !torch.vtensor<[?,12,256,256],f32>
    return %0 : !torch.vtensor<[?,12,256,256],f32>
}

@pravg-amd
Copy link
Author

QLinearAveragePool

func.func @test_qlinearAveragePool(%arg0: !torch.vtensor<[1,128,56,56],ui8>, %arg1: !torch.vtensor<[],f32>, %arg2: !torch.vtensor<[],ui8>, %arg3: !torch.vtensor<[],f32>, %arg4: !torch.vtensor<[],ui8>) -> !torch.vtensor<[1,128,28,28],ui8> attributes {torch.onnx_meta.ir_version = 5 : si64, torch.onnx_meta.opset_version = 10 : si64, torch.onnx_meta.producer_name = "backend-test", torch.onnx_meta.producer_version = ""} {
    %0 = torch.operator "onnx.QLinearAveragePool"(%arg0, %arg1, %arg2, %arg3, %arg4) {torch.onnx.auto_pad = "NOTSET", torch.onnx.ceil_mode = 0 : si64, torch.onnx.count_include_pad = 0 : si64, torch.onnx.kernel_shape = [2 : si64, 2 : si64], torch.onnx.pads = [0 : si64, 0 : si64, 0 : si64, 0 : si64], torch.onnx.strides = [2 : si64, 2 : si64]} : (!torch.vtensor<[1,128,56,56],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>, !torch.vtensor<[],f32>, !torch.vtensor<[],ui8>) -> !torch.vtensor<[1,128,28,28],ui8>
    return %0 : !torch.vtensor<[1,128,28,28],ui8>
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants