Casting error in simple model with a constant tensor<f32> during --EmitLib #335

agostini01 · 2020-10-05T19:07:58Z

Got an error in a (Tensor+Tensor)+Constant graph.

This is the onnx.mlir code:

module {
  func @main_graph(%arg0: tensor<4x5xf32>, %arg1: tensor<4x5xf32>) -> tensor<4x5xf32> attributes {input_names = ["input_y:0", "input_x:0"], output_names = ["output:0"]} {
    %0 = "onnx.Add"(%arg1, %arg0) {onnx_node_name = "added"} : (tensor<4x5xf32>, tensor<4x5xf32>) -> tensor<4x5xf32>
    %1 = "onnx.Constant"() {value = dense<4.200000e+01> : tensor<f32>} : () -> tensor<f32>
    %2 = "onnx.Add"(%0, %1) {onnx_node_name = "add"} : (tensor<4x5xf32>, tensor<f32>) -> tensor<4x5xf32>
    return %2 : tensor<4x5xf32>
  }
  "onnx.EntryPoint"() {func = @main_graph, numInputs = 2 : i32, numOutputs = 1 : i32} : () -> ()
}

And this is the executed line and the error:

$ /working_dir/onnx-mlir/build/bin/onnx-mlir --EmitLib /working_dir/examples/model/custom_add_plus_cte/onnx_mlir_generated/model.onnx.mlir

onnx-mlir: /working_dir/llvm-project/llvm/include/llvm/Support/Casting.h:269: typename cast_retty<X, Y *>::ret_type llvm::cast(Y *) [X = llvm::FixedVectorType, Y = llvm::Type]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

Note that the same example without the internal constant, but with a 1xf32 argument works:

module {
  func @main_graph(%arg0: tensor<4x5xf32>, %arg1: tensor<4x5xf32>, %arg2: tensor<1xf32>) -> tensor<4x5xf32> attributes {input_names = ["input_y:0", "input_x:0", "cte:0"], output_names = ["output:0"]} {
    %0 = "onnx.Add"(%arg1, %arg0) {onnx_node_name = "added"} : (tensor<4x5xf32>, tensor<4x5xf32>) -> tensor<4x5xf32>
    %1 = "onnx.Add"(%0, %arg2) {onnx_node_name = "add"} : (tensor<4x5xf32>, tensor<1xf32>) -> tensor<4x5xf32>
    return %1 : tensor<4x5xf32>
  }
  "onnx.EntryPoint"() {func = @main_graph, numInputs = 3 : i32, numOutputs = 1 : i32} : () -> ()
}

The text was updated successfully, but these errors were encountered:

agostini01 · 2020-10-05T19:29:02Z

Looking further into the problem, I found that if we replace the type declaration:
tensor<f32> by tensor<1xf32> the compilation is successful.

Compilation works if the mlir generated from the onnx model were this:

module {
  func @main_graph(%arg0: tensor<4x5xf32>, %arg1: tensor<4x5xf32>) -> tensor<4x5xf32> attributes {input_names = ["input_y:0", "input_x:0"], output_names = ["output:0"]} {
    %0 = "onnx.Add"(%arg1, %arg0) {onnx_node_name = "added"} : (tensor<4x5xf32>, tensor<4x5xf32>) -> tensor<4x5xf32>
    %1 = "onnx.Constant"() {value = dense<4.200000e+01> : tensor<1xf32>} : () -> tensor<1xf32>
    %2 = "onnx.Add"(%0, %1) {onnx_node_name = "add"} : (tensor<4x5xf32>, tensor<1xf32>) -> tensor<4x5xf32>
    return %2 : tensor<4x5xf32>
  }
  "onnx.EntryPoint"() {func = @main_graph, numInputs = 2 : i32, numOutputs = 1 : i32} : () -> ()
}

However,

/working_dir/onnx-mlir/build/bin/onnx-mlir --EmitONNXIR /working_dir/examples/model/custom_add_plus_cte/model.onnx -o /working_dir/examples/model/custom_add_plus_cte/onnx_mlir_generated/model

emits this incorrect IR for this model. I will try to add the model in the next comment.

agostini01 · 2020-10-05T19:33:21Z

I could not add the model.onnx file, but this is how the onnx file was generated:

#!/bin/python

# To execute this, you need to have the following installed
# pip install tensorflow onnx tf2onnx

import tensorflow as tf
import tf2onnx


# Declare a custom function that represents the graph with all its inputs
def add_plus_cte(x, y, cte):
    added = tf.math.add(
        x, y, name='added'
    )
    added_plus_cte = added + cte
    return added_plus_cte

# Create a `Function` object that contains a graph
fun_obj = tf.function(add_plus_cte)

# Make some tensors to test it
x1 = tf.constant([[1.0, 2.0]])
y1 = tf.constant([[2.0, 3.0]])
b1 = tf.constant(4.0)

# It works!
print(fun_obj(x1, y1, b1).numpy())

# Wrap everything in a session to be read by tf2onnx tool
with tf.compat.v1.Session() as sess:

    # Declare graph input arguments (it has bigger dimensions)
    x = tf.compat.v1.placeholder(tf.float32, [4, 5], name="input_x")
    y = tf.compat.v1.placeholder(tf.float32, [4, 5], name="input_y")
    cte = tf.constant(42.0) # The constant is not a variable. It is embedded in the graph.
    result = add_plus_cte(x,y,cte)
    _ = tf.identity(result, name="output")

    # Create the onnx graph
    onnx_graph = tf2onnx.tfonnx.process_tf_graph(sess.graph, input_names=["input_x:0","input_y:0"], output_names=["output:0"])
    model_proto = onnx_graph.make_model("custom_add_plus_cte")

    # Save to file
    with open("model.onnx", "wb") as f:
        f.write(model_proto.SerializeToString())

chentong319 · 2020-10-05T19:44:19Z

Thanks @agostini01. Could you send the model (.onnx) file to me by email([email protected])?

agostini01 · 2020-10-05T21:13:19Z

@chentong319 , just sent you an email.

The manual fix allows for LLVM IR to be generated but breaks downstream during the compilation of the .so binary.

chentong319 · 2020-10-05T23:48:35Z

@agostini01 I did not see your email yet. Is the model file too large?

agostini01 · 2020-10-06T18:39:21Z

@chentong319 my email must have been sent to a spam folder.

I have created a github repo and included the example: https://github.com/agostini01/failing-onnx-models/tree/main/custom_tensor_add_plus_cte

chentong319 · 2020-10-07T00:54:53Z

I downloaded the model and tried. So far I found that the onnx::TensorProto node for this denseElemtAttr has dims().size() == 0. It should be 1. That's why the importer generated tensor, but not tensor<1xf32>. Need further to identify the source.

agostini01 · 2020-10-07T14:51:32Z

This is good that you found one of the sources of the problem.

I was playing with onnx-mlir source code, but could not spot how to debug this error.
Do you set breakpoints on specific files?
How/where do you know that f32 was generated instead of 1xf32?

chentong319 · 2020-10-07T18:40:58Z

I set break point in onnx-mlir/src/Builder/FrontendDialectHelper.cpp: mlir::DenseElementsAttr onnxTensorProtoToDenseElmAttr. This is the procedure to construct the attribute for ConstantOp. I dumped the type and also print out initializer.dims().size().
By the way, I tried to use your model generation python code. I got an error (same error for other example code). I think that the error is caused by that my tensor flow version is too new than the tensor2onnx tool (from package not the source) required). Did you install tensor2onnx from package or source?

doru1004 · 2020-11-06T16:43:19Z

@chentong319 any updates on whether this error has already been fixed?

agostini01 changed the title ~~Casting error when using a model with constants during --EmitLib~~ Casting error in simple model with a constant during --EmitLib Oct 5, 2020

agostini01 changed the title ~~Casting error in simple model with a constant during --EmitLib~~ Casting error in simple model with a constant tensor<f32> during --EmitLib Oct 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Casting error in simple model with a constant tensor<f32> during --EmitLib #335

Casting error in simple model with a constant tensor<f32> during --EmitLib #335

agostini01 commented Oct 5, 2020

agostini01 commented Oct 5, 2020 •

edited

Loading

agostini01 commented Oct 5, 2020

chentong319 commented Oct 5, 2020

agostini01 commented Oct 5, 2020 •

edited

Loading

chentong319 commented Oct 5, 2020

agostini01 commented Oct 6, 2020 •

edited

Loading

chentong319 commented Oct 7, 2020

agostini01 commented Oct 7, 2020

chentong319 commented Oct 7, 2020

doru1004 commented Nov 6, 2020

Casting error in simple model with a constant tensor<f32> during --EmitLib #335

Casting error in simple model with a constant tensor<f32> during --EmitLib #335

Comments

agostini01 commented Oct 5, 2020

agostini01 commented Oct 5, 2020 • edited Loading

agostini01 commented Oct 5, 2020

chentong319 commented Oct 5, 2020

agostini01 commented Oct 5, 2020 • edited Loading

chentong319 commented Oct 5, 2020

agostini01 commented Oct 6, 2020 • edited Loading

chentong319 commented Oct 7, 2020

agostini01 commented Oct 7, 2020

chentong319 commented Oct 7, 2020

doru1004 commented Nov 6, 2020

agostini01 commented Oct 5, 2020 •

edited

Loading

agostini01 commented Oct 5, 2020 •

edited

Loading

agostini01 commented Oct 6, 2020 •

edited

Loading