-
Notifications
You must be signed in to change notification settings - Fork 7.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly convert a Detectron2 model to ONNX for Deployment #4414
Comments
Hi @vitorbds See if this example helps you It exports |
Hi @thiagocrepaldi Does the function in the link can handle |
Just replaced |
@thiagocrepaldi Sorry to bother you- I don't really understand the link to the function you guys were talking about- all I see is a pull request. Is it possible that you share with me a working script that will export the model properly to onnx? Thanks |
Hi Frank, the link was a pull request, which unit test do the export you want. All you need is to copy the unit test from the PR and remove/replace the helper functions we use to export and process the results. Which model are you specifically trying to export? I think I need to work on the deployment documentation a little bit :) |
Yeah, so my end goal is that I can export a custom detectron2 model (currently using the balloon example) and then after having exported it to onnx using the export_model.py script in tools directory, to export it again to TensorRT. Do you know if thats possible? Because the conversion on your end is fine but the second I get to the conversion at TensorRT it fails and they tell me that your script is broken and that they can't help me. They gave me a base model that they exported (using your script from a while back im assuming) and Ive been using that but then I ran into issues with the custom model part because your script is necessary for it to work. I can explain more if necessary but I think this would suffice. |
Would it help if I were to checkout a specific commit? Because right now the errors im getting with TensorRT (which I have reported and gotten ignored with) are that when I use their export script that the list assignment index is out of range, and when I skip this step and try to just create their engine it gives me a unrecognized op. Now I know this isn't really your domain but they told me that the problem was in Detectron2's export script, and I really need that to work in order for this whole project of mine to work |
Heres the official output from their commands -create_onnx.py and then build_engine.py |
Hi. I am actually trying to achieve the same type of export for a mask_rcnn (detectron2 -> onnx) or (detectron2 -> pytorch). Did you figure it out ? |
Yeah so I managed to get the export working for detectron2 maskrcnn r50fpn to tensorrt, however I have yet to be able to run it in a dockerfile on the Jetson tx2. I am currently working on that. As of right now though, I have a working system that I would be happy to share that works on the Jetson TX2 and a converter on google colab-fully perfected |
Nice, well done. I would love to see the Google Colab converter, to see if that solves it for my case, if that works for you ! |
https://colab.research.google.com/drive/1ZFdkdIAjD0ldhJ9TEhzTL1bndLyqm2Rd?usp=sharing Thats the colab file- note that I used the balloon dataset so you might need to change it to your preferences. Read through the code aswell before you use it |
@frankvp11 great news that you managed to convert the model to onnx. By looking your notebook, it seems no changes to the original onnx export code on detectron2 was needed. Is that right? Did you try |
COuld you file an issue and send my way? This issue is meant to something else and we probably can closed it as resolved as @vitorbds already confirmed the model he asked is working as expected |
I could absolutely try that-however its my understanding that TensorRT anticipated that you would have nodes that wouldnt be understood by TensorRT which is why they created a second converted called create_onnx.py |
It is possible that TensorRT does not implement all ONNX operators, but my PR allows the export of the model only using standard ONNX operators, so in theory, detectron2 exporter is correct |
what was the missing op you mentioned? That is probably the root cause. TensorRT not implementing an operator. I could guess it is GridSampler, as it is a new addition to ONNX/Pytorch, so probably to TensorRT too |
When I tried to do it with the latest detectron2 and export-method tracing I get an unknown scalar error. |
Intriguing. I would expect that from |
Give me one second to get that for you |
was the output and this was the command !python /content/detectron2/tools/deploy/export_model.py --config-file /content/output.yaml --output /content/model.onnx --format onnx --sample-image /content/new.jpg --export-method tracing MODEL.DEVICE cuda MODEL.WEIGHTS /content/output/model_final.pth Same notebook btw |
Output seems wierd sorry about that. I can try editing it if you want @thiagocrepaldi |
Alright no problem. You can close |
@frankvp11 I have a fix/workaround for #4354 and could look into the your issue. Did you create a new issue with a straightforward repro? preferrably a python file that I just run and see the problem If you are using an old torch, I suggest trying master branch or maybe 1.12.1. Also, change your detectron2 code to use latest onnx opset (aka 16) in this file https://github.com/facebookresearch/detectron2/blob/main/detectron2/export/__init__.py#L19 STABLE_ONNX_OPSET_VERSION = 16 # Default was 11 |
Hi. Thanks for your reply. I actually figured it out. Works well with both tracing and scripting with a pytorch inference. |
Thank you very much for sharing. Got me on the right track !! |
hi,i have a model it trained with faster_rcnn_R_101_FPN_3x. i want to transfer to onnx this model. I successfully transfer, but when I make an inference with onnxruntime, I get the following output : `['output', 'value.11', 'value.7', 'onnx::Split_1073'] outputs is [array([], shape=(0, 4), dtype=float32), array([], dtype=int64), array([], dtype=float32), array([ 800, 1067], dtype=int64)] ` can you help me ? |
I think I can try. Let's first establish some things. First things first -> make sure that the picture you are using is indeed a valid picture, ie its got the right size, it has objects that you can detect, and its valid, etc. Then let's talk about the output you are recieving. It's been a while since I've used numpy, and I havent use onnxruntime, so I might be totally useless. However, based on my initial time viewing this, it seems like the numpy array size/shape seems correct, and potentially the output isn't. My suggestion for you would be to re-transfer a new model. Start from scratch essentially and check the reproducability of this error. Then test it again from the newly transfered model. You can also use a service like Netron to make sure that the model "shape" is correct, like the correct layers in the correct positions. If this isn't correct, it can easily be determined that your transfering process is indeed incorrect (most likely). Otherwise, I'd look into other ways to make predictions using onnxruntime (maybe you are doing it wrong, I dont know). You can also check out my article on medium, it might help-> https://medium.com/@frankvanpaassen3/how-to-optimize-custom-detectron2-models-with-tensorrt-2cd710954ad3 If all this fails, I'd suggest opening a new issue, and getting help from actual Detectron2 people. |
thanks frankvp11, i attent your said. i'll read your article. |
Did you end up getting it to work? |
Hi @frankvp11 , Im trying to export a retina net model trained using detectron2 to ONNX format, it would be great if you can help me how to do this export. Should I use caffe2_tracing or just tracing as export method |
In my article I said caffe2 tracing, but I just checked the /samples/python/detectron2 and it says caffe2 is deprecated, so my guess would be regular tracing. I suppose if you have time and willpower your could try both, just make sure if you do to have the right versions for everything (specifically for the caffe2 side) |
Hi Githubers, has anyone successfully figured this out? I tried to convert a Detectron2 model to ONNX for Deployment for a long time, but still didn't successful. I followed this workflow: TensorRT/samples/python/detectron2 at main · NVIDIA/TensorRT · GitHub 3 But always got an error: KeyError: ‘UNKNOWN_SCALAR’` It would be great that someone could share some helpful information ^^ |
Hi @htlbayytq, I was able to figure out how to convert the R-CNN R50 as outlined in the demo from TensorRT. From what I last remember, the panoptic_fpn_R_50_1x is not yet supported. It seems that this page has some relevance, so we'll go from here. I don't know if you read my article, but there I outline how I managed to get it to work, and also it might be worthwhile checking to make sure all the installations are correct and that you have tried everything suggested by actual Detectron2 staff in this issue. If all this still fails, you might have to wait for a facebookresearch rep to come help |
hi frank, |
Also, why do we need onnx to detect the situation? |
As far as that you are unable to get them to run with an onnxruntime, perhaps you could try other optimization services (TensorRT). Also, I agree with you that the support for onnx conversion is severely lacking, however nothing I can do about that. As far as why we use onnx, for me personally was because it was required to make TensorRT engine, however it basically converts it to a difference language (think C++ to python) so that the engine building services like onnxruntime and tensorrt can do what you did (play around with size, etc). Im not an expert though so don't quote me on that. As far as I remember, the things that I did in my article worked ~6 months ago, perhaps you could try going back to those versions if you need. |
I would like to export detectron2 model to OpenVINO because I have Intel Iris Xe graphics and NCS 2. How can I do it ? |
Hmm. I think if you follow the instructions as posted in the export directory it should work. However I'm assuming that your here because you got an error while trying to do so? I'm not familiar with either OpenVINO or NCS 2, so if that's the issue you might need to go to their forums (if they exist). As far as converting to openvino, if detectron2 doesn't have a file for it (or openvino doesn't provide a "translater" file I don't think it's possible. It would only be possible if OpenVINO used a pre-established model format provided by detectron2 (caffe2, onnx, torchscript) or created their own |
You think it isn't possible. For detectron2, I need model with .pt/.yaml/.pth extension and weights with .pt/.pkl/.pth but for OpenVINO it's model.xml and weights.bin. Now, I can export PyTorch model to OpenVINO IR and use my Intel GPU but I can't with detectron2 yet. |
If this may help , found there is a something weird in detectron2 with operator onnx::Split despite fixing split for variable sizes, found a proble when size is 1 : had to do avoid calling split in this case and just return the original structure before the split wrapped in [] . for ex
did not understand why , but fixed my problem So now I can infer with onnxrt CPU and CUDA EP, but not TensorRT EP. I am now having with TensorRT EP an issue with operator onnx:ReduceMax . see pytorch/pytorch#97344 |
Have you the same output with detectron2 and ort? And can you do an inference on CPU/GPU ? |
Yes the results are the same (10e-3 or better) on cpu that is validating the model. Don't know yet as tesnorrt EP on GPU does not work. |
Can you share your code, I would like to test it. |
@thiagocrepaldi thank you for your work on the detectronv2 to onnx. I was able to run this locally and create an ONNX file for the |
Ouch, now you got me. One way to identify what is what is the output order. The first output on pytorch will be the first output on ONNX. so if the mask is the third output on pytorch, then value.7 is indeed your mask Detectron2 models when exported to ONNX, however, can have more outputs than pytorch. One reason is that the detectron2 serializes some types, such as boxes and whaetver into a format which includes not only the original data, but also some metadata to deserialize data before returning it to user. These metadata "leaks" to ONNX representation because in the process of onnx export, we export exactly what we see (which is the serialized data with metadata). It is safe to ignore any extra output, though. Regarding to the difference in shape, there is no 1:1 mapping between Pytorch operators and ONNX operators. Pytorch ops are defined as Meta wants and ONNX tries to do a more generic version that will fit not only Pytorch, but also Tensorflow, MXNet, Caffe2, CNTK, etc. So it is possible that for an operator in torch that outputs Does it help? ps: IIRC, the detectron2 config files specifies input/output for the models, so they might be used as a reference to which input/output is what. Just don't quote me on that lol |
@thiagocrepaldi thank you so much for you speedy and throughout response. I think I figured it out. The output detectron2/detectron2/modeling/postprocessing.py Lines 60 to 68 in d779ea6
So I just need to implement the post processor here on my side! |
Hi @njaouen ! I have problems changing the output from 28*28. It would help me a lot to know how you did it, thank you in advance. |
Hi @Aaronponceuv , you can modify postprocessing function and use sem_seg_postprocess function (https://github.com/facebookresearch/detectron2/blob/main/detectron2/modeling/postprocessing.py#L77). |
Hello,
I am trying to convert a Detectron2 model to ONNX format and make inference without use detectron2 dependence in inference stage.
Even is possible to find some information about that here :
https://detectron2.readthedocs.io/en/latest/tutorials/deployment.html
The implementation of this task is constantly being updated and the information found in this documentation is not clear enough to carry out this task .
Some one can help me with some Demo/Tutorial of how make it ?
@thiagocrepaldi
Some information:
My model was trained using pre-trained weight from:
'faster_rcnn_50': {
'model_path': 'COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml',
'weights_path': 'model_final_280758.pkl'
},
I have 4 classes.
Of course now i have my our weight.
My model was saved in .pth forrmat.
I used my our dataset, with image ( .png )
Code in Python
The text was updated successfully, but these errors were encountered: