forked from pytorch/ort
-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Changing MAX_LEN to 64 during preprocessing
Fixing issues identified during validation More fixes for issues identified during review Changing LF line sequence Adding back license for resnet sample fixing ReadMe instructions removing version file generated from build fix to close image file Update Readme.md Fix avg inf time Minor fixes - modified preprocessing in BERT sample to throw a warning message when truncation happens. - modified ortinferencemodule to reuse _inputs_info Print message change Reuse _input_info Changing exceptions to valueerrors, fixing default options Updating Readme and usage docs removing test labels file Update usage.md
- Loading branch information
1 parent
c318f37
commit 67ea747
Showing
6 changed files
with
333 additions
and
221 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -147,32 +147,42 @@ To see torch-ort in action, see https://github.com/microsoft/onnxruntime-trainin | |
|
||
# Accelerate inference for PyTorch models with ONNX Runtime (Preview) | ||
|
||
ONNX Runtime for PyTorch accelerates PyTorch model inference using ONNX Runtime. | ||
ONNX Runtime for PyTorch is now extended to support PyTorch model inference using ONNX Runtime. | ||
|
||
It is available via the torch-ort-inference python package. This preview package enables OpenVINO™ Execution Provider for ONNX Runtime by default for accelerating inference on various Intel CPUs and integrated GPUs. | ||
It is available via the torch-ort-inference python package. This preview package enables OpenVINO™ Execution Provider for ONNX Runtime by default for accelerating inference on various Intel® CPUs, Intel® integrated GPUs, and Intel® Movidius™ Vision Processing Units - referred to as VPU. | ||
|
||
This repository contains the source code for the package, as well as instructions for running the package. | ||
|
||
## Prerequisites | ||
|
||
- Ubuntu 18.04, 20.04 | ||
|
||
- Python* 3.7, 3.8 or 3.9 | ||
|
||
## Install in a local Python environment | ||
|
||
By default, torch-ort-inference depends on PyTorch 1.12 and ONNX Runtime OpenVINO EP 1.12. | ||
|
||
Install torch-ort-inference with OpenVINO dependencies | ||
1. Install torch-ort-inference with OpenVINO dependencies. | ||
|
||
- `pip install torch-ort-inference[openvino]` | ||
- `pip install torch-ort-inference[openvino]` | ||
<br/><br/> | ||
2. Run post-installation script | ||
|
||
## Verify your installation | ||
- `python -m torch_ort.configure` | ||
|
||
Once you have created your environment, using Python, execute the following steps to validate that your installation is correct. | ||
## Verify your installation | ||
|
||
1. Download a inference script | ||
Once you have created your environment, execute the following steps to validate that your installation is correct. | ||
|
||
- `wget https://raw.githubusercontent.com/pytorch/ort/main/torch_ort_inference/tests/bert_for_sequence_classification.py` | ||
1. Clone this repo | ||
|
||
- `git clone [email protected]:pytorch/ort.git` | ||
<br/><br/> | ||
2. Install extra dependencies | ||
|
||
- `pip install wget pandas transformers` | ||
|
||
<br/><br/> | ||
3. Run the inference script | ||
|
||
- `python ./ort/torch_ort_inference/tests/bert_for_sequence_classification.py` | ||
|
@@ -204,6 +214,11 @@ If no provider options are specified by user, OpenVINO™ Execution Provider is | |
backend = "CPU" | ||
precision = "FP32" | ||
``` | ||
For more details on APIs, see [usage.md](/torch_ort_inference/docs/usage.md). | ||
|
||
### Note | ||
|
||
Currently, Vision models are supported on Intel® VPUs. Support for NLP models may be added in future releases. | ||
|
||
## License | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# APIs for OpenVINO™ integration with TorchORT | ||
|
||
This document describes available Python APIs for OpenVINO™ integration with TorchORT to accelerate inference for PyTorch models on various Intel hardware. | ||
|
||
## Essential APIs | ||
|
||
To add the OpenVINO™ integration with TorchORT package to your PyTorch application, add following 2 lines of code: | ||
|
||
```python | ||
from torch_ort import ORTInferenceModule | ||
model = ORTInferenceModule(model) | ||
``` | ||
|
||
By default, CPU backend with FP32 precision is enabled. You can set different backend and supported precision using OpenVINOProviderOptions as below: | ||
|
||
```python | ||
provider_options = OpenVINOProviderOptions(backend = "GPU", precision = "FP16") | ||
model = ORTInferenceModule(model, provider_options = provider_options) | ||
``` | ||
Supported backend-precision combinations: | ||
| Backend | Precision | | ||
| --------| --------- | | ||
| CPU | FP32 | | ||
| GPU | FP32 | | ||
| GPU | FP16 | | ||
| MYRIAD | FP16 | | ||
|
||
## Additional APIs | ||
|
||
To save the inline exported onnx model, use DebugOptions as below: | ||
|
||
```python | ||
debug_options = DebugOptions(save_onnx=True, onnx_prefix='<model_name>') | ||
model = ORTInferenceModule(model, debug_options=debug_options) | ||
``` | ||
|
||
To enable verbose log of the execution of the TorchORT pipeline, use DebugOptions as below: | ||
|
||
```python | ||
debug_options = DebugOptions(log_level=LogLevel.VERBOSE) | ||
model = ORTInferenceModule(model, debug_options=debug_options) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.