-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Implement new Python APIs #25999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement new Python APIs #25999
Conversation
Also: AttributeError: 'InferenceSession' object has no attribute 'inputs_meminfo'
copy_tensors fails no data transfer to copy from CPU to CPU. lintrunner complains OrtSyncStream is undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces significant enhancements to ONNX Runtime's Python and C++ APIs, focusing on device and memory management, synchronization streams, and tensor operations. The changes provide better introspection capabilities and make advanced features previously only available in C++ accessible from Python.
- Adds comprehensive Python bindings for device/memory types and execution provider device handling
- Introduces synchronization stream support with Python-accessible APIs
- Implements a new tensor copy functionality for efficient data transfer between OrtValue objects
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
onnxruntime/test/python/onnxruntime_test_python_autoep.py | Adds tests for new EP device memory info, sync stream creation, and tensor copy functionality |
onnxruntime/test/python/onnxruntime_test_python.py | Adds tests for new session memory info properties and device/memory info APIs |
onnxruntime/python/onnxruntime_pybind_state.cc | Core implementation of new Python bindings for device types, memory info, sync streams, and tensor copy |
onnxruntime/python/onnxruntime_pybind_ortvalue.cc | Changes OrtValue data_ptr return type from int64_t to uintptr_t for better cross-platform compatibility |
onnxruntime/python/onnxruntime_inference_collection.py | Adds Python wrapper methods for accessing session memory info and EP devices, plus tensor copy function |
onnxruntime/init.py | Exports new public APIs for device/memory types and tensor copy functionality |
include/onnxruntime/core/session/onnxruntime_cxx_inline.h | Refactors sync stream implementation to support templated approach |
include/onnxruntime/core/session/onnxruntime_cxx_api.h | Generalizes SyncStream handling with template-based implementation for better type safety |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Description
This pull request introduces several enhancements to ONNX Runtime's Python and C++ APIs, focusing on improved device and memory information handling, synchronization stream support, and tensor copy functionality. It adds new Python bindings for device/memory types, exposes more detailed session input/output metadata, and provides a Python-accessible tensor copy API. The changes also refactor and extend the C++ API for better stream and memory info management.
Key changes include:
Device and Memory Information Enhancements
OrtMemoryInfoDeviceType
,OrtDeviceMemoryType
, and expandedOrtDevice
to expose the memory type via a newmem_type
method. TheOrtMemoryInfo
Python class now supports both legacy and new V2 constructors and exposes additional properties such as device memory type and vendor ID. [1] [2] [3]InferenceSession
object to provide access to input/outputOrtMemoryInfo
andOrtEpDevice
objects through new properties and methods. [1] [2] [3] [4]Synchronization Stream and Execution Provider Device Support
OrtSyncStream
, including creation viaOrtEpDevice.create_sync_stream()
and retrieval of device-specificOrtMemoryInfo
viaOrtEpDevice.memory_info()
. [1] [2]SyncStream
handling, allowing for unowned streams and improved type safety. [1] [2]Tensor Copy Functionality
copy_tensors
function and corresponding C++ binding, enabling efficient copying of tensor data betweenOrtValue
objects, optionally using a synchronization stream. [1] [2] [3]Miscellaneous Improvements and Fixes
OrtValue.data_ptr
method in the Python binding fromint64_t
touintptr_t
for better cross-platform compatibility. [1] [2]OrtDevice
). [1] [2]These changes collectively improve the flexibility and introspection capabilities of ONNX Runtime's device, memory, and execution provider interfaces, and make advanced features available to Python users.
Motivation and Context
Depends on: #26021