-
Notifications
You must be signed in to change notification settings - Fork 3.2k
[Feature Request] Global Threadpool in Python API #23523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I have linked a fork that i have tested to be working on my macbook with the following implementation as an example use case ort.set_global_thread_pool_sizes(64, 64) # new functionality
class OnnxRunner:
sess_options = ort.SessionOptions()
sess_options.use_per_session_threads = False # newly exposed to python api
def __init__(self, model: bytes):
self.session = ort.InferenceSession(model, sess_options=self.sess_options)
def predict(self, x: Array):
x = x.astype("float32")
y = self.session.run(["output"], {"input": x})
return y[0] |
Would you be willing to submit a PR? |
yes, i have an open PR here |
+1 to this, we're a similar situation of having thousands of small models that are resident simultaneously, we don't want to have that many thread pools spun up and sitting idle. |
My pr is still open but i unfortunately don't really have the knowledge or bandwidth to push it further at the moment. I threw it together as a POC for my team but we ended up just going with:
in local testing this worked to pin the inference sessions to 1 global shared thread which was sufficient for our use-case. |
Describe the feature request
Expose the ability to utilize a global threadpool for inference sessions in the python API
Describe scenario use case
My current use case requires the instantiation of many (thousands) of small onnx models in memory at once. Doing so causes too many threads to be spawned halting the program. The functionality for a global threadpool exists in the cpp source but is not exposed to the python bindings.
The text was updated successfully, but these errors were encountered: