BLS scripting: executing / submitting multiple requests at once in a single-shot? #7928
Unanswered
vadimkantorov
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Does BLS support submitting multiple InferenceRequest's at once as an API? (to save maybe some gRPC roundtrips and networking API-related thread blocking) Or is there no point in this?
And also, is there support in InferenceClient of submitting multiple requests to the same model as a single gRPC roundtrip packing them into a single gRPC request to reduce overhead?
At least from DX/UX standpoint, this would be a useful regime to support at least in frontend methods IMO... and could be useful for text models: e.g. BERT-based classification of multiple short strings
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions