-
Notifications
You must be signed in to change notification settings - Fork 848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a way for kernels to request an optional scratch buffer size #2345
Conversation
A more optimized implementation may require more scratch buffer usage. This enables such an operator to have a fallback if tensor arena size is not enough, Change-Id: I6ccf72f3caaf2d6ff2d6cdddbf3bb62a26f31915
Just a couple initial thoughts:
|
Thanks @rascani |
@mansnils - I think I like the idea of a build flag, but I think that might beg the question: how useful is the "both" option? The code logic becomes very simple with a build flag: if optimizing for size, scratch buffer is X, else Y. The "both" option brings more complex logic and I'm not sure I see when it would be useful to the user. Maybe it makes sense for multi-tenant situations where multiple models are sharing the same arena space (thus the arena size is the max of the model set)? |
@rascani - perhaps the added complexity that comes with the "both" option can not be justified. Considering that the simpler build option proposal kind of solves the original problem, the "both" option can be seen more as nice to have. I am happy to update the draft PR with the new proposal. |
To give an example of a multi-tenant use case, it could be running a test suite with multiple models for a given target. |
"This PR is being marked as stale due to inactivity. Remove label or comment to prevent closure in 5 days." |
Closing in favor of #2408 |
Adds a build flag that can be used by any kernel to provide a different implementation depending on use case. Adds a first use case for cmsis-nn transpose conv. The background for this PR is in #2345 BUG=none
A more optimized implementation may require more scratch buffer usage. This enables such an operator to have a fallback if tensor arena size is not enough,