You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In your benchmarks, you use BS 1, 100, 300, 1000, etc. It's clear that 1 is slower than 100 for the GPU backend, but then larger batch sizes show performance degradation.
Two issues with this:
For consumer GPUs, even BS 100 will be hard to fit into memory
As the results show deterioration from 100 upward, what is the performance for BS 16, 32, 64?
Enquiring minds need to know!
Thanks!
The text was updated successfully, but these errors were encountered:
That’s a great point, and it would indeed be very interesting to test! Thanks for bringing it up. I’d love to look into this when I have a bit more time, but unfortunately, I don’t have the capacity to do so at the moment.
In your benchmarks, you use BS 1, 100, 300, 1000, etc. It's clear that 1 is slower than 100 for the GPU backend, but then larger batch sizes show performance degradation.
Two issues with this:
Enquiring minds need to know!
Thanks!
The text was updated successfully, but these errors were encountered: