-
-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add generic syrk/herk #1249
base: master
Are you sure you want to change the base?
add generic syrk/herk #1249
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, better performance is always welcome. I've got some comments to explore the opportunity to make it even more generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to a small suggestion for the test, this LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should have a Quaternion test case?
Ok, I've added a Quaternion test. I also added a pseudo docstring since the non-commutativity is not obvious. I assume a pseudo docstring is more appropriate than a real one since this function is not going to be exported. |
Btw there's no reason to not have docstrings anymore for internal functions, now that we have |
Ok, added a docstring then. |
Co-authored-by: Daniel Karrasch <[email protected]>
Co-authored-by: Daniel Karrasch <[email protected]>
cfe7620
to
9ec3005
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1249 +/- ##
==========================================
- Coverage 92.03% 92.01% -0.02%
==========================================
Files 34 34
Lines 15459 15526 +67
==========================================
+ Hits 14227 14287 +60
- Misses 1232 1239 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice! I only wonder if we could "hang the generic code lower" in the dispatch hierarchy, follow the existing dispatch for longer and thereby avoid some code duplication. But this could be explored in another PR once this is in.
It can be done. I'd remove the restriction I'd also like to remove the line |
So what do you think, should we merge and then you pick it up as time permits, or do you feel like you wanna work on it now? |
I'd rather do it now. |
This line dispatches to the optimized methods for 2x2 and 3x3 matrices. Removing this would lead to dispatching to BLAS calls, which are suboptimal for small matrices. |
That's what I had in mind.
If we move the generic code further down the call path, this might be handled by methods up in the call path? |
What about calling my generic code instead of BLAS for small matrices? |
Done. Moved |
I've added an implementation of syrk/herk for generic types, in order to avoid falling back to
_generic_matmatmul!
, as it's rather slow. I didn't do anything fancy, no multithreading or anything, but this gives a 1.5x to 2x speedup forInt
andBigFloat
, for example.The generic version of syrk only works for real and complex numbers, but funnily enough herk works for anything that respects
conj(a*b) == conj(b)*conj(a)
, which as far as I can tell is any subtype ofNumber
, including quaternions and octonions.I've ran the tests locally by reverting to the commit before the lazy JLLs one, and they pass.