Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add generic syrk/herk #1249

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open

Conversation

araujoms
Copy link
Contributor

I've added an implementation of syrk/herk for generic types, in order to avoid falling back to _generic_matmatmul!, as it's rather slow. I didn't do anything fancy, no multithreading or anything, but this gives a 1.5x to 2x speedup for Int and BigFloat, for example.

The generic version of syrk only works for real and complex numbers, but funnily enough herk works for anything that respects conj(a*b) == conj(b)*conj(a), which as far as I can tell is any subtype of Number, including quaternions and octonions.

I've ran the tests locally by reverting to the commit before the lazy JLLs one, and they pass.

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, better performance is always welcome. I've got some comments to explore the opportunity to make it even more generic.

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to a small suggestion for the test, this LGTM.

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should have a Quaternion test case?

@araujoms
Copy link
Contributor Author

Ok, I've added a Quaternion test. I also added a pseudo docstring since the non-commutativity is not obvious. I assume a pseudo docstring is more appropriate than a real one since this function is not going to be exported.

@jishnub
Copy link
Member

jishnub commented Mar 31, 2025

Btw there's no reason to not have docstrings anymore for internal functions, now that we have public.

@araujoms
Copy link
Contributor Author

Ok, added a docstring then.

@araujoms araujoms changed the title add generic sykr/herk add generic syrk/herk Mar 31, 2025
Copy link

codecov bot commented Apr 1, 2025

Codecov Report

Attention: Patch coverage is 98.52941% with 1 line in your changes missing coverage. Please review.

Project coverage is 92.01%. Comparing base (e53b50c) to head (b979b01).

Files with missing lines Patch % Lines
src/matmul.jl 98.48% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1249      +/-   ##
==========================================
- Coverage   92.03%   92.01%   -0.02%     
==========================================
  Files          34       34              
  Lines       15459    15526      +67     
==========================================
+ Hits        14227    14287      +60     
- Misses       1232     1239       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice! I only wonder if we could "hang the generic code lower" in the dispatch hierarchy, follow the existing dispatch for longer and thereby avoid some code duplication. But this could be explored in another PR once this is in.

@araujoms
Copy link
Contributor Author

araujoms commented Apr 2, 2025

It can be done. I'd remove the restriction T<:BlasFloat from generic_matmatmul_wrapper!, and add methods to syrk_wrapper!, herk_wrapper!, and gemm_wrapper! for generic types.

I'd also like to remove the line matmul2x2or3x3_nonzeroalpha!(C, tA, tB, A, B, α, β) && return C from generic_matmatmul_wrapper!, because that breaks the feature of syrk/herk that the result is always symmetric/Hermitian.

@dkarrasch
Copy link
Member

So what do you think, should we merge and then you pick it up as time permits, or do you feel like you wanna work on it now?

@araujoms
Copy link
Contributor Author

araujoms commented Apr 2, 2025

I'd rather do it now.

@jishnub
Copy link
Member

jishnub commented Apr 2, 2025

I'd also like to remove the line matmul2x2or3x3_nonzeroalpha!(C, tA, tB, A, B, α, β) && return C from generic_matmatmul_wrapper!

This line dispatches to the optimized methods for 2x2 and 3x3 matrices. Removing this would lead to dispatching to BLAS calls, which are suboptimal for small matrices.

@dkarrasch
Copy link
Member

It can be done. I'd remove the restriction T<:BlasFloat from generic_matmatmul_wrapper!, and add methods to syrk_wrapper!, herk_wrapper!, and gemm_wrapper! for generic types.

That's what I had in mind.

This line dispatches to the optimized methods for 2x2 and 3x3 matrices. Removing this would lead to dispatching to BLAS calls, which are suboptimal for small matrices.

If we move the generic code further down the call path, this might be handled by methods up in the call path?

@araujoms
Copy link
Contributor Author

araujoms commented Apr 2, 2025

What about calling my generic code instead of BLAS for small matrices?

@araujoms
Copy link
Contributor Author

araujoms commented Apr 2, 2025

Done. Moved matmul2x2or3x3_nonzeroalpha! down for gemm_wrapper!, and replaced it with my generic code for syrk_wrapper! and herk_wrapper!.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants