Skip to content

Sparse mapping refactor #63

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jul 17, 2025
Merged

Sparse mapping refactor #63

merged 14 commits into from
Jul 17, 2025

Conversation

lkdvos
Copy link
Contributor

@lkdvos lkdvos commented Jul 16, 2025

This PR is a slightly more careful implementation of the sparse mapping functionality.

The core logic consists of using a trait ZeroPreserving for determining whether or not a function preserves zeros.

There are 3 modes:

  • NonPreserving : map over eachindex
  • WeakPreserving : map over each index where not all input values are 0.
  • StrongPreserving : map over each index where not any of the input values are 0.

The logic seems to work as intended, and is decently flexible, but there are some questions left to tackle.

Currently, I still didn't handle dealing with cases where zero(eltype(T)) does not exist. In principle I am okay with getting rid of this fallback definition, but that leads to a bunch of required specializations on top of the simple + - and * due to these functions appearing in a wrapped form from the way broadcasting is passed to map.

Also, there is a very interesting problem associated to map!(f, C, As...) where any or all of the As have overlapping memory with C. Since we have to zero out the memory in general cases, as we might not be mapping over all indices, we also have to make a copy.
This is handled by unalias, but might give unwanted copies:

For example: map!(identity, C1, C2) with C1 = C2 would now lead to first copying C2, than zeroing out C1, and finally moving the copied C2 into C1. This happens for example with scalar multiplications of the form C .*= beta, in sparse_mul.
I can manually correct this behavior there, but just wanted to mention this.

I'll finish with the cleanup of the other functions etc after a first round of review?

@lkdvos lkdvos requested a review from mtfishman July 16, 2025 14:59
Copy link

codecov bot commented Jul 16, 2025

Codecov Report

Attention: Patch coverage is 63.51351% with 27 lines in your changes missing coverage. Please review.

Project coverage is 74.38%. Comparing base (00cac74) to head (8214b8f).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/map.jl 75.43% 14 Missing ⚠️
src/indexing.jl 0.00% 13 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #63      +/-   ##
==========================================
- Coverage   76.70%   74.38%   -2.33%     
==========================================
  Files           8        9       +1     
  Lines         601      648      +47     
==========================================
+ Hits          461      482      +21     
- Misses        140      166      +26     
Flag Coverage Δ
docs 34.36% <47.29%> (-1.37%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mtfishman
Copy link
Member

Can you bump the version?

@lkdvos
Copy link
Contributor Author

lkdvos commented Jul 16, 2025

I'm not entirely sure how I managed to locally get the BlockSparseArray tests to pass, as it seems like the downstream tests here are failing. Having a closer look here, it does seem to be the case that this is a bit of a breaking change, and I'm not sure if it is worth it to try to make it non-breaking. Let me know if you disagree though, I don't think it is impossible, but it is definitely cleaner to depend on eachstoredindex(indexstyle, ...) to exist.

@mtfishman
Copy link
Member

So is the idea that this PR now expects eachstoredindex(::IndexStyle, A) to be defined, but some dowstream packages don't define it? Could we define a fallback eachstoredindex(A) = eachstoredindex(IndexStyle(A), A)?

@lkdvos
Copy link
Contributor Author

lkdvos commented Jul 17, 2025

The problem is actually the other way around: the downstream packages define eachstoredindex(A) but not eachstoredindex(::IndexStyle, A).
In order to fix this, I could try to make eachstoredindex(::IndexStyle, A) call eachstoredindex(A) as a fallback, and convert if necessary, but I think this might then lead to loops in our default implementations (eachstoredindex(A) = eachstoredindex(IndexStyle(A), A) = CartesianIndices(A)[eachstoredindex(A)], or something similar), which I might be able to intercept, but I'm not sure this amount of complexity is worth it, and we should just have the downstream packages define the correct functions instead.

@mtfishman
Copy link
Member

Got it. So in this design, if an array type X has IndexStyle(A::X) = IndexCartesian(), it should overload eachstoredindex(style::IndexCartesian, A::X) = ... and then eachstoredindex(A::X) will work? Will eachstoredindex(style::IndexLinear, A::X) be defined by default?

@lkdvos
Copy link
Contributor Author

lkdvos commented Jul 17, 2025

Yes, this is part of the changes I have here. I've implemented eachstoredindex(::IndexStyle, A::X) where the IndexStyle is "non-canonical" to have a default in terms of the "canonical" style, and eachstoredindex(A) falls back to the canonical style as well, but the thing to overload would have to be eachstoredindex(::IndexStyle, A::X)

@mtfishman
Copy link
Member

Yes, this is part of the changes I have here. I've implemented eachstoredindex(::IndexStyle, A::X) where the IndexStyle is "non-canonical" to have a default in terms of the "canonical" style, and eachstoredindex(A) falls back to the canonical style as well, but the thing to overload would have to be eachstoredindex(::IndexStyle, A::X)

I think the part that is confusing me is that I don't see the logic where if only eachstoredindex(::IndexCartesian, A::X), eachstoredindex(::IndexLinear, A::X) calls out to that.

@mtfishman
Copy link
Member

Looks good to me, ready to merge?

@lkdvos
Copy link
Contributor Author

lkdvos commented Jul 17, 2025

Good to go for me!

@mtfishman mtfishman merged commit 1dd8127 into main Jul 17, 2025
14 of 16 checks passed
@mtfishman mtfishman deleted the sparsemap branch July 17, 2025 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants