-
Notifications
You must be signed in to change notification settings - Fork 80
Exploring SciPy sparse array migration from sparse matrices #785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #785 +/- ##
=======================================
- Coverage 85.4% 85.4% -0.0%
=======================================
Files 150 150
Lines 15989 16036 +47
=======================================
+ Hits 13655 13688 +33
- Misses 2334 2348 +14
🚀 New features to boost your workflow:
|
This looks good to me. Regarding the decision to make, we mostly need to figure out how to migrate cc @pedrovma |
I agree that the property The As for providing user control of the property, one proposal is to provide a temporary keyword arg to the If there isn't a ecosystem problem/discussion about |
I didn't even know we have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two comments on specific places in the code. (details below)
- the PR handling using
matrix_power
only works after scipy1.12, but there is a workaround shown below. - the line flagged as untested could be tested by running the same test with
row_st
set.
Should I put (either of) these changes into this PR? I should also remove the conftest
tool. I'll put it into the scipy migration to sparray document and link to it here. Actually I could put it into a PR/issue comment too. Maybe an issue on the pysal/pysal repo?
libpysal/weights/util.py
Outdated
wk = sum(w**x for x in range(1, k + 1)) | ||
wk = sum(sparse.linalg.matrix_power(w, k) for k in range(1, k+1)) | ||
shortest_path = False | ||
else: | ||
wk = w**k | ||
wk = sparse.linalg.matrix_power(w, k) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes to the matrix power computation are being raised in the "oldest" CI run. That makes sense because scipy.sparse.linalg.matrix_power
is a scipy 1.12 feature. We can work around using this matrix_power
function if you want to keep the minimum scipy requirement at 1.8.
if lower_order:
wk = w.copy()
for _ in range(k):
wk += wk @ w
shortest_path = False
else:
# work around for scipy.sparse.linalg.matrix_power needed until scipy1.12
wk = w.copy()
x = 1
while 2 * x < k:
wk = wk @ wk
x *= 2
while x < k:
wk = wk @ w
x += 1
BTW the graph interface uses scipy.linalg.matrix_power
, so that interface will not run for scipy<1.12. Another reason to shift the min scipy version I guess.
@@ -1225,7 +1225,7 @@ def lat2SW(nrows=3, ncols=5, criterion="rook", row_st=False): | |||
m = sparse.dia_matrix((data, offsets), shape=(n, n), dtype=np.int8) | |||
m = m + m.T | |||
if row_st: | |||
m = sparse.spdiags(1.0 / m.sum(1).T, 0, *m.shape) * m | |||
m = sparse.dia_matrix(((1.0 / m.sum(1).T), [0]), shape=m.shape) @ m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add [have added] a test that calls lat2SW
with row_st=True
to make sure this is working/tested. I tried it locally and The line is correct.
switch from isspmatrix_csr to issparse and format='csr'
febaf09
to
8881bf4
Compare
I have removed Draft Status for this PR -- it is ready for review as a "pass 1" part of migration to sparray. ("pass 1" as discussed in the SciPy migration to sparray guide, means make sure the code can work with either spmatrix or sparray.) This PR is constructed to work with the current minimum SciPy version of 1.8. But a couple of places will be simplified by bumping the minimum version of SciPy to 1.12. These places are commented as such with the one-line replacement code and the version of scipy that it needs. This mostly involves replacing This PR adds a test of the The remaining failure in the "Oldest" CI seems to be unrelated to the changes here. |
This PR explores changes needed for migrating to SciPy sparse arrays instead of sparse matrices. The migration guide suggests that the process be done in 2 "passes", and so probably in 2 PRs. This PR implements "Pass 1" where we find use of
*
and**
and other spmatrix idioms and replace them with e.g.@
and other idioms that work for both spmatrix and sparray.Specifically, this PR:
conftest.py
file that monkey-patches SciPy to raise whenever spmatrix objects use*
and other idioms.@
and other idioms that work for both spmatrix and sparray.The additional
conftest.py
file should not be merged to your codebase -- but I wanted to show what that file would look like because you have many repos to be converter -- hence the "Draft" nature of this PR. We should revert that commit before merging this. Running pytest locally with this file in place raises exceptions until the fixes are in place. If there's a better place to store this conftest.py file let me know.SciPy version: While sparray is implemented in SciPy 1.8, the constructor functions like
random_array()
are not available until 1.12. There are workarounds for the constructor functions. But you should think about bumping the minimum version to SciPy 1.12.More about the "harder" decisions needed for this migration:
Note that "pass 1" doesn't actually switch anything to sparray. The goal is code that will work with sparray sparse objects. The next step is "pass 2" where we change to using sparray internally. It will require deciding what to do about functions that currently return spmatrix objects.
For example:
as_spmatrix=True
-- or find some other way to determine whether to return sparray or spmatrix.If the library uses any sparse libraries that require int32 index arrays for CSR index arrays, we will need to pay attention to potential changes. The spmatrix and sparray apis choose index array dtypes slightly differently. So let me know if you have use any 32-bit sparse libraries. E.g. some scipy.csgraph functions, pyamg and others.