-
-
Notifications
You must be signed in to change notification settings - Fork 326
Fix specifying memory order in v2 arrays #2951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
this looks good! and my explanation from #2950 was wrong -- this has nothing to do with the global config, it's just buggy code (that i think I wrote). |
So I updated the test, and it now passes a bit more. But the check that the actual array data is F-order now fails, so it looks like the data still isn't (and never was) actually being written in F-order. More digging needed... |
@pytest.mark.parametrize("array_order", ["C", "F"]) | ||
@pytest.mark.parametrize("data_order", ["C", "F"]) | ||
@pytest.mark.parametrize("memory_order", ["C", "F"]) | ||
def test_v2_non_contiguous( | ||
array_order: Literal["C", "F"], data_order: Literal["C", "F"], memory_order: Literal["C", "F"] | ||
) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@normanrz you originally wrote this test in #2679, but I don't understand why there are three different types of order, and there's no documentation, or even comments on this test explaining.
I've therefore chosen to be a little bit bold, and for at least v2 data this PR now assumes there is only one order. If we need to deal with different types of orders, I'm certainly open to that, but that increased complexity probably needs to come with some very good user docs, given in zarr-python v2 there was only one type of memory order.
Since you originally wrote this test, I wanted to see what you thought about this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array_order = order in the metadata
data_order = order of the numpy array that is used to write
memory_order = order in the runtime config which is used for reading data, ie the order of the output numpy array
At least the first two are necessary for this test. If you want to change the behavior to always read in array_order, then you can drop the memory_order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something, but don't we just need to parametrize with one order? We pass that one order parameter to the array creation routine, and then check it's correct in both the metadata and the order the data is stored in memory, right (which is how the test works currently in this PR)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider this: You create a Zarr array with F order. Next you create a numpy array with C order. You write the numpy array in the Zarr array. This test makes sure that the data on-disk is in F order. That's why you need at least 2 orders.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, that makes sense - thanks!
Fixes #2950 - still needs some tests, but opening to see if this breaks any other existing tests first.
TODO:
docs/user-guide/*.rst
changes/