-
-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
misc Zarr V3 bug fixes: open_group, open_consolidated and MemoryStoreV3 #1006
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1006 +/- ##
==========================================
- Coverage 99.95% 99.94% -0.01%
==========================================
Files 33 33
Lines 13638 13663 +25
==========================================
+ Hits 13632 13656 +24
- Misses 6 7 +1
|
remove redundant test_consolidated_with_chunk_store test open_group with store and chunk_store specified as string paths
Green! 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, @grlee77. Thanks! I'm merging so that we have this to validate other PRs but I've kicked off a few conversations that are likely of interest as V3 gets finalized.
@@ -1277,7 +1277,9 @@ def open_consolidated(store: StoreLike, metadata_key=".zmetadata", mode="r+", ** | |||
""" | |||
|
|||
# normalize parameters | |||
store = normalize_store_arg(store, storage_options=kwargs.get("storage_options"), mode=mode) | |||
zarr_version = kwargs.get('zarr_version', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps more for a follow-up, but if this is getting extracted does it make sense to list as a parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that's not happening with storage_options
either, so it probably is more of a higher-level aesthetic decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine doing it either way.
path = 'dataset' | ||
|
||
@pytest.mark.parametrize('stores_from_path', [False, True]) | ||
def test_consolidate_metadata(with_chunk_store, zarr_version, stores_from_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outwith this PR, but this reminds me if v3 would make standardizing consolidated metadata easier. See zarr-developers/zarr-specs#113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also /zarr-developers/zarr-specs#136 related to where this metadata would be stored
@@ -229,64 +236,81 @@ def test_consolidate_metadata(with_chunk_store, zarr_version): | |||
arr[:] = 1.0 | |||
assert 16 == arr.nchunks_initialized | |||
|
|||
if stores_from_path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine to have in the test, but this is an interesting block, @grlee77. Can you explain it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are tests below that do del store[key]
, but when the store is just a path string instead of a BaseStore
class this will not work. So, the test uses z._store
directly. I think I had originally just tried deleting the keys directly from the consolidated store, z
, instead of accessing the private z._store
, but this was not possible because ConsolidatedMetadataStore
classes are read-only.
'g2/arr/.zattrs'] | ||
else: | ||
assert isinstance(out._store, ConsolidatedMetadataStoreV3) | ||
assert 'meta/root/consolidated/.zmetadata' in store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, outwith this PR but back to zarr-developers/zarr-specs#113, could the v3 spec here simply replace all the individual JSON files solely with the consolidated metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could be done, but haven't tried it. All store types are already somewhat consolidated for v3 since the entire meta
folder tree is independent of the data
one. I suppose if there are many arrays/groups it may still be preferred to further consolidate into one file, though? I would defer to those using the feature on what is the desired behavior.
# the expected key format gives a match | ||
assert g2 == g1['foo/bar'] | ||
|
||
# TODO: Should presence of a trailing slash raise KeyError? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should likely get captured as an issue for the V3 final roadmap.
assert_array_equal(data, g["bar"]) | ||
else: | ||
# TODO: How to access element created outside of group.path in v3? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should likely get captured as an issue for the V3 final roadmap.
The first commit here fixes a couple of bugs in the v3 convenience functions that were revealed when working on a branch of Xarray that adds tests with Zarr v3.
The
rename
method on MemoryStoreV3 was also fixed here (GroupV3 tests with this store type had inadvertently been left commented out in the previously merged PR).Some outdated commments/TODOs were also cleaned up and comments were added explaining some cases where desired v3 behavior is not fully clear (e.g.
test_path
).TODO: