-
-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Zarr V3 protocol #898
Conversation
Define the StoreV3 class and create v3 versions of most existing stores Add a test_storage_v3.py with test classes inheriting from their v2 counterparts. Only a subset of methods involving differences in v3 behavior were overridden.
Codecov Report
@@ Coverage Diff @@
## master #898 +/- ##
=========================================
Coverage 99.94% 99.95%
=========================================
Files 32 33 +1
Lines 11256 13622 +2366
=========================================
+ Hits 11250 13616 +2366
Misses 6 6
|
49163e0
to
0540124
Compare
0540124
to
b13a6b3
Compare
de243b2
to
266ab58
Compare
zarr_version should not be in the array metadata, only the base store metadata compressor should be absent when there is no compression
266ab58
to
99d9632
Compare
classmethods adapted from zarrita code
99d9632
to
37858e1
Compare
37858e1
to
49a42b5
Compare
avoid pytest error about missing fixture fix flake8 error related to zarr_version fixture
update hexdigests
49a42b5
to
60e3560
Compare
No need for this class as DirectoryStoreV3 with / chunk separator can be used instead
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking nice, @grlee77! A number of mostly aesthetic points below after an initial read. Don't think any of them are MUSTs before merging this into a dev branch. (See zarr-developers/governance#15)
The only other high-level comment, also aesthetic, is whether zarr._storage.v3
(or similar) would be appropriate earlier rather than later.
zarr/_storage/absstore.py
Outdated
@@ -209,3 +205,63 @@ def getsize(self, path=None): | |||
|
|||
def clear(self): | |||
self.rmdir() | |||
|
|||
|
|||
def rmdir_abs(store: ABSStore, path=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not a method on the store?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why I did it that way. I can make this ABSStore.rmdir
and just call it from there in ABSStoreV3.rmdir
# assume path already normalized | ||
prefix = _path_to_prefix(path) | ||
for key in list(store.keys()): | ||
if key.startswith(prefix): | ||
del store[key] | ||
|
||
|
||
def _rmdir_from_keys_v3(store: StoreV3, path: str = "") -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Knowing how this module is growing, is there any place relatively convenient that we could "hide" v3-specific methods?
key == 'zarr.json') | ||
|
||
# cannot create a group without a path in v3 | ||
# so create /meta/root/consolidated group to store the metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of "/meta/root/consolidated" likely needs review at the zarr-specs level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened zarr-developers/zarr-specs#136 so we don't forget
zarr/hierarchy.py
Outdated
store = _normalize_store_arg(store) | ||
store = _normalize_store_arg(store, zarr_version=zarr_version) | ||
if zarr_version is None: | ||
zarr_version = getattr(store, '_store_version', 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth planning now for when the default swaps from 2 to 3? Either by using a constant or by documenting the locations while you have it in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I added DEFAULT_ZARR_VERSION
(currently set to 2) in place of 2 in these checks.
It is intended as a constant used by developers, not something a user would set in their code to select the default version.
These define the root path for metadata and data, respectively
Still use a default set of metadata in __init__ method of Group or Array classes. Add a _get_metadata_suffix helper that defaults to '.json' if metadata is not present.
default metadata already gets added by Metadata3.encode_hierarchy_metadata when meta=None
Anything else planned from your side, @grlee77 ? |
Before release we will need to decide on where we want the import path for v3 Stores to be (currently in the same file as the V2 ones). We can think about that and a possible environment variable to enable/disable v3 visibility in a followup, though. |
👍 for handling the refactoring and v3-switch in follow ups. As discussed during the past two community calls, getting this in to the mainline. 🎉 (And as promised for @grlee77: 🍪) Clear warning for @zarr-developers/core-devs : the mainline is now in an unreleasable See zarr-developers/governance#15 for some background. |
This PR builds on #897, adding zarr v3 support
to most of the high-level convenience functions (I haven't looked at the consolidated metadata yet).EDIT: I closed the prior v3 PRs leading up to this one as their commits are also included here along with additional bug fixes and tests.
TODO: