Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misc Zarr V3 bug fixes: open_group, open_consolidated and MemoryStoreV3 #1006

Merged
merged 8 commits into from
Apr 20, 2022

Conversation

grlee77
Copy link
Contributor

@grlee77 grlee77 commented Apr 11, 2022

The first commit here fixes a couple of bugs in the v3 convenience functions that were revealed when working on a branch of Xarray that adds tests with Zarr v3.

The rename method on MemoryStoreV3 was also fixed here (GroupV3 tests with this store type had inadvertently been left commented out in the previously merged PR).

Some outdated commments/TODOs were also cleaned up and comments were added explaining some cases where desired v3 behavior is not fully clear (e.g. test_path).

TODO:

  • Add unit tests and/or doctests in docstrings
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@pep8speaks
Copy link

pep8speaks commented Apr 11, 2022

Hello @grlee77! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-04-18 15:05:55 UTC

@codecov
Copy link

codecov bot commented Apr 11, 2022

Codecov Report

Merging #1006 (4b61acc) into master (c7494a9) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #1006      +/-   ##
==========================================
- Coverage   99.95%   99.94%   -0.01%     
==========================================
  Files          33       33              
  Lines       13638    13663      +25     
==========================================
+ Hits        13632    13656      +24     
- Misses          6        7       +1     
Impacted Files Coverage Δ
zarr/__init__.py 100.00% <ø> (ø)
zarr/hierarchy.py 99.78% <ø> (-0.22%) ⬇️
zarr/convenience.py 100.00% <100.00%> (ø)
zarr/storage.py 100.00% <100.00%> (ø)
zarr/tests/test_convenience.py 100.00% <100.00%> (ø)
zarr/tests/test_core.py 100.00% <100.00%> (ø)
zarr/tests/test_hierarchy.py 100.00% <100.00%> (ø)

@joshmoore
Copy link
Member

joshmoore commented Apr 12, 2022

Any thoughts on:
Screen Shot 2022-04-12 at 12 57 12

Otherwise, at a glance, 👍 for the fixes. Need to get my head around some of them though.

grlee77 added 2 commits April 18, 2022 11:04
remove redundant test_consolidated_with_chunk_store

test open_group with store and chunk_store specified as string paths
@joshmoore
Copy link
Member

Green! 😄

Copy link
Member

@joshmoore joshmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, @grlee77. Thanks! I'm merging so that we have this to validate other PRs but I've kicked off a few conversations that are likely of interest as V3 gets finalized.

@@ -1277,7 +1277,9 @@ def open_consolidated(store: StoreLike, metadata_key=".zmetadata", mode="r+", **
"""

# normalize parameters
store = normalize_store_arg(store, storage_options=kwargs.get("storage_options"), mode=mode)
zarr_version = kwargs.get('zarr_version', None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps more for a follow-up, but if this is getting extracted does it make sense to list as a parameter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that's not happening with storage_options either, so it probably is more of a higher-level aesthetic decision.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine doing it either way.

path = 'dataset'

@pytest.mark.parametrize('stores_from_path', [False, True])
def test_consolidate_metadata(with_chunk_store, zarr_version, stores_from_path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outwith this PR, but this reminds me if v3 would make standardizing consolidated metadata easier. See zarr-developers/zarr-specs#113

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also /zarr-developers/zarr-specs#136 related to where this metadata would be stored

@@ -229,64 +236,81 @@ def test_consolidate_metadata(with_chunk_store, zarr_version):
arr[:] = 1.0
assert 16 == arr.nchunks_initialized

if stores_from_path:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to have in the test, but this is an interesting block, @grlee77. Can you explain it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are tests below that do del store[key], but when the store is just a path string instead of a BaseStore class this will not work. So, the test uses z._store directly. I think I had originally just tried deleting the keys directly from the consolidated store, z, instead of accessing the private z._store, but this was not possible because ConsolidatedMetadataStore classes are read-only.

'g2/arr/.zattrs']
else:
assert isinstance(out._store, ConsolidatedMetadataStoreV3)
assert 'meta/root/consolidated/.zmetadata' in store
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, outwith this PR but back to zarr-developers/zarr-specs#113, could the v3 spec here simply replace all the individual JSON files solely with the consolidated metadata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could be done, but haven't tried it. All store types are already somewhat consolidated for v3 since the entire meta folder tree is independent of the data one. I suppose if there are many arrays/groups it may still be preferred to further consolidate into one file, though? I would defer to those using the feature on what is the desired behavior.

# the expected key format gives a match
assert g2 == g1['foo/bar']

# TODO: Should presence of a trailing slash raise KeyError?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should likely get captured as an issue for the V3 final roadmap.

assert_array_equal(data, g["bar"])
else:
# TODO: How to access element created outside of group.path in v3?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should likely get captured as an issue for the V3 final roadmap.

@joshmoore joshmoore merged commit 0aaf6d5 into zarr-developers:master Apr 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants