Skip to content

open() fails for path to a v2 group in a RemoteStore #2444

Closed
@agoodm

Description

@agoodm

Zarr version

v3.0.0b

Numcodecs version

n/a

Python Version

3.11

Operating System

Mac

Installation

pip install -e .

Description

Using the standard open() to open a v2 group inside a RemoteStore fails.

Steps to reproduce

In [1]: import zarr

In [2]: zarr.open("s3://mur-sst/zarr/", zarr_version=2, storage_options=dict(anon=True))
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[2], line 1
----> 1 zarr.open("s3://mur-sst/zarr/", zarr_version=2, storage_options=dict(anon=True))

File ~/dev/zarr-python/src/zarr/_compat.py:43, in _deprecate_positional_args.<locals>._inner_deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
     41 extra_args = len(args) - len(all_args)
     42 if extra_args <= 0:
---> 43     return f(*args, **kwargs)
     45 # extra_args > 0
     46 args_msg = [
     47     f"{name}={arg}"
     48     for name, arg in zip(kwonly_args[:extra_args], args[-extra_args:], strict=False)
     49 ]

File ~/dev/zarr-python/src/zarr/api/synchronous.py:77, in open(store, mode, zarr_version, zarr_format, path, **kwargs)
     67 @_deprecate_positional_args
     68 def open(
     69     store: StoreLike | None = None,
   (...)
     75     **kwargs: Any,  # TODO: type kwargs as valid args to async_api.open
     76 ) -> Array | Group:
---> 77     obj = sync(
     78         async_api.open(
     79             store=store,
     80             mode=mode,
     81             zarr_version=zarr_version,
     82             zarr_format=zarr_format,
     83             path=path,
     84             **kwargs,
     85         )
     86     )
     87     if isinstance(obj, AsyncArray):
     88         return Array(obj)

File ~/dev/zarr-python/src/zarr/core/sync.py:141, in sync(coro, loop, timeout)
    138 return_result = next(iter(finished)).result()
    140 if isinstance(return_result, BaseException):
--> 141     raise return_result
    142 else:
    143     return return_result

File ~/dev/zarr-python/src/zarr/core/sync.py:100, in _runner(coro)
     95 """
     96 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
     97 exception, the exception will be returned.
     98 """
     99 try:
--> 100     return await coro
    101 except Exception as ex:
    102     return ex

File ~/dev/zarr-python/src/zarr/api/asynchronous.py:308, in open(store, mode, zarr_version, zarr_format, path, storage_options, **kwargs)
    305     return await open_group(store=store_path, zarr_format=zarr_format, mode=mode, **kwargs)
    307 try:
--> 308     return await open_array(store=store_path, zarr_format=zarr_format, **kwargs)
    309 except (KeyError, NodeTypeValidationError):
    310     # KeyError for a missing key
    311     # NodeTypeValidationError for failing to parse node metadata as an array when it's
    312     # actually a group
    313     return await open_group(store=store_path, zarr_format=zarr_format, **kwargs)

File ~/dev/zarr-python/src/zarr/api/asynchronous.py:1066, in open_array(store, zarr_version, zarr_format, path, storage_options, **kwargs)
   1063 zarr_format = _handle_zarr_version_or_format(zarr_version=zarr_version, zarr_format=zarr_format)
   1065 try:
-> 1066     return await AsyncArray.open(store_path, zarr_format=zarr_format)
   1067 except FileNotFoundError:
   1068     if store_path.store.mode.create:

File ~/dev/zarr-python/src/zarr/core/array.py:670, in AsyncArray.open(cls, store, zarr_format)
    647 """
    648 Async method to open an existing Zarr array from a given store.
    649 
   (...)
    667 <AsyncArray memory://... shape=(100, 100) dtype=int32>
    668 """
    669 store_path = await make_store_path(store)
--> 670 metadata_dict = await get_array_metadata(store_path, zarr_format=zarr_format)
    671 # TODO: remove this cast when we have better type hints
    672 _metadata_dict = cast(ArrayV3MetadataDict, metadata_dict)

File ~/dev/zarr-python/src/zarr/core/array.py:135, in get_array_metadata(store_path, zarr_format)
    131     zarray_bytes, zattrs_bytes = await gather(
    132         (store_path / ZARRAY_JSON).get(), (store_path / ZATTRS_JSON).get()
    133     )
    134     if zarray_bytes is None:
--> 135         raise FileNotFoundError(store_path)
    136 elif zarr_format == 3:
    137     zarr_json_bytes = await (store_path / ZARR_JSON).get()

FileNotFoundError: <RemoteStore(S3FileSystem, mur-sst/zarr)>

So the issue here is the open() generally defaults to open via open_array() and then switches to open_group() if the correct exception is handled. Sure enough this works if using open_group() or open_consolidated():

In [3]: zarr.open_consolidated("s3://mur-sst/zarr/", zarr_version=2, storage_options=dict(anon=True))
Out[3]: <Group <RemoteStore(S3FileSystem, mur-sst/zarr)>>

In [4]: zarr.open_group("s3://mur-sst/zarr/", zarr_version=2, storage_options=dict(anon=True))
Out[4]: <Group <RemoteStore(S3FileSystem, mur-sst/zarr)>>

Additional output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions