Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checking that the store is an instance of dict seem incorrect. #636

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -802,7 +802,7 @@ Here is an example using S3Map to read an array created previously::
array([b'H', b'e', b'l', b'l', b'o', b' ', b'f', b'r', b'o', b'm', b' ',
b't', b'h', b'e', b' ', b'c', b'l', b'o', b'u', b'd', b'!'],
dtype='|S1')
>>> z[:].tostring()
>>> z[:].tobytes()
b'Hello from the cloud!'

Zarr now also has a builtin storage backend for Azure Blob Storage.
Expand Down Expand Up @@ -841,11 +841,11 @@ store. E.g.::
>>> z = root['foo/bar/baz']
>>> from timeit import timeit
>>> # first data access is relatively slow, retrieved from store
... timeit('print(z[:].tostring())', number=1, globals=globals()) # doctest: +SKIP
... timeit('print(z[:].tobytes())', number=1, globals=globals()) # doctest: +SKIP
b'Hello from the cloud!'
0.1081731989979744
>>> # second data access is faster, uses cache
... timeit('print(z[:].tostring())', number=1, globals=globals()) # doctest: +SKIP
... timeit('print(z[:].tobytes())', number=1, globals=globals()) # doctest: +SKIP
b'Hello from the cloud!'
0.0009490990014455747

Expand Down
3 changes: 2 additions & 1 deletion zarr/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import operator
import re
from functools import reduce
from collections.abc import MutableMapping

import numpy as np
from numcodecs.compat import ensure_bytes, ensure_ndarray
Expand Down Expand Up @@ -1955,7 +1956,7 @@ def _encode_chunk(self, chunk):
cdata = chunk

# ensure in-memory data is immutable and easy to compare
if isinstance(self.chunk_store, dict):
if isinstance(self.chunk_store, MutableMapping):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned upthread, this change is too broad and may force unnecessary copies. The goal of this check was to make sure there would not be unexpected mutation of in-memory data so would copy to bytes before storing. If we are writing to disk or sending to a cloud store, this is unneeded. Perhaps the check needs to be updated, but maybe that can be handled as part of Greg's work?

cdata = ensure_bytes(cdata)
Carreau marked this conversation as resolved.
Show resolved Hide resolved

return cdata
Expand Down
4 changes: 2 additions & 2 deletions zarr/storage.py
Original file line number Diff line number Diff line change
Expand Up @@ -1973,11 +1973,11 @@ class LRUStoreCache(MutableMapping):
>>> z = root['foo/bar/baz'] # doctest: +REMOTE_DATA
>>> from timeit import timeit
>>> # first data access is relatively slow, retrieved from store
... timeit('print(z[:].tostring())', number=1, globals=globals()) # doctest: +SKIP
... timeit('print(z[:].tobytes())', number=1, globals=globals()) # doctest: +SKIP
b'Hello from the cloud!'
0.1081731989979744
>>> # second data access is faster, uses cache
... timeit('print(z[:].tostring())', number=1, globals=globals()) # doctest: +SKIP
... timeit('print(z[:].tobytes())', number=1, globals=globals()) # doctest: +SKIP
b'Hello from the cloud!'
0.0009490990014455747

Expand Down