You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I discovered zarr a few days ago, just after v3 was published and I'm trying to use it in a multiprocessing context where one process writes numeric as well as variable length string data into a persistent file from which a reader process reads the newly arrived data.
The aim is to exchange data as well as store it persistently at the same time.
I tried to build a minimal working example (See steps to reproduce) but more often than not reading from the zarr files fails with the following exception:
Traceback (most recent call last):
File "C:\tools\Python\3.12\3.12.2-win64\Lib\multiprocessing\process.py", line 314, in _bootstrap
self.run()
File "...\scratch_3.py", line 26, in run
text_dset = root['text_data']
~~~~^^^^^^^^^^^^^
File "...\site-packages\zarr\core\group.py", line 1783, in __getitem__
obj = self._sync(self._async_group.getitem(path))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "...\site-packages\zarr\core\sync.py", line 187, in _sync
return sync(
^^^^^
File "...\site-packages\zarr\core\sync.py", line 142, in sync
raise return_result
File "...\site-packages\zarr\core\sync.py", line 98, in _runner
return await coro
^^^^^^^^^^
File "...\site-packages\zarr\core\group.py", line 681, in getitem
zarr_json = json.loads(zarr_json_bytes.to_bytes())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\tools\Python\3.12\3.12.2-win64\Lib\json\__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\tools\Python\3.12\3.12.2-win64\Lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\tools\Python\3.12\3.12.2-win64\Lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Is this a bug in v3, is it not ready yet for multiprocessing or am I making as mistake here?
Sadly, the v3 docs don't really describe how to use zarr in a multiprocessing context so it might be possible I'm missing something.
We haven't explicitly tested zarr-python 3 with multiprocessing, but I don't see any reason why there should be any particular problems, because at least with the LocalStore zarr-python doesn't rely on holding any file handles open.
That being said, I don't really understand the architecture of your program. From the error, it looks like the reader is trying to access a zarr.json document that is empty. Since the run method on your ZarrWriter class opens root with overwrite=True, but the run method on your ZarrReader class opens the same group with mode = r, it's possible that you have a race condition here. You may need to poll the state of the zarr.json document before trying to open it.
To avoid these kinds of issues, I would create your zarr hierarchy in synchronous code as much as possible (because writing some JSON documents doesn't benefit from multiprocessing anyways).
Zarr version
3.0.1
Numcodecs version
0.15.0
Python Version
3.12.2
Operating System
Windows 11 22H2
Installation
using pip into virtual environment
Description
Hi,
I discovered zarr a few days ago, just after v3 was published and I'm trying to use it in a multiprocessing context where one process writes numeric as well as variable length string data into a persistent file from which a reader process reads the newly arrived data.
The aim is to exchange data as well as store it persistently at the same time.
I tried to build a minimal working example (See steps to reproduce) but more often than not reading from the zarr files fails with the following exception:
Is this a bug in v3, is it not ready yet for multiprocessing or am I making as mistake here?
Sadly, the v3 docs don't really describe how to use zarr in a multiprocessing context so it might be possible I'm missing something.
Steps to reproduce
Additional output
No response
The text was updated successfully, but these errors were encountered: