-
Notifications
You must be signed in to change notification settings - Fork 40
Fill value lost with Icechunk #343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I was able to work around this in the following way: from xarray.backends.zarr import FillValueCoder
import numpy as np
import zarr
coder = FillValueCoder()
group = zarr.open_group(store)
group['foo'].attrs['_FillValue'] = coder.encode(ds.foo.encoding['_FillValue'], np.dtype('f4'))
store.commit("fixed fill value") |
@rabernat I am trying to understand the expected behavior for what the VirtualiZarr readers like dmrpp should return for Looking at https://github.com/pydata/xarray/blob/49502fcde4db6ea3da1f60ead589580cfdad5c98/xarray/backends/zarr.py#L787-L796 it seems like Testing in this notebook: https://gist.github.com/abarciauskas-bgse/7b0cad2dc2808a5ee7fc0676b59cf1a5 cc @ayushnag @sharkinsspatial @TomNicholas in case you can see what I'm doing wrong. |
Closed by #420 |
This issue is mostly likely related to the different treatment of missing data in Xarray with Zarr V3 vs. V2 (see pydata/xarray#5475). I don't think it's specific to Icechunk, but rather is related to Zarr V3. However, it can only be reproduced with IC afaik because only IC uses V3 in VirtualiZarr.
Note: this requires
pip install git+https://github.com/mpiannucci/kerchunk@v3
To reproduce, first create test data
Next, create an Icechunk virtual dataset
Now read it back
As you can see, there is no NaN in the first value.
I'm not really sure how encoding works in VirtualiZarr, but I'd be happy to explain the relevant changes with Xarray's handling of fill value in Zarr V3 format (see e.g. https://github.com/pydata/xarray/blob/49502fcde4db6ea3da1f60ead589580cfdad5c98/xarray/backends/zarr.py#L787-L796).
cc @mpiannucci
The text was updated successfully, but these errors were encountered: