You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Upgrading numpy from 1.26.4 to 2.1.2 breaks my code. I went through several pages of issues looking for "concat", but none seemed to fit.
The xr.concat method applied to a list of DataArrays that are to be concatenated along a scalar coordinate seems to no longer work.
When the DataArrays are created, it used to convert a scalar coord of np.str_ type to a numpy array with dtype <U... . This conversion seems to be gone and, without it, my code no longer works.
Instead a rather cryptic error message appears (full traceback below, here the last bit):
File ~/Projects/temp/xarray_numpy_bug/.env/lib64/python3.12/site-packages/xarray/core/variable.py:1387, in Variable.set_dims(self, dim, shape)
1385 else:
1386 indexer = (None,) * (len(expanded_dims) - self.ndim) + (...,)
-> 1387 expanded_data = self.data[indexer]
1389 expanded_var = Variable(
1390 expanded_dims, expanded_data, self._attrs, self._encoding, fastpath=True
1391 )
1392 return expanded_var.transpose(*dim)
TypeError: string indices must be integers, not 'tuple'
self.data with latest numpy is just a string version of a UUID (was formerly converted to a numpy array) and the indexer is (None, Ellipsis).
What did you expect to happen?
In contrast to the output posted below in the "Minimal Complete Verifiable Example" and "Relevant log output", I expected this output that I get with numpy version 1.26.4 :
# Python 3.12.3 (main, Apr 17 2024, 00:00:00) [GCC 14.0.1 20240411 (Red Hat 14.0.1-0)]# Type 'copyright', 'credits' or 'license' for more information# IPython 8.28.0 -- An enhanced Interactive Python. Type '?' for help.importxarrayasxrimportnumpyasnpimportuuidxr.__version__# '2024.9.0'np.__version__# 2.1.2'xarr=xr.DataArray(np.array([1.0,] *3, dtype=np.float64), dims=("abc"), coords=dict(abc=np.array(list("abc"), dtype="<U1"), scalar_coord=np.str_(uuid.uuid4())))
xarr.coords["scalar_coord"]
# <xarray.DataArray 'scalar_coord' ()> Size: 144B# np.str_('56382178-7f7d-4ec8-a4c1-8ebee96ec8df')# Coordinates:# scalar_coord <U36 144B ...# xarr.coords["scalar_coord"].data# np.str_('56382178-7f7d-4ec8-a4c1-8ebee96ec8df')xarr2=xr.DataArray(np.array([1.0,] *3, dtype=np.float64), dims=("abc"), coords=dict(abc=np.array(list("abc"), dtype="<U1"), scalar_coord=np.str_(uuid.uuid4())))
xr.concat([xarr, xarr2], dim=("scalar_coord"))
# see error in "Relevant log output"
MVCE confirmation
Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
Complete example — the example is self-contained, including all data and the text of any traceback.
Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
New issue — a search of GitHub Issues suggests this is not a duplicate.
Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
In [10]: xr.concat([xarr, xarr2], dim=("scalar_coord"))
---------------------------------------------------------------------------TypeErrorTraceback (mostrecentcalllast)
CellIn[10], line1---->1xr.concat([xarr, xarr2], dim=("scalar_coord"))
File~/Projects/temp/xarray_numpy_bug/.env/lib64/python3.12/site-packages/xarray/core/concat.py:264, inconcat(objs, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs, create_index_for_new_dim)
259raiseValueError(
260f"compat={compat!r} invalid: must be 'broadcast_equals', 'equals', 'identical', 'no_conflicts' or 'override'"261 )
263ifisinstance(first_obj, DataArray):
-->264return_dataarray_concat(
265objs,
266dim=dim,
267data_vars=data_vars,
268coords=coords,
269compat=compat,
270positions=positions,
271fill_value=fill_value,
272join=join,
273combine_attrs=combine_attrs,
274create_index_for_new_dim=create_index_for_new_dim,
275 )
276elifisinstance(first_obj, Dataset):
277return_dataset_concat(
278objs,
279dim=dim,
(...)
287create_index_for_new_dim=create_index_for_new_dim,
288 )
File~/Projects/temp/xarray_numpy_bug/.env/lib64/python3.12/site-packages/xarray/core/concat.py:755, in_dataarray_concat(arrays, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs, create_index_for_new_dim)
752arr=arr.rename(name)
753datasets.append(arr._to_temp_dataset())
-->755ds=_dataset_concat(
756datasets,
757dim,
758data_vars,
759coords,
760compat,
761positions,
762fill_value=fill_value,
763join=join,
764combine_attrs=combine_attrs,
765create_index_for_new_dim=create_index_for_new_dim,
766 )
768merged_attrs=merge_attrs([da.attrsfordainarrays], combine_attrs)
770result=arrays[0]._from_temp_dataset(ds, name)
File~/Projects/temp/xarray_numpy_bug/.env/lib64/python3.12/site-packages/xarray/core/concat.py:540, in_dataset_concat(datasets, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs, create_index_for_new_dim)
535# case where concat dimension is a coordinate or data_var but not a dimension536if (
537dim_nameincoord_namesordim_nameindata_names538 ) anddim_namenotindim_names:
539datasets= [
-->540ds.expand_dims(dim_name, create_index_for_new_dim=create_index_for_new_dim)
541fordsindatasets542 ]
544# determine which variables to concatenate545concat_over, equals, concat_dim_lengths=_calc_concat_over(
546datasets, dim_name, dim_names, data_vars, coords, compat547 )
File~/Projects/temp/xarray_numpy_bug/.env/lib64/python3.12/site-packages/xarray/core/dataset.py:4797, inDataset.expand_dims(self, dim, axis, create_index_for_new_dim, **dim_kwargs)
4793ifknotinvariables:
4794ifkincoord_namesandcreate_index_for_new_dim:
4795# If dims includes a label of a non-dimension coordinate,4796# it will be promoted to a 1D coordinate with a single value.->4797index, index_vars=create_default_index_implicit(v.set_dims(k))
4798indexes[k] =index4799variables.update(index_vars)
File~/Projects/temp/xarray_numpy_bug/.env/lib64/python3.12/site-packages/xarray/util/deprecation_helpers.py:143, indeprecate_dims.<locals>.wrapper(*args, **kwargs)
135emit_user_level_warning(
136f"The `{old_name}` argument has been renamed to `dim`, and will be removed "137"in the future. This renaming is taking place throughout xarray over the "
(...)
140PendingDeprecationWarning,
141 )
142kwargs["dim"] =kwargs.pop(old_name)
-->143returnfunc(*args, **kwargs)
File~/Projects/temp/xarray_numpy_bug/.env/lib64/python3.12/site-packages/xarray/core/variable.py:1387, inVariable.set_dims(self, dim, shape)
1385else:
1386indexer= (None,) * (len(expanded_dims) -self.ndim) + (...,)
->1387expanded_data=self.data[indexer]
1389expanded_var=Variable(
1390expanded_dims, expanded_data, self._attrs, self._encoding, fastpath=True1391 )
1392returnexpanded_var.transpose(*dim)
TypeError: stringindicesmustbeintegers, not 'tuple'
Anything else we need to know?
I created a fresh fedora container and created two new virtual environments in which I executed the exact same code to ensure this really has just to do with xarray and numpy versions.
I went through all 3 pages of open issues on "concat" and read those that appeared to possibly be relevant, but none seemed to match my case. Truely sorry if I overlooked something!
$ toolbox create -i fedora-toolbox:40 xarray_fedora40
$ toolbox enter xarray_fedora40
Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!
I believe this is the same as #9399, which was fixed by #9403 (and I can't reproduce on main with numpy>=2.1). We're only waiting on a release now, which should happen soon(-ish).
What happened?
Upgrading numpy from 1.26.4 to 2.1.2 breaks my code. I went through several pages of issues looking for "concat", but none seemed to fit.
The xr.concat method applied to a list of DataArrays that are to be concatenated along a scalar coordinate seems to no longer work.
When the DataArrays are created, it used to convert a scalar coord of
np.str_
type to a numpy array with dtype<U...
. This conversion seems to be gone and, without it, my code no longer works.Instead a rather cryptic error message appears (full traceback below, here the last bit):
self.data with latest numpy is just a string version of a UUID (was formerly converted to a numpy array) and the indexer is (None, Ellipsis).
What did you expect to happen?
In contrast to the output posted below in the "Minimal Complete Verifiable Example" and "Relevant log output", I expected this output that I get with numpy version 1.26.4 :
Where the `xarr1.coords["scalar_coord"] looks like this (an array created from a scalar):
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
I created a fresh fedora container and created two new virtual environments in which I executed the exact same code to ensure this really has just to do with xarray and numpy versions.
I went through all 3 pages of open issues on "concat" and read those that appeared to possibly be relevant, but none seemed to match my case. Truely sorry if I overlooked something!
Environment
INSTALLED VERSIONS
commit: None
python: 3.12.3 (main, Apr 17 2024, 00:00:00) [GCC 14.0.1 20240411 (Red Hat 14.0.1-0)]
python-bits: 64
OS: Linux
OS-release: 6.10.12-200.fc40.x86_64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: ('en_GB', 'UTF-8')
libhdf5: None
libnetcdf: None
xarray: 2024.9.0
pandas: 2.2.3
numpy: 2.1.2
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 75.1.0
pip: 24.2
conda: None
pytest: None
mypy: None
IPython: 8.28.0
sphinx: None
The text was updated successfully, but these errors were encountered: