Skip to content

(fix): allow all extension array data types in pandas adapters #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ilan-gold
Copy link

@ilan-gold ilan-gold commented Oct 23, 2024

This probably needs some work, but since pandas datetime handling often goes through extension arrays, I think these two issues are completely linked

cc: @shoyer @kmuehlbauer

@ilan-gold
Copy link
Author

It's tough to do this without CI - I will push a few of the sorts of things re: datetimes that will change so you get a sense, but generally, things are even cleaner (more datetimes preserved).

@ilan-gold
Copy link
Author

Not sure we want to keep allowing pandas in-memory date time stuff + dask: 59b03f2 maybe should start blanket converting extension arrays before dask?

@ilan-gold ilan-gold force-pushed the ig/fix_extension_indexer branch from a9c9386 to 7c32bd0 Compare October 24, 2024 07:25
@kmuehlbauer kmuehlbauer reopened this Oct 24, 2024
@kmuehlbauer
Copy link
Owner

@ilan-gold Tried to activate CI, but seems it doesn't work. I'm back to desk next week, won't have time to check this now.

@shoyer
Copy link

shoyer commented Oct 24, 2024

Can you open this up as a pull request against pydata/xarray?

ilan-gold and others added 16 commits October 25, 2024 08:40
Co-authored-by: Stephan Hoyer <[email protected]>
Co-authored-by: Deepak Cherian <[email protected]>
Co-authored-by: Spencer Clark <[email protected]>
Co-authored-by: Spencer Clark <[email protected]>
Co-authored-by: Stephan Hoyer <[email protected]>
Co-authored-by: Spencer Clark <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix scalar handling for timedelta based indexer

* remove stale error message and "ignore:Converting non-default" in testsuite

* add per review suggestions

* add/remove todo

* rename timeunit -> format

* return "ns" resolution per default for timedeltas, if not specified

* Be specific on types/dtpyes

* add comment

* add suggestions from code review

* fix docs

* fix test which isn't run for numpy2 atm

* add notes on to_datetime section, update examples showing usage of 'as_unit'

* use np.timedelta64 for to_timedelta example, update as_unit example, update note

* remove note

* Apply suggestions from code review

Co-authored-by: Deepak Cherian <[email protected]>

* refactor timedelta decoding to _numbers_to_timedelta and res-use it within decode_cf_timedelta

* fix conventions test, add todo

* run times through pd.Timestamp to catch possible overflows

* fix tests for cftime_to_nptime

* fix cftime_to_nptime in cftimeindex

* introduce pd.Timestamp instance check

* warn if out-of-bound datetimes are encoded with standard calendar, fall back to cftime encoding, add fix for cftime issue where python datetimes are not encoded correctly with date2num.

* fix time-coding.rst, add reference to time-series.rst.

* try to fix typing, ignore one

* try to fix docs

* revert doc-changes

* Add a non-ns test for polyval, polyfit

* more doc cosmetics

* add whats-new.rst entry

* add/fix coder docstring

* add xr.date_range example as suggested per review

* Apply suggestions from code review

Co-authored-by: Spencer Clark <[email protected]>

* Implement `time_unit` option for `decode_cf_timedelta` (pydata#3)

* Fix timedelta encoding overflow issue; always decode to ns resolution

* Implement time_unit for decode_cf_timedelta

* Reduce diff

* fix typing

* use nanmin/nanmax, catch numpy RuntimeWarnings

* Apply suggestions from code review

Co-authored-by: Kai Mühlbauer <[email protected]>

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Stephan Hoyer <[email protected]>
Co-authored-by: Deepak Cherian <[email protected]>
Co-authored-by: Spencer Clark <[email protected]>
Co-authored-by: Deepak Cherian <[email protected]>
max-sixty and others added 28 commits March 25, 2025 09:27
* Move fit computation code to dedicated new file

Part of pydata#10089

* .

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Fix GroupBy first, last with flox

Closes pydata#10169

* fix test

* parallelize upstream tests
* Allow setting `fill_value` on Zarr format 3 arrays

Closes pydata#10064

* fix

* fix format detection

* fix

* Set use_zarr_fill_value_as_mask=False
* DataTree: sel & isel add error context

* add test

* changelog

---------

Co-authored-by: Deepak Cherian <[email protected]>
The name of the repo has changed from `zarr` to `zarr-python` it was still working due to github re-direct, but better to be explicit about which repo this is aiming at
)

* Add test to check units appear in FacetGrid plot

- appended test to `TestFacetGrid` class inside test_plot.py
- checks that units are added to the plot axis labelling

* fix: ensure axis labels include units in FacetGrid plots

- Fixed an issue where axis labels for FacetGrid plots did not display
units when provided.
- Now, both the dimension name and its corresponding unit (if available)
are shown on the axis label.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added whats-new documentation

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* main:
  Vendor pandas to xarray conversion tests (pydata#10187)
  Fix: Correct axis labelling with units for FacetGrid plots (pydata#10185)
  Use explicit repo name in upstream wheels (pydata#10181)
  DOC: Update docstring to reflect renamed section (pydata#10180)
)

* bug: fix write_empty_chunks for zarr v3

* future proof write_empty_chunks in append flow

* test: fix write_empty_test for zarr 2

* typing: fix typing for write_empty_chunks

* small edits

---------

Co-authored-by: Deepak Cherian <[email protected]>
…data#10192)

Bumps the actions group with 1 update: [scientific-python/upload-nightly-action](https://github.com/scientific-python/upload-nightly-action).


Updates `scientific-python/upload-nightly-action` from 0.6.1 to 0.6.2
- [Release notes](https://github.com/scientific-python/upload-nightly-action/releases)
- [Commits](scientific-python/upload-nightly-action@82396a2...b36e8c0)

---
updated-dependencies:
- dependency-name: scientific-python/upload-nightly-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* main:
  Bump scientific-python/upload-nightly-action in the actions group (pydata#10192)
  Add new whats-new section (pydata#10190)
  release 2025.03.1 (pydata#10188)
  Support zarr `write_empty_chunks` for zarr-python 3 and up (pydata#10177)
@kmuehlbauer
Copy link
Owner

Closing this as pydata#9671 tracks this.

@kmuehlbauer kmuehlbauer closed this Apr 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.