-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: datetime selector #1822
base: main
Are you sure you want to change the base?
feat: datetime selector #1822
Conversation
""" | ||
return Selector(lambda plx: plx.selectors.all()) | ||
|
||
|
||
def datetime( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally, the datetime selector 😂
narwhals/utils.py
Outdated
if "*" in time_zones: | ||
import zoneinfo | ||
|
||
time_zones.extend(list(zoneinfo.available_timezones())) | ||
time_zones.remove("*") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we don't allow "*" to be passed to the Datetime constructor as a time zone, I am explicitly creating all the timezones.
self: Self, | ||
time_unit: TimeUnit | Collection[TimeUnit] | None, | ||
time_zone: str | timezone | Collection[str | timezone | None] | None, | ||
) -> DaskSelector: # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For dask, the selector works, but the cast fails with:
TypeError: Cannot use .astype to convert from timezone-aware dtype to timezone-naive dtype. Use obj.tz_localize(None) or obj.tz_convert('UTC').tz_localize(None) instead.
"pyspark" in str(constructor) | ||
or "duckdb" in str(constructor) | ||
or "dask" in str(constructor) | ||
or ("pyarrow_table" in str(constructor) and PYARROW_VERSION < (12,)) | ||
or ("pyarrow" in str(constructor) and is_windows()) | ||
or ("pandas" in str(constructor) and PANDAS_VERSION < (2,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are a lot of xfail, but let me go through them:
- pyspark and duckdb: do not implement selectors
- dask: see comment
- pyarrow < 12, does not implement
replace_time_zone
- pyarrow in windows: fails to find UTC timezone
- pandas < 2, does not support
time_units!="ns"
if "*" in time_zones: | ||
import sys | ||
|
||
if sys.version_info >= (3, 9): | ||
import zoneinfo | ||
else: # pragma: no cover | ||
# This code block is due to a typing issue with backports.zoneinfo package: | ||
# https://github.com/pganssle/zoneinfo/issues/125 | ||
from backports import zoneinfo | ||
|
||
time_zones.extend(list(zoneinfo.available_timezones())) | ||
time_zones.remove("*") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this taken from Polars? because chrono-tz's time zone identifiers are a bit different in some cases, i'm not sure we should be doing this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From polars docs:
"""
...
time_zone:
One or more timezone strings, as defined in zoneinfo (to see valid options run `import zoneinfo;
zoneinfo.available_timezones()` for a full list).
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure but it's not in the source code, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, it is not in the source code - polars handles "*" out of the box to match all the timezones and None to handle non aware datetimes (default arg is time_zone = ("*", None)
)
What type of PR is this? (check all applicable)
Related issues
Checklist
If you have comments or can explain your changes, please do so below
@MarcoGorelli you might have a better idea for how to test this. I checked how polars does it and it is definitly brute forcing 😂
I took the opportunity to:
TimeUnit
alias to use around the codebaseThe actual changes are not too large