-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Different behaviour on polars and pandas with dates: AttributeError: 'datetime.datetime' object has no attribute 'floor'
#1660
Comments
Hey @sergiocalde94 thanks for reporting the issue.
As we aim to replicate polars behaviour, you should not expect to be able to use For a temporary workaround, you can use |
Thanks for the report! I'd say this is expected, but that we need a documentation page about scalars. In this case, as Francesco noted:
However, import polars as pl
import pandas as pd
import narwhals.stable.v1 as nw
from datetime import datetime
start_train_date = "2024-01-01"
end_train_date = "2024-03-05"
df_polars = pl.DataFrame(
{
"application_started_at": [
datetime.strptime("2024-01-01", "%Y-%m-%d"),
datetime.strptime("2024-02-01", "%Y-%m-%d"),
datetime.strptime("2024-03-01", "%Y-%m-%d"),
datetime.strptime("2024-04-01", "%Y-%m-%d"),
datetime.strptime("2024-04-06", "%Y-%m-%d"),
]
}
)
df = nw.from_native(df_polars)
print(df["application_started_at"].max().date())
df_pandas = df_polars.to_pandas()
df = nw.from_native(df_pandas)
print(df["application_started_at"].max().date()) This outputs
It also works for PyArrow: df_pyarrow = df_polars.to_arrow()
df = nw.from_native(df_pyarrow)
print(df["application_started_at"].max().date()) |
Excellent solutions, thanks 🙏 I didn't notice about the Also, if it's not part of the narwhals API for me is a little bit strange to have the possibility of using it (the It's a little bit confusing that I can use I don't know if I am saying stupid things but that was my first thought when using this library 😃. |
Hey @sergiocalde94 - I think the main idea is that as output of narwhals you can expect a type that is compatible with polars return type - but not necessarily with, let's say, pandas, in all the cases. In practice, whenever the return type differ, we could cast it to a python scalar ourselves (as we actually do for some pyarrow cases). The difference between pandas and pyarrow is that pandas types can be treated as python scalars in the majority of cases, while for pyarrow this was not the case. I am going to close this for now, but we might come back to standardize the return type across all backends in the future |
Describe the bug
There is an inconsistency in behavior when using narwhals.stable.v1 with Polars and Pandas DataFrames for the max() operation followed by floor("D") on a datetime column.
As far as I understood it should be agnostic but maybe I'm doing something wrong since it's my first day using narwhals 😬
Steps or code to reproduce the bug
Setup:
Error:
No error if using pandas:
Expected results
Timestamp('2024-04-06 00:00:00')?
Actual results
AttributeError: 'datetime.datetime' object has no attribute 'floor'
Please run narwhals.show_version() and enter the output below.
Relevant log output
No response
The text was updated successfully, but these errors were encountered: