-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: assignment via loc silently fails with differing dtypes #61346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Confirmed on main. Still silently works when assigning differing dtypes to columns via It seems to me that this should be raising an error, consistent with the behavior introduced for other dtype mismatches (e.g., int64 ← str, which now raises a |
This may be well known, but just in case,
This isn't so clear to me, e.g. df = pd.DataFrame({"a": [1.0, 2.5, 3.0]})
df.loc[:, "a"] = 5
print(df)
# a
# 0 5.0
# 1 5.0
# 2 5.0 Should this raise? I personally think the answer there is no. But I'm not sure we ever made any decisions on which implicit conversions should and should not be allowed. This is somewhat related to PDEP-6. |
cc @pandas-dev/pandas-core |
I don't think that is what is going on here. It's not about incompatible types not being recognized. It's about the automatic conversion that is done with strings that are formatted datetime objects being assigned to a series that has df['bar'] = pd.to_datetime(df['foo'], format='%Y-%m-%d')
df.loc[:, 'bar'] = df.loc[:, 'bar'].dt.strftime('%Y%m%d') the first statement sets the dtype of >>> df.loc[:, "bar"] = ["290102", "300304"]
>>> df
foo bar
0 2025-04-23 2002-01-29
1 2025-04-22 2004-03-30 I'm not sure if we want to change the behavior in this case. If On the other hand, as shown in the example, if a user did something like that, it is unclear whether they wanted the dates parsed as YYMMDD or DDMMYY. So maybe we should be warning if things are ambiguous?? |
Yup, looks like it's going down the mixed formats path (🙀 ) In [8]: df = pd.DataFrame({'foo': ['2025-04-23', '2025-04-22']}); df['bar'] = pd.to_datetime(df['foo'], format='%Y-%m-%d
...: ')
In [9]: df.loc[:, 'bar'] = ['12/01/2020', '13/01/2020']
In [10]: df
Out[10]:
foo bar
0 2025-04-23 2020-12-01
1 2025-04-22 2020-01-13 |
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
I expect
bar
to look likeinstead of
Expected Behavior
bar
should look likeInstalled Versions
The text was updated successfully, but these errors were encountered: