-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Better error message for trying to convert to "double," type #60685
Comments
I can't reproduce under 2.2.3, just converts to float64. Can you specify under what versions this occurs? This could also be a numpy upstream issue as ultimately this gets delegated to numpy for numpy-typed data blocks (below is with numpy 1.26.4). In [4]: df = pd.DataFrame({"a": [1, 2, 3]})
In [5]: df.dtypes
Out[5]:
a int64
dtype: object
In [6]: df.astype("double,")
Out[6]:
a
0 1.0
1 2.0
2 3.0
In [7]: df.astype("double,").dtypes
Out[7]:
a float64
dtype: object |
I was able to reproduce the error, and here’s what I found: When you create a DataFrame with dtype="double,", Pandas delegates the dtype handling to NumPy. In NumPy, this is interpreted as a structured data type rather than a standard floating-point type. According to NumPy’s documentation, a structured dtype requires: To investigate, I checked how the dtype is printed:
About this:
So, here is the thing, 'double,' is not rejected as an invalid dtype because NumPy interprets it as a structured dtype. This is not about an existing datatype with some extra character; it's specifically about the comma.
This is similar to the "hurdy-gurdy" case. I believe this is not entirely a bug, but I'm not sure how to work with structured data types in general. Additionally, I'd like to ask if you have an aarch64 processor instead of an x86_64 processor. My laptop is aarch64, and I was able to reproduce the same error message. However, when I tried on an x86_64 machine, I couldn't reproduce it as @Liam3851 mentioned. I wonder if the difference is due to the processor type or some other reason. |
@parkine that's so interesting! Thanks for digging into it. That makes it sound like raising a more helpful error would either need to happen within numpy or pandas would have to have some kind of extra "is this a valid datatype" check before handing over to numpy or pyarrow, which probably would add additional complexity that isn't worth it just for a more helpful error message. I'm on x86_64, so I guess the message might be related to something else? Potentially numpy version (numpy version 2.1.3). Either way, happy to close this issue off since it sounds like handling it within pandas is probably not worth it? |
Happy to help! I don’t think they could’ve fixed it since this seems more like a numpy “feature” than a bug. The datatype you used wasn’t technically wrong just a different interpretation from numpy’s perspective. What’s interesting is how numpy decides to treat "double," either as a typo or a structured datatype. Thank you for checking. Like you said, it’s probably a version issue. The one that didn’t throw an error was NumPy 1.26.4, but version 2.2.2 did. I was thinking it'd be a pandas version issue and didn’t even think to check numpy. Maybe the newer version enforces structured dtypes more strictly? Not entirely sure, but if I figure it out, I’ll drop a comment! |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
I (don't ask how 😅) came over the fact that this:
Raises a helpful 'TypeError: data type "hurdy-gurdy" not understood message'
But this doesn't happen if the incorrect type is "double," (I'm not sure why but I assume something around the fact that pandas needs to check for types such as "double[pyarrow]").
This code:
Raises the error: "TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''".
Feature Description
Throw the safe "double," is not a valid type error message
Alternative Solutions
Probably involves digging slightly into the part of the code that's throwing an error - I'd love to put in a PR if this is something that'd be accepted?
Additional Context
No response
The text was updated successfully, but these errors were encountered: