-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enh]: cast expr in SparkLike
#1743
Comments
Hey @lucas-nelson-uiuc thanks for taking the time in this. I would be very happy to see Coming to your considerations:
Amazing 🚀👌
I think it is fine to raise an error when casting to these types for now and evaluate later on.
polars should default to nullable, so we can keep such flag on, not sure how to go about having metadata
Yes I would say so, also with nullable flag!
Not sure, we can either raise for now or validate that each element has the same length |
Hey @lucas-nelson-uiuc thank you for making the issue. I pushed some old commits today about casting on basic type and I didn't realise you may also working on this. sorry for that 😕 The PR only contains basic types therefore your work on |
No problem ^ thanks for getting the PR started! |
We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?
No response
Please describe the purpose of the new feature or describe the problem to solve.
Currently working on implementing the
cast
expression inSparkLike
- wanted to take this issue to discuss + list out all the key considerations with Spark's data types.Suggest a solution if possible.
So far, I'm able to implement a handful of data types. However, I noticed that some types cannot (yet) be implemented or I'm uncertain how they'd be implemented.
Able to implement (currently testing)
Float64
,Float32
,Int64
,Int32
,Int16
,Decimal
String
Boolean
ArrayType
(see comment below)Struct
,Field
Cannot (yet) implement
UInt8
,UInt16
,UInt32
,UInt64
)Enum
,Categorical
)Object
,Unknown
)Unsure how to implement
pyspark.types.StructField
contains more than justname
anddtype
- is it worth updatingnarwhals.dtypes.Field
to have these additional (optional) parameters to accommodate PySpark?nullable
: whether the field can be null (None) or notmetadata
: additional information about the fieldArrayType
functions like Polars'List
type (at least it doesn't have a width constraint)pyspark.types.ArrayType
tonarwhals.dtypes.List
?Array
type that aligns with Polars?Let me know what anyone thinks about the above. Feel free to add onto this with other types we could add in. Thanks!
If you have tried alternatives, please describe them below.
No response
Additional information that may help us understand your needs.
No response
The text was updated successfully, but these errors were encountered: