[SPARK-53553][CONNECT] Fix handling of null values in LiteralValueProtoConverter #52310
+287
−9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR fixes the handling of null literal values in
LiteralValueProtoConverter
for Spark Connect. The main changes include:Added proper null value handling: Created a new
setNullValue
method that correctly sets null values in proto literals with appropriate data type information.Reordered pattern matching: Moved null and Option handling to the top of the pattern matching in
toLiteralProtoBuilderInternal
to ensure null values are processed before other type-specific logic.Fixed converter logic: Updated the
getScalaConverter
method to properly handle null values by checkinghasNull
before applying type-specific conversion logic.Why are the changes needed?
The previous implementation had several issues with null value handling:
Incorrect null processing order: Null values were being processed after type-specific logic, which could lead to exceptions.
Missing null checks in converters: The converter functions didn't properly check for null values before applying type-specific conversion logic.
Does this PR introduce any user-facing change?
Yes. This PR fixes a bug where null values in literals (especially in arrays and maps) were not being properly handled in Spark Connect. Users who were experiencing issues with null value serialization in complex types should now see correct behavior.
How was this patch tested?
build/sbt "connect-client-jvm/testOnly *ClientE2ETestSuite -- -z SPARK-53553"
build/sbt "connect/testOnly *LiteralExpressionProtoConverterSuite"
build/sbt "connect-client-jvm/testOnly org.apache.spark.sql.PlanGenerationTestSuite"
build/sbt "connect/testOnly org.apache.spark.sql.connect.ProtoToParsedPlanTestSuite"
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Cursor 1.5.11