[SPARK-49828][SQL] Expose ColumnNode to/from Expression utils as developer APIs #48306

holdenk · 2024-09-30T21:40:24Z

What changes were proposed in this pull request?

Make ColumnNodeToExpressionConverter ExpressionColumnNode into DeveloperApis instead of internal so that external advanced developers can write columns with codegen that take in columns for input.

Why are the changes needed?

After the introduction of the ColumNode interface external advanced developers building codegen expressions wishing to provide a Scala API have to hack around and access internal Spark APIs. This is illustrated in my PR upgrading spark-testing-base's codegen example to Spark 4 preview2 -- see holdenk/spark-testing-base#423

Does this PR introduce any user-facing change?

No -- developer only

How was this patch tested?

Annotation/visibility only change, verified (manually) that these APIs would be sufficient in holdenk/spark-testing-base#423

Was this patch authored or co-authored using generative AI tooling?

No

hvanhovell · 2024-09-30T21:52:12Z

sql/core/src/main/scala/org/apache/spark/sql/internal/columnNodeSupport.scala


-private[sql] object ColumnNodeToExpressionConverter extends ColumnNodeToExpressionConverter {
+@DeveloperApi
+object ColumnNodeToExpressionConverter extends ColumnNodeToExpressionConverter {


Quick question, we have an implicit class in SparkSession called RichColumn. That will 'restore' the expr functionality. Is that something that would work for you? Alternatively, if the use of SparkSession is cumbersome, we could also add another implicit class to org.apache.spark.sql.classic.ClassicConversions.

So for Java users?

Sure, I see your point. I would love to meet the Java developer that works with Spark internals :)...

They can still use these objects.

hvanhovell · 2024-09-30T21:53:23Z

sql/core/src/main/scala/org/apache/spark/sql/internal/columnNodeSupport.scala

 */
-private[sql] case class ExpressionColumnNode private(
+@DeveloperApi
+case class ExpressionColumnNode private(


You could use org.apache.spark.sql.classic.ClassicConversions for this. I am not saying that you should, some people don't like implicits.

So it doesn't do both the directions of the conversion and, as you point out, it's also an implicit conversion which (in my opinion) is some pretty unnecessary magic for this.

In this case implicits provide a way for a developer to make a minimal amount of changes, that is why they are there. I am generally not a big fan, but in this case they do make migration a lot easier.

You could call org.apache.spark.sql.classic.ClassicConversions.column directly if you want to avoid implicits.

hvanhovell

LGTM.

dongjoon-hyun

It's a little weird to have DeveloperApi in org.apache.spark.sql.internal package.

For example, Apache Spark codebase has no DeveloperApi in internal package so far.

$ git grep DeveloperApi | grep internal | wc -l
       0

If we need to open this, shall we move the package location from internal to a more proper package location, @holdenk ?

holdenk · 2024-12-05T22:32:54Z

If we need to open this, shall we move the package location from internal to a more proper package location, @holdenk ?

Seems reasonable to me.

…perApis instead of internal so that external advanced developers can write columns with codegen that take in columns for input.

Changed to util

github-actions · 2025-03-16T00:28:18Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

holdenk requested a review from hvanhovell September 30, 2024 21:40

github-actions bot added the SQL label Sep 30, 2024

hvanhovell reviewed Sep 30, 2024

View reviewed changes

hvanhovell approved these changes Sep 30, 2024

View reviewed changes

dongjoon-hyun previously requested changes Oct 17, 2024

View reviewed changes

holdenk added 4 commits December 5, 2024 14:33

Make ColumnNodeToExpressionConverter ExpressionColumnNode into Develo…

9f22956

…perApis instead of internal so that external advanced developers can write columns with codegen that take in columns for input.

Move columnNodeSupport

df49d0d

Finish moving columnNodeSupport and friends to utils

fab3567

Finish reorg from internal to utils.

54e990d

holdenk force-pushed the SPARK-49828-column-expression-apis branch from f6e7c5a to 54e990d Compare December 5, 2024 22:59

github-actions bot added ML MLLIB PYTHON labels Dec 5, 2024

github-actions bot added the Stale label Mar 16, 2025

github-actions bot closed this Mar 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-49828][SQL] Expose ColumnNode to/from Expression utils as developer APIs #48306

[SPARK-49828][SQL] Expose ColumnNode to/from Expression utils as developer APIs #48306

Uh oh!

holdenk commented Sep 30, 2024

Uh oh!

hvanhovell Sep 30, 2024

Uh oh!

holdenk Sep 30, 2024

Uh oh!

hvanhovell Sep 30, 2024

Uh oh!

hvanhovell Sep 30, 2024

Uh oh!

holdenk Sep 30, 2024

Uh oh!

hvanhovell Sep 30, 2024

Uh oh!

hvanhovell left a comment

Uh oh!

dongjoon-hyun left a comment •

edited

Loading

Uh oh!

holdenk commented Dec 5, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Mar 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-49828][SQL] Expose ColumnNode to/from Expression utils as developer APIs #48306

[SPARK-49828][SQL] Expose ColumnNode to/from Expression utils as developer APIs #48306

Uh oh!

Conversation

holdenk commented Sep 30, 2024

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

hvanhovell Sep 30, 2024

Choose a reason for hiding this comment

Uh oh!

holdenk Sep 30, 2024

Choose a reason for hiding this comment

Uh oh!

hvanhovell Sep 30, 2024

Choose a reason for hiding this comment

Uh oh!

hvanhovell Sep 30, 2024

Choose a reason for hiding this comment

Uh oh!

holdenk Sep 30, 2024

Choose a reason for hiding this comment

Uh oh!

hvanhovell Sep 30, 2024

Choose a reason for hiding this comment

Uh oh!

hvanhovell left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

holdenk commented Dec 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dongjoon-hyun left a comment •

edited

Loading

holdenk commented Dec 5, 2024 •

edited

Loading