-
Notifications
You must be signed in to change notification settings - Fork 226
Scala build infra #1211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Scala build infra #1211
Conversation
Adds build-infra related work for Scala projects, like apache#1208
e398895
to
922961f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introducing Scala into a Java codebase can increase complexity and reduce maintainability. It requires the team to manage interoperability concerns, handle differences in toolchains, and maintain expertise in two languages simultaneously, potentially causing confusion, increased onboarding time, and decreased productivity.
Do we really want to introduce Scala code in Polaris?
cc @RussellSpitzer @jbonofre @collado-mike @dennishuo @eric-maynard
It seems like the intent in #1190 is to use scala too. Is that right? |
So, let me summarize: I'm not a Scala fan, but for Spark + Gatling there's just no way around that. |
For benchmark, it depends where it will land. If it goes on For Spark plugin, it's a good point to have Scala infra there. |
@eric-maynard The intent in #1190 is not using Scala, the scala version in the PR is used to pull the right spark and iceberg dependency, for example: spark_sql_2.12 and iceberg_spark_runtime_3.5-2.21 etc. I do not intend to introduce any Scala class in for the plugin also. |
@snazy @jbonofre Could you explain some more details to me about the claim that "spark plugin requires scala build infra"? My understanding is that scala build is only required if we introduce our own Scala class, but we can use java for the plugin code like SparkCatalog in iceberg. For dependencies like spark_sql, iceberg_spark_runtime, as far as we pull the correct library with scala version, no re-compile will be needed, right? |
@gh-yzou you mentioned the reason yourself in "spark_sql_2.12 and iceberg_spark_runtime_3.5-2.12" - "2.12" (and 2.13) are Scala versions, right? |
@snazy but those are libraries already compiled, i do not need to recompile those libraries anymore, right? I am not sure if |
You need to have the right dependencies in place - and that is what this change + #1203 do |
change #1203 assumes there will be Scala class there, like ScalaStuff.scala, therefore it requires the scala build. However, there is no strict requirement that we need to develop the spark plugin using Scala code. If we do not introduce any new Scala code, i am not sure I am getting the reason why we need Scala build infra? |
If you don't have Scala code, then it won't change anything, except give you the right dependencies. But #1208 does have Scala code. |
I do like Scala, but I do not think we should make the decision of introducing a new language [dependency] lightly. Maybe it's better to discuss this on #1208, but if it's possible to avoid a new language [dependency] at this time I think we should avoid it. |
Right, I am just trying to clarify that Scala build is not strictly required by Spark plugin. And as @jbonofre pointed out, #1208 might fits better in polaris-tools repository. As @eric-maynard suggested, we probably could move the discussion to PR #1208, and it would be great if we could avoid introducing new language in Polaris if not necessary. |
Well, the Scala language implications are implicitly introduced with the Spark plugin. |
By dependency stuff, do you mean we need to use the correct implementation(org.apache.spark:spark-sql_:) library in our plugin development? If that is what you mean, i think we are doing that, we configs the spark scala version in a spark-scala.properties file like what nessie does, and adds all targets as a default build. |
This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Adds build-infra related work for Scala projects, like #1208