-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added fallback using reflection for backward-compatibility #1573
base: main
Are you sure you want to change the base?
added fallback using reflection for backward-compatibility #1573
Conversation
@@ -45,7 +45,7 @@ jobs: | |||
matrix: | |||
os: [ubuntu-24.04] | |||
java-version: [11] | |||
spark-version: [{short: '3.4', full: '3.4.3'}, {short: '3.5', full: '3.5.5'}] | |||
spark-version: [{short: '3.4', full: '3.4.3'}, {short: '3.5', full: '3.5.4'}, {short: '3.5', full: '3.5.5'}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we advanced the minimum version to 3.5.5, I am not sure if we should add back 3.5.4 tests.
SQL tests can be done here, but tests in Comet are not done with 3.5.4 anymore.
If we decide to support 3.5.4, I would vote for doing it by adding a maven profile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a maven profile using spark 3.5.4 will break compilation for the 3.5 shim -
So to support both 3.5.4 and 3.5.5 in compilation we will need either separate shims for each patch version,
or the 3.5 shim to support all versions through reflection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say we should create a separate Shims for Spark 3.5.4 where it is necessary. I know it is right now we only do profile at 3.4, 3.5, 4.0 level. but I observed a couple of API changes between patch version changes.
Comet Release for 3.5 does not mean that it works with all 3.5.x, currently. Now the minimum version will be 3.5.5 in the next release. Either we drop the 3.5.4 support, or create a new profile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went for the reflection approach after my post-merge discussion with @wForget in #1565 -
since creating different shims for 3.5.4 also means a different release for 3.5.4, which might cause confusion for users.
Since I did the upgrade to 3.5.5 separately, dropping support for 3.5.4 simply means discarding this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to add 3.5.4 ci, just make sure it can be compiled successfully. Or how about deleting it after verification in this pr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it can't be compiled succesfully - for the same reason 3.5.5 couldn't be compiled when the 3.5 shims were compiled against 3.5.4...
try { | ||
PartitionedFileUtil.getPartitionedFile(f, f.getPath, p.values) | ||
} catch { | ||
case _: NoSuchMethodError => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about directly judging the spark version? Like:
if (org.apache.spark.SPARK_VERSION_SHORT >= "3.5.5") {
... for spark 3.5.5+
} else {
... for spark 3.5.4
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And keeping the current code paths? (i.e. compiled code for 3.5.5, else reflection)
That is more explicit, but I'm not sure it brings any benefit.
The advantage of the try-catch idea that in case this param will go away in 3.5.6 (like it did in 4.0.0-preview1/2, while it's back in main...) this will remain backward and forward-compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to define reflection for each version. Some similar implementations:
@kazuyukitanimura WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reflection is ok with me. I just wanted to make sure we dropped 3.5.4 support
If we decide to support 3.5.4 as community, I think we should test with 3.5.4 (downgrade in pom) all the way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer maintaining compatibility with more Spark patch versions to increase its adoptability. If it is known to not work with Spark <=3.5.4, we'd better explicitly document it in Supported Spark Versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should support multiple 3.5.x versions. Some users do not have control over the minor Spark version they use. There isn't any technical reason why we cannot maintain diff files for multiple Spark versions and test multiple versions in CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to see reflection used for both 3.5.0-3.5.4 and 3.5.5 rather than catching a runtime exception, as @wForget suggests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a commit to always invoke the two methods differing between 3.5.x versions using reflection.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1573 +/- ##
============================================
+ Coverage 56.12% 58.49% +2.36%
- Complexity 976 977 +1
============================================
Files 119 122 +3
Lines 11743 12231 +488
Branches 2251 2278 +27
============================================
+ Hits 6591 7154 +563
+ Misses 4012 3945 -67
+ Partials 1140 1132 -8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Which issue does this PR close?
Closes #1572
Rationale for this change
Users with older 3.5.x versions before Spark 3.5.5 should be able using the latest comet version.
What changes are included in this PR?
Modifying the 3.5
ShimCometScanExec
shim to fallback to reflection with the previous method signatures if the latest is not found.How are these changes tested?
Ran
mvn test
with thespark-3.5
profile.Added 3.5.4 to the matrix of tested spark versions in
spark_sql_test.yml