-
Notifications
You must be signed in to change notification settings - Fork 205
chore: Upgrade to datafusion 47.0.0-rc1 and arrow-rs 55.0.0 #1563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/// Unlike `map_batch` this method also preserves the columns that | ||
/// may not appear in the final output (`projected_table_schema`) but may | ||
/// appear in push down predicates | ||
fn map_partial_batch(&self, batch: RecordBatch) -> datafusion::common::Result<RecordBatch> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
map_partial_batch
has been removed in DataFusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbutrovich @parthchandra fyi, I am not sure of the impact, but I figured I would first see if any tests fail in CI
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1563 +/- ##
============================================
+ Coverage 56.12% 58.75% +2.62%
- Complexity 976 1081 +105
============================================
Files 119 125 +6
Lines 11743 12581 +838
Branches 2251 2360 +109
============================================
+ Hits 6591 7392 +801
- Misses 4012 4015 +3
- Partials 1140 1174 +34 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
I ran TPC-H benchmarks locally and do not see any noticeable difference in performance |
Some aggregate tests started to fail. I filed an issue in DataFusion apache/datafusion#15676 |
fn fmt_sql(&self, _: &mut Formatter<'_>) -> std::fmt::Result { | ||
unimplemented!() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DataFusion now supports a new EXPLAIN output, but this is not exposed in Comet, so we do not need to implement these new methods.
.with_table_partition_cols(partition_fields), | ||
_ => get_file_config( | ||
(Some(data_schema), Some(projection_vector), Some(partition_fields)) => { | ||
get_file_config_builder( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this now uses a builder
@comphead Could you review the HDFS changes? There was a change in objectstore to use |
Thanks @andygrove I think these HDFS changes LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. I don't think the schema adapter change is likely to affect our readers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks @andygrove are you planning to refer to rc1 or wait for the official release?
I'd like to merge this one and then switch to the official release next week |
@andygrove here's the switch: |
Which issue does this PR close?
Closes #1634
Rationale for this change
Use latest DataFusion. I will create a follow-on PR next week to change to the official 47.0.0 release once it is available.
What changes are included in this PR?
How are these changes tested?