-
Notifications
You must be signed in to change notification settings - Fork 1.5k
fix query results for predicates referencing partition columns and data columns #15935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fn projected_columns_prevent_pushdown() { | ||
let table_schema = get_basic_table_schema(); | ||
|
||
fn projected_or_partition_columns_prevent_pushdown() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test actually tests for this... except that since both schemas were being passed in it hit the bug as well.
3b52297
to
854c264
Compare
cc @alamb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix @adriangb. It's LGTM, but I have one question
@@ -464,16 +464,16 @@ impl FileFormat for ParquetFormat { | |||
fn supports_filters_pushdown( | |||
&self, | |||
file_schema: &Schema, | |||
table_schema: &Schema, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused, as we talked like
"keep supports_filters_pushdown so that TableProviders can do Exact pruning of filters, e.g. using partition columns.",
and that is making sense to me now. So, instead, should we pass only table_schema's to these supports_filters_pushdown()
API's at ListingTable level? Theory and practice conflict in my mind now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead, should we pass only table_schema's to these supports_filters_pushdown() API's at ListingTable level
We could! But we'd do that in #15769 right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't change any API?
I mean can we remove file_schema: &Schema
or table_schema: &Schema
from these supports_filters_pushdown()
API's? If we can, which one should we remove? In this PR, the fix shows that we utilize file_schema: &Schema
, however, what I understand from that conversation is we should depend on table_schema: &Schema
, not file_schema: &Schema
. That's what confuses me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well keep only the table_schema. This fix (this PR) becomes irrelevant once we merge the other one. But since that may take longer I thought it was best to make this PR as a simpler fix and not mix together a bug fix and larger change. Helps others who have forks copy the change, etc.
Thanks @adriangb. Let's keep going |
…ta columns (apache#15935) * fix query results for predicates referencing partition columns and data columns * fmt * add e2e test * newline
* feat: ORDER BY ALL * refactor: orderyby all * refactor: order_by_to_sort_expr * refactor: TODO comment * fix query results for predicates referencing partition columns and data columns (#15935) * fix query results for predicates referencing partition columns and data columns * fmt * add e2e test * newline * chore(deps): bump substrait from 0.55.0 to 0.55.1 (#15941) Bumps [substrait](https://github.com/substrait-io/substrait-rs) from 0.55.0 to 0.55.1. - [Release notes](https://github.com/substrait-io/substrait-rs/releases) - [Changelog](https://github.com/substrait-io/substrait-rs/blob/main/CHANGELOG.md) - [Commits](substrait-io/substrait-rs@v0.55.0...v0.55.1) --- updated-dependencies: - dependency-name: substrait dependency-version: 0.55.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * feat: create helpers to set the max_temp_directory_size (#15919) * feat: create helpers to set the max_temp_directory_size Signed-off-by: Jérémie Drouet <[email protected]> * refactor: use helper in cli Signed-off-by: Jérémie Drouet <[email protected]> * refactor: update error message Signed-off-by: Jérémie Drouet <[email protected]> * refactor: use setter in tests Signed-off-by: Jérémie Drouet <[email protected]> --------- Signed-off-by: Jérémie Drouet <[email protected]> Co-authored-by: Andrew Lamb <[email protected]> * Fix main CI (#15942) * Improve sqllogictest error reporting (#15905) * refactor filter pushdown apis (#15801) * refactor filter pushdown apis * remove commented out code * fix tests * fail to fix bug * fix * add/fix docs * lint * add some docstrings, some minimal cleaup * review suggestions * add more comments * fix doc links * fmt * add comments * make test deterministic * add bench * fix bench * register bench * fix bench * cargo fmt --------- Co-authored-by: berkaysynnada <[email protected]> Co-authored-by: Berkay Şahin <[email protected]> --------- Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Jérémie Drouet <[email protected]> Co-authored-by: silezhou <[email protected]> Co-authored-by: Adrian Garcia Badaracco <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jérémie Drouet <[email protected]> Co-authored-by: Andrew Lamb <[email protected]> Co-authored-by: xudong.w <[email protected]> Co-authored-by: Gabriel <[email protected]> Co-authored-by: berkaysynnada <[email protected]> Co-authored-by: Berkay Şahin <[email protected]>
Fixes #15912