Closed
Description
Describe the bug
Filters such as partition_col = col_from_file
are never applied if datafusion.execution.parquet.pushdown_filters = true
To Reproduce
With datafusion-cli
:
COPY (
SELECT arrow_cast('a', 'Utf8') AS val
) TO 'test_files/scratch/test/part=a/123.parquet'
STORED AS PARQUET;
COPY (
SELECT arrow_cast('b', 'Utf8') AS val
) TO 'test_files/scratch/test/part=b/123.parquet'
STORED AS PARQUET;
COPY (
SELECT arrow_cast('xyz', 'Utf8') AS val
) TO 'test_files/scratch/test/part=c/123.parquet'
STORED AS PARQUET;
set datafusion.execution.parquet.pushdown_filters = true;
CREATE EXTERNAL TABLE test(part text, val text)
STORED AS PARQUET
PARTITIONED BY (part)
LOCATION 'test_files/scratch/test/';
SELECT * FROM test;
explain analyze
select * from test
where part != val;
> select * from test
where part != val;
+-----+------+
| val | part |
+-----+------+
| a | a |
| xyz | c |
| b | b |
+-----+------+
3 row(s) fetched.
Which is clearly wrong.
Expected behavior
> select * from test
where part != val;
+-----+------+
| val | part |
+-----+------+
| xyz | c |
+-----+------+
Additional context
No response