Skip to content

Update ClickBench benchmarks with DataFusion 43.0.0 #13099

Closed
@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

Like #11567

Requires

Once DataFusion 43.0.0 is released, It would be great to update ClickBench https://benchmark.clickhouse.com/ with runs from the latest version. It looks like we are still reporting numbers for DataFusion 40 and there have been significant improvements since then. See for more details:

Describe the solution you'd like

Perhaps we can follow the model of ClickHouse/ClickBench#210 (thanks @pmcgleenon )

We will also need to update DataFusion to apply the new binary_as_string option added by @goldmedal in #12816. TLDR is that we need to update the create table statements to have the OPTIONS ('binary_as_string' 'true') clause

https://github.com/ClickHouse/ClickBench/blob/main/datafusion/create_partitioned.sql

CREATE EXTERNAL TABLE hits
STORED AS PARQUET
LOCATION 'partitioned'
OPTIONS ('binary_as_string' 'true');

Note this is the same as the DuckDB runner, as explained in #12788

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions