Benchmark: databend-common-ast vs other Rust SQL parsers on PostgreSQL workloads #19485

LucaCappelletti94 · 2026-02-25T19:22:27Z

LucaCappelletti94
Feb 25, 2026

I have published an open-source benchmark comparing Rust SQL parsers on real-world PostgreSQL statements, including databend-common-ast. Sharing the results here as they may be useful to the team.

Benchmark repo: https://github.com/LucaCappelletti94/sql_ast_benchmark

What the benchmark measures

Performance: parse throughput on the Spider + Gretel PostgreSQL datasets (4,505 statements: SELECT, INSERT, UPDATE, DELETE), measured at batch sizes 1–1000.

Correctness: evaluated against the sqlparser-rs test suite using pg_query.rs (libpg_query - the actual PostgreSQL parser) as ground truth. Four metrics:

Metric	Definition	Direction
Recall	Of SQL pg_query accepts, how many does the parser also accept?	↑ higher is better
False-positive rate	Of SQL pg_query rejects, how many does the parser wrongly accept?	↓ lower is better
Round-trip stability	`parse → print → re-parse → re-print` produces identical output?	↑ higher is better
Fidelity	`pg_query_canonical(parser_output) == pg_query_canonical(original)` - semantically correct output, not just self-consistent?	↑ higher is better

Performance results

Methodology note

databend-common-ast parses one statement per call, so the benchmark splits the concatenated input on ; and invokes the parser once per statement. This matches the natural API contract but adds a small string-splitting overhead compared to parsers that accept a multi-statement string natively.

SELECT throughput (batch of N statements)

Batch size	sqlparser-rs	databend-common-ast	pg_query.rs
1	6.6 µs	11.2 µs	11.0 µs
10	124.9 µs	184.7 µs	198.5 µs
100	800.6 µs	1.37 ms	1.44 ms
1000	10.09 ms	16.16 ms	17.67 ms

Other statement types (full corpus batch)

Statement type	Corpus size	sqlparser-rs	databend-common-ast
INSERT	993 stmts	8.37 ms	9.07 ms
UPDATE	984 stmts	6.23 ms	6.65 ms
DELETE	934 stmts	5.53 ms	6.09 ms

databend-common-ast is consistently within 10–30% of sqlparser-rs wall-clock time. For INSERT/UPDATE/DELETE the gap narrows to under 10%.

Parse success rate on real-world corpus (Spider + Gretel, PostgreSQL-validated)

Statement type	Parse success rate
SELECT	99.2%
INSERT	94.3%
UPDATE	98.2%
DELETE	97.3%

The 1–6% failures reflect PostgreSQL-specific constructs (RETURNING, certain type casts, PG-specific syntax) that fall outside the Databend/ClickHouse dialect focus.

Correctness results

Tested against the sqlparser-rs test suite on three corpora, using pg_query as PostgreSQL ground truth. Counts show absolute numbers; percentages are bolded.

PostgreSQL-specific tests (312 valid / 129 invalid)

Metric	databend-common-ast	sqlparser-rs (reference)
Recall	40/312 - 13%	310/312 - 99%
False-positive rate	2/129 - 1.6%	37/129 - 28.7%
Round-trip	40/40 - 100%	310/310 - 100%
Fidelity	31/40 - 77.5%	306/310 - 98.7%

Common (all-dialect) tests (323 valid / 469 invalid)

Metric	databend-common-ast	sqlparser-rs (reference)
Recall	177/323 - 55%	318/323 - 98%
False-positive rate	36/469 - 7.7%	141/469 - 30.1%
Round-trip	177/177 - 100%	318/318 - 100%
Fidelity	150/177 - 84.7%	318/318 - 100%

TPC-H / regression tests (21 valid / 1 invalid)

Metric	databend-common-ast	sqlparser-rs (reference)
Recall	20/21 - 95%	21/21 - 100%
False-positive rate	0/1 - 0%	1/1 - 100%*
Round-trip	20/20 - 100%	21/21 - 100%
Fidelity	19/20 - 95%	21/21 - 100%

* This corpus contains only 1 invalid statement; sqlparser-rs accepts it (false positive), databend does not.

Key correctness observations

Strengths:

Lowest false-positive rate of any parser tested: 1.6–7.7% vs sqlparser-rs's 28.7–30.1%. databend-common-ast is the most conservative parser benchmarked - when it accepts SQL, it is almost always genuinely valid PostgreSQL. This is a strong property for applications that need strict dialect enforcement.
Perfect round-trip stability: every statement that databend-common-ast accepts round-trips cleanly through parse → print → re-parse → re-print. Zero regressions observed across all corpora.
Strong TPC-H / analytical SQL recall (95%): for complex analytical queries (multi-join, aggregations, subqueries, window functions), databend-common-ast handles nearly everything sqlparser-rs does. This suggests the core SQL expression and query grammar is solid.

Areas for improvement:

Low recall on PostgreSQL-specific syntax (13%): the PG-specific corpus covers DDL extensions and PostgreSQL type system features (ENUM, DOMAIN, SEQUENCE, PG-specific operators, array syntax, etc.). Most failures are expected given the Databend/ClickHouse dialect focus, but these are the concrete gaps if PostgreSQL compatibility is a goal.
Moderate recall on common SQL (55%): even on standard SQL constructs shared across dialects, databend-common-ast accepts 55% of what pg_query accepts (vs 98% for sqlparser-rs). This suggests several common SQL constructs are not yet supported beyond the Databend dialect.
Fidelity gap on accepted statements: among statements the parser accepts, 77.5–84.7% produce semantically equivalent output under pg_query's canonical form (vs 98.7–100% for sqlparser-rs). Round-trip is perfect (the parser is self-consistent), but the pretty-printer normalizes some constructs in ways that change semantics - for example stripping parenthesized join grouping (which also causes the panics above) or altering implicit cast representation.

Reliability concern - panics from internal assert_reparse:

Three statements in the test corpus cause databend-common-ast to panic. The panics do not come from malformed input crashing the parser - the parser succeeds on all three. They originate from assert_reparse in parser.rs (lines 178/182), a round-trip consistency check called unconditionally after every successful parse. It re-parses the pretty-printed AST and asserts the two ASTs are equal. When the pretty-printer produces inequivalent output, the assertion fires.

The three reproduction cases and their root causes:

-- Pretty-printer emits a spurious leading comma in the column list,
-- causing the re-parse to fail with a syntax error (Err turned into panic via map_err).
CREATE TABLE t (CONSTRAINT positive CHECK (2 > 1))

-- Pretty-printer loses parenthesized join grouping, flattening nested joins
-- into a left-associative chain. Re-parsed AST has a different tree structure -> assert_eq! fires.
SELECT * FROM a NATURAL JOIN (b NATURAL JOIN (c NATURAL JOIN d NATURAL JOIN e)) NATURAL JOIN (f NATURAL JOIN (g NATURAL JOIN h))

-- Same class of bug, simpler case: triple parentheses stripped, associativity changes.
SELECT * FROM a NATURAL JOIN (((b NATURAL JOIN c)))

Two distinct pretty-printer bugs are involved: a spurious comma in CONSTRAINT ... CHECK output, and loss of parenthesized join grouping. The assert_reparse check is called unconditionally (not behind #[cfg(debug_assertions)]), so these panics occur in release builds. The benchmark wraps all databend calls in std::panic::catch_unwind to handle this; without that guard, a panic would abort the calling thread.

Summary

databend-common-ast has genuinely impressive properties: the lowest false-positive rate of any pure-Rust parser tested, perfect round-trip stability, strong TPC-H coverage, and wall-clock performance within 10–30% of sqlparser-rs. The primary gap for general-purpose use is recall - particularly on PG-specific and common SQL - which reflects the Databend dialect focus rather than a fundamental design limitation.

If the team is interested in expanding PostgreSQL dialect coverage, the test corpora from this benchmark could be useful as a regression suite. I am happy to open a PR with the raw SQL test files, or to discuss specific failure patterns further.

Thanks for building and maintaining databend-common-ast - it is a well-engineered codebase.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark: databend-common-ast vs other Rust SQL parsers on PostgreSQL workloads #19485

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Benchmark: databend-common-ast vs other Rust SQL parsers on PostgreSQL workloads #19485

Uh oh!

LucaCappelletti94 Feb 25, 2026

What the benchmark measures

Performance results

Methodology note

SELECT throughput (batch of N statements)

Other statement types (full corpus batch)

Parse success rate on real-world corpus (Spider + Gretel, PostgreSQL-validated)

Correctness results

PostgreSQL-specific tests (312 valid / 129 invalid)

Common (all-dialect) tests (323 valid / 469 invalid)

TPC-H / regression tests (21 valid / 1 invalid)

Key correctness observations

Summary

Replies: 0 comments

LucaCappelletti94
Feb 25, 2026