Skip to content

bench: bit-packed compare-constant baseline#8012

Merged
joseph-isaacs merged 1 commit into
developfrom
claude/bitpack-compare-bench-KGPS3
May 18, 2026
Merged

bench: bit-packed compare-constant baseline#8012
joseph-isaacs merged 1 commit into
developfrom
claude/bitpack-compare-bench-KGPS3

Conversation

@joseph-isaacs

Copy link
Copy Markdown
Contributor

Summary

  • Adds a divan benchmark bitpack_compare in vortex-fastlanes that compares an Operator::Eq / Operator::Lt against an out-of-range constant on a BitPackedData array vs. an explicit "decompress, then Arrow compare" baseline that materialises the unpacked PrimitiveArray first.
  • The constant is chosen as 1 << BW, just past the packable range, so a future compare-constant kernel can recognise it and short-circuit. Today both arms decompress; this PR establishes a baseline for that follow-up to land against.
  • Grid sized for fast runs: len ∈ {1024, 65536}, bit_width ∈ {4, 16}, Eq + Lt.

The follow-up optimization (out-of-range fast path on BitPacked, plus the bitpack_constant analytical encoder) is in #PR2-PLACEHOLDER, stacked on this branch. Splitting the bench out lets the speedup PR show concrete numbers against this measured baseline.

Test plan

  • cargo check -p vortex-fastlanes --benches
  • cargo bench -p vortex-fastlanes --bench bitpack_compare records the slow baseline numbers prior to the follow-up landing

🤖 Generated with Claude Code

Add `bitpack_compare` divan bench in vortex-fastlanes that pits a binary
`Operator::Eq` / `Operator::Lt` against an out-of-range constant on a
`BitPackedData` array against an explicit "decompress, then Arrow compare"
baseline that materialises the unpacked `PrimitiveArray` first.

The constant is chosen as `1 << BW`, i.e. just past the packable range, so a
future kernel that recognises out-of-range constants can short-circuit it.
Today both arms decompress; the benchmark establishes a baseline for that
upcoming optimization to land against. Sized small (`len ∈ {1024, 65536}`,
`bit_width ∈ {4, 16}`, Eq + Lt) so it finishes quickly.

Run with `cargo bench -p vortex-fastlanes --bench bitpack_compare`.

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
@codspeed-hq

codspeed-hq Bot commented May 18, 2026

Copy link
Copy Markdown

Merging this PR will improve performance by 18%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 4 improved benchmarks
✅ 1217 untouched benchmarks
🆕 16 new benchmarks

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation chunked_varbinview_canonical_into[(1000, 10)] 197.9 µs 162 µs +22.19%
Simulation chunked_varbinview_into_canonical[(100, 100)] 358.4 µs 323.5 µs +10.78%
Simulation chunked_varbinview_into_canonical[(1000, 10)] 211.2 µs 175.8 µs +20.11%
Simulation chunked_varbinview_opt_canonical_into[(1000, 10)] 224.8 µs 188.6 µs +19.23%
🆕 Simulation baseline_eq[16, 1024] N/A 64.1 µs N/A
🆕 Simulation baseline_lt[16, 1024] N/A 64.4 µs N/A
🆕 Simulation baseline_lt[16, 65536] N/A 274.5 µs N/A
🆕 Simulation baseline_lt[4, 1024] N/A 64.1 µs N/A
🆕 Simulation baseline_eq[16, 65536] N/A 259.4 µs N/A
🆕 Simulation baseline_lt[4, 65536] N/A 251.9 µs N/A
🆕 Simulation fast_eq_out_of_range[16, 65536] N/A 291.1 µs N/A
🆕 Simulation fast_eq_out_of_range[4, 1024] N/A 67 µs N/A
🆕 Simulation baseline_eq[4, 1024] N/A 63.1 µs N/A
🆕 Simulation baseline_eq[4, 65536] N/A 237.9 µs N/A
🆕 Simulation fast_eq_out_of_range[16, 1024] N/A 67.7 µs N/A
🆕 Simulation fast_lt_out_of_range[16, 65536] N/A 306.3 µs N/A
🆕 Simulation fast_lt_out_of_range[4, 65536] N/A 262.1 µs N/A
🆕 Simulation fast_eq_out_of_range[4, 65536] N/A 246 µs N/A
🆕 Simulation fast_lt_out_of_range[16, 1024] N/A 67.9 µs N/A
🆕 Simulation fast_lt_out_of_range[4, 1024] N/A 87.8 µs N/A

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing claude/bitpack-compare-bench-KGPS3 (5f53991) with develop (faf7e42)

Open in CodSpeed

@joseph-isaacs joseph-isaacs enabled auto-merge (squash) May 18, 2026 17:24
@joseph-isaacs joseph-isaacs merged commit 7b47788 into develop May 18, 2026
66 of 67 checks passed
@joseph-isaacs joseph-isaacs deleted the claude/bitpack-compare-bench-KGPS3 branch May 18, 2026 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/skip Do not list PR in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants