Skip to content

fix(vortex-bench): map gs:// scheme to gcs storage label#8630

Merged
robert3005 merged 1 commit into
vortex-data:developfrom
polarsignals:fix/vortex-bench-gcs-storage-scheme
Jun 30, 2026
Merged

fix(vortex-bench): map gs:// scheme to gcs storage label#8630
robert3005 merged 1 commit into
vortex-data:developfrom
polarsignals:fix/vortex-bench-gcs-storage-scheme

Conversation

@brancz

@brancz brancz commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Rationale for this change

Running the benchmark harness (datafusion-bench / query_bench) against a remote
dataset on Google Cloud Storage (--opt remote-data-dir=gs://…) fails immediately
during benchmark setup with:

Error: unknown URL scheme: gs

vortex-bench's url_scheme_to_storage helper — which maps a data-dir URL scheme to a
storage label used for result reporting — only handled s3 and file, so any gs://
run bailed before a single query executed. S3 remote runs work because s3 is handled;
GCS was simply never covered. make_object_store already supports gs:// for the actual
reads, so the only gap was this reporting helper.

What changes are included in this PR?

  • Add a STORAGE_GCS = "gcs" constant.
  • Add a "gs" arm to url_scheme_to_storage returning that label.

Verified by running TPC-H SF1 from a GCS bucket end-to-end (DataFusion + Vortex, 22/22
queries executing against gs://…, results tagged storage=gcs).

What APIs are changed? Are there any user-facing changes?

None. This only affects the benchmark harness's storage-label reporting; no public API,
format, or behavior change outside vortex-bench.

🤖 Generated with Claude Code

url_scheme_to_storage only handled s3 and file, so benchmark runs against GCS (gs://) failed during setup with "unknown URL scheme: gs" before any query ran. Add a STORAGE_GCS constant and a gs arm. make_object_store already handles gs:// for the actual reads.

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@brancz brancz requested a review from a team June 30, 2026 16:23
@brancz brancz added the changelog/fix A bug fix label Jun 30, 2026
@robert3005 robert3005 enabled auto-merge (squash) June 30, 2026 16:27
@robert3005 robert3005 merged commit 797b650 into vortex-data:develop Jun 30, 2026
77 of 79 checks passed
@codspeed-hq

codspeed-hq Bot commented Jun 30, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 2 improved benchmarks
❌ 2 regressed benchmarks
✅ 1591 untouched benchmarks
⏩ 4 skipped benchmarks1

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation chunked_varbinview_into_canonical[(1000, 10)] 168.9 µs 205.7 µs -17.87%
Simulation slice_empty_vortex 339.4 ns 397.8 ns -14.66%
Simulation chunked_varbinview_canonical_into[(100, 100)] 259.5 µs 224.4 µs +15.64%
Simulation chunked_varbinview_into_canonical[(100, 100)] 306.6 µs 271.5 µs +12.95%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing polarsignals:fix/vortex-bench-gcs-storage-scheme (e72acdc) with develop (5d3be01)

Open in CodSpeed

Footnotes

  1. 4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/fix A bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants