Skip to content

Add support for position and equality deletes in vortex#17

Merged
robert3005 merged 9 commits into
mainfrom
rk/deletes
Jun 23, 2026
Merged

Add support for position and equality deletes in vortex#17
robert3005 merged 9 commits into
mainfrom
rk/deletes

Conversation

@robert3005

Copy link
Copy Markdown
Member

No description provided.

@robert3005 robert3005 force-pushed the rk/deletes branch 2 times, most recently from 0319b57 to a314dc0 Compare June 15, 2026 13:57
@github-actions github-actions Bot added the SPARK label Jun 15, 2026
@robert3005 robert3005 changed the title Add support for position deletes in vortex Add support for position and equality deletes in vortex Jun 15, 2026
robert3005 and others added 6 commits June 16, 2026 13:16
Pushes position deletes into the Vortex scan so deleted rows are excluded natively instead of being read and filtered out afterwards.

DeleteFilter exposes the deleted positions for pushdown (skipped when the _is_deleted column is projected, since those rows must be marked rather than removed). GenericReader forwards them only when the reader advertises support via the new ReadBuilder.supportsPositionDeletes(), so Parquet/ORC/Avro keep applying deletes post-scan. VortexIterable serializes the positions as a portable 64-bit Roaring bitmap and applies EXCLUDE_ROARING row selection.

Also adds a Vortex position-delete writer (PositionDeleteVortexWriter, VortexFormatModel.forPositionDeletes) for writing path/pos delete files, plus TestVortexPositionDeletes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
VortexSchemas.convert (used to bind scan filters in VortexIterable) threw on struct columns because toIcebergType had no Struct branch. Recurse into struct children, assigning unique field ids via a shared counter (also fixes the latent duplicate-id bug where list elements were hardcoded to id 0). Map stays unsupported.

This unblocks reading Vortex tables whose schema contains structs through the generic reader (which always binds the residual filter), so the full DeleteReadTests suite now runs for Vortex via TestVortexReaderDeletes (data + v2 position-delete files + v3 DVs, written through the standard GenericFileWriterFactory/registry path).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Robert Kruszewski <github@robertk.io>
@robert3005 robert3005 merged commit a6b8409 into main Jun 23, 2026
53 of 58 checks passed
@robert3005 robert3005 deleted the rk/deletes branch June 23, 2026 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant