Add custom MultiFileReader for reading delete files #641

mchataigner · 2025-12-22T16:22:13Z

Context: Slower read performance when a table has many delete files.

TL;DR: We can leverage the metadata already available in DuckLake to improve load time of delete files.

Problem & Motivation:

DuckLake stores file_size metadata for both data and delete files. For data files, there is already a mechanism to forward this metadata to the MultiFileReader and the underlying filesystem. The Parquet reader requires this file_size to access the footer metadata. When using an HTTPFileSystem instance (e.g., for S3, Azure), it performs a HEAD request on the file if metadata fields (file_size, etag, last_modified) are not present. Since all files in DuckLake are immutable, we can apply the same optimization logic for delete files to avoid these unnecessary HEAD requests.

Solution:

Implements a custom multi-file reading solution that pre-populates file metadata to eliminate redundant storage HEAD requests when scanning delete files:

Key Changes:

New DeleteFileFunctionInfo struct: Extends TableFunctionInfo to carry DuckLakeFileData metadata through the table function binding process.
Custom DeleteFileMultiFileReader class:
- Extends DuckDB's MultiFileReader to intercept file list creation
- Pre-populates ExtendedOpenFileInfo with metadata already available from DuckLake:
  - File size (file_size_bytes)
  - ETag (empty string as placeholder)
  - Last modified timestamp (set to epoch)
  - Encryption key (if present)
- Creates a SimpleMultiFileList with this extended info upfront
- Overrides CreateFileList() to return the pre-built list, bypassing DuckDB's default file discovery
Modified ScanDeleteFile() method:
- Changed parquet_scan from const reference to mutable copy to allow modification
- Attaches DeleteFileFunctionInfo and custom reader factory to the table function
- Passes the actual parquet_scan function to TableFunctionBindInput instead of a dummy function, ensuring proper function context

Performance Impact: Eliminates HEAD requests to object storage when opening Parquet delete files. This is particularly beneficial when working with remote storage (S3, Azure, etc.) and tables with many delete files, where HEAD requests were causing significant performance bottlenecks.

pdet

Hi @mchataigner thanks for the PR!
Could you add a MinIO test that demonstrates fewer requests are done ?
Could you also retarget it to v 1.4?

mchataigner · 2025-12-23T08:21:36Z

@pdet you're welcome. I will do the changes.
Thanks for your feedback.

…e files **Context**: We experience slow read performance when a table has many delete files. **TL;DR**: We can leverage the metadata already available in DuckLake to improve load time of delete files. **Problem & Motivation:** DuckLake stores `file_size` metadata for both data and delete files. For data files, there is already a mechanism to forward this metadata to the MultiFileReader and the underlying filesystem. The Parquet reader requires this `file_size` to access the footer metadata. When using an `HTTPFileSystem` instance (e.g., for S3, Azure), it performs a HEAD request on the file if metadata fields (`file_size`, `etag`, `last_modified`) are not present. Since all files in DuckLake are immutable, we can apply the same optimization logic for delete files to avoid these unnecessary HEAD requests. **Solution:** Implements a custom multi-file reading solution that pre-populates file metadata to eliminate redundant storage HEAD requests when scanning delete files: **Key Changes:** 1. **New `DeleteFileFunctionInfo` struct**: Extends `TableFunctionInfo` to carry `DuckLakeFileData` metadata through the table function binding process. 2. **Custom `DeleteFileMultiFileReader` class**: - Extends DuckDB's `MultiFileReader` to intercept file list creation - Pre-populates `ExtendedOpenFileInfo` with metadata already available from DuckLake: - File size (`file_size_bytes`) - ETag (empty string as placeholder) - Last modified timestamp (set to epoch) - Encryption key (if present) - Creates a `SimpleMultiFileList` with this extended info upfront - Overrides `CreateFileList()` to return the pre-built list, bypassing DuckDB's default file discovery 3. **Modified `ScanDeleteFile()` method**: - Changed `parquet_scan` from const reference to mutable copy to allow modification - Attaches `DeleteFileFunctionInfo` and custom reader factory to the table function - Passes the actual `parquet_scan` function to `TableFunctionBindInput` instead of a dummy function, ensuring proper function context **Performance Impact**: Eliminates HEAD requests to object storage when opening Parquet delete files. This is particularly beneficial when working with remote storage (S3, Azure, etc.) and tables with many delete files, where HEAD requests were causing significant performance bottlenecks.

mchataigner · 2025-12-26T23:33:16Z

@pdet sorry for the delay, I updated the PR with format fix and added a test with MinIO.

pdet

Hi, thanks again for the changes, just a small coment wrt the test

pdet · 2026-01-06T14:59:53Z

test/sql/delete/delete_metadata.test

+statement ok
+SET autoinstall_known_extensions=1;
+
+statement ok
+SET autoload_known_extensions=1;


I think we can ensure the test only runs in the MinIo CI by having

require httpfs require-env S3_TEST_SERVER_AVAILABLE 1

and S3_TEST_SERVER_AVAILABLE: 1 in the .github/workflows/MinIO.yml env.

I've done it @pdet 🫡

Then I think you don't need this:

statement ok SET autoinstall_known_extensions=1; statement ok SET autoload_known_extensions=1;

Or am I wrong?

of course, done!

once github is ready 😅

@pdet I believe I should re-open a new PR because it seems broken. The branch got wrongly deleted on our end, then I added a new commit to it and repushed it - I think GH is now lost. I cannot close this PR myself, but feel free to close it so I can reopen it.

pdet · 2026-01-07T17:49:56Z

Sure!

redox · 2026-01-07T18:55:17Z

Sure!

#674

mchataigner force-pushed the mbc/improve_scan_delete_files branch 2 times, most recently from 8dfed69 to c45e07f Compare December 22, 2025 16:48

mchataigner changed the title ~~Add custom MultiFileReader to avoid HEAD requests when scanning delete files~~ Add custom MultiFileReader for reading delete files Dec 22, 2025

pdet reviewed Dec 22, 2025

View reviewed changes

mchataigner force-pushed the mbc/improve_scan_delete_files branch from c45e07f to b625d74 Compare December 26, 2025 23:29

mchataigner force-pushed the mbc/improve_scan_delete_files branch from b625d74 to db066de Compare December 26, 2025 23:31

mchataigner changed the base branch from main to v1.4-andium December 26, 2025 23:31

mchataigner requested a review from pdet December 26, 2025 23:32

pdet requested changes Jan 6, 2026

View reviewed changes

Ensure this test only runs on MinIO

6902d74

utay deleted the mbc/improve_scan_delete_files branch January 6, 2026 21:25

pdet closed this Jan 7, 2026

redox mentioned this pull request Jan 7, 2026

Add custom MultiFileReader for reading delete files #674

Open

Add custom MultiFileReader for reading delete files #641

Add custom MultiFileReader for reading delete files #641

Uh oh!

Conversation

mchataigner commented Dec 22, 2025

Uh oh!

pdet left a comment

Choose a reason for hiding this comment

Uh oh!

mchataigner commented Dec 23, 2025

Uh oh!

mchataigner commented Dec 26, 2025

Uh oh!

pdet left a comment

Choose a reason for hiding this comment

Uh oh!

pdet Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

redox Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

pdet Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

redox Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

redox Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

redox Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

pdet commented Jan 7, 2026

Uh oh!

redox commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants