Feature request
Which Delta project/connector is this regarding?
Overview
VACUUM ... USING INVENTORY <table_identifier> currently requires the inventory source to be a Delta table.
This is stricter than necessary. The inventory is ultimately consumed as a DataFrame and validated against the required inventory schema (path, length, isDir, modificationTime).
Motivation
The feature announcement describes inventory-based vacuum as a way to pass inventory "as a delta table or as a spark sql query":
https://delta.io/blog/efficient-delta-vacuum/
This request is intended to close the gap between that documented usage and the current implementation behavior for the table identifier form.
The command has two inventory paths with inconsistent behavior:
USING INVENTORY <table_identifier> fails unless the source is Delta.
USING INVENTORY (<subquery>) works with any SQL source.
Both paths end up in the same VACUUM flow where inventory schema is validated.
Because of this, users are forced to either:
- Materialize inventory data as Delta just to satisfy the identifier path, or
- Rewrite to subquery syntax as a workaround.
Further details
Scope clarification:
- This request only changes how the
USING INVENTORY <table_identifier> source is resolved.
- It does not change any controls for the VACUUM target table.
- The target remains Delta-only and still goes through existing Delta VACUUM safety and protocol checks.
Requested behavior:
- Allow
<table_identifier> inventory sources to resolve to any analyzable relation (table/view/temp view), not only Delta.
- Keep the existing inventory schema validation as the safety check.
Suggested implementation:
- For the identifier branch, resolve to a DataFrame directly from the analyzed plan.
- Do not require
getDeltaTable() for the inventory source.
- Continue using the existing strict schema validation to reject invalid inventory input.
This keeps safety unchanged while removing an unnecessary restriction and aligning identifier behavior with subquery behavior.
Willingness to contribute
Feature request
Which Delta project/connector is this regarding?
Overview
VACUUM ... USING INVENTORY <table_identifier>currently requires the inventory source to be a Delta table.This is stricter than necessary. The inventory is ultimately consumed as a DataFrame and validated against the required inventory schema (
path,length,isDir,modificationTime).Motivation
The feature announcement describes inventory-based vacuum as a way to pass inventory "as a delta table or as a spark sql query":
https://delta.io/blog/efficient-delta-vacuum/
This request is intended to close the gap between that documented usage and the current implementation behavior for the table identifier form.
The command has two inventory paths with inconsistent behavior:
USING INVENTORY <table_identifier>fails unless the source is Delta.USING INVENTORY (<subquery>)works with any SQL source.Both paths end up in the same VACUUM flow where inventory schema is validated.
Because of this, users are forced to either:
Further details
Scope clarification:
USING INVENTORY <table_identifier>source is resolved.Requested behavior:
<table_identifier>inventory sources to resolve to any analyzable relation (table/view/temp view), not only Delta.Suggested implementation:
getDeltaTable()for the inventory source.This keeps safety unchanged while removing an unnecessary restriction and aligning identifier behavior with subquery behavior.
Willingness to contribute