Closed
Description
Feature Request / Improvement
When engines, such as Daft, read from the Table
object (see scan_iceberg), it would be great if PyIceberg transparently handles time travel.
For example, to query an Iceberg table at a specific commit or timestamp, we can use PyIceberg to time travel to the particular snapshot-id or timestamp and then pass it into the engine.
There are several options to achieve this:
- Construct
Table
object with the metadata of a specificSnapshot
. Maybe a function likeTable.as_of(snapshot_id/timestamp) -> Table
. This will make time travel transparent to the engine. - Pass the
Snapshot
object to the engine. The functionTable.snapshot_by_id -> Snapshot
already exists, and represents a specific Iceberg commit. The engine will need to be able to read from bothSnapshot
andTable
Happy to explore other options as well.
Metadata
Metadata
Assignees
Labels
No labels