Skip to content

Delta on Iceberg V4 RFC Followup Improvements #6953

@anoopj

Description

@anoopj

As a followup of #6640, we want to incorporate the following Delta protocol improvements for tables with Iceberg v4 tree enabled:

  • partitionValues and stats could both be actual JSON objects instead of string-string map and JSON object literal string, respectively. That way, we could parse them strongly-typed straight from JSON (stats schema is usually known) or as variant.
  • Represent timestamps in metadata with full precision. This avoids the current truncation down to milliseconds.
  • Currently timestamp supports multiple formats, including ISO8601 formatting. Just use an int64 instead.
  • Empty strings should not be used to represent null partition values.
  • Remove actions must have stats and partition values from the corresponding Add action.
  • Avoid URI encoding in the path. Standardize on v4 relative/absolute paths.
  • Handle binary partition values correctly.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions