[BUG][Spark] UniForm Iceberg incremental conversion diagnostics report the latest snapshot version instead of the offending commit version
Bug
Which Delta project/connector is this regarding?
Describe the problem
UniForm performs incremental Iceberg conversion by translating Delta commits one
at a time. IcebergConverter.runIcebergConversionForActions processes a single
commit, identified by its deltaVersion parameter.
However, the diagnostics emitted from this path report targetSnapshot.version
— the latest (head) snapshot being synced — rather than deltaVersion, the
commit actually being processed. This affects:
- the
delta.iceberg.conversion.unsupportedActions event and its accompanying
logError
- the
delta.iceberg.conversion.convertActions success event
As a result, when conversion fails on a specific commit, the error and telemetry
attribute the failure to the table head (frequently an unrelated operation such
as a MERGE) rather than to the commit that actually failed. The reported
version does not correspond to the commit being converted, which makes
conversion failures misleading and difficult to triage.
Steps to reproduce
- Create a Delta table with UniForm Iceberg enabled
(delta.universalFormat.enabledFormats = 'iceberg') so incremental Iceberg
conversion runs on each commit.
- Produce a sequence of commits such that an earlier commit (e.g. a stats/
metadata commit) is converted while the table head is a later, unrelated
commit (e.g. a MERGE).
- Trigger a conversion failure on the earlier commit (an unsupported
combination of actions), or simply inspect the convertActions success
event for a converted commit.
- Inspect the driver
ERROR log line / the emitted Delta event.
Observed results
The diagnostics report the head snapshot version, not the commit being
processed.
For example:
26/06/13 02:00:10 ERROR IcebergConverterEdge: Unsupported combination of actions for incremental conversion. Context:
version -> 62111,
commitInfo -> COMPUTE STATS,
hasAdd -> true,
hasRemove -> false,
dataChange -> Some,
hasDv -> true
Further details
The relevant code is in
iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala,
in runIcebergConversionForActions. Both the unsupportedActions /
convertActions recordDeltaEvent calls and the logError string interpolate
targetSnapshot.version where they should report the per-commit deltaVersion.
This is a diagnostics/logging-only issue; conversion behavior itself is correct
and unchanged. The fix is to report both the head version and the per-commit
version, and to rename the fields so the two are unambiguous.
Environment information
- Delta Lake version: master (also affects released versions where incremental
UniForm Iceberg conversion is present)
- Spark version: 3.5.x / 4.x
- Scala version: 2.13
Willingness to contribute
[BUG][Spark] UniForm Iceberg incremental conversion diagnostics report the latest snapshot version instead of the offending commit version
Bug
Which Delta project/connector is this regarding?
Describe the problem
UniForm performs incremental Iceberg conversion by translating Delta commits one
at a time.
IcebergConverter.runIcebergConversionForActionsprocesses a singlecommit, identified by its
deltaVersionparameter.However, the diagnostics emitted from this path report
targetSnapshot.version— the latest (head) snapshot being synced — rather than
deltaVersion, thecommit actually being processed. This affects:
delta.iceberg.conversion.unsupportedActionsevent and its accompanyinglogErrordelta.iceberg.conversion.convertActionssuccess eventAs a result, when conversion fails on a specific commit, the error and telemetry
attribute the failure to the table head (frequently an unrelated operation such
as a
MERGE) rather than to the commit that actually failed. The reportedversion does not correspond to the commit being converted, which makes
conversion failures misleading and difficult to triage.
Steps to reproduce
(
delta.universalFormat.enabledFormats = 'iceberg') so incremental Icebergconversion runs on each commit.
metadata commit) is converted while the table head is a later, unrelated
commit (e.g. a
MERGE).combination of actions), or simply inspect the
convertActionssuccessevent for a converted commit.
ERRORlog line / the emitted Delta event.Observed results
The diagnostics report the head snapshot version, not the commit being
processed.
For example:
Further details
The relevant code is in
iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala,in
runIcebergConversionForActions. Both theunsupportedActions/convertActionsrecordDeltaEventcalls and thelogErrorstring interpolatetargetSnapshot.versionwhere they should report the per-commitdeltaVersion.This is a diagnostics/logging-only issue; conversion behavior itself is correct
and unchanged. The fix is to report both the head version and the per-commit
version, and to rename the fields so the two are unambiguous.
Environment information
UniForm Iceberg conversion is present)
Willingness to contribute