Skip to content

[Bug]: Iceberg REST GET view drops versions[].representations for metadata-location-imported views #12504

@defun621

Description

@defun621

What happened

When a view is registered in Nessie by reference to an existing Iceberg view
metadata file (the ICEBERG_VIEW "metadata-location" import path, e.g. a view
created by Dremio), Nessie's Iceberg REST endpoint

GET /iceberg/v1/{prefix}/namespaces/{ns}/views/{view}

returns metadata.versions[].representations: [], even though the referenced
metadata file on object storage contains a valid SQL representation.

The metadata-import conversion NessieModelIceberg.icebergViewSnapshotToNessie(...)
copies the current version's id, default catalog, namespace and summary, but
not currentVersion.representations(). The normal REST view update path
(addViewVersion) does copy them, so the model supports representations — this
is a lossy branch in the import path only.

Consequence: Iceberg REST clients that rely on versions[].representations
cannot analyze/query the view.

How to reproduce it

Stack: MinIO (S3) + Nessie + Dremio, with two warehouses — Nessie default
s3://warehouse/ and a Dremio source storage root s3://lakehouse/.

  1. Run Nessie with MinIO-backed warehouses (IN_MEMORY version store is fine).

  2. Add a Nessie source in Dremio (http://nessie:19120/api/v2, storage root
    bucket lakehouse) and create a table + view:

    CREATE TABLE nessie_lakehouse.modern.v_person AS
      SELECT 'v1' AS id, 'marko' AS name, CAST(29 AS INTEGER) AS age
      UNION ALL SELECT 'v2','vadas',27
      UNION ALL SELECT 'v4','josh',32
      UNION ALL SELECT 'v6','peter',35;
    
    CREATE OR REPLACE VIEW nessie_lakehouse.modern.v_person_view AS
      SELECT id, name, age FROM nessie_lakehouse.modern.v_person WHERE age >= 27;
  3. Query Nessie's Iceberg REST view endpoint:

    curl -s http://localhost:19120/iceberg/v1/main/namespaces/modern/views/v_person_view \
      | jq '.metadata.versions[0].representations'

    Actual: []

  4. Read the referenced metadata file directly from storage — it does contain
    the representation:

    mc cat local/lakehouse/modern/v_person_view/metadata/0000*-*.gz.metadata.json \
      | gunzip | jq '.versions[0].representations'
    # [ { "type":"sql", "sql":"SELECT id, name, age FROM ... WHERE age >= 27",
    #     "dialect":"DremioSQL" } ]

Expected: the REST GET view response preserves the SQL representation from
the referenced metadata file. Actual: representations: [].

Nessie server type (docker/uber-jar/built from source) and version

Type: Docker — ghcr.io/projectnessie/nessie:0.107.6

Client type (Ex: UI/Spark/pynessie ...) and version

Writer (creates the view via metadata-location import): Dremio dremio/dremio-oss:latest, Reader: observes the empty representations): any Iceberg REST client, reproduced with raw curl

Additional information

Suspected root causecatalog/format/iceberg/src/main/java/org/projectnessie/catalog/formats/iceberg/nessie/NessieModelIceberg.java,
icebergViewSnapshotToNessie(...) sets icebergCurrentVersionId,
icebergDefaultCatalog, icebergNamespace, icebergVersionSummary but omits
currentVersion.representations(). The reverse conversion expects
NessieViewSnapshot.representations() to be populated. This path is reached
from ImportSnapshotWorker (icebergMetadata(...)icebergViewSnapshotToNessie(...)).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions