Remove redundant locations when constructing access policies #2149

eric-maynard · 2025-07-21T21:01:03Z

Iceberg tables can technically store data across any number of paths, but Polaris currently uses 3 different locations for credential vending:

The table's base location
The table's write.data.path, if set
The table's write.metadata.path, if set

This was intended to capture scenarios where e.g. (2) is not a child path of (1), so that the vended credentials can still be valid for reading the entire table. However, there are systems that seem to always set (2) and (3), such as:

s3:/my-bucket/base/iceberg
s3:/my-bucket/base/iceberg/data
s3:/my-bucket/base/iceberg/metadata

In such cases the extra paths (e.g. extra resources in the AWS Policy) are redundant. In one such case, these redundant paths caused the policy to exceed the maximum allowable 2048 characters.

This PR removes redundant paths -- those that are the child of another path -- from the list of accessible locations tracked for a given table and does some slight refactoring to consolidate the logic for extracting these paths from a TableMetadata.

polaris-core/src/main/java/org/apache/polaris/core/storage/StorageUtil.java

snazy · 2025-07-23T07:39:03Z

polaris-core/src/main/java/org/apache/polaris/core/storage/StorageUtil.java

+
+  /** Removes "redundant" locations, so {/a/b/, /a/b/c, /a/b/d} will be reduced to just {/a/b/} */
+  private static @Nonnull Set<String> removeRedundantLocations(Set<String> locationStrings) {
+    HashSet<String> result = new HashSet<>(locationStrings);


Since this is a new collection, it can be produced by

locationStrings.stream() .filter(Objects::nonNull) .map(StorageLocation::of) .collect(Collectors.toCollection(HashSet::new));

That would remove duplicate locations, but not redundant locations like we want to. We'd still need to loop over the collection.

Yes, but you'd save the exponential instantiation of SotrageLocation objects

Actually you could safe the inner loop with a sorted collection, if the locations end with a /.

polaris-core/src/test/java/org/apache/polaris/core/storage/StorageUtilTest.java

service/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalog.java

snazy · 2025-07-23T07:41:59Z

service/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalog.java

@@ -2612,16 +2572,6 @@ protected FileIO loadFileIO(String ioImpl, Map<String, String> properties) {
        callContext, ioImpl, properties, identifier, locations, storageActions, resolvedPath);
  }

-  private void blockedUserSpecifiedWriteLocation(Map<String, String> properties) {


loadFileIO is also unused and should be removed as well.

Agreed that it can be removed, but unlike blockedUserSpecifiedWriteLocation it's not related to this PR so I think we should do that separately.

…into smaller-policy

eric-maynard added 6 commits July 21, 2025 10:35

initial commit

1f73d94

remove redundancy

343293c

autolint

d169eb9

all calls

e87a836

Test

5da8ac7

autolint

7890465

github-project-automation bot added this to Basic Kanban Board Jul 21, 2025

github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Jul 21, 2025

eric-maynard added 6 commits July 21, 2025 14:01

one more test

3a2c438

autolint

240f9f1

typofix

0cb6fd8

autolint

82dcda7

polish

e3cf2cc

Merge branch 'main' into smaller-policy

2d05d74

snazy reviewed Jul 23, 2025

View reviewed changes

eric-maynard added 5 commits July 30, 2025 21:01

changes per review

c0382d1

autolint

e2fe773

another fix

f1e9bc6

Merge branch 'smaller-policy' of github.com-oss:eric-maynard/polaris …

97aedbd

…into smaller-policy

autolint

54028ac

eric-maynard requested a review from snazy July 30, 2025 12:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove redundant locations when constructing access policies #2149

Remove redundant locations when constructing access policies #2149

eric-maynard commented Jul 21, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

snazy Jul 23, 2025

Uh oh!

eric-maynard Jul 30, 2025

Uh oh!

snazy Jul 31, 2025

Uh oh!

snazy Aug 1, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

snazy Jul 23, 2025

Uh oh!

eric-maynard Jul 30, 2025

Uh oh!

Uh oh!

Remove redundant locations when constructing access policies #2149

Are you sure you want to change the base?

Remove redundant locations when constructing access policies #2149

Conversation

eric-maynard commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

snazy Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

eric-maynard Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

snazy Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

snazy Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

snazy Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

eric-maynard Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

eric-maynard commented Jul 21, 2025 •

edited

Loading

snazy Aug 1, 2025 •

edited

Loading