-
Notifications
You must be signed in to change notification settings - Fork 272
Feature: Rollback compaction on conflict #1285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Rollback compaction on conflict #1285
Conversation
468051c
to
05f9bad
Compare
01571dc
to
285cc29
Compare
285cc29
to
eb06c77
Compare
...e/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java
Outdated
Show resolved
Hide resolved
...e/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java
Outdated
Show resolved
Hide resolved
...e/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java
Outdated
Show resolved
Hide resolved
...e/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java
Outdated
Show resolved
Hide resolved
...e/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java
Outdated
Show resolved
Hide resolved
...e/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java
Outdated
Show resolved
Hide resolved
...e/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java
Outdated
Show resolved
Hide resolved
...e/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java
Outdated
Show resolved
Hide resolved
354b12b
to
f1de46c
Compare
...e/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java
Outdated
Show resolved
Hide resolved
...e/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java
Outdated
Show resolved
Hide resolved
...e/common/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java
Outdated
Show resolved
Hide resolved
ef58f72
to
480cbf5
Compare
...ice/common/src/main/java/org/apache/polaris/service/catalog/iceberg/CatalogHandlerUtils.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't look into the actual code change, but I saw that there's a global config option to opt-in. Have you thought about making it configurable per table?
@snazy its even configurable at a snapshot level the writers need to add a property to consider it applicable to the feature, so IMHO its even more granular than table, i initially started with table but eventually based on the feedbacks moved to combination of catalog config (global) and snapshot property |
Thank you @eric-maynard @dimas-b @RussellSpitzer @snazy for the reviews ! This feature is gonna be really helpful for Polaris users ! |
Intention is make the catalog smarter, to revert the compaction commits in case of crunch to let the writers who are actually adding or removing the data to the table succeed. In a sense treating compaction as always a lower priority process. Presently the rest catalog client creates the snapshot and asks the Rest Server to apply the snapshot and gives this in a combination of requirement and update. Polaris could apply some basic inference and generate some updates to metadata given a property is enabled at a table level, by saying that It will revert back the commit which was created by compaction and let the write succeed. I had this PR in OSS, which was essentially doing this at the client end, but we think its best if we do this as server end. to support more such clients. How to use this Enable a catalog level configuration : polaris.config.rollback.compaction.on-conflicts.enabled when this is enabled polaris will apply the intelligence of rollbacking those REPLACE ops snapshot which have the property of polaris.internal.rollback.compaction.on-conflict in their snapshot summary to resolve conflicts at the server end ! a sample use case is there is a deployment of a Polaris where this config is enabled and there is auto compaction (maintenance job) which is updating the table state, it adds the snapshot summary that polaris.internal.rollback.compaction.on-conflict is true now when a backfill process running for 8 hours want to commit but can't because the compaction job committed before so in this case it will reach out to Polaris and Polaris will see if the snapshot of compation aka replace snapshot has this property if yes roll it back and let the writer succeed ! Devlist: https://lists.apache.org/thread/8k8t77dgk1vc124fnb61932bdp9kf1lc
Intention is make the catalog smarter, to revert the compaction commits in case of crunch to let the writers who are actually adding or removing the data to the table succeed. In a sense treating compaction as always a lower priority process. Presently the rest catalog client creates the snapshot and asks the Rest Server to apply the snapshot and gives this in a combination of requirement and update. Polaris could apply some basic inference and generate some updates to metadata given a property is enabled at a table level, by saying that It will revert back the commit which was created by compaction and let the write succeed. I had this PR in OSS, which was essentially doing this at the client end, but we think its best if we do this as server end. to support more such clients. How to use this Enable a catalog level configuration : polaris.config.rollback.compaction.on-conflicts.enabled when this is enabled polaris will apply the intelligence of rollbacking those REPLACE ops snapshot which have the property of polaris.internal.rollback.compaction.on-conflict in their snapshot summary to resolve conflicts at the server end ! a sample use case is there is a deployment of a Polaris where this config is enabled and there is auto compaction (maintenance job) which is updating the table state, it adds the snapshot summary that polaris.internal.rollback.compaction.on-conflict is true now when a backfill process running for 8 hours want to commit but can't because the compaction job committed before so in this case it will reach out to Polaris and Polaris will see if the snapshot of compation aka replace snapshot has this property if yes roll it back and let the writer succeed ! Devlist: https://lists.apache.org/thread/8k8t77dgk1vc124fnb61932bdp9kf1lc
About The Change
Intention is make the catalog smarter, to revert the compaction commits in case of crunch to let the writers who are actually adding or removing the data to the table succeed. In a sense treating compaction as always a lower priority process.
Presently the rest catalog client creates the snapshot and asks the Rest Server to apply the snapshot and gives this in a combination of requirement and update.
Polaris could apply some basic inference and generate some updates to metadata given a property is enabled at a table level, by saying that It will revert back the commit which was created by compaction and let the write succeed.
I had this PR in OSS, which was essentially doing this at the client end, but we think its best if we do this as server end. to support more such clients.
How to use this
polaris.config.rollback.compaction.on-conflicts.enabled
when this is enabled polaris will apply the intelligence of rollbacking thoseREPLACE
ops snapshot which have the property ofpolaris.internal.rollback.compaction.on-conflict
in their snapshot summary to resolve conflicts at the server end !a sample use case is there is a deployment of a Polaris where this config is enabled and there is auto compaction (maintenance job) which is updating the table state, it adds the snapshot summary that
polaris.internal.rollback.compaction.on-conflict
is true now when a backfill process running for 8 hours want to commit but can't because the compaction job committed before so in this case it will reach out to Polaris and Polaris will see if the snapshot of compation aka replace snapshot has this property if yes roll it back and let the writer succeed !Devlist: https://lists.apache.org/thread/8k8t77dgk1vc124fnb61932bdp9kf1lc
New Scenario :