feat: support write tombstone to a compactTopic.#25248
feat: support write tombstone to a compactTopic.#25248zhaizhibo wants to merge 2 commits intoapache:masterfrom
Conversation
| FieldUtils.writeField(compactor, "topicCompactionRetainNullValue", false, true); | ||
| FieldUtils.writeField(compactor, "topicCompactionRetainNullKey", true, true); |
There was a problem hiding this comment.
It would be better to override restartBroker in this class so that it reinitializes compactor so that there wouldn't be a need to set the fields directly.
lhotari
left a comment
There was a problem hiding this comment.
I'm just wondering if there's really a problem currently.
Null values should get removed in compaction.
This is the location where that should be happening currently:
If it's not working, I think that there's a bug.
|
I looked into the change history and it looks like there used to be a related bug which was fixed a long time with #18877. |
|
There's also #24523 since Pulsar 4.1.0, implementing PIP-429: Optimize Handling of Compacted Last Entry by Skipping Payload Buffer Parsing |
lhotari
left a comment
There was a problem hiding this comment.
In Pulsar, this is the default
If any given message has an empty payload, it will be skipped and considered deleted (akin to the concept of tombstones in key-value databases).
source: https://pulsar.apache.org/docs/next/concepts-topic-compaction/#how-topic-compaction-works
Therefore there seems to be a conflict in the current description of this PR. Please add a test case which reproduces the problem that you are facing.
There is a gap in Pulsar's compaction that it doesn't support retention. It seems that you have filed #24791 about that issue and that is directly related to #19665.
Motivation
Currently, Pulsar's topic compaction process retains all keys, including those with null values. This creates a limitation where deleted keys (tombstones) are never removed from compacted ledgers, causing them to grow indefinitely over time. This behavior consumes unnecessary storage and impacts system performance for long-running topics with frequent key deletions.
Modifications
Added a new optional broker configuration
topicCompactionRetainNullValueto control whether null values should be retained during compaction.Default behavior:
true(retain null values) - maintains backward compatibilityNew option: When set to
false, keys with null values (tombstones) will be removed during compactionBackward compatibility: If the configuration is not set, the system defaults to the original behavior (retain null values)
Verifying this change
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
PR in forked repository: