Skip to content

Comments

feat: support write tombstone to a compactTopic.#25248

Open
zhaizhibo wants to merge 2 commits intoapache:masterfrom
zhaizhibo:feat_add_tombstone_for_compact
Open

feat: support write tombstone to a compactTopic.#25248
zhaizhibo wants to merge 2 commits intoapache:masterfrom
zhaizhibo:feat_add_tombstone_for_compact

Conversation

@zhaizhibo
Copy link
Contributor

Motivation

Currently, Pulsar's topic compaction process retains all keys, including those with null values. This creates a limitation where deleted keys (tombstones) are never removed from compacted ledgers, causing them to grow indefinitely over time. This behavior consumes unnecessary storage and impacts system performance for long-running topics with frequent key deletions.

Modifications

Added a new optional broker configuration topicCompactionRetainNullValue to control whether null values should be retained during compaction.

Default behavior: true (retain null values) - maintains backward compatibility
New option: When set to false, keys with null values (tombstones) will be removed during compaction
Backward compatibility: If the configuration is not set, the system defaults to the original behavior (retain null values)

Verifying this change

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Feb 16, 2026
Comment on lines +756 to +757
FieldUtils.writeField(compactor, "topicCompactionRetainNullValue", false, true);
FieldUtils.writeField(compactor, "topicCompactionRetainNullKey", true, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to override restartBroker in this class so that it reinitializes compactor so that there wouldn't be a need to set the fields directly.

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just wondering if there's really a problem currently.
Null values should get removed in compaction.

This is the location where that should be happening currently:

Pair<String, Integer> keyAndSize = extractKeyAndSize(m, metadata);
if (keyAndSize != null) {
if (keyAndSize.getRight() > 0) {
MessageId old = latestForKey.put(keyAndSize.getLeft(), id);
replaceMessage = old != null;
} else {
deletedMessage = true;
latestForKey.remove(keyAndSize.getLeft());
}

If it's not working, I think that there's a bug.

@lhotari
Copy link
Member

lhotari commented Feb 16, 2026

I looked into the change history and it looks like there used to be a related bug which was fixed a long time with #18877.

@lhotari
Copy link
Member

lhotari commented Feb 16, 2026

There's also #24523 since Pulsar 4.1.0, implementing PIP-429: Optimize Handling of Compacted Last Entry by Skipping Payload Buffer Parsing

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Pulsar, this is the default

If any given message has an empty payload, it will be skipped and considered deleted (akin to the concept of tombstones in key-value databases).

source: https://pulsar.apache.org/docs/next/concepts-topic-compaction/#how-topic-compaction-works

Therefore there seems to be a conflict in the current description of this PR. Please add a test case which reproduces the problem that you are facing.

There is a gap in Pulsar's compaction that it doesn't support retention. It seems that you have filed #24791 about that issue and that is directly related to #19665.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs ready-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants