Skip to content

Commit cd5ce52

Browse files
authored
KAFKA-20664: Clarify docs on max compaction lag, segment.ms, and segment.bytes for active segment rolling (#22489)
Improves the documentation for `segment.bytes`, `segment.ms`, and `max.compaction.lag.ms` with respect to active segment rolling. Reviewers: Lucy Liu <lucliu@confluent.io>, Matthias J. Sax <matthias@confluent.io>
1 parent 6110d3e commit cd5ce52

2 files changed

Lines changed: 19 additions & 4 deletions

File tree

clients/src/main/java/org/apache/kafka/common/config/TopicConfig.java

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,12 +31,17 @@ public class TopicConfig {
3131
public static final String SEGMENT_BYTES_CONFIG = "segment.bytes";
3232
public static final String SEGMENT_BYTES_DOC = "This configuration controls the segment file size for " +
3333
"the log. Retention and cleaning is always done a file at a time so a larger segment size means " +
34-
"fewer files but less granular control over retention.";
34+
"fewer files but less granular control over retention. " +
35+
"The active segment is rolled once it reaches this size.";
3536

3637
public static final String SEGMENT_MS_CONFIG = "segment.ms";
3738
public static final String SEGMENT_MS_DOC = "This configuration controls the period of time after " +
3839
"which Kafka will force the log to roll even if the segment file isn't full to ensure that retention " +
39-
"can delete or compact old data.";
40+
"can delete or compact old data. " +
41+
"This forces active segment rolling by time, even if the active segment has not reached " +
42+
"<code>segment.bytes</code>. For compacted topics, <code>max.compaction.lag.ms</code> can trigger " +
43+
"active segment rolling sooner: the effective time-based roll threshold is the smaller of " +
44+
"<code>segment.ms</code> and <code>max.compaction.lag.ms</code>.";
4045

4146
public static final String SEGMENT_JITTER_MS_CONFIG = "segment.jitter.ms";
4247
public static final String SEGMENT_JITTER_MS_DOC = "The maximum random jitter subtracted from the scheduled " +
@@ -151,7 +156,13 @@ public class TopicConfig {
151156

152157
public static final String MAX_COMPACTION_LAG_MS_CONFIG = "max.compaction.lag.ms";
153158
public static final String MAX_COMPACTION_LAG_MS_DOC = "The maximum time a message will remain " +
154-
"ineligible for compaction in the log. Only applicable for logs that are being compacted.";
159+
"ineligible for compaction in the log. Only applicable for logs that are being compacted. " +
160+
"Because the active segment is never compacted, for compacted topics this value also drives " +
161+
"active segment rolling: the effective time-based roll threshold is the smaller of " +
162+
"<code>segment.ms</code> and <code>max.compaction.lag.ms</code>. Active segment rolling moves " +
163+
"records out of the active segment, after which <code>max.compaction.lag.ms</code> makes them " +
164+
"eligible for compaction even if <code>min.cleanable.dirty.ratio</code> is not met. See " +
165+
"<a href=\"https://kafka.apache.org/documentation/#compaction\">log compaction</a>.";
155166

156167
public static final String MIN_CLEANABLE_DIRTY_RATIO_CONFIG = "min.cleanable.dirty.ratio";
157168
public static final String MIN_CLEANABLE_DIRTY_RATIO_DOC = "This configuration controls how frequently " +

docs/design/design.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -454,7 +454,11 @@ This can be used to prevent messages newer than a minimum message age from being
454454
log.cleaner.max.compaction.lag.ms
455455
```
456456

457-
This can be used to prevent log with low produce rate from remaining ineligible for compaction for an unbounded duration. If not set, logs that do not exceed min.cleanable.dirty.ratio are not compacted. Note that this compaction deadline is not a hard guarantee since it is still subjected to the availability of log cleaner threads and the actual compaction time. You will want to monitor the uncleanable-partitions-count, max-clean-time-secs and max-compaction-delay-secs metrics.
457+
This can be used to prevent log with low produce rate from remaining ineligible for compaction for an unbounded duration. If not set, logs that do not exceed min.cleanable.dirty.ratio are not compacted.
458+
459+
Because the active segment is never compacted (as noted above), records become eligible for compaction only through active segment rolling. For a compacted topic the active segment is rolled when the first of these is reached: it grows to segment.bytes, or its age reaches the smaller of segment.ms and max.compaction.lag.ms. So max.compaction.lag.ms governs two distinct things. First, it triggers active segment rolling by lowering the effective time-based roll threshold to the smaller of segment.ms and max.compaction.lag.ms, moving older records out of the active segment. Second, max.compaction.lag.ms then makes the rolled records eligible for compaction even when the log's dirty ratio is below min.cleanable.dirty.ratio.
460+
461+
Note that this compaction deadline is not a hard guarantee since it is still subjected to the availability of log cleaner threads and the actual compaction time. You will want to monitor the uncleanable-partitions-count, max-clean-time-secs and max-compaction-delay-secs metrics.
458462

459463
Further cleaner configurations are described [here](/documentation.html#brokerconfigs).
460464

0 commit comments

Comments
 (0)