Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-18404: Remove partitionMaxBytes usage from DelayedShareFetch #17870

Merged

Conversation

adixitconfluent
Copy link
Contributor

@adixitconfluent adixitconfluent commented Nov 20, 2024

About

This PR aims to remove partitionMaxBytes from share fetch requests(SFR). In the current flow, we cannot fetch more than 1 MB for each topic partition in the share fetch request by default although the overall SFR max bytes is 50MB. This PR has removed this limit by using strategies from a strategy class PartitionMaxBytesStrategy.
PartitionMaxBytesStrategy contains an UNIFORM strategy that divides overall SFR max bytes equally among the acquired topic partitions. For example - If SFR max bytes = 50MB and there are 5 acquired topic partitions, then total data that can be fetched for each topic partition from replica manager can be upto 10MB each.

Testing

Added unit tests for PartitionMaxBytesStrategy and tested with already present unit tests on broker and integration tests for share consumers

@github-actions github-actions bot added core Kafka Broker KIP-932 Queues for Kafka clients small Small PRs labels Nov 20, 2024
@adixitconfluent adixitconfluent force-pushed the partition_max_bytes_removal branch from 78f38d4 to 106411f Compare January 3, 2025 15:41
@adixitconfluent adixitconfluent changed the title WIP: Remove partitionMaxBytes from share fetch requests WIP: Remove partitionMaxBytes from DelayedShareFetch Jan 3, 2025
@adixitconfluent adixitconfluent changed the title WIP: Remove partitionMaxBytes from DelayedShareFetch WIP: KAFKA-18404: Remove partitionMaxBytes from DelayedShareFetch Jan 6, 2025
@adixitconfluent adixitconfluent marked this pull request as ready for review January 7, 2025 09:06
@adixitconfluent adixitconfluent changed the title WIP: KAFKA-18404: Remove partitionMaxBytes from DelayedShareFetch KAFKA-18404: Remove partitionMaxBytes from DelayedShareFetch Jan 7, 2025
@adixitconfluent adixitconfluent changed the title KAFKA-18404: Remove partitionMaxBytes from DelayedShareFetch WIP: KAFKA-18404: Remove partitionMaxBytes from DelayedShareFetch Jan 7, 2025
@adixitconfluent adixitconfluent changed the title WIP: KAFKA-18404: Remove partitionMaxBytes from DelayedShareFetch WIP: KAFKA-18404: Remove partitionMaxBytes usage from DelayedShareFetch Jan 7, 2025
@adixitconfluent adixitconfluent changed the title WIP: KAFKA-18404: Remove partitionMaxBytes usage from DelayedShareFetch KAFKA-18404: Remove partitionMaxBytes usage from DelayedShareFetch Jan 7, 2025
@github-actions github-actions bot removed the small Small PRs label Jan 8, 2025
@adixitconfluent
Copy link
Contributor Author

Test failures are unrelated to my changes.

Copy link
Collaborator

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the PR. Can you please help me understand how we will extend this code in future with minimal changes?

import java.util.Locale;
import java.util.Set;

public class PartitionMaxBytesDivisionStrategy {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this class or way extensible in future to add more startegies? Shouldn't you need the interface with enum which shall provide you the right implementing strategy class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apoorvmittal10, you can add a strategy to enum StrategyType and add the corresponding code to partitionMaxBytes function for handling the new strategy type. If you do not like this way I can create an interface

public interface PartitionMaxBytesDivisionStrategy {
       public LinkedHashMap<TopicIdPartition, Integer> partitionMaxBytes(int requestMaxBytes, Set<TopicIdPartition> partitions, int acquiredPartitionsSize);
}

Then use subclasses to implement this function. Wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's good as you would require to chnage couple of additional lines in DelayedShareFetch. Rather we should write over a single class/interface which should provide the implementation based on StrategyType.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about below implementation and usage?

this.partitionMaxBytesStrategy = PartitionMaxBytesStrategy.type(StrategyType.UNIFORM);
/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements. See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License. You may obtain a copy of the License at
 *
 *    http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.apache.kafka.server.share.fetch;

import org.apache.kafka.common.TopicIdPartition;

import java.util.LinkedHashMap;
import java.util.Locale;
import java.util.Set;

public interface PartitionMaxBytesStrategy {

    enum StrategyType {
        UNIFORM;

        @Override
        public String toString() {
            return super.toString().toLowerCase(Locale.ROOT);
        }
    }

    LinkedHashMap<TopicIdPartition, Integer> maxBytes(int requestMaxBytes, Set<TopicIdPartition> partitions, int acquiredPartitionsSize);

    static PartitionMaxBytesStrategy type(StrategyType type) {
        return switch (type) {
            case UNIFORM -> PartitionMaxBytesStrategy::uniformPartitionMaxBytes;
        };
    }

    /**
     * Returns the partition max bytes for a given partition based on the strategy type.
     *
     * @param requestMaxBytes - The total max bytes available for the share fetch request
     * @param partitions - The topic partitions in the order for which we compute the partition max bytes.
     * @param acquiredPartitionsSize - The total partitions that have been acquired.
     * @return the partition max bytes for the topic partitions
     */
    static LinkedHashMap<TopicIdPartition, Integer> uniformPartitionMaxBytes(int requestMaxBytes, Set<TopicIdPartition> partitions, int acquiredPartitionsSize) {
        if (partitions == null || partitions.isEmpty()) {
            throw new IllegalArgumentException("Partitions to generate max bytes is null or empty");
        }
        if (requestMaxBytes <= 0) {
            throw new IllegalArgumentException("Requested max bytes must be greater than 0");
        }
        if (acquiredPartitionsSize <= 0) {
            throw new IllegalArgumentException("Acquired partitions size must be greater than 0");
        }

        LinkedHashMap<TopicIdPartition, Integer> partitionMaxBytes = new LinkedHashMap<>();
            partitions.forEach(partition -> partitionMaxBytes.put(
                partition, requestMaxBytes / acquiredPartitionsSize));
        return partitionMaxBytes;
    }
}

Copy link
Collaborator

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, some comments and questions.

Copy link
Collaborator

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, some comments to address.

Copy link
Collaborator

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. I have a comment around tests, I think we should add tests for the functionality in DelayedShareFetch.

Copy link
Collaborator

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the PR and fixes.

Copy link
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@AndrewJSchofield AndrewJSchofield merged commit 4e24c50 into apache:trunk Jan 13, 2025
9 checks passed
tedyyan pushed a commit to tedyyan/kafka that referenced this pull request Jan 13, 2025
100, partitions, 0));
// empty partitions set.
assertThrows(IllegalArgumentException.class, () -> PartitionMaxBytesStrategy.checkValidArguments(
100, Collections.EMPTY_SET, 20));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for the delayed review. Could you please use Set.of() to address the following warnings?

> Task :share:compileTestJava
Note: /home/jenkins/kafka/share/src/test/java/org/apache/kafka/server/share/fetch/PartitionMaxBytesStrategyTest.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chia7712 I missed it in review, thanks for pointing out. @adixitconfluent is busy so I just raised this minor PR: #18541

@ijuma
Copy link
Member

ijuma commented Jan 17, 2025

I cherry-picked this to 4.0 as it made another cherry-pick simpler (#18524).

@apoorvmittal10
Copy link
Collaborator

apoorvmittal10 commented Jan 17, 2025

I cherry-picked this to 4.0 as it made another cherry-pick simpler (#18524).

@AndrewJSchofield Do you think it's sensible to include the config for limiting bytes in single share fetch as the partition max bytes is now dynamic, in your doc? I am just thinking in terms of clients if using default value in 4.0 then might see bigger chunk of data flowing to client.

ijuma pushed a commit that referenced this pull request Jan 17, 2025
apoorvmittal10 added a commit to apoorvmittal10/kafka that referenced this pull request Jan 20, 2025
AndrewJSchofield pushed a commit that referenced this pull request Jan 20, 2025
…Fetch (#17870)" (#18643)

This reverts commit b021b51.

Reviewers: Ismael Juma <[email protected]>, Andrew Schofield <[email protected]>
pranavt84 pushed a commit to pranavt84/kafka that referenced this pull request Jan 27, 2025
airlock-confluentinc bot pushed a commit to confluentinc/kafka that referenced this pull request Jan 27, 2025
manoj-mathivanan pushed a commit to manoj-mathivanan/kafka that referenced this pull request Feb 19, 2025
AndrewJSchofield pushed a commit that referenced this pull request Mar 12, 2025
…quests (#19148)

This PR aims to remove the usage of partition max bytes from share fetch
requests. Partition Max Bytes is being defined by
`PartitionMaxBytesStrategy` which was added to the broker as part of PR
#17870

Reviewers: Andrew Schofield <[email protected]>, Apoorv Mittal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-approved clients core Kafka Broker KIP-932 Queues for Kafka
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants