Skip to content

feat(admin): add KIP-396 list/alter offsets APIs#3419

Open
DCjanus wants to merge 45 commits intoIBM:mainfrom
DCjanus:pr/kip-396-admin-offsets
Open

feat(admin): add KIP-396 list/alter offsets APIs#3419
DCjanus wants to merge 45 commits intoIBM:mainfrom
DCjanus:pr/kip-396-admin-offsets

Conversation

@DCjanus
Copy link
Copy Markdown
Contributor

@DCjanus DCjanus commented Jan 5, 2026

Summary

  • Add Admin APIs for KIP-396 ListOffsets and AlterConsumerGroupOffsets to support bulk offset queries and group offset commits.
  • Provide Go types aligned with KIP-396 semantics (OffsetAndMetadata).

Key changes

  • Add ListOffsets and AlterConsumerGroupOffsets to ClusterAdmin and wire protocol requests.
  • Add result structs with leader epoch support and commit metadata for admin offset commits.
  • Implement broker fan-out for list offsets and coordinator commit path for alter offsets.
  • Add functional tests covering timestamp list offsets and admin offset commits.

Constraints / tradeoffs

  • ListOffsets currently accepts int64 queries (earliest/latest/timestamp) to stay consistent with existing offset conventions; we can introduce an explicit spec type in a follow-up if needed.

Notes

Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
@DCjanus DCjanus force-pushed the pr/kip-396-admin-offsets branch from edad338 to 11e34c1 Compare January 5, 2026 16:07
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
@DCjanus DCjanus closed this Jan 5, 2026
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
@DCjanus DCjanus reopened this Jan 6, 2026
Comment thread admin_offsets.go Outdated

// OffsetSpec specifies which offset to look up for a partition.
type OffsetSpec struct {
timestamp int64
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure why we need a ton of constructors, when we can simply export this value.

After all, given the functionality provided, we can already arbitrarily mutate any given OffsetSpec and access the field at will:

offspec := OffsetSpecLatest()
offspec = OffsetSpecForTimestamp(arbitraryTimestamp)
go func() {
	offspec = OffsetSpecForTimestamp(OffsetNewest)
}()
ts := offspec.Timestamp() // write-read race condition

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure why we need a ton of constructors, when we can simply export this value.

After all, given the functionality provided, we can already arbitrarily mutate any given OffsetSpec and access the field at will:

offspec := OffsetSpecLatest()
offspec = OffsetSpecForTimestamp(arbitraryTimestamp)
go func() {
	offspec = OffsetSpecForTimestamp(OffsetNewest)
}()
ts := offspec.Timestamp() // write-read race condition

Agree that we should not mix two different input styles for the same concept (as noted in #3419 (comment)).

To stay consistent with existing Sarama APIs (e.g. Client.GetOffset(topic, partition, time int64) which uses OffsetOldest/OffsetNewest), I removed OffsetSpec and made ListOffsets take int64 directly (pass OffsetOldest/OffsetNewest or a millisecond timestamp).

Done in commit f3e40c7.

Comment thread admin_offsets.go Outdated
Comment thread admin_offsets.go Outdated
Comment thread admin_offsets.go Outdated
close(results)

allResults := make(map[TopicPartitionID]*ListOffsetsResult, len(partitions))
var firstErr error
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var errs []error
for res := range results {
	if res.err != nil {
		errs = append(errs, res.err)
	}
	...
}

return allResults errors.Join(errs...)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var errs []error
for res := range results {
	if res.err != nil {
		errs = append(errs, res.err)
	}
	...
}

return allResults errors.Join(errs...)

Addressed in commit bce2339 by aggregating all errors with errors.Join.

Comment thread admin_offsets.go Outdated

for _, req := range requests {
wg.Add(1)
go func(req *brokerOffsetRequest) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many requests are we likely to be spinning off here?

It can happen that spinning off too many goroutines will actually be performance-degrading rather than performance-enhancing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many requests are we likely to be spinning off here?

It can happen that spinning off too many goroutines will actually be performance-degrading rather than performance-enhancing.

Thanks for the note. The Java AdminClient doesn’t appear to impose an explicit concurrency cap either — it groups by broker and issues one in‑flight request per broker via the admin I/O loop.

Key entry point (no concurrency limiting logic):

In practice, the Java I/O loop is equivalent to spawning goroutines in Go: both drive concurrent in‑flight requests without a per‑broker cap.

Given typical Kafka clusters are well below 1,000 brokers, I believe even 10,000 concurrent in‑flight requests should be relatively easy for modern hardware to handle.

Comment thread admin_offsets.go Outdated
Comment thread admin.go Outdated
Comment on lines +125 to +130
ListOffsets(partitions map[TopicPartitionID]OffsetSpec, options *ListOffsetsOptions) (map[TopicPartitionID]*ListOffsetsResult, error)

// AlterConsumerGroupOffsets alters offsets for the specified group by committing the provided offsets and metadata.
// The request targets the group's coordinator and returns per-partition results in the response.
// This operation is not transactional so it may succeed for some partitions while fail for others.
AlterConsumerGroupOffsets(group string, offsets map[TopicPartitionID]OffsetAndMetadata, options *AlterConsumerGroupOffsetsOptions) (*OffsetCommitResponse, error)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our current design uses map[string]map[int32]V adding this TopicPartitionID might have been nice if we had started with it, but we haven’t really. https://pkg.go.dev/github.com/IBM/sarama#AlterPartitionReassignmentsResponse

Providing two different ways to do something is likely to increase confusion over any benefits of flattening the maps.

It also reduces the ability to iterate over topics individually, without needing to then iterate over all Topic × Partition combinations and select for Topic.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our current design uses map[string]map[int32]V adding this TopicPartitionID might have been nice if we had started with it, but we haven’t really. https://pkg.go.dev/github.com/IBM/sarama#AlterPartitionReassignmentsResponse

Providing two different ways to do something is likely to increase confusion over any benefits of flattening the maps.

It also reduces the ability to iterate over topics individually, without needing to then iterate over all Topic × Partition combinations and select for Topic.

Thanks for the point about keeping the API shape consistent. I reverted to the existing map[string]map[int32]V style so callers can iterate by topic without scanning all partitions.

This removes TopicPartitionID from ListOffsets and aligns the input/return maps with the rest of Sarama’s admin APIs.

Done in commit be65797.

Comment thread admin_offsets.go Outdated
}

// ListOffsetsResult contains the response for a single topic partition.
type ListOffsetsResult struct {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure this is an Offsets result? As it seems to be single offset result?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed ListOffsetsResult to OffsetResult to reflect the single-partition result; done in commit f9150fe.

Comment thread admin_offsets.go Outdated
Comment on lines +66 to +75
type brokerOffsetRequest struct {
broker *Broker
request *OffsetRequest
partitions []TopicPartitionID
}

type brokerOffsetResult struct {
result map[TopicPartitionID]*ListOffsetsResult
err error
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also isolate/scope these types into the ListOffsets right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also isolate/scope these types into the ListOffsets right?

Updated in commit 243445e (scoped the helper types inside ListOffsets).

Comment thread admin_offsets.go Outdated
Comment on lines +95 to +96
req = &brokerOffsetRequest{
broker: broker,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure why we double track the broker? We’re using it as the key, and as a field in the value?

We could just for broker, req := range requests { … } below, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure why we double track the broker? We’re using it as the key, and as a field in the value?

We could just for broker, req := range requests { … } below, right?

Addressed in commit ea32b6e by removing the redundant broker field and passing the broker via the map key in the loop.

@DCjanus DCjanus force-pushed the pr/kip-396-admin-offsets branch from 2e502be to ad782bd Compare January 9, 2026 17:42
DCjanus and others added 10 commits January 10, 2026 02:44
Co-authored-by: Cassondra Foesch <puellanivis@users.noreply.github.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Co-authored-by: Cassondra Foesch <puellanivis@users.noreply.github.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Co-authored-by: Cassondra Foesch <puellanivis@users.noreply.github.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Copy link
Copy Markdown
Collaborator

@puellanivis puellanivis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, just two things that are perhaps more style than substance.

Comment thread admin_offsets.go Outdated

for broker, req := range requests {
wg.Add(1)
go func(broker *Broker, req *brokerOffsetRequest) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] We’re compiling this under go.mod go version v1.24.0, which should be well enough into the change of semantics for loopvars that removed the need to passing these variables through arguments, rather than accessing them by simple closure. 🤔 Though, I’m not sure how we would want to approach changing this common pattern, even if it is now unnecessary.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] We’re compiling this under go.mod go version v1.24.0, which should be well enough into the change of semantics for loopvars that removed the need to passing these variables through arguments, rather than accessing them by simple closure. 🤔 Though, I’m not sure how we would want to approach changing this common pattern, even if it is now unnecessary.

Thanks for the style note.

Personally, I prefer passing loop variables into goroutines so it is clear what is shared.

I also want to match the project style. If you think it is better to rely on the Go 1.24 behavior, I can change it — or you can push a commit since maintainer edits are enabled.

Copy link
Copy Markdown
Collaborator

@puellanivis puellanivis Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usual problem is that to know for sure which values are being put into these parameters, one has to drop all the way to the bottom of the go func() { … }() to check, (though good here, you’re also shadowing the originals, so one cannot accidentally access the wrong one; since whether one likes to be explicit or not, the implicit closure access is always there), and it requires repeating the type information of the parameters as well, while when accessed through closure.

So, if we refactor a type name of one of the variables, this is then Yet Another Place where the type-name must be refactored.

PS: To be more clear, because I forgot to mention it, this is fine. 🤷‍♀️

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usual problem is that to know for sure which values are being put into these parameters, one has to drop all the way to the bottom of the go func() { … }() to check, (though good here, you’re also shadowing the originals, so one cannot accidentally access the wrong one; since whether one likes to be explicit or not, the implicit closure access is always there), and it requires repeating the type information of the parameters as well, while when accessed through closure.

So, if we refactor a type name of one of the variables, this is then Yet Another Place where the type-name must be refactored.

PS: To be more clear, because I forgot to mention it, this is fine. 🤷‍♀️

Thanks for the clarification — you convinced me. I switched this to simple closure capture in 3d11625.

In a language without a “no implicit capture” guarantee, explicit parameters don’t really reduce review burden, since you still have to check for other implicit captures.

Comment thread admin_offsets.go Outdated
}
broker.handleThrottledResponse(resp)

partitionResults := make(map[string]map[int32]*OffsetResult)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a seeming reversal of my prior guidance; since this type is function internal, we could also use the topicPartition type as a key here, and avoid needing more than the one map.

(One of those cases where blackbox internals can choose different paradigms from the public API.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a seeming reversal of my prior guidance; since this type is function internal, we could also use the topicPartition type as a key here, and avoid needing more than the one map.

(One of those cases where blackbox internals can choose different paradigms from the public API.)

Thanks! Updated in 494f7c3.

@dnwe dnwe added the feat label Jan 11, 2026
Copy link
Copy Markdown
Collaborator

@dnwe dnwe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick first pass, one stylistic comment below, but also one main question here: is there a reason why we have the ca.retryOnError wrapper on AlterConsumerGroupOffsets, but not on any of the calls in ListOffsets?

Comment thread admin_offsets.go Outdated
Comment on lines +184 to +200
if conf.Version.IsAtLeast(V0_9_0_0) {
request.Version = 2
} else {
request.Version = 1
}
if conf.Version.IsAtLeast(V0_11_0_0) {
request.Version = 3
}
if conf.Version.IsAtLeast(V2_0_0_0) {
request.Version = 4
}
if conf.Version.IsAtLeast(V2_1_0_0) {
request.Version = 6
}
if conf.Version.IsAtLeast(V2_3_0_0) {
request.Version = 7
}
Copy link
Copy Markdown
Collaborator

@dnwe dnwe Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we structure these in descending order like we do elsewhere in Sarama? e.g.,

if version.IsAtLeast(V2_5_0_0) {
// Version 7 is adding the require stable flag.
request.Version = 7
} else if version.IsAtLeast(V2_4_0_0) {
// Version 6 is the first flexible version.
request.Version = 6
} else if version.IsAtLeast(V2_1_0_0) {
// Version 3, 4, and 5 are the same as version 2.
request.Version = 5
} else if version.IsAtLeast(V2_0_0_0) {
request.Version = 4
} else if version.IsAtLeast(V0_11_0_0) {
request.Version = 3
} else if version.IsAtLeast(V0_10_2_0) {
// Starting in version 2, the request can contain a null topics array to indicate that offsets
// for all topics should be fetched. It also returns a top level error code
// for group or coordinator level errors.
request.Version = 2
} else if version.IsAtLeast(V0_8_2_0) {
// In version 0, the request read offsets from ZK.
//
// Starting in version 1, the broker supports fetching offsets from the internal __consumer_offsets topic.
request.Version = 1
}

It is somewhat stylistic (but also avoids multiple assignment). It is also useful to have the version-to-version additions (taken from Kafka's protocol json files) to show what changed between them too

You may also want to move this into offset_commit_request.go as a NewOffsetCommitRequest func that sets up the correct version etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we structure these in descending order like we do elsewhere in Sarama? e.g.,

if version.IsAtLeast(V2_5_0_0) {
// Version 7 is adding the require stable flag.
request.Version = 7
} else if version.IsAtLeast(V2_4_0_0) {
// Version 6 is the first flexible version.
request.Version = 6
} else if version.IsAtLeast(V2_1_0_0) {
// Version 3, 4, and 5 are the same as version 2.
request.Version = 5
} else if version.IsAtLeast(V2_0_0_0) {
request.Version = 4
} else if version.IsAtLeast(V0_11_0_0) {
request.Version = 3
} else if version.IsAtLeast(V0_10_2_0) {
// Starting in version 2, the request can contain a null topics array to indicate that offsets
// for all topics should be fetched. It also returns a top level error code
// for group or coordinator level errors.
request.Version = 2
} else if version.IsAtLeast(V0_8_2_0) {
// In version 0, the request read offsets from ZK.
//
// Starting in version 1, the broker supports fetching offsets from the internal __consumer_offsets topic.
request.Version = 1
}

It is somewhat stylistic (but also avoids multiple assignment). It is also useful to have the version-to-version additions (taken from Kafka's protocol json files) to show what changed between them too

You may also want to move this into offset_commit_request.go as a NewOffsetCommitRequest func that sets up the correct version etc.

  • Switched to descending version checks with explicit version-diff comments.
  • Moved the admin OffsetCommitRequest initializer to offset_commit_request.go as NewOffsetCommitRequest and used it from admin_offsets.go.

Done in commit c525eb0

Copy link
Copy Markdown
Collaborator

@puellanivis puellanivis Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanna add as an aside, but we can also switch to a more go-style:

switch {
case version.IsAtLeast(V2_5_0_0):
case version.IsAtLeast(V2_4_0_0):
...
}

But I’ve not mentioned this before, because the style already here is are if … else if … else … cascades.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanna add as an aside, but we can also switch to a more go-style:

switch {
case version.IsAtLeast(V2_5_0_0):
case version.IsAtLeast(V2_4_0_0):
...
}

But I’ve not mentioned this before, because the style already here is are if … else if … else … cascades.

Thanks for the suggestion! I agree it’s more Go‑style.

To keep this PR focused and clean, I’d prefer not to change this here.

@dnwe if you also prefer the switch {} style, I can follow up with a small PR to convert similar cascades across the project.

Signed-off-by: DCjanus <DCjanus@dcjanus.com>
Signed-off-by: DCjanus <DCjanus@dcjanus.com>
@DCjanus
Copy link
Copy Markdown
Contributor Author

DCjanus commented Jan 15, 2026

I did a quick first pass, one stylistic comment below, but also one main question here: is there a reason why we have the ca.retryOnError wrapper on AlterConsumerGroupOffsets, but not on any of the calls in ListOffsets?

I missed the retry logic earlier; added it in commit 4446246.

@DCjanus DCjanus requested a review from dnwe February 24, 2026 01:48
@DCjanus
Copy link
Copy Markdown
Contributor Author

DCjanus commented Feb 26, 2026

Hi @dnwe,

I saw the new AI-assistance guidance in #3452, including the note about not putting AI tools/models in Co-authored-by.

Some commits in this PR include:Co-authored-by: OpenAI Codex <codex@openai.com>

I add that via a global AGENTS.md rule across repos to keep AI disclosure explicit and consistent. This also makes it easy for maintainers who prefer to avoid AI-assisted contributions to identify them quickly (I may have a different view, but I respect that preference).

I also understand why Signed-off-by should remain human-only, since AI agents do not have legal identity for attestation.

Before opening PRs, I still review and validate all changes myself.

Given the new policy, what would you prefer for this PR?
Should I rewrite history (force-push) to remove those Co-authored-by trailers, or keep existing commits as-is and follow the new rule from now on?

Happy to follow your preference.

(Also, this comment was lightly polished with AI for clarity because English is not my native language, and I want to avoid ambiguity in technical communication.)

@DCjanus
Copy link
Copy Markdown
Contributor Author

DCjanus commented Feb 26, 2026

@dnwe Quick correction: for this PR, I had not enabled my AGENTS.md co-author rule yet, so these commits do not include an AI Co-authored-by trailer.

I still want to discuss the policy in general, but this specific PR is not an example of that case.

@puellanivis
Copy link
Copy Markdown
Collaborator

So, it’s inappropriate to attribute co-authorship to anything that is not a person, and thus not able to hold any copyrights. Having random public domain uncopyrightable code mixed into your PR muddies the protections and/or requirements that may be necessary for license compliance.

Comment thread admin.go
Comment on lines +248 to +249
var netErr net.Error
return errors.As(err, &netErr) && netErr.Timeout()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, errors.AsType is in v1.26 so not available yet for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

KIP-396: Add Reset/List Offsets Operations to AdminClient

3 participants