Introduce origin event tracking mechanism #424

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

agavra merged 11 commits into main from origin_events

Feb 26, 2025

Contributor

agavra commented Feb 17, 2025 •

edited

Loading

This patch has a few changes:

It introduces the mechanism for counting origin events via OriginEventRecorder that inserts headers as records are consumed if (and only if) they do not already contain the origin event marker
It adds a UsageBasedV1 License Info which acts as a placeholder for our Usage based license when we introduce it. This is necessary so that existing licenses don't suddenly start reporting Origin Events / adding the headers
It adds the reporting of Origin Events to a server, a TestLicenseServer that imitates a license server and integration tests to make sure that it works with complicated subtopologies

These are done in three separate commits if you want to review one at a time.

agavra commented

View reviewed changes

kafka-client/src/main/java/dev/responsive/kafka/api/ResponsiveKafkaStreams.java

    
              public class ResponsiveKafkaStreams extends KafkaStreams {

                private static final String SIGNING_KEYS_PATH = "/responsive-license-keys/license-keys.json";

Contributor Author

agavra Feb 20, 2025

All of this License stuff was moved to LicenseUtils

kafka-client/src/main/java/dev/responsive/kafka/internal/clients/OriginEventRecorder.java Outdated

Comment on lines 95 to 100

    
                    final var header = record.headers().lastHeader(ORIGIN_EVENT_HEADER_KEY);

                    if (header == null) {

                      record.headers().add(ORIGIN_EVENT_HEADER_KEY, ORIGIN_EVENT_MARK);

                      inc(new TopicPartition(record.topic(), record.partition()));

                    }

                  }

Contributor Author

agavra Feb 20, 2025

this is the main part of the PR where it records a new origin event and marks it as an origin event

rodesai reviewed

View reviewed changes

kafka-client/src/main/java/dev/responsive/kafka/internal/clients/OriginEventRecorder.java Outdated Show resolved Hide resolved

kafka-client/src/main/java/dev/responsive/kafka/internal/clients/OriginEventRecorder.java Outdated

    
                  for (final ConsumerRecord<K, V> record : records) {

                    final var header = record.headers().lastHeader(ORIGIN_EVENT_HEADER_KEY);

                    if (header == null) {

                      record.headers().add(ORIGIN_EVENT_HEADER_KEY, ORIGIN_EVENT_MARK);

Contributor

rodesai Feb 21, 2025

do these headers actually get propagated down to the producer when it writes? I wonder if it would make more sense to just add the headers from the producer itself to prevent the header from being removed/messed with by streams

Contributor

rodesai Feb 21, 2025

On a related note, don't we need to filter out changelog records? Though I suppose they are only read by the restore consumer.

Contributor Author

agavra Feb 21, 2025 •

edited

Loading

They do get propagated (I have a test for most of the DSL operators to confirm) but I think you're right we might as well just add the headers from the producers anyway.

That would also fix the changelog record problem since we'll produce changelog records with the header anyway.

kafka-client/src/main/java/dev/responsive/kafka/internal/clients/OriginEventRecorder.java Outdated Show resolved Hide resolved

kafka-client/src/main/java/dev/responsive/kafka/internal/clients/OriginEventRecorder.java Outdated

    
                    final var header = record.headers().lastHeader(ORIGIN_EVENT_HEADER_KEY);

                    if (header == null) {

                      record.headers().add(ORIGIN_EVENT_HEADER_KEY, ORIGIN_EVENT_MARK);

                      inc(new TopicPartition(record.topic(), record.partition()));

Contributor

rodesai Feb 21, 2025

I think if a task gets restarted within a single stream thread without the task being unassigned, then we'll double-count origin events. I'm not sure the exact circumstances where that happens, but I'm pretty sure there's a few paths that close/suspend a task and then revives it - one I know of is when an EOS task is corrupted. One option here would be to track the last observed offset for each partition, and only bump the count if the polled record is a new offset.

Contributor Author

agavra Feb 21, 2025

This is fixed with the new offset tracker

kafka-client/src/main/java/dev/responsive/kafka/internal/clients/OriginEventRecorder.java Outdated

    
                @Override

                public void onCommit(final Map<TopicPartition, OffsetAndMetadata> offsets) {

                  final var now = System.currentTimeMillis();

Contributor

rodesai Feb 21, 2025

The offsets being committed aren't necessarily what's been polled so far. Streams puts the polled records into its own buffer, and can commit at any time, even if the internal buffer isn't drained. So technically we should only be reporting origin events up to the offsets being committed. How to do that is tricky. Need to think about it some more.

Contributor Author

agavra Feb 21, 2025

OK I fixed this and I believe the previous one as well by tracking origin events in a bitset instead of just as a long, and then counting that bitset on commit. It's a bit heavyweight, but I think it should be fine since if we are tracking 100K offsets per commit we'll only need ~1.5k longs which should be hardly any memory overhead and 1.5k bitwise ops for counting the number of 1s shouldn't take much either. PTAL

Contributor

ableegoldman Feb 26, 2025 •

edited

Loading

yeah this is a good catch, what's been polled =/= what's being committed. I think we could find other ways to work around this (for example by wrapping the ConsumerRecord class) but your bitwise solution seems efficient and simple enough.

Just needs to make sure we explain all this in the javadocs because without the context it's pretty hard to understand what it's there for lol

kafka-client/src/main/java/dev/responsive/kafka/internal/clients/OriginEventRecorder.java Outdated Show resolved Hide resolved

kafka-client/src/main/java/dev/responsive/kafka/api/config/ResponsiveConfig.java Outdated Show resolved Hide resolved

kafka-client/src/main/java/dev/responsive/kafka/internal/license/LicenseChecker.java

    
                    verifyTimedTrialV1((TimedTrialV1) licenseInfo);

                    LOG.info("Checked and confirmed valid Time Trial license");

                  } else if (licenseInfo instanceof UsageBasedV1) {

                    LOG.info("Checked and confirmed valid Usage Based license");

Contributor

rodesai Feb 21, 2025

is this where we will check that the license is valid eventually?

Contributor Author

agavra Feb 21, 2025

yeah, this is where we could issue the remote check if we wanted to

agavra added 6 commits

February 21, 2025 14:36


          add basics for tracking origin events


          also add usage license for testing

635abbd


          add license server reporting

bb70e30


          checkstyle

0bb5c84


          add bitset tracking mechanism for offsets

45758e3


          fix typo

d6d0578

agavra force-pushed the origin_events branch from 4f7eed3 to 6b7032c Compare

February 21, 2025 22:37


          fix submodule stuff

36b89ea

agavra force-pushed the origin_events branch from 6b7032c to 36b89ea Compare

February 21, 2025 22:40

agavra added 2 commits

February 21, 2025 15:06


          fix tests

e60e991


          fix checkstyle and make license a password

9fcb71b

ableegoldman reviewed

View reviewed changes

kafka-client/src/main/java/dev/responsive/kafka/internal/clients/NoopOriginEventRecorder.java Show resolved Hide resolved

ableegoldman reviewed

View reviewed changes

kafka-client/src/test/resources/log4j.properties

    
              log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c:%L)%n

              log4j.logger.dev.responsive.kafka.api.async=DEBUG

              #log4j.logger.dev.responsive.kafka.internal.clients.OriginEventRecorder=DEBUG

Contributor

ableegoldman Feb 22, 2025

did you mean to include this?

Contributor Author

agavra Feb 23, 2025

Yeah I did, I was using it earlier and figured I'd probably be turning it off and on. I can remove it though if it bothers our perfectionism 😆

Contributor

ableegoldman Feb 26, 2025

nope I'm good with it, just wanted to make sure!

ableegoldman reviewed

View reviewed changes

kafka-client/src/main/java/dev/responsive/kafka/internal/clients/OffsetTracker.java

    
                public void mark(final long offset) {

                  if (offset < baseOffset) {

                    throw new IllegalArgumentException(

Contributor

ableegoldman Feb 22, 2025 •

edited

Loading

nit: can we log an error before throwing that also includes the value of offsets (here and below)

ableegoldman reviewed

View reviewed changes

kafka-client/src/main/java/dev/responsive/kafka/internal/clients/OffsetTracker.java

Comment on lines +17 to +21

    
              /**

               * This class allows us to efficiently count the number of events

               * between two offsets that match a certain condition. This is somewhat

               * memory efficient in that we can track 100K offsets with ~1.5K longs

               * (64 bits per long), or roughly 12KB.

Contributor

ableegoldman Feb 22, 2025

I'm a little confused by the point of this class. This is just counting the number of origin events since the last commit, right? It doesn't seem like we ever expose what the actual origin event offsets are, so why do we need to save them in this BitSet scheme -- ie why not just use an int that increments in #mark and resets/returns in #countAndShift?

Lmk if you'd rather sync online because I'm sure I'm missing something here 🙂

Contributor Author

agavra Feb 23, 2025

This class should definitely have javadoc explaining it, originally I had implemented what you're saying but Rohan pointed out these two concerns: #424 (comment) and #424 (comment) both fixed by this bitset tracker.

Contributor

ableegoldman Feb 26, 2025

ah, I missed those discussions, but yeah he's right. thanks for filling in the javadocs


          add some comments

08fdf12

rodesai reviewed

View reviewed changes

Contributor

rodesai left a comment

Looks good! Would still be good to get @ableegoldman's feedback on whether the problems I raised that require the bitset make sense.

kafka-client/src/main/java/dev/responsive/kafka/internal/clients/OffsetTracker.java Outdated Show resolved Hide resolved

kafka-client/src/test/java/dev/responsive/kafka/integration/OriginEventIntegrationTest.java Outdated Show resolved Hide resolved


          last round of feedback

0cea25e

rodesai approved these changes

View reviewed changes

Contributor

rodesai left a comment

LGTM!

ableegoldman approved these changes

View reviewed changes

Contributor

ableegoldman left a comment

LGTM!

kafka-client/src/main/java/dev/responsive/kafka/internal/clients/OffsetTracker.java

Comment on lines +17 to +21

    
              /**

               * This class allows us to efficiently count the number of events

               * between two offsets that match a certain condition. This is somewhat

               * memory efficient in that we can track 100K offsets with ~1.5K longs

               * (64 bits per long), or roughly 12KB.

Contributor

ableegoldman Feb 26, 2025

ah, I missed those discussions, but yeah he's right. thanks for filling in the javadocs

agavra merged commit 0b754aa into main

1 check passed

agavra deleted the origin_events branch

February 26, 2025 18:39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet