-
Notifications
You must be signed in to change notification settings - Fork 2
Merge latest confluentinc/librdkafka into master #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…4794) and add build checks with different configurations
…ide during send push telemetry call (confluentinc#4826)
fallback happens when any topic to fetch has a zero topic id, that can happen if the cluster has a Fetch version that is greater or equal to 13 but an inter broker protocol version that is less than 2.8
add missing consumer metrics described in the KIP: * consumer.coordinator.rebalance.latency.avg * consumer.coordinator.rebalance.latency.max * consumer.coordinator.rebalance.latency.total * consumer.fetch.manager.fetch.latency.avg * consumer.fetch.manager.fetch.latency.max * consumer.poll.idle.ratio.avg * consumer.coordinator.commit.latency.avg * consumer.coordinator.commit.latency.max additionally: * add unit tests for all the metrics * add integrations tests with the producer or consumer while they're active * configurable group initial rebalance delay ms to make integration tests reusable with both producer and consumer --------- Co-authored-by: mahajanadhitya <[email protected]> Co-authored-by: Anchit Jain <[email protected]> Co-authored-by: Emanuele Sabellico <[email protected]>
also fix macOS build issue --------- Co-authored-by: Emanuele Sabellico <[email protected]>
…inc#4860) Co-authored-by: mahajanadhitya <[email protected]>
…tions (confluentinc#4845) * electLeaders Api 1st draft * Minor change * Elect Leaders API final * minor change * Added a function * Minor Change * removed binary file from pr * Style fix * Resolve documentation errors * resolved build errors * first round of comments * name fixes * second round of comments * memory deallocation fix * Changelog changes addition * changes * variable naming change * third round comments * grammar change * latest changes * additional formatting * documentation changes * additional comments * one more round comments * election type count * name change election type cnt * new comments * removing election type check * removing merge conflicts * changelog.md changes * new changes * Fixing failing tests * Moved top level error code from response to event * changed topic partition result structure * style fix * remove null partitions check * requested changes * introduction md formatting * requested changes * partitions cnt print --------- Co-authored-by: Pranav Rathi <[email protected]>
Co-authored-by: John "Preston" Mille <[email protected]>
from external API and avoid exposing internal destructor
…fluentinc#4871) AK 2.7 is the first version implementing Fetch 12, before that it shouldn't fallback to v12, neither check if topic IDs are supported.
…#4800) when potential topic partitions are less than number of members.
…nfluentinc#4754) as replicas reply with "NOT_LEADER_OR_FOLLOWER" when using the replica id, Java clients sends requests to the leader too. Co-authored-by: Kyle Phelps <[email protected]>
…a instances (confluentinc#4724) Circular dependencies from a partition fetch queue message to the same partition blocked the destroy of an instance, that happened in case the partition was removed from the cluster while it was being consumed. Solved by purging internal partition queue, after being stopped and removed, to allow reference count to reach zero and trigger a destroy. Purging internal fetch queue on removing the partition only for the consumer.
* Security upgrade for OpenSSL and Curl, CVEs fixed: OpenSSL - CVE-2024-2511 - CVE-2024-4603 - CVE-2024-4741 - CVE-2024-5535 - CVE-2024-6119 CURL - CVE-2024-8096 - CVE-2024-7264 - CVE-2024-6874 - CVE-2024-6197 * Fix for curl configure failure caused by curl/curl#14373
) must be equal to the server sent nonce, that already contains the client side nonce. librdkafka was incorrectly concatenating the client side nonce again, leading to this fix being made on AK side, released in 3.8.1, with endsWith instead of equals. apache/kafka@0a00456
except the Style check job because it needs clang format 10 Fix to inherit javac path, needed by test 0098
Mock handler implementation Rename current consumer protocol from generic to classic Mock handler with automatic or manual assignment More consumer group metadata getters Test helpers Configurable session timeout and HB interval Fix mock handler ListOffsets response LeaderEpoch instead of CurrentLeaderEpoch Integration tests passing with AK trunk Improve documentation and KIP 848 specific mock tests Add mock tests for unknown topic id in metadata request and partial reconciliation Make test 0147 more reliable Fix test 0106 after HB timeout change Exclude test case with AK trunk Rename rd_kafka_buf_write_tags to rd_kafka_buf_write_tags_empty Trivup 0.12.5 can run a KafkaCluster directly with KRaft and AK trunk Trivup 0.12.6 build with a specific commit Trivup 0.12.7 with fixes for AK 3.8.0 and Py 3.12 New version of trivup 0.12.7 to fix an issue with apache/kafka#16464 on AK > 3.8.0 Static group membership mock tests Move test 0147 to a different PR Disable interactive "needsrestart" prompt
* test_read_file can read binary files too * Trivup 0.12.8 * Read certificate CA chain when set using a configuration setter with PEM format. Test that CA with untrusted chain fails authentication. * Test untrusted certificate signed with an intermediate CA * Remove private key and duplicate certs from pem client certificate * Print logs sent as events * Trivup now already inheriths the environment in interactive mode * Use namespace to avoid conflicts on TestEventCb
…h with client certificate chain (confluentinc#4900) Failing test: expect the error code that is received when no certificate is sent instead of the one received when it's sent but not trusted. Client cert callback to check if trusted certificate authorities match with client certificate chain. Log a warning when client certificate isn't sent --------- Co-authored-by: trnguyencflt <[email protected]>
… leader epoch (confluentinc#4901) Failing tests including for confluentinc#4796 and confluentinc#4804 Closes confluentinc#4796 and confluentinc#4804 CHANGELOG Fix for the correct expected RPC code in test 0139 Apply same fix to metadata update operation too Don't change rktp state to active when there's no leader but wait it's available to validate it Comment about excluded -1 value
An incorrect assumption is made that libssl is built with support for the (now-deprecated) ENGINE API if it is provided by OpenSSL >= 1.1.0 or LibreSSL. OPENSSL_NO_ENGINE is defined by OpenSSL and all of its forks if the ENGINE API was disabled at compile-time - ensure that the definition of OPENSSL_NO_ENGINE is taken into account when using ENGINE features.
…and client is using SASL authentication only (confluentinc#4936) without any client certificate set
…e with big-endian architectures (confluentinc#5183) * Fix compression types read issue in GetTelemetrySubscriptions response for big-endian architectures * Decrease allocated buffer size in `rd_kafka_PushTelemetryRequest` and explicitly cast the enum
…roupHeartbeat not updating member epoch in a case (confluentinc#4672) [KIP-848] Fixed a condition where error was being raised in commit due to old error in the topic partition [KIP-848] Fix discarding heartbeat response without epoch update when leaving during inflight HB
Re-bootstrap is now triggered only after metadata.recovery.rebootstrap.trigger.ms have passed since first metadata refresh request after last successful metadata response. The calculation was since last successful metadata response so it's possible it did overlap with the periodic topic.metadata.refresh.interval.ms and cause a re-bootstrap even if not needed.
…m them (confluentinc#4931) * Fetched committed offsets should be validated before starting to consume from it. Failing test and mock handler implementation for returning the committed offset leader epoch instead of current leader epoch. * Validate the offsets before starting to fetch assigned partitions * Add more test cases for partition assignment offset validation * Fix for test 0139 subtest `do_test_store_offset_without_leader_epoch` . When fetching an offset it returns the leader epoch used when committing, not the current leader epoch. Given the mock cluster fix the test needs to be changed. * Fix test `0139` subtest `do_test_list_offsets_leader_change`: use cloned partition list for listing offsets, to avoid the fake leader epoch is then used for validation when assigning. Fix ListOffsets mock handler for logging the correct returned leader epoch. * Changelog entry * Reduce number of tests in quick mode * Add a new fetch state when finishing validating and starting to seek after a truncation, to avoid a second repeated validation and possibly duplicated messages. * Increase single test timeout * Fix to leave the group in `rd_kafka_cgrp_incr_unassign_done` if terminate was requested, as done in `rd_kafka_cgrp_unassign_done` and `rd_kafka_cgrp_consumer_incr_unassign_done` * Mock cluster, set the group as empty when last member leaves instead of triggering a rebalance * Test 0139 with mock cluster marked as local. Doesn't delete topic if tests are local only as it's possible there's no cluster to connect to and it speeds up completing the test * Resume the partition before fetch start or before validation
--------- Co-authored-by: Arthur O'Dwyer <[email protected]>
* Revert setting timeout to infinity * style fix * Changelog change * Changelog changes * Changelog change
* Fix flakyness test 0085 * Errors that cause a refresh coordinator like NOT_COORDINATOR during an offset fetch should not be propagated to the application.
…al promotions (confluentinc#5191) * Pipeline improvements about machine types and auto-cancel * Use cached docker image for integration tests, style checks, docs build * vcpkg cache * msys2 cache * Upgrade macOS agents
…fluentinc#5155) * Implementation of OAUTHBEARER/OIDC metadata based authentication, initially supporting the Azure UAMI method. * Tests with trivup 0.14.0 supporting metadata based authentications * Add documentation and changelog entry * Rename `azure` value to `azure_imds` and replace UAMI that is the identity with IMDS that is the authentication service * Extract authentication URL and rename internal function and enums * Changes to name the configuration property "query" instead of "params" as in other implementations and to make it optional if the default endpoint is overridden.
…sume from them (confluentinc#4931)" (confluentinc#5207) This reverts commit 13a2bba.
…odes (confluentinc#5194) * Add test cases for new OffsetCommit and OffsetFetch Error Codes * Testcase for discarding the member epoch in a consumer group heartbeat response when leaving with an inflight HB
confluentinc#5214) * Changelog changes and some modification to the KIP-848 migration guide * Add that KIP-848 is not enabled by default and other PR comments
* Downgrade min supported OSX version to 13 * Version upgrade to v2.12.1
teskje
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of doing this, can we reset master to the upstream master and Cherry-Pick our commits on top? I fear otherwise we will hopelessly lose track of the changes we made.
|
Unfortunately, this also doesn't build. :/ It looks like one of the commits we've reverted in our fork has been built upon so that future changes will also need to be unpicked. It's probably worthwhile to go through the small number of extra commits we have and triage individually, but that will take longer... |
|
I discussed that revert with @petrosagg some time ago and it's worth trying to upgrade without it. For context, this was added to fix unexplained memory errors. Two relevant Slack threads I could find:
Since the reason for the memory issues is unexplained, there is a chance that they have been fixed in current librdkafka versions, or have been fixed by other random changes we made. |
|
In that case I think all changes we have (confluentinc/librdkafka@master...MaterializeInc:librdkafka:master) are already included in upstream librdkafka or supplanted by a better change, so we could just switch our rust-rdkafka to use the upstream librdkafka. Edit: MaterializeInc/rust-rdkafka#35 |
No description provided.