-
Notifications
You must be signed in to change notification settings - Fork 988
Valkey release 9.0.0-rc2 #2628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Valkey release 9.0.0-rc2 #2628
Conversation
…y for RDMA benchmark. (valkey-io#2430) (1) The old logic may result in the RDMA event being acknowledged unexpectly in the following two scenarios. * ibv_get_cq_event get an EAGAIN error. * ibv_get_cq_event get one event but it may ack multiple times in the pollcq loop. (2) In the benchmark result of valkey over RDMA, the tail latency is as high as 177 milliseconds(almost 80x of TCP). This results from incorrect benchmark client setup which includes the connection setup time into the benchmark latency recording. This patch fixes this crazy tail latency issue by modifying the valkey-benchmark.c. This change only affects benchmark over RDMA as updates are regulated under Macro USE_RDMA. There are following updates on valkey RDMA but I am willing to create separated pull requests. --------- Signed-off-by: Ruihong Wang <[email protected]>
…y-io#2474) fixes: valkey-io#2461 --------- Signed-off-by: Ran Shidlansik <[email protected]>
…alkey-io#2475) Currently HSETEX always generate `hset` notification. In order to align with generic `set` command, it should only generate `hset` if the provided time-to-live is a valid future time. --------- Signed-off-by: Ran Shidlansik <[email protected]>
In valkey-io#2431 we changed the assert to a if condition, and the test cause some trouble, now we just remove the assert (if condition) and disable the test for now due to valkey-io#2441. Signed-off-by: Binbin <[email protected]>
…ns into a single log entry (valkey-io#2481) Previously, each slot migration was logged individually, which could lead to log spam in scenarios where many slots are migrated at once. This commit enhances the logging mechanism to group consecutive slot migrations into a single log entry, improving log readability and reducing noise. Log snippets ``` 1661951:S 13 Aug 2025 15:47:10.132 * Slot range [16383, 16383] is migrated from node c3926da75f7c3a0a1bcd07e088b0bde09d48024c () in shard 7746b693330c0814178b90b757e2711ebb8c6609 to node 2465c29c8afb9231525e281e5825684d0bb79f7b () in shard 39342c039d2a6c7ef0ff96314b230dfd7737d646. 1661951:S 13 Aug 2025 15:47:10.289 * Slot range [10924, 16383] is migrated from node 2465c29c8afb9231525e281e5825684d0bb79f7b () in shard 39342c039d2a6c7ef0ff96314b230dfd7737d646 to node c3926da75f7c3a0a1bcd07e088b0bde09d48024c () in shard 7746b693330c0814178b90b757e2711ebb8c6609. 1661951:S 13 Aug 2025 15:47:10.524 * Slot range [10924, 16383] is migrated from node c3926da75f7c3a0a1bcd07e088b0bde09d48024c () in shard 7746b693330c0814178b90b757e2711ebb8c6609 to node 2465c29c8afb9231525e281e5825684d0bb79f7b () in shard 39342c039d2a6c7ef0ff96314b230dfd7737d646. ``` --------- Signed-off-by: Ping Xie <[email protected]>
Make sure slot migration has finished before moving on to the next test. Resolves valkey-io#2479 Signed-off-by: Sarthak Aggarwal <[email protected]>
…io#2273) Automatically attach respective label to newly filled issues. --------- Signed-off-by: Sarthak Aggarwal <[email protected]> Signed-off-by: Sarthak Aggarwal <[email protected]> Co-authored-by: Harkrishn Patro <[email protected]>
) The change will ensure that the slot is present on the node before the slot is populated. This will avoid the errors during populating the slot. Resolves valkey-io#2480 --------- Signed-off-by: Sarthak Aggarwal <[email protected]>
Similar to dicts, we disallow resizing while the hashtable is rehashing. In the previous code, if a resize was triggered during rehashing, like if the rehashing wasn't fast enough, we would do a while loop until the rehashing was complete, which could be a potential issue when doing resize. --------- Signed-off-by: Binbin <[email protected]>
* Use pipelines of length 1000 instead of up to 200000. * Use CLIENT REPLY OFF instead of reading and discarding the replies. Fixes valkey-io#2205 Signed-off-by: Viktor Söderqvist <[email protected]>
https://github.com/valkey-io/valkey/blob/aefed3d363d1c7d7cb391d3f605484c78c9a88f2/src/networking.c#L2279-L2293 From above code, we can see that `c->repl_data->ref_block_pos` could be equal to `o->used`. When `o->used == o->size`, we may call SSL_write() with num=0 which does not comply with the openSSL specification. (ref: https://docs.openssl.org/master/man3/SSL_write/#warnings) What's worse is that it's still the case after the reconnection. See https://github.com/valkey-io/valkey/blob/aefed3d363d1c7d7cb391d3f605484c78c9a88f2/src/replication.c#L756-L769. So in this case the replica will keep reconnecting again and again until it doesn't meet the requirements for partial synchronization. Resolves valkey-io#2119 --------- Signed-off-by: yzc-yzc <[email protected]>
…rMessage (valkey-io#2484) Fix valkey-io#2438 Modified `DingReceiver` function in `tests/modules/cluster.c` by adding null-termination logic for cross-version compatibility --------- Signed-off-by: Hanxi Zhang <[email protected]>
Simplifies a test case seen to be flaky. fixes: valkey-io#2482 Signed-off-by: Ran Shidlansik <[email protected]>
We recently introduced a new template to create `test failures` issues from a template. This change makes this template visible in the `CONTRIBUTING.md` file. Also, added a tip to paste the stack trace since outputs of CI links can expire. --------- Signed-off-by: Sarthak Aggarwal <[email protected]>
…2502) There are memory leaks when we return NULL. Signed-off-by: Binbin <[email protected]>
Previously, the config names and values were stored in a temporary dict. This ensures that no duplicates are returned, but it also makes the order random. In this PR, the config names and values still stored in the temporary dict, but then they are copied to an array, which is sorted, before the reply is sent. Resolves valkey-io#2042 --------- Signed-off-by: yzc-yzc <[email protected]>
We now pass in rdbSnapshotOptions options in this function, and options.conns is now malloc'ed in the caller side, so we need to zfree it when returning early due to an error. Previously, conns was malloc'ed after the error handling, so we don't have this. Introduced in valkey-io#1949. --------- Signed-off-by: Binbin <[email protected]>
In clusters with a very short node timeout such as 2-3 seconds, the extra failover delay of 500-1000 milliseconds (500 + random value 0-500; total 750 on average) before initiating a failover is a significant extra downtime to the cluster. This PR makes this delay relative to node timeout, using a shorter failover delay for a smaller configured node timeout. The formula is `fixed_delay = min(500, node_timeout / 30)`. | Node timeout | Fixed failover delay | |---------------|----------------------| | 15000 or more | 500 (same as before) | | 7500 | 250 | | 3000 | 100 | | 1500 | 50 | Additional change: Add an extra 500ms delay to new replicas that may not yet know about the other replicas. This avoids the scenario where a new replica with no data wins the failover. This change turned out to be needed to for the stability of some test cases. The purposes of the failover delay are 1. Allow FAIL to propagate to the voting primaries in the cluster 2. Allow replicas to exchange their offsets, so they will have a correct view of their own rank. A third (undocumented) purpose of this delay is to allow newly added replicas to discover other replicas in the cluster via gossip and to compute their rank, to realize it's are not the best replica. This case is mitigated by adding another 500ms delay to new replicas, i.e. if it has replication offset 0. A low node timeout only makes sense in fast networks, so we can assume that the above needs less time than in a cluster with a higher node timeout. These delays don't affect the correctness of the algorithm. They are just there to increase the probability that a failover will succeed by making sure that the FAIL message has enough time to propagate in the cluster and to the random part is to reduce the probability that two replicas initiates the failover at the same time. The typical use case is when data consistency matters and writes can't be skipped. For example, in some application, we buffer writes in the application during node failures to be able to apply them when the failover is completed. The application can't buffer them for a very long time, so we need the cluster to be up again within e.g. 5 seconds from the time a node starts to fail. I hope this PR can be considered safer than valkey-io#2227, although the two changes are orthogonal. Part of issue valkey-io#2023. --------- Signed-off-by: Viktor Söderqvist <[email protected]>
…-io#2500) Minor cleanup. Signed-off-by: Binbin <[email protected]>
several fixes: 1. fix not using bool input type for hashTypeIgnoreTTL - probably lost during the 3 HFE PR merges 2. remove vset change hashtable encoding to single - The current code is a bug. The entry is probably expired (or about to be expired soon) so we can leave it as a hashtable till it does. 3. enable incremental rehashing for volatile item keys kvstore - This is the center issue of this PR. without it the activeexpiry might not scan the kvstore which is very fragmented with lots of empty buckets. --------- Signed-off-by: Ran Shidlansik <[email protected]>
valkey-io#2466) If we want to expand kvstoreHashtableExpand, we need to make sure the hashtable exists. Currently, when processing RDB slot-info, our expand has no effect because the hashtable does not exist (we initialize it only when we need it). We also update kvstoreExpand to use the kvstoreHashtableExpand to make sure there is only one code path. Also see valkey-io#1199 for more details. Signed-off-by: Binbin <[email protected]>
Fixes valkey-io#2372 ## Description This PR adds an automated workflow that assigns PR authors to their own pull requests when opened or reopened. ## Changes - Added `.github/workflows/auto-author-assign.yml` workflow - Uses `toshimaru/[email protected]` action - Triggers on `pull_request_target` events (opened, reopened) - Requires `pull-requests: write` permission ## Benefits - Improves PR tracking and organization - Automatically assigns responsibility to PR authors - Reduces manual assignment overhead for maintainers - Helps contributors track their own PRs more easily ## Testing ✅ **Tested on my fork before submission:** 1. Merged the workflow into my fork's unstable branch 2. Created a test PR within my fork 3. Verified that I was automatically assigned as the assignee 4. Screenshot showing successful auto-assignment: <img width="1278" height="684" alt="Screenshot 2025-08-01 at 3 39 05 PM" src="https://github.com/user-attachments/assets/9ad643be-5eac-4ad6-bec7-184cf22e9cbd" /> The workflow executed successfully and assigned me to the test PR as expected. --------- Signed-off-by: Hanxi Zhang <[email protected]>
This change avoids additional failure report creation if the node is already marked as failed. The failure report count has never been used after a node has been marked as failed. So, there is no value addition in maintaining it further. This reduces operation of both add and delete failure report. Hence, the performance benefit. We can observe an avg. of 10% reduction in p99 CPU utilization (in a 2000 nodes cluster (1000 primary/ 1000 replica) with 330 nodes in failed state with this change. --------- Signed-off-by: Seungmin Lee <[email protected]>
…de as primary (valkey-io#2510) Fixes: valkey-io#2460 With this change we avoid divergence in cluster and replication layer view. I've observed node can be marked as primary in cluster while it can be marked as replica in replication layer view and have a active replication link. Without this change, we used to end up in a invalid replica chain in replication layer. --------- Signed-off-by: Harkrishn Patro <[email protected]>
…alkey-io#2523) Signed-off-by: Ran Shidlansik <[email protected]>
currently hsetex is verifying the number of fields matches the provided number of fields by using div. instead it can match to the multiplication in order to prevent rounding the verified value down. --------- Signed-off-by: Ran Shidlansik <[email protected]> Co-authored-by: Madelyn Olson <[email protected]>
Command: `./runtest --single unit/bitops --loops 3` Unstable ``` [ignore]: large memory flag not provided [-1/1 done]: unit/bitops (4 seconds) [ignore]: large memory flag not provided [0/1 done]: unit/bitops (4 seconds) [ignore]: large memory flag not provided [1/1 done]: unit/bitops (4 seconds) The End Execution time of different units: 4 seconds - unit/bitops 4 seconds - unit/bitops 4 seconds - unit/bitops ``` After fix ``` [1/3 done]: unit/bitops (4 seconds) [ignore]: large memory flag not provided [2/3 done]: unit/bitops (4 seconds) [ignore]: large memory flag not provided [3/3 done]: unit/bitops (4 seconds) The End Execution time of different units: 4 seconds - unit/bitops 4 seconds - unit/bitops 4 seconds - unit/bitops ``` Signed-off-by: Sarthak Aggarwal <[email protected]>
pthread functions return the error instead of setting errno. Fixes valkey-io#2525 Signed-off-by: Ted Lyngmo <[email protected]>
…key-io#2595) When we introduced the new Hash fields expiration functionality, we decided to combine the current active expiration job between generic keys and hash fields. During that job we run a tight loop. In each loop iteration we scan over maximum of 20 keys (with default expire effort) and try to "expire" them. For hash fields expiration job, the "expire" of a hash key, means expiring maximum of 80 fields (with default expire effort). The problem is that we might do much more work per each iteration of hash fields expiration job. The current code is shared between the 2 jobs, and currently we only perform time check every 16 iterations. as a result the CPU of fields active expiration can spike and consume much higher CPU% than the current 25% bound allows. Example: Before this PR | Scenario | AVG CPU | Time to expire all fields | |----------------------------------------------------|---------|---------------------------| | Expiring 10M volatile fields from a single hash | 20.18% | 26 seconds | | Expiring 10M volatile fields from 10K hash objects | 32.72% | 7 seconds | After this PR Scenario | AVG CPU | Time to expire all fields | Scenario | AVG CPU | Time to expire all fields | |----------------------------------------------------|---------|---------------------------| | Expiring 10M volatile fields from a single hash | 20.23%. | 26 seconds | | Expiring 10M volatile fields from 10K hash objects | 20.76%. | 11 seconds | *NOTE* The change introduced here make the field job check the time every iteration. We offer compile time option to use efficient time check using TSC (X86) or VCR (ARM) on most modern CPU, so the impact is expected to be low. Still, in order to avoid degradation for existing workloads, the code change was made so it will not impact the existing generic keys active expiration job. --------- Signed-off-by: Ran Shidlansik <[email protected]> Co-authored-by: Madelyn Olson <[email protected]> Co-authored-by: Viktor Söderqvist <[email protected]>
I believe we should evict the clients when the client eviction limit is breached instead of _at_ the breach. I came across this function in the failed [daily test](https://github.com/valkey-io/valkey/actions/runs/17521272806/job/49765359298#step:6:7770), which could possibly be related. Signed-off-by: Sarthak Aggarwal <[email protected]>
…alkey-io#2593) As discussed in valkey-io#2579 Notably, I am exposing this feature as "Atomic Slot Migration" to modules. If we want to call it something else, we should consider that now (e.g.. "Replication-Based Slot Migration"?) Also note: I am exposing both target and source node in the event. It is not guaranteed that either target or source would be the node the event fires on (e.g. replicas will fire the event after replica containment is introduced). Even though it could be potentially inferred from CLUSTER SLOTS, it should help modules parse it this way. Modules should be able to infer whether it is occurring on primary/replica from `ctx` flags, so not duplicating that here. Closes valkey-io#2579 --------- Signed-off-by: Jacob Murphy <[email protected]>
## Summary - extend replication wait time in `slave-selection` test ``` *** [err]: Node valkey-io#10 should eventually replicate node valkey-io#5 in tests/unit/cluster/slave-selection.tcl valkey-io#10 didn't became slave of valkey-io#5 ``` ## Testing - `./runtest --single unit/cluster/slave-selection` - `./runtest --single unit/cluster/slave-selection --valgrind` Signed-off-by: Vitali Arbuzov <[email protected]> Signed-off-by: Binbin <[email protected]> Co-authored-by: Binbin <[email protected]> Co-authored-by: Harkrishn Patro <[email protected]>
…y-io#2602) Resolves valkey-io#2545 Followed the steps to reproduce the issue, and was able to get non-zero `total_net_repl_output_bytes`. ``` (base) ~/workspace/valkey git:[fix-bug-2545] src/valkey-cli INFO | grep total_net_repl_output_bytes total_net_repl_output_bytes:1788 ``` --------- Signed-off-by: Sarthak Aggarwal <[email protected]>
…alkey-io#2612) With valkey-io#2604 merged, the `Node valkey-io#10 should eventually replicate node valkey-io#5` started passing successfully with valgrind, but I guess we are seeing a new daily failure from a `New Master down consecutively` test that runs shortly after. Signed-off-by: Sarthak Aggarwal <[email protected]>
- Adds io-thread enabled perf-tests for pr - changes the server and benchmark client cpu ranges so there are on separate NUMA nodes of the metal machine. - Also kill any servers that are active on the metal machine if anything fails. - Adds a benchmark wf to benchmark versions and publish on a issue id provided: <img width="340" height="449" alt="Screenshot 2025-09-11 at 12 14 28 PM" src="https://github.com/user-attachments/assets/04f6a781-e163-4d6b-9b70-deedad15c9ef" /> - Comments on the issue with the full comparison like this: <img width="936" height="1152" alt="Screenshot 2025-09-11 at 12 15 35 PM" src="https://github.com/user-attachments/assets/e1584c8e-25dc-433f-a4d4-5b08d7548ddf" /> roshkhatri#3 (comment) --------- Signed-off-by: Roshan Khatri <[email protected]>
Set free method for deferred_reply list to properly clean up ClientReplyValue objects when the list is destroyed Signed-off-by: Uri Yagelnik <[email protected]>
When we adding atomic slot migration in valkey-io#1949, we reused a lot of rdb save code, it was an easier way to implement ASM in the first time, but it comes with some side effect. Like we are using CHILD_TYPE_RDB to do the fork, we use rdb.c/rdb.h function to save the snapshot, these mess up the logs (we will print some logs saying we are doing RDB stuff) and mess up the info fields (we will say we are rdb_bgsave_in_progress but actually we are doing slot migration). In addition, it makes the code difficult to maintain. The rdb_save method uses a lot of rdb_* variables, but we are actually doing slot migration. If we want to support one fork with multiple target nodes, we need to rewrite these code for a better cleanup. Note that the changes to rdb.c/rdb.h are reverting previous changes from when we was reusing this code for slot migration. The slot migration snapshot logic is similar to the previous diskless replication. We use pipe to transfer the snapshot data from the child process to the parent process. Interface changes: - New slot_migration_fork_in_progress info field. - New cow_size field in CLUSTER GETSLOTMIGRATIONS command. - Also add slot migration fork to the cluster class trace latency. Signed-off-by: Binbin <[email protected]> Signed-off-by: Jacob Murphy <[email protected]> Co-authored-by: Jacob Murphy <[email protected]>
reduce the req and warmup time to finish in 6 hrs as the github workflow times out after 6 hrs --------- Signed-off-by: Roshan Khatri <[email protected]>
The reason is that the replication stream may not have yet reached the replica for execution. We could add a wait_for_condition. We decided to replace those assert calls with assert_replication_stream to verify the contents of the replication stream rather than the commandstats. ``` *** [err]: Flush slot command propagated to replica in tests/unit/cluster/cluster-flush-slot.tcl ``` Signed-off-by: Sarthak Aggarwal <[email protected]> Signed-off-by: Binbin <[email protected]> Co-authored-by: Binbin <[email protected]>
@valkey-io/core-team please review |
FYI, we're still waiting for #2329 to be merged and included in this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Release notes look great to me. The "Attention Valkey Module maintainers" looks very good.
…n rules (valkey-io#2629) Following the decision in valkey-io#2189, we need to fix this test because the `extended-redis-compatibility` config option is not going to be deprecated in 9.0. This commit changes the test to postpone the deprecation of `extended-redis-compatibility` until 10.0 release. Signed-off-by: Ricardo Dias <[email protected]>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 9.0 #2628 +/- ##
======================================
Coverage ? 72.19%
======================================
Files ? 127
Lines ? 70936
Branches ? 0
======================================
Hits ? 51215
Misses ? 19721
Partials ? 0
🚀 New features to boost your workflow:
|
Pick this one to fix the valgrind errors: |
We probably should close the correct `slot_migration_pipe_read`. It should resolve the valgrind errors. Signed-off-by: Sarthak Aggarwal <[email protected]>
f11bede
to
1d7dfa9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionally all looks ok, just some misc wording suggestions.
Signed-off-by: Ricardo Dias <[email protected]>
This PR includes all commits from the unstable branch since RC1 cut, and also a commit with the update of version.h and release notes.