Skip to content

Conversation

rjd15372
Copy link
Member

This PR includes all commits from the unstable branch since RC1 cut, and also a commit with the update of version.h and release notes.

ruihong123 and others added 30 commits September 18, 2025 14:54
…y for RDMA benchmark. (valkey-io#2430)

(1) The old logic may result in the RDMA event being acknowledged
unexpectly in the following two scenarios.
* ibv_get_cq_event get an EAGAIN error. 
* ibv_get_cq_event get one event but it may ack multiple times in the
pollcq loop.

(2) In the benchmark result of valkey over RDMA, the tail latency is as
high as 177 milliseconds(almost 80x of TCP). This results from incorrect
benchmark client setup which includes the connection setup time into the
benchmark latency recording. This patch fixes this crazy tail latency
issue by modifying the valkey-benchmark.c. This change only affects
benchmark over RDMA as updates are regulated under Macro USE_RDMA.

There are following updates on valkey RDMA but I am willing to create
separated pull requests.

---------

Signed-off-by: Ruihong Wang <[email protected]>
…alkey-io#2475)

Currently HSETEX always generate `hset` notification. In order to align
with generic `set` command, it should only generate `hset` if the
provided time-to-live is a valid future time.

---------

Signed-off-by: Ran Shidlansik <[email protected]>
In valkey-io#2431 we changed the assert to a if condition, and the test cause
some trouble, now we just remove the assert (if condition) and disable
the test for now due to valkey-io#2441.

Signed-off-by: Binbin <[email protected]>
…ns into a single log entry (valkey-io#2481)

Previously, each slot migration was logged individually, which could
lead to log spam in scenarios where many slots are migrated at once.

This commit enhances the logging mechanism to group consecutive slot
migrations into a single log entry, improving log readability and
reducing noise.

Log snippets

```
1661951:S 13 Aug 2025 15:47:10.132 * Slot range [16383, 16383] is migrated from node c3926da75f7c3a0a1bcd07e088b0bde09d48024c () in shard 7746b693330c0814178b90b757e2711ebb8c6609 to node 2465c29c8afb9231525e281e5825684d0bb79f7b () in shard 39342c039d2a6c7ef0ff96314b230dfd7737d646.
1661951:S 13 Aug 2025 15:47:10.289 * Slot range [10924, 16383] is migrated from node 2465c29c8afb9231525e281e5825684d0bb79f7b () in shard 39342c039d2a6c7ef0ff96314b230dfd7737d646 to node c3926da75f7c3a0a1bcd07e088b0bde09d48024c () in shard 7746b693330c0814178b90b757e2711ebb8c6609.
1661951:S 13 Aug 2025 15:47:10.524 * Slot range [10924, 16383] is migrated from node c3926da75f7c3a0a1bcd07e088b0bde09d48024c () in shard 7746b693330c0814178b90b757e2711ebb8c6609 to node 2465c29c8afb9231525e281e5825684d0bb79f7b () in shard 39342c039d2a6c7ef0ff96314b230dfd7737d646.
```

---------

Signed-off-by: Ping Xie <[email protected]>
Make sure slot migration has finished before moving on to the next test.

Resolves valkey-io#2479

Signed-off-by: Sarthak Aggarwal <[email protected]>
…io#2273)

Automatically attach respective label to newly filled issues.
---------

Signed-off-by: Sarthak Aggarwal <[email protected]>
Signed-off-by: Sarthak Aggarwal <[email protected]>
Co-authored-by: Harkrishn Patro <[email protected]>
)

The change will ensure that the slot is present on the node before the
slot is populated. This will avoid the errors during populating the
slot.

Resolves valkey-io#2480

---------

Signed-off-by: Sarthak Aggarwal <[email protected]>
Similar to dicts, we disallow resizing while the hashtable is
rehashing. In the previous code, if a resize was triggered during
rehashing, like if the rehashing wasn't fast enough, we would do
a while loop until the rehashing was complete, which could be a
potential issue when doing resize.

---------

Signed-off-by: Binbin <[email protected]>
* Use pipelines of length 1000 instead of up to 200000.
* Use CLIENT REPLY OFF instead of reading and discarding the replies.

Fixes valkey-io#2205

Signed-off-by: Viktor Söderqvist <[email protected]>
https://github.com/valkey-io/valkey/blob/aefed3d363d1c7d7cb391d3f605484c78c9a88f2/src/networking.c#L2279-L2293
From above code, we can see that `c->repl_data->ref_block_pos` could be
equal to `o->used`.
When `o->used == o->size`, we may call SSL_write() with num=0 which does
not comply with the openSSL specification.
(ref: https://docs.openssl.org/master/man3/SSL_write/#warnings)

What's worse is that it's still the case after the reconnection. See
https://github.com/valkey-io/valkey/blob/aefed3d363d1c7d7cb391d3f605484c78c9a88f2/src/replication.c#L756-L769.
So in this case the replica will keep reconnecting again and again until
it doesn't meet the requirements for partial synchronization.

Resolves valkey-io#2119

---------

Signed-off-by: yzc-yzc <[email protected]>
…rMessage (valkey-io#2484)

Fix valkey-io#2438

Modified `DingReceiver` function in `tests/modules/cluster.c` by adding
null-termination logic for cross-version compatibility

---------

Signed-off-by: Hanxi Zhang <[email protected]>
Simplifies a test case seen to be flaky.

fixes: valkey-io#2482

Signed-off-by: Ran Shidlansik <[email protected]>
We recently introduced a new template to create `test failures` issues
from a template. This change makes this template visible in the
`CONTRIBUTING.md` file. Also, added a tip to paste the stack trace since
outputs of CI links can expire.

---------

Signed-off-by: Sarthak Aggarwal <[email protected]>
…2502)

There are memory leaks when we return NULL.

Signed-off-by: Binbin <[email protected]>
Previously, the config names and values were stored in a temporary dict.
This ensures that no duplicates are returned, but it also makes the
order random.

In this PR, the config names and values still stored in the temporary
dict, but then they are copied to an array, which is sorted, before the
reply is sent.

Resolves valkey-io#2042

---------

Signed-off-by: yzc-yzc <[email protected]>
We now pass in rdbSnapshotOptions options in this function, and
options.conns
is now malloc'ed in the caller side, so we need to zfree it when
returning early
due to an error. Previously, conns was malloc'ed after the error
handling, so we
don't have this.

Introduced in valkey-io#1949.

---------

Signed-off-by: Binbin <[email protected]>
In clusters with a very short node timeout such as 2-3 seconds, the
extra failover delay of 500-1000 milliseconds (500 + random value 0-500;
total 750 on average) before initiating a failover is a significant
extra downtime to the cluster. This PR makes this delay relative to node
timeout, using a shorter failover delay for a smaller configured node
timeout. The formula is `fixed_delay = min(500, node_timeout / 30)`.

| Node timeout  | Fixed failover delay |
|---------------|----------------------|
| 15000 or more | 500 (same as before) |
| 7500          | 250                  |
| 3000          | 100                  |
| 1500          | 50                   |

Additional change: Add an extra 500ms delay to new replicas that may not
yet know about the other replicas. This avoids the scenario where a new
replica with no data wins the failover. This change turned out to be
needed to for the stability of some test cases.

The purposes of the failover delay are

1. Allow FAIL to propagate to the voting primaries in the cluster
2. Allow replicas to exchange their offsets, so they will have a correct
view of their own rank.

A third (undocumented) purpose of this delay is to allow newly added
replicas to discover other replicas in the cluster via gossip and to
compute their rank, to realize it's are not the best replica. This case
is mitigated by adding another 500ms delay to new replicas, i.e. if it
has replication offset 0.

A low node timeout only makes sense in fast networks, so we can assume
that the above needs less time than in a cluster with a higher node
timeout.

These delays don't affect the correctness of the algorithm. They are
just there to increase the probability that a failover will succeed by
making sure that the FAIL message has enough time to propagate in the
cluster and to the random part is to reduce the probability that two
replicas initiates the failover at the same time.

The typical use case is when data consistency matters and writes can't
be skipped. For example, in some application, we buffer writes in the
application during node failures to be able to apply them when the
failover is completed. The application can't buffer them for a very long
time, so we need the cluster to be up again within e.g. 5 seconds from
the time a node starts to fail.

I hope this PR can be considered safer than valkey-io#2227, although the two
changes are orthogonal.

Part of issue valkey-io#2023.

---------

Signed-off-by: Viktor Söderqvist <[email protected]>
several fixes:
1. fix not using bool input type for hashTypeIgnoreTTL - probably lost
during the 3 HFE PR merges
2. remove vset change hashtable encoding to single - The current code is
a bug. The entry is probably expired (or about to be expired soon) so we
can leave it as a hashtable till it does.
3. enable incremental rehashing for volatile item keys kvstore - This is
the center issue of this PR. without it the activeexpiry might not scan
the kvstore which is very fragmented with lots of empty buckets.

---------

Signed-off-by: Ran Shidlansik <[email protected]>
valkey-io#2466)

If we want to expand kvstoreHashtableExpand, we need to make sure the
hashtable exists. Currently, when processing RDB slot-info, our expand
has no effect because the hashtable does not exist (we initialize it only
when we need it).

We also update kvstoreExpand to use the kvstoreHashtableExpand to make
sure there is only one code path. Also see valkey-io#1199 for more details.

Signed-off-by: Binbin <[email protected]>
Fixes valkey-io#2372

## Description

This PR adds an automated workflow that assigns PR authors to their own
pull requests when opened or reopened.

## Changes

- Added `.github/workflows/auto-author-assign.yml` workflow
- Uses `toshimaru/[email protected]` action
- Triggers on `pull_request_target` events (opened, reopened)
- Requires `pull-requests: write` permission

## Benefits

- Improves PR tracking and organization
- Automatically assigns responsibility to PR authors
- Reduces manual assignment overhead for maintainers
- Helps contributors track their own PRs more easily

## Testing

✅ **Tested on my fork before submission:**

1. Merged the workflow into my fork's unstable branch
2. Created a test PR within my fork
3. Verified that I was automatically assigned as the assignee
4. Screenshot showing successful auto-assignment:
<img width="1278" height="684" alt="Screenshot 2025-08-01 at 3 39 05 PM"
src="https://github.com/user-attachments/assets/9ad643be-5eac-4ad6-bec7-184cf22e9cbd"
/>

The workflow executed successfully and assigned me to the test PR as
expected.

---------

Signed-off-by: Hanxi Zhang <[email protected]>
This change avoids additional failure report creation if the node is
already marked as failed. The failure report count has never been used
after a node has been marked as failed. So, there is no value addition
in maintaining it further. This reduces operation of both add and delete
failure report. Hence, the performance benefit.

We can observe an avg. of 10% reduction in p99 CPU utilization (in a 2000
nodes cluster (1000 primary/ 1000 replica) with 330 nodes in failed
state with this change.

---------

Signed-off-by: Seungmin Lee <[email protected]>
…de as primary (valkey-io#2510)

Fixes: valkey-io#2460

With this change we avoid divergence in cluster and replication layer
view. I've observed node can be marked as primary in cluster while it
can be marked as replica in replication layer view and have a active
replication link. Without this change, we used to end up in a invalid
replica chain in replication layer.

---------

Signed-off-by: Harkrishn Patro <[email protected]>
currently hsetex is verifying the number of fields matches the provided
number of fields by using div. instead it can match to the
multiplication in order to prevent rounding the verified value down.

---------

Signed-off-by: Ran Shidlansik <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Command: `./runtest --single unit/bitops --loops 3`

Unstable

```
[ignore]: large memory flag not provided
[-1/1 done]: unit/bitops (4 seconds)
[ignore]: large memory flag not provided
[0/1 done]: unit/bitops (4 seconds)
[ignore]: large memory flag not provided
[1/1 done]: unit/bitops (4 seconds)

                   The End

Execution time of different units:
  4 seconds - unit/bitops
  4 seconds - unit/bitops
  4 seconds - unit/bitops
```

After fix

```
[1/3 done]: unit/bitops (4 seconds)
[ignore]: large memory flag not provided
[2/3 done]: unit/bitops (4 seconds)
[ignore]: large memory flag not provided
[3/3 done]: unit/bitops (4 seconds)

                   The End

Execution time of different units:
  4 seconds - unit/bitops
  4 seconds - unit/bitops
  4 seconds - unit/bitops
```

Signed-off-by: Sarthak Aggarwal <[email protected]>
pthread functions return the error instead of setting errno.

Fixes valkey-io#2525

Signed-off-by: Ted Lyngmo <[email protected]>
ranshid and others added 11 commits September 18, 2025 14:54
…key-io#2595)

When we introduced the new Hash fields expiration functionality,
we decided to combine the current active expiration job between generic
keys and hash fields.
During that job we run a tight loop. In each loop iteration we scan over
maximum of 20 keys (with default expire effort) and try to "expire"
them.
For hash fields expiration job, the "expire" of a hash key, means
expiring maximum of 80 fields (with default expire effort).
The problem is that we might do much more work per each iteration of
hash fields expiration job.
The current code is shared between the 2 jobs, and currently we only
perform time check every 16 iterations.
as a result the CPU of fields active expiration can spike and consume
much higher CPU% than the current 25% bound allows.

Example:

Before this PR

| Scenario | AVG CPU | Time to expire all fields |

|----------------------------------------------------|---------|---------------------------|
| Expiring 10M volatile fields from a single hash | 20.18% | 26 seconds
|
| Expiring 10M volatile fields from 10K hash objects | 32.72% | 7
seconds |


After this PR
Scenario | AVG CPU | Time to expire all fields
| Scenario | AVG CPU | Time to expire all fields |

|----------------------------------------------------|---------|---------------------------|
| Expiring 10M volatile fields from a single hash | 20.23%. | 26 seconds
|
| Expiring 10M volatile fields from 10K hash objects | 20.76%. | 11
seconds |

*NOTE*
The change introduced here make the field job check the time every
iteration. We offer compile time option to use efficient time check
using TSC (X86) or VCR (ARM) on most modern CPU, so the impact is
expected to be low. Still, in order to avoid degradation for existing
workloads, the code change was made so it will not impact the existing
generic keys active expiration job.

---------

Signed-off-by: Ran Shidlansik <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Co-authored-by: Viktor Söderqvist <[email protected]>
I believe we should evict the clients when the client eviction limit is
breached instead of _at_ the breach. I came across this function in the
failed [daily
test](https://github.com/valkey-io/valkey/actions/runs/17521272806/job/49765359298#step:6:7770),
which could possibly be related.

Signed-off-by: Sarthak Aggarwal <[email protected]>
…alkey-io#2593)

As discussed in valkey-io#2579

Notably, I am exposing this feature as "Atomic Slot Migration" to
modules. If we want to call it something else, we should consider that
now (e.g.. "Replication-Based Slot Migration"?)

Also note: I am exposing both target and source node in the event. It is
not guaranteed that either target or source would be the node the event
fires on (e.g. replicas will fire the event after replica containment is
introduced). Even though it could be potentially inferred from CLUSTER
SLOTS, it should help modules parse it this way. Modules should be able
to infer whether it is occurring on primary/replica from `ctx` flags, so
not duplicating that here.

Closes valkey-io#2579

---------

Signed-off-by: Jacob Murphy <[email protected]>
## Summary
- extend replication wait time in `slave-selection` test

```
*** [err]: Node valkey-io#10 should eventually replicate node valkey-io#5 in tests/unit/cluster/slave-selection.tcl
valkey-io#10 didn't became slave of valkey-io#5
```

## Testing
- `./runtest --single unit/cluster/slave-selection`
- `./runtest --single unit/cluster/slave-selection --valgrind`

Signed-off-by: Vitali Arbuzov <[email protected]>
Signed-off-by: Binbin <[email protected]>
Co-authored-by: Binbin <[email protected]>
Co-authored-by: Harkrishn Patro <[email protected]>
…y-io#2602)

Resolves valkey-io#2545 

Followed the steps to reproduce the issue, and was able to get non-zero
`total_net_repl_output_bytes`.

```
(base) ~/workspace/valkey git:[fix-bug-2545]
src/valkey-cli INFO | grep total_net_repl_output_bytes
total_net_repl_output_bytes:1788
```

---------

Signed-off-by: Sarthak Aggarwal <[email protected]>
…alkey-io#2612)

With valkey-io#2604 merged, the `Node valkey-io#10 should eventually replicate node valkey-io#5`
started passing successfully with valgrind, but I guess we are seeing a
new daily failure from a `New Master down consecutively` test that runs
shortly after.

Signed-off-by: Sarthak Aggarwal <[email protected]>
- Adds io-thread enabled perf-tests for pr
- changes the server and benchmark client cpu ranges so there are on
separate NUMA nodes of the metal machine.
- Also kill any servers that are active on the metal machine if anything
fails.
- Adds a benchmark wf to benchmark versions and publish on a issue id
provided:
<img width="340" height="449" alt="Screenshot 2025-09-11 at 12 14 28 PM"
src="https://github.com/user-attachments/assets/04f6a781-e163-4d6b-9b70-deedad15c9ef"
/>

- Comments on the issue with the full comparison like this:
 
<img width="936" height="1152" alt="Screenshot 2025-09-11 at 12 15
35 PM"
src="https://github.com/user-attachments/assets/e1584c8e-25dc-433f-a4d4-5b08d7548ddf"
/>

roshkhatri#3 (comment)

---------

Signed-off-by: Roshan Khatri <[email protected]>
Set free method for deferred_reply list to properly clean up 
ClientReplyValue objects when the list is destroyed

Signed-off-by: Uri Yagelnik <[email protected]>
When we adding atomic slot migration in valkey-io#1949, we reused a lot of rdb save code,
it was an easier way to implement ASM in the first time, but it comes with some
side effect. Like we are using CHILD_TYPE_RDB to do the fork, we use rdb.c/rdb.h
function to save the snapshot, these mess up the logs (we will print some logs
saying we are doing RDB stuff) and mess up the info fields (we will say we are
rdb_bgsave_in_progress but actually we are doing slot migration).

In addition, it makes the code difficult to maintain. The rdb_save method uses
a lot of rdb_* variables, but we are actually doing slot migration. If we want
to support one fork with multiple target nodes, we need to rewrite these code
for a better cleanup.

Note that the changes to rdb.c/rdb.h are reverting previous changes from when
we was reusing this code for slot migration. The slot migration snapshot logic
is similar to the previous diskless replication. We use pipe to transfer the
snapshot data from the child process to the parent process.

Interface changes:
- New slot_migration_fork_in_progress info field.
- New cow_size field in CLUSTER GETSLOTMIGRATIONS command.
- Also add slot migration fork to the cluster class trace latency.

Signed-off-by: Binbin <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
Co-authored-by: Jacob Murphy <[email protected]>
reduce the req and warmup time to finish in 6 hrs as the github workflow
times out after 6 hrs

---------

Signed-off-by: Roshan Khatri <[email protected]>
The reason is that the replication stream may not have yet reached
the replica for execution. We could add a wait_for_condition. We decided
to replace those assert calls with assert_replication_stream to verify
the contents of the replication stream rather than the commandstats.
```
*** [err]: Flush slot command propagated to replica in tests/unit/cluster/cluster-flush-slot.tcl
```

Signed-off-by: Sarthak Aggarwal <[email protected]>
Signed-off-by: Binbin <[email protected]>
Co-authored-by: Binbin <[email protected]>
@rjd15372
Copy link
Member Author

@valkey-io/core-team please review

@rjd15372
Copy link
Member Author

FYI, we're still waiting for #2329 to be merged and included in this PR.

@zuiderkwast
Copy link
Contributor

zuiderkwast commented Sep 19, 2025

Regarding the failing test case about valkey vs redis: This test case was added in #306 and designed to fail if the compatibility feature isn't removed in 9.0, to make sure we don't forget what we decided in 8.0. #2192 (to close #2189) should have removed this, but apparently it was missed.

Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Release notes look great to me. The "Attention Valkey Module maintainers" looks very good.

…n rules (valkey-io#2629)

Following the decision in valkey-io#2189, we need to fix this test because the
`extended-redis-compatibility` config option is not going to be
deprecated in 9.0.

This commit changes the test to postpone the deprecation of
`extended-redis-compatibility` until 10.0 release.

Signed-off-by: Ricardo Dias <[email protected]>
Copy link

codecov bot commented Sep 19, 2025

Codecov Report

❌ Patch coverage is 78.33895% with 193 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (9.0@70a7ce0). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/replication.c 44.26% 34 Missing ⚠️
src/module.c 34.14% 27 Missing ⚠️
src/cluster_migrateslots.c 85.09% 24 Missing ⚠️
src/rdb.c 74.15% 23 Missing ⚠️
src/memory_prefetch.c 0.00% 18 Missing ⚠️
src/cluster_legacy.c 83.65% 17 Missing ⚠️
src/bio.c 82.22% 8 Missing ⚠️
src/networking.c 94.96% 8 Missing ⚠️
src/scripting_engine.c 70.37% 8 Missing ⚠️
src/server.c 83.78% 6 Missing ⚠️
... and 8 more
Additional details and impacted files
@@          Coverage Diff           @@
##             9.0    #2628   +/-   ##
======================================
  Coverage       ?   72.19%           
======================================
  Files          ?      127           
  Lines          ?    70936           
  Branches       ?        0           
======================================
  Hits           ?    51215           
  Misses         ?    19721           
  Partials       ?        0           
Files with missing lines Coverage Δ
src/acl.c 90.61% <100.00%> (ø)
src/aof.c 81.17% <100.00%> (ø)
src/bitops.c 90.04% <100.00%> (ø)
src/childinfo.c 97.43% <100.00%> (ø)
src/commandlog.c 96.66% <100.00%> (ø)
src/commands.def 100.00% <ø> (ø)
src/config.c 78.66% <100.00%> (ø)
src/db.c 93.25% <100.00%> (ø)
src/dict.c 79.46% <ø> (ø)
src/hyperloglog.c 92.30% <100.00%> (ø)
... and 27 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@zuiderkwast
Copy link
Contributor

Pick this one to fix the valgrind errors:

We probably should close the correct `slot_migration_pipe_read`. It
should resolve the valgrind errors.

Signed-off-by: Sarthak Aggarwal <[email protected]>
@rjd15372 rjd15372 force-pushed the 9.0-rc2 branch 2 times, most recently from f11bede to 1d7dfa9 Compare September 22, 2025 10:09
@madolson madolson requested review from a team and madolson September 22, 2025 14:03
Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally all looks ok, just some misc wording suggestions.

Signed-off-by: Ricardo Dias <[email protected]>
@rjd15372 rjd15372 merged commit ba9a104 into valkey-io:9.0 Sep 23, 2025
65 of 67 checks passed
@enjoy-binbin enjoy-binbin mentioned this pull request Sep 25, 2025
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.