Save cluster file in bio to avoid the stuck io latency #2555

enjoy-binbin · 2025-08-27T12:05:19Z

When the cluster changes, we need to persist the cluster configuration,
if I/O is delayed or blocked, possibly by disk contention, this may result
in large latencies on the main thread.

We should avoid synchronous I/O from the main thread. So in this commit,
we will try to bio to save the config file. We add a bio job and send a
sds version of the config file, which does the synchronous save, so there
is some eventually consistent version consistently stored on disk.

This may break our previous assumption that nodes.conf is in sync and has
the strong consistency. For shutdown and cluster saveconfig, we will wait
for the bio job to get drained and trigger a new save in a sync way.

Closes #2424.

When the cluster changes, we need to persist the cluster configuration, if I/O is delayed or blocked, possibly by disk contention, this may result in large latencies on the main thread. We should avoid synchronous I/O from the main thread. So in this commit, we will try to bio to save the config file. We add a bio job and send a sds version of the config file, which does the synchronous save, so there is some eventually consistent version consistently stored on disk. This may break our previous assumption that nodes.conf is in sync and has the strong consistency. For shutdown and cluster saveconfig, we will wait for the bio job to get drained and trigger a new save in a sync way. Closes valkey-io#2424. Signed-off-by: Binbin <[email protected]>

codecov · 2025-08-27T12:24:26Z

Codecov Report

❌ Patch coverage is 96.07843% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.21%. Comparing base (1ba622b) to head (0e37d99).
⚠️ Report is 7 commits behind head on unstable.

Files with missing lines	Patch %	Lines
src/bio.c	87.50%	1 Missing ⚠️
src/cluster_legacy.c	97.67%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #2555      +/-   ##
============================================
+ Coverage     72.19%   72.21%   +0.02%     
============================================
  Files           126      126              
  Lines         70615    70653      +38     
============================================
+ Hits          50977    51020      +43     
+ Misses        19638    19633       -5

Files with missing lines	Coverage Δ
src/bio.c	`85.43% <87.50%> (-0.09%)`	⬇️
src/cluster_legacy.c	`87.58% <97.67%> (+0.43%)`	⬆️

... and 13 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

enjoy-binbin · 2025-08-27T14:34:21Z

src/cluster_legacy.c

    /* Save the config, possibly using fsync. */
    if (flags & CLUSTER_TODO_SAVE_CONFIG) {
        int fsync = flags & CLUSTER_TODO_FSYNC_CONFIG;
-        clusterSaveConfigOrDie(fsync);


This may require some adjustments. Previously, we synchronized saves and exited if they failed. Now, if we switch to bio, do we still need to retain the exit logic?

Also see #1032

Left a comment on #1032

hpatro

High level thought:

We should make the clusterSaveConfig as async only otherwise we don't have any protection mechanism to disallow stale bio job to overwrite a freshly saved cluster config file from main thread.

src/bio.c

src/cluster_legacy.c

hpatro · 2025-08-27T17:15:16Z

src/cluster_legacy.c

+}
+
+/* Save the cluster file, if save fails, the process will exit. */
 void clusterSaveConfigOrDie(int do_fsync) {


Just realized clusterSaveConfigOrDie is always fsync, maybe we can get rid of the method param altogether.

Yes, we can remove it once we make a decision at this one #2555 (comment)

Co-authored-by: Harkrishn Patro <[email protected]> Signed-off-by: Binbin <[email protected]>

cherukum-Amazon · 2025-09-26T18:13:16Z

src/bio.c

    bioSubmitJob(BIO_RDB_SAVE, job);
 }

+void bioCreateClusterConfigSaveJob(sds content, int do_fsync) {


Who owns/free’s content? bioCreateClusterConfigSaveJob(sds content, int do_fsync) enqueues an sds into job->cluster_save_args.content, then the BIO worker calls clusterSaveConfigFromBio(...). is there a clear free path for content afterwards—Would you please document ownership and ensure a single, deterministic free site to avoid leaks or double-frees.

cherukum-Amazon · 2025-09-26T18:20:20Z

src/cluster_legacy.c

    latencyStartMonitor(latency);
    while (offset < content_size) {
-        written_bytes = write(fd, ci + offset, content_size - offset);
+        written_bytes = write(fd, content + offset, content_size - offset);


If multiple saves are enqueued, can older content overwrite newer ones and amplify I/O. Can we Coalesce jobs so only the latest content is written (drop older items). Do we need some sort of versioning/epoch so the older jobs are dropped.

cherukum-Amazon · 2025-09-26T18:25:05Z

src/bio.c

        } else if (job_type == BIO_RDB_SAVE) {
            replicaReceiveRDBFromPrimaryToDisk(job->save_to_disk_args.conn, job->save_to_disk_args.is_dual_channel);
+        } else if (job_type == BIO_CLUSTER_SAVE) {
+            if (clusterSaveConfigFromBio(job->cluster_save_args.content, job->cluster_save_args.do_fsync) == C_ERR) {


Do we need to introduce exponential backoff and retry to mitigate transient errors. Can we include observability and metrics around the type of error we notice while trying to save the file/directory

enjoy-binbin marked this pull request as ready for review August 27, 2025 12:05

github-actions bot assigned enjoy-binbin Aug 27, 2025

enjoy-binbin commented Aug 27, 2025

View reviewed changes

hpatro reviewed Aug 27, 2025

View reviewed changes

Apply suggestions from code review

0e37d99

Co-authored-by: Harkrishn Patro <[email protected]> Signed-off-by: Binbin <[email protected]>

cherukum-Amazon reviewed Sep 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Save cluster file in bio to avoid the stuck io latency #2555

Save cluster file in bio to avoid the stuck io latency #2555

enjoy-binbin commented Aug 27, 2025

Uh oh!

codecov bot commented Aug 27, 2025 •

edited

Loading

Uh oh!

enjoy-binbin Aug 27, 2025

Uh oh!

hpatro Sep 11, 2025

Uh oh!

hpatro left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hpatro Aug 27, 2025

Uh oh!

enjoy-binbin Aug 29, 2025

Uh oh!

cherukum-Amazon Sep 26, 2025

Uh oh!

cherukum-Amazon Sep 26, 2025

Uh oh!

cherukum-Amazon Sep 26, 2025

Uh oh!

Uh oh!

Save cluster file in bio to avoid the stuck io latency #2555

Are you sure you want to change the base?

Save cluster file in bio to avoid the stuck io latency #2555

Conversation

enjoy-binbin commented Aug 27, 2025

Uh oh!

codecov bot commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

enjoy-binbin Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

hpatro Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

hpatro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hpatro Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

enjoy-binbin Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

cherukum-Amazon Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

cherukum-Amazon Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

cherukum-Amazon Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Aug 27, 2025 •

edited

Loading