fix: Implement chunking for large SBF filters #5944

EricHayter · 2025-10-23T00:52:08Z

This commit adds chunking functionalities for load/save operations of bloom filters. Additional information is added in the serialization of each filter. Specifically, when saving each filter the total size of the filter is written followed by chunks of the filter (max size of 64 MB per chunk).

Addresses #5314.

This commit adds chunking functionalities for load/save operations of bloom filters. Additional information is added in the serialization of each filter. Specifically, when saving each filter the total size of the filter is written followed by chunks of the filter (max size of 64 MB per chunk). Signed-off-by: Eric <[email protected]>

EricHayter · 2025-10-23T01:09:10Z

I've done some manual testing to determine at what capacity and error rate would induce chunking on filters. However, when I implement the same sequences of commands specifically BF.RESERVE for that large of a filter (> 64 MB) the RUN call fails:
E20251022 20:57:24.968197 8972 rdb_load.cc:2152] Error while calling HandleAux(): Out of memory, or used memory is too high

What would be a good way of testing the functionality in the unit tests?

romange · 2025-10-23T06:51:17Z

I see the tests pass, can you please push the test that fails so I could advice?
I think explicit overriding of max_memory_limit limit might help...

EricHayter · 2025-10-23T11:00:16Z

Got the test working. max_memory_limit fixed it. Thanks.

romange · 2025-11-04T08:27:07Z

Hi @EricHayter , thanks for doing this. The problem with this PR is that it breaks compatibility with the previous versions.
newer versions of dragonfly won't be able to load the "old" format during the replication or snapshotting.

A possible way to address is to do the following:

To introduce RDB_TYPE_SBF2
On the loading side to support both the previous format and the current one.
On the saving side to add rdb_sbf_chunked flag that will default to false. In the unit test that you added set it to true. Above the flag add a comment: "to flip its value to 'true' in March 2026", so we would have enough versions released that can already load the new format (to prevent broken rollbacks, if say someone updates dragonfly but wants to rollback - they won't be able to if we just flip to a new format now).

This way, in production we will still use the old encoding for 5-6 months and we will have enough released versions that support the new format. then we can just flip the flag in the future and slowly retire the old format.

yes, wire formats are complicated to change.

Added a new flag `rdb_sbf_chunked` which determines the save format of SBFs. Also, separate functions for saving SBFs were added. Signed-off-by: Eric <[email protected]>

romange · 2025-11-06T07:04:49Z

src/server/rdb_load.cc

    SET_OR_UNEXPECT(LoadLen(nullptr), hash_cnt);
-    SET_OR_UNEXPECT(FetchGenericString(), filter_data);
+
+    if (absl::GetFlag(FLAGS_rdb_sbf_chunked)) {


I do not think you need to use FLAGS_rdb_sbf_chunked in the loader.
The load part should only rely on RDB_TYPE_SBF2 and RDB_TYPE_SBF to decide how to parse. it should be stateless in this sense.

EricHayter force-pushed the large-sbf-serialization branch from 40e6b5b to 0fbd8dd Compare October 23, 2025 01:04

Implement unit test for SBF chunking logic

4505644

EricHayter marked this pull request as ready for review October 23, 2025 14:45

romange self-requested a review November 4, 2025 08:27

EricHayter force-pushed the large-sbf-serialization branch 2 times, most recently from 61939aa to 5fef132 Compare November 6, 2025 01:46

Add flags for setting save format for SBF filters

e83b84f

Added a new flag `rdb_sbf_chunked` which determines the save format of SBFs. Also, separate functions for saving SBFs were added. Signed-off-by: Eric <[email protected]>

EricHayter force-pushed the large-sbf-serialization branch from 5fef132 to e83b84f Compare November 6, 2025 01:55

romange reviewed Nov 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Implement chunking for large SBF filters #5944

fix: Implement chunking for large SBF filters #5944

EricHayter commented Oct 23, 2025

Uh oh!

EricHayter commented Oct 23, 2025

Uh oh!

romange commented Oct 23, 2025

Uh oh!

EricHayter commented Oct 23, 2025

Uh oh!

romange commented Nov 4, 2025

Uh oh!

romange Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: Implement chunking for large SBF filters #5944

Are you sure you want to change the base?

fix: Implement chunking for large SBF filters #5944

Conversation

EricHayter commented Oct 23, 2025

Uh oh!

EricHayter commented Oct 23, 2025

Uh oh!

romange commented Oct 23, 2025

Uh oh!

EricHayter commented Oct 23, 2025

Uh oh!

romange commented Nov 4, 2025

Uh oh!

romange Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants