ct: End to end cluster recovery #29553

Lazin · 2026-02-05T23:11:02Z

This PR enables cloud topics cluster recovery to bootstrap partitions with the correct start offset and term from the L1 metastore. When a cloud topics cluster is recovered, partitions now start at their known offsets rather than offset 0, ensuring data consistency without requiring cloud access during partition creation.

Changes

Core Infrastructure
New Controller Command: set_partition_bootstrap_params_cmd

Adds a new controller command to set bootstrap parameters (start_offset, initial_term) for partitions before topic creation
Parameters are stored in topic_table._pending_bootstrap_params map keyed by NTP
topics_frontend::set_bootstrap_params() provides the API for setting these parameters

Partition Bootstrap Flow

controller_backend fetches bootstrap params from topic_table when creating partitions
partition_manager::manage() uses bootstrap params to initialize partition state via bootstrap_partition_state()
New raft::bootstrap_partition_state() function creates initial raft state with known offset/term

Cluster Recovery Integration

Modified cluster_recovery_backend.cc to query L1 metastore for each cloud topic partition's start_offset and term
Calls set_bootstrap_params() before creating recovered cloud topics
Partitions are created with correct offsets from metastore, avoiding cloud access during partition creation

Backports Required

Release Notes

none

Copilot

Pull request overview

This PR implements L0 recovery functionality by adding the ability to bootstrap partitions with custom initial offsets and terms. This enables programmatic partition creation for cluster recovery scenarios where partitions need to start at specific known offsets.

Changes:

Added bootstrap_partition_state function to create partition state with custom offset/term
Introduced partition_bootstrap_params to store and propagate bootstrap parameters through the system
Added commands and handlers to set bootstrap parameters on existing topics before partition materialization

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/v/raft/consensus_utils.h	Declares new bootstrap_partition_state function
src/v/raft/consensus_utils.cc	Implements bootstrap_partition_state to create Raft snapshot and set storage metadata
src/v/raft/tests/bootstrap_partition_state_test.cc	Tests bootstrap_partition_state functionality
src/v/raft/tests/BUILD	Adds build configuration for bootstrap test
src/v/cluster/types.h	Defines partition_bootstrap_params and related command data structures
src/v/cluster/types.cc	Implements output operator for partition_bootstrap_params
src/v/cluster/topics_frontend.h	Declares set_bootstrap_params frontend method
src/v/cluster/topics_frontend.cc	Implements set_bootstrap_params to replicate bootstrap command
src/v/cluster/topic_updates_dispatcher.h	Declares apply method for bootstrap params command
src/v/cluster/topic_updates_dispatcher.cc	Implements command dispatch for bootstrap params
src/v/cluster/topic_table.h	Adds bootstrap_params field to partition metadata
src/v/cluster/topic_table.cc	Implements storage and retrieval of bootstrap params
src/v/cluster/tests/topic_table_test.cc	Tests bootstrap params command application
src/v/cluster/partition_manager.h	Adds bootstrap_params parameter to manage method
src/v/cluster/partition_manager.cc	Implements bootstrap logic using bootstrap_partition_state
src/v/cluster/controller_snapshot.h	Updates snapshot schema to include bootstrap_params
src/v/cluster/controller_backend.h	Adds bootstrap_params parameter to create_partition
src/v/cluster/controller_backend.cc	Passes bootstrap params through partition creation pipeline
src/v/cluster/commands.h	Defines set_partition_bootstrap_params_cmd

src/v/cluster/controller_snapshot.h

Add infrastructure to support bootstrapping partitions with custom initial offset and term. This is used for programmatic partition creation during cluster recovery. - Add partition_bootstrap_params struct in types.h - Add get_partition_bootstrap_params() API to topic_table - Pass bootstrap_params through controller_backend to partition_manager - Add bootstrap_existing_log() helpers in raft/consensus_utils Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

Add a controller command to set bootstrap parameters for partitions in an existing topic. This enables cluster recovery to: 1. Create topics without remote_topic_properties 2. Set bootstrap params via this command 3. Let controller_backend create partitions with known offsets - Add set_partition_bootstrap_params_cmd_data struct - Add set_partition_bootstrap_params_cmd command type - Add apply() method in topic_table and topic_updates_dispatcher - Add set_bootstrap_params() method in topics_frontend - Add unit tests Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

paramters propagation and use. The test registers partition bootstrap parameters (start offset and term id) and then creates the topic. The partition for which the bootstrap parameters were registered is then validated to have the right starting offset. Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

vbotbuildovich · 2026-02-06T22:29:34Z

Retry command for Build#80344

please wait until all jobs are finished before running the slash command

/ci-repeat 1
skip-redpanda-build
skip-units
skip-rebase
tests/rptest/tests/quota_management_test.py::QuotaManagementUpgradeTest.test_upgrade

vbotbuildovich · 2026-02-06T23:07:24Z

CI test results

test results on build#80344

test_class	test_method	test_arguments	test_kind	job_url	test_status	passed	reason	test_history
invalid_describe_configs_test	bad_describe_config_response		unit	https://buildkite.com/redpanda/redpanda/builds/80344#019c34db-bcef-4c46-b1cf-c07b47eca920	FAIL	0/1
QuotaManagementUpgradeTest	test_upgrade	null	integration	https://buildkite.com/redpanda/redpanda/builds/80344#019c34ef-3174-45ef-b9bd-74f11cb38d80	FLAKY	9/11	Test FAILS after retries.Significant increase in flaky rate(baseline=0.0000, p0=0.0000, reject_threshold=0.0100)	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=QuotaManagementUpgradeTest&test_method=test_upgrade

test results on build#80380

test_class	test_method	test_arguments	test_kind	job_url	test_status	passed	reason	test_history
NodesDecommissioningTest	test_decommission_status	null	integration	https://buildkite.com/redpanda/redpanda/builds/80380#019c39cd-077a-4078-b4e2-efc976366484	FLAKY	10/11	Test PASSES after retries.No significant increase in flaky rate(baseline=0.0491, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1402, p1=0.2208, trust_threshold=0.5000)	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=NodesDecommissioningTest&test_method=test_decommission_status

test results on build#80435

test_class	test_method	test_arguments	test_kind	job_url	test_status	passed	reason	test_history
DataMigrationsApiTest	test_higher_level_migration_api	null	integration	https://buildkite.com/redpanda/redpanda/builds/80435#019c4616-42e4-449a-98a6-d3d0c4fcad3e	FLAKY	10/11	Test PASSES after retries.No significant increase in flaky rate(baseline=0.0000, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000)	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=DataMigrationsApiTest&test_method=test_higher_level_migration_api
WriteCachingFailureInjectionE2ETest	test_crash_all	{"use_transactions": false}	integration	https://buildkite.com/redpanda/redpanda/builds/80435#019c4619-ef05-46a8-9e34-f7c9a4701ce0	FLAKY	5/11	Test FAILS after retries.Significant increase in flaky rate(baseline=0.1085, p0=0.0024, reject_threshold=0.0100)	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=WriteCachingFailureInjectionE2ETest&test_method=test_crash_all

test results on build#80448

test_class	test_method	test_arguments	test_kind	job_url	test_status	passed	reason	test_history
ScalingUpTest	test_moves_with_local_retention	{"use_topic_property": true}	integration	https://buildkite.com/redpanda/redpanda/builds/80448#019c489a-3c3f-4e14-b91a-547c0091d87a	FLAKY	10/11	Test PASSES after retries.No significant increase in flaky rate(baseline=0.0229, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000)	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ScalingUpTest&test_method=test_moves_with_local_retention

andrwng · 2026-02-07T02:15:09Z

src/v/cluster/tests/topic_table_test.cc

    add_random_topic(); // invalidates iterator
    BOOST_REQUIRE_THROW((void)it->first, iterator_stability_violation);
 }
+


Create topics without remote_topic_properties

I get that we can avoid downloading them on autocreate, but I think we still want to preserve the initial revision id of the topic. Topic manifests (used for read replicas) paths will include the initial revisions, and if we aren't preserving it, every time we do recovery we'll start uploading manifests to a different path.

but why do we need them in L0?

This isn't just for L0, it's for recovery of the entire topic. L1 state is discoverable (e.g. through read replicas) by topic manifests

I see. L0 doesn't need it to function but polluting bucket with manifests is a no go. Do we store initial revision in the metastore?

I'd love to avoid pulling in all topic manifests if possible. Given that we already have the metastore it should be a main way to pull metadata. Using a mix of data from the metastore and the manifests feels a bit odd. The metastore is clearly superior for the recovery.

Yeah I don't think we need to download the manifests. But the caller of recovery (wcr) should have enough information from the controller to give the revision id (if a remote revision is set, use it, and if not, use the topic revision)

andrwng · 2026-02-07T02:21:48Z

src/v/cluster/topics_frontend.h

+      model::topic_namespace,
+      absl::btree_map<model::partition_id, partition_bootstrap_params>,


How do we guarantee that the topic creation that consumes these is actually the topic that we care about? Should we be including a revision id somewhere in the key?

We don't know the revision id of the topic because the topic is created after the bootstrap params are set. This is only used during the full cluster recovery. I guess we want to clean this up after partitions are created to avoid a situation when the topic is re-created but after that it's picking up the old bootstrap params.

I added the cleanup for that stuff. So now the cluster recovery backend is setting the bootstrap parameters and then when the recovery is completed and all partitions are reconciled and actually created on replicas bootstrap parameters are removed.

andrwng · 2026-02-07T02:24:42Z

src/v/cluster/tests/topic_table_test.cc

+    // Params should still be available after topic creation
+    params0 = table.local().get_partition_bootstrap_params(ntp0);
+    BOOST_REQUIRE(params0.has_value());
+    BOOST_REQUIRE_EQUAL(params0->start_offset, model::offset(1000));
+
+    params1 = table.local().get_partition_bootstrap_params(ntp1);
+    BOOST_REQUIRE(params1.has_value());
+    BOOST_REQUIRE_EQUAL(params1->start_offset, model::offset(2000));


Is this an important property to maintain? I would have thought that once we create the topic, we might want the topic table to remove the bootstrap params.

yes, but at the moment there is no command that removes them
but the intention is to eventually remove them

I added the command that removes the bootstrap parameters, there is a test for that below.

andrwng · 2026-02-07T02:27:12Z

src/v/cluster/types.h

+/// Data for setting bootstrap parameters on existing topic partitions.
+/// Used by cluster recovery to specify known offsets for partitions.
+struct set_partition_bootstrap_params_cmd_data


If we're leaving the bootstrap params in the topics table anyway, what's the rationale for having another command for this, vs including it as an optional parameter for topic creation?

Having it be separate feels a bit odd, because e.g. cluster recovery could fail midway and we could be left with the boostrap params in the topic table (and maybe they'd be unintentionally used by some other topic creation?)

It's just a separation of concerns. The plan is to add a clean-up command that will remove bootstrap state.

Any thoughts about the case where WCR fails and we end up leaving state in the topics table?

WDYT about storing the bootstrap params in the cluster_recovery_state (also replicated on every node on every shard through the cluster_recovery_table). That way it makes it clear that these bootstrap params are only for recovery, and it becomes easy to clear state atomically with respect to WCR status (e.g. once we complete or fail WCR, the state transition could clear this map).

But that should be fine. If recovery fails the user will retry the recovery. The same set of bootstrap params will be used.
I can try to do this, but I'm a bit concerned about new dependencies.

If the user doesn't retry recovery though, this seems like a surprise waiting to happen. It doesn't seem like an unreasonable sequence of events that a user tries to recover, it fails midway, but the user thinks inspects the cluster and thinks it's good enough to continue their jobs, and doesn't retry.

Re: dependencies, yea it's a fair concern. I'm hoping there aren't any surprises there. I do appreciate that the cluster recovery table state updates are deterministic, but it's worth thinking about if there are races between its updates and topic creation...

I don't think that there are races because we're waiting until the changes are applied to the topic table and the reconciliation loop is doing the same thing. Essentially, both commands are replicated with the same log.

I think the easiest solution is to clear this state on recovery failure. I'll try this next.

Also, when the new recovery starts.

andrwng · 2026-02-07T02:34:18Z

src/v/cluster/partition_manager.cc

-    // hasn't been constructed yet.
-    // TODO: implement a recovery primitive for cloud topics
-    if (!ntp_cfg.cloud_topic_enabled()) {
+    if (bootstrap_params.has_value()) {


Should we also condition this on having cloud_topics_enabled set? It feels risky to be permissive here, unless you have some other user of bootstrap params in mind?

The mechanism is generic, it's not tied to the cloud topics and will work in any case.

andrwng · 2026-02-07T03:00:49Z

tests/rptest/tests/cluster_recovery_test.py

+        assert result[0] is not None
+        return result[0]
+
+    def _wait_for_metastore_start_offset(


tests/rptest/tests/cluster_recovery_test.py

Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

to the topics_frontend. The method replciates the command that clears pending bootstrap state from the topic_table. Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

how the clear_partition_bootstrap_params_cmd command is handled Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

When all topics/partitions are created and reconciled invoke the method to remove bootstrap state. Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

When the cloud topic is recovered the revision id has to be populated. This is done in the cluster recovery reconciler. Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

vbotbuildovich · 2026-02-10T06:44:31Z

Retry command for Build#80435

please wait until all jobs are finished before running the slash command

/ci-repeat 1
skip-redpanda-build
skip-units
skip-rebase
tests/rptest/tests/write_caching_fi_e2e_test.py::WriteCachingFailureInjectionE2ETest.test_crash_all@{"use_transactions":false}

Add two new commands to the offline_log_viewer tool. Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

Lazin · 2026-02-10T16:02:13Z

upd:

the cleanup is now called if recovery fails or the new recovery starts
the offline_log_viewer is updated to support two new controller commands

Copilot AI review requested due to automatic review settings February 5, 2026 23:11

github-actions bot added area/build area/redpanda labels Feb 5, 2026

Copilot AI reviewed Feb 5, 2026

View reviewed changes

src/v/cluster/controller_snapshot.h Outdated Show resolved Hide resolved

Lazin force-pushed the ct/ctp-recovery branch 2 times, most recently from 2c63bce to b85c5ce Compare February 6, 2026 20:07

redpanda-data deleted a comment from Copilot AI Feb 6, 2026

Lazin force-pushed the ct/ctp-recovery branch from b85c5ce to 1fa4866 Compare February 6, 2026 20:48

Lazin force-pushed the ct/ctp-recovery branch from 1fa4866 to 2c8cdfc Compare February 6, 2026 21:04

Lazin added 4 commits February 6, 2026 16:28

raft: Add partition bootstrap test

efc6486

Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

ct: Set up L0 recovery with cluster_recovery_backend

f4683d2

Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

Lazin force-pushed the ct/ctp-recovery branch from 2c8cdfc to f4683d2 Compare February 6, 2026 21:28

Lazin changed the title ~~ct: L0 recovery (WIP)~~ ct: Cluster recovery Feb 6, 2026

Lazin changed the title ~~ct: Cluster recovery~~ ct: E2e cluster recovery Feb 7, 2026

Lazin changed the title ~~ct: E2e cluster recovery~~ ct: End to end cluster recovery Feb 7, 2026

Lazin requested review from andrwng and dotnwat February 7, 2026 00:09

andrwng reviewed Feb 7, 2026

View reviewed changes

Lazin added 3 commits February 7, 2026 05:37

rptest: Add cloud topics recovery test

6bb1d79

Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

cluster: Add 'clear_bootstrap_params' method

0a21283

to the topics_frontend. The method replciates the command that clears pending bootstrap state from the topic_table. Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

cluster: Add topic_table test that checks

429feb3

how the clear_partition_bootstrap_params_cmd command is handled Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

Lazin force-pushed the ct/ctp-recovery branch from 68bd273 to 429feb3 Compare February 7, 2026 11:23

Lazin requested a review from andrwng February 9, 2026 16:42

Lazin added 2 commits February 9, 2026 16:54

cluster: Invoke clear_bootstrap_params during recovery

bfc2449

When all topics/partitions are created and reconciled invoke the method to remove bootstrap state. Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

cluster: Populate initial revision id

de21311

When the cloud topic is recovered the revision id has to be populated. This is done in the cluster recovery reconciler. Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

Lazin force-pushed the ct/ctp-recovery branch from e149526 to de21311 Compare February 9, 2026 21:54

fixup! cluster: Invoke clear_bootstrap_params during recovery

d3a7a70

tools: Update offline_log_viewer

a8f19cb

Add two new commands to the offline_log_viewer tool. Signed-off-by: Evgeny Lazin <4lazin@gmail.com>

		model::topic_namespace,
		absl::btree_map<model::partition_id, partition_bootstrap_params>,

ct: End to end cluster recovery #29553

Are you sure you want to change the base?

ct: End to end cluster recovery #29553

Conversation

Lazin commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Partition Bootstrap Flow

Cluster Recovery Integration

Backports Required

Release Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

vbotbuildovich commented Feb 6, 2026

Retry command for Build#80344

Uh oh!

vbotbuildovich commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI test results

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Lazin Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vbotbuildovich commented Feb 10, 2026

Retry command for Build#80435

Uh oh!

Lazin commented Feb 5, 2026 •

edited

Loading

vbotbuildovich commented Feb 6, 2026 •

edited

Loading

Lazin Feb 9, 2026 •

edited

Loading