ct: hook cloud topics recovery into whole cluster restore by andrwng · Pull Request #29515 · redpanda-data/redpanda

andrwng · 2026-02-04T09:29:32Z

This hooks cloud topics restore into the Whole Cluster Restore machinery. At a high level, cloud topics restore entails two steps:

restore the L1 metastore
restore the cloud topics, pointing the restored topics at the restored metastore

In restoring the metastore, we restore each domain, and then update the metastore topic config so its remote_label will be that of the original metastore. This sets the metastore up to continue uploading its metastore manifest to the original cluster's location, which will be a nice property to have for discoverability of the metastore (e.g. for read replicas). This updated remote label also serves as an indicator that restore is complete and that subsequent attempts to restore this metastore will be no-ops.

In restoring the cloud topics, all the original cloud topics' properties are preserved, but with the recovery property set. There is some additional work that needs to be done to make the restored cloud topics operational (e.g. on restore, they should query the metastore to bootstrap their CTP STMs), but that is left for later work.

Backports Required

Release Notes

None

Copilot

Pull request overview

This PR hooks cloud topics recovery into the whole cluster restore process. It introduces new recovery stages for cloud topics metastore and cloud topic data, allowing cloud topics to be restored from controller snapshots during cluster recovery.

Changes:

Added two new recovery stages (recovered_cloud_topics_metastore and recovered_cloud_topic_data) for restoring cloud topics
Modified the topic creation flow to use the metastore's remote label for cloud topics
Implemented recovery actions for cloud topics metastore and cloud topic data
Added comprehensive tests for cloud topics cluster recovery scenarios

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/v/redpanda/application_start.cc	Passes cloud topics state to controller start for recovery
src/v/kafka/server/handlers/alter_configs.cc	Prevents remote_label from being modified via AlterConfigs
src/v/cluster/types.h	Adds remote_label property and new recovery stages
src/v/cluster/types.cc	Adds string representation for new recovery stages
src/v/cluster/topics_frontend.cc	Updates topic creation to use metastore's remote label for cloud topics
src/v/cluster/topic_table.cc	Adds incremental update handling for remote_label
src/v/cluster/partition_manager.cc	Skips log download for cloud topics during recovery
src/v/cluster/controller.h	Adds cloud topics state parameter to controller start
src/v/cluster/controller.cc	Passes cloud topics state to recovery backend
src/v/cluster/cluster_recovery_reconciler.h	Adds cloud topics fields to recovery actions
src/v/cluster/cluster_recovery_reconciler.cc	Implements recovery action generation for cloud topics
src/v/cluster/cloud_metadata/tests/controller_snapshot_reconciliation_test.cc	Tests for metastore and cloud topic reconciliation
src/v/cluster/cloud_metadata/tests/cluster_metadata_utils.h	Helper functions for cloud topic properties
src/v/cluster/cloud_metadata/tests/BUILD	Adds dependencies for test helpers
src/v/cluster/cloud_metadata/cluster_recovery_backend.h	Adds cloud topics state member
src/v/cluster/cloud_metadata/cluster_recovery_backend.cc	Implements metastore and cloud topic recovery actions
src/v/cluster/BUILD	Adds cloud topics metastore dependencies
src/v/cloud_topics/tests/cluster_recovery_test.cc	End-to-end test for cloud topics cluster recovery
src/v/cloud_topics/tests/BUILD	Adds cluster recovery test target
src/v/cloud_topics/level_one/metastore/tests/replicated_metastore_test.cc	Tests restore with correct partition count
src/v/cloud_topics/level_one/metastore/tests/BUILD	Adds test dependencies
src/v/cloud_topics/level_one/metastore/replicated_metastore.cc	Updates restore to handle missing manifest and partition count
src/v/cloud_topics/level_one/metastore/leader_router.h	Adds num_partitions parameter to ensure_topic_exists
src/v/cloud_topics/level_one/metastore/leader_router.cc	Passes num_partitions through to domain supervisor
src/v/cloud_topics/level_one/domain/domain_supervisor.h	Adds num_partitions parameter
src/v/cloud_topics/level_one/domain/domain_supervisor.cc	Creates metastore topic with specified partition count

Comments suppressed due to low confidence (1)

src/v/cloud_topics/level_one/metastore/replicated_metastore.cc:1

Corrected spelling of 'uplaod' to 'upload'.

/*

Copilot · 2026-02-04T09:30:32Z

src/v/cluster/cloud_metadata/cluster_recovery_backend.cc

+        // Restore on the metastore with our desired remote label. This will
+        // create the underlying topic if it doesn't already exist. Retry on
+        // transport errors since leadership may be unstable during recovery.
+        auto* metastore = _ct_state->local().get_l1_metastore();


Potential null pointer dereference. While there's a check for !_ct_state earlier, there's no validation that get_l1_metastore() returns a non-null pointer before using it.

Suggested change

auto* metastore = _ct_state->local().get_l1_metastore();

auto* metastore = _ct_state->local().get_l1_metastore();

if (metastore == nullptr) {

vlog(

clusterlog.error,

"Cloud topics metastore not available for recovery");

co_return cluster::errc::invalid_configuration_update;

}

src/v/cluster/cloud_metadata/cluster_recovery_backend.cc

Lazin · 2026-02-04T19:28:19Z

src/v/cluster/partition_manager.cc

-            auto last_included_term = last_segment->archiver_term;
-
+    // TODO: implement a recovery primitive for cloud topics
+    if (!ntp_cfg.cloud_topic_enabled()) {


Can you remove this change? It interferes with the implementation of the CTP recovery.

Is it tricky to rebase over the conflict? Would different formatting make the conflict easier to deal with?

This is required for my tests, as without it, WCR will fail in trying to autocreate the topics.

rockwotj

I'm not super familiar with WCR, so will let another person review, but from my naive take it LGTM.

I can do more reading later in the week to get more familiar to be able to approve

andrwng · 2026-02-05T19:51:05Z

Just rebased

Lazin

LGTM except the last mile (partition manager recovery) but I'll revert that later

Lazin · 2026-02-05T22:39:47Z

src/v/cluster/cluster_recovery_reconciler.cc

    case recovered_acls:
        return 5;
-    case recovered_remote_topic_data:
+    case recovered_cloud_topics_metastore:


the order has to be changed because subsequent steps need all topics to be recovered already?

Are you suggesting that i change the order? or are you asking why it's important that it's ordered that way?

The order is important for a few reasons:

it's generally gonna be less error prone to do metastore recovery when we haven't restored all other (e.g. tiered storage) topics already

we want to restore the cloud topics before e.g. getting to the stage where we truncate consumer group offsets

Lazin · 2026-02-05T22:47:02Z

src/v/cluster/cluster_recovery_reconciler.cc

                  tp_ns);
                continue;
            }
+            if (tp_config.is_cloud_topic()) {


for cloud topics the redpanda.remote.recovery topic property is not needed

the flow of the L0 recovery:

the reconciler recovers metastore

then it reads starting points for all CTPs

then it invokes method of the topics_frontend that replicates the starting points

the starting points are added to the topics table

CTP's are created normally (no recovery prop), the partitions will be seeded with the right starting points from the metastore

i think here we just need to add a config to the list without adding the recovery
or I guess I can change it in my PR later

Yea that makes sense, we can either do that in the cluster_recovery_backend, which has access to the metastore when getting into the recovered_cloud_topic_data stage, or as you suggest maybe plumbing that into the topics_frontend. I suspect doing it in the recovery backend will be more straightforward

andrwng · 2026-02-05T23:14:46Z

Force push:

handled some uncaught exceptions i found in looping the cluster_recovery test
tweaked add_node() in cluster_recovery_test to use the test configs that are in dev

There were some uncaught exceptions in opening the replicated database. Wraps these in an appropriate error type. Along the way, I updated some other exception handling to use the new wrapper.

In restoring the metastore topic during WCR, one of the steps to complete recovery is to change the metastore topic's remote label to be that of the original metastore topic. In this way, the location of the metastore manifest will be consistent from the original cluster and restored cluster (which is a useful property for read replicas, that will need to locate the metastore).

It'll generally be more robust if the restored topic is created with the number of partitions from the metastore manifest, vs the current behaveior of rejecting the restore if there's a partition count mismatch. To that end, allow ensure_topic_exists() to take in a parameter for the number of partitions.

- We previously rejected the restore if the number of domains didn't match the default number of partitions. Instead, we just create the same number of partitions as there are expected domains. - We previoulsy rejected the restore if there is no metastore manifest in the cloud. Instead, treat this as success, and opt to create the metastore topic if needed.

This allows us to rely on the invariant that a topic's remote_label points to the same remote label that the metastore's manifest will be. This is a nice property to have for read replicas: given a topic manifest, we'll be able to find the appropriate metastore manifest in the object storage.

The current ordering of actions is somewhat brittle since action order is tied to stage enum value. This makes it difficult to extend with new stages because stages are serialized in the controller log, and thus it's unsafe to add new stage enums with values that aren't at the end of the enum space. This commit takes a stab at making this a bit better by defining a non-persisted ordering of stages that is decoupled from enum value. My intent is to add more stages for cloud topics restore that will not be the last stage actioned.

This adds a couple of stages that will be used for cloud topics recovery: - metastore recovery: the metastore will be restored such that the new cluster's metastore topic will host the appropriate domain databases - topic recovery: each cloud topic will query the metastore to learn about the offsets/terms at which to start each partition, and then initialize their CTP STMs with those starting points These stages are only performed if there are cloud topics that need to be restored.

This implements the body of the cluster restore stages for cloud topics, by restoring the metastore, and then calling create topics on all the desired cloud topics. One of the invariants that we're trying to uphold is that the metastore topic will have the same remote_label after restoring as the original, in order (this has the nice property of being able to point at the metastore from a topic manifest, e.g. for read replicas). To that end, restoring the metastore looks like: 1. creating the metastore topic 2. calling restore 3. upon success, updating the metastore topic's remote_label to be what it was in the dead cluster After this happens, we go ahead and create all the cloud topics, which is only partially implemented in this commit: we create the topics, but additional logic is needed to get the restored topics to bootstrap their ctp_stms from the metastore. That is left as separate work.

vbotbuildovich · 2026-02-06T01:08:33Z

Retry command for Build#80268

please wait until all jobs are finished before running the slash command

/ci-repeat 1
skip-redpanda-build
skip-units
skip-rebase
tests/rptest/tests/random_node_operations_smoke_test.py::RedpandaNodeOperationsSmokeTest.test_node_ops_smoke_test@{"cloud_storage_type":1,"mixed_versions":false}

vbotbuildovich · 2026-02-06T01:16:01Z

CI test results

test results on build#80268

test_class	test_method	test_arguments	test_kind	job_url	test_status	passed	reason	test_history
RedpandaNodeOperationsSmokeTest	test_node_ops_smoke_test	{"cloud_storage_type": 1, "mixed_versions": false}	integration	https://buildkite.com/redpanda/redpanda/builds/80268#019c3030-cdd1-4dfc-8ac2-1da471fe89ce	FLAKY	7/11	Test FAILS after retries.Significant increase in flaky rate(baseline=0.0116, p0=0.0002, reject_threshold=0.0100)	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=RedpandaNodeOperationsSmokeTest&test_method=test_node_ops_smoke_test
AdminSecurityReportTest	test_security_report	null	integration	https://buildkite.com/redpanda/redpanda/builds/80268#019c302d-cb6e-4ebc-88d1-cca3550d2122	FLAKY	10/11	Test PASSES after retries.No significant increase in flaky rate(baseline=0.0000, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000)	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=AdminSecurityReportTest&test_method=test_security_report

Copilot AI review requested due to automatic review settings February 4, 2026 09:29

github-actions bot added area/build area/redpanda labels Feb 4, 2026

Copilot AI reviewed Feb 4, 2026

View reviewed changes

andrwng force-pushed the ct-wcr branch 3 times, most recently from 4de461f to eef4196 Compare February 4, 2026 10:24

andrwng requested review from Lazin, dotnwat and rockwotj February 4, 2026 17:23

Lazin reviewed Feb 4, 2026

View reviewed changes

rockwotj reviewed Feb 5, 2026

View reviewed changes

andrwng force-pushed the ct-wcr branch from eef4196 to 918bf53 Compare February 5, 2026 19:50

Lazin previously approved these changes Feb 5, 2026

View reviewed changes

andrwng dismissed Lazin’s stale review via b2942df February 5, 2026 23:13

andrwng force-pushed the ct-wcr branch from 918bf53 to b2942df Compare February 5, 2026 23:13

andrwng requested a review from Lazin February 5, 2026 23:14

andrwng added 8 commits February 5, 2026 15:19

ct/l1/lsm: uncaught exceptions in replicated_db

eddc63f

There were some uncaught exceptions in opening the replicated database. Wraps these in an appropriate error type. Along the way, I updated some other exception handling to use the new wrapper.

andrwng force-pushed the ct-wcr branch from b2942df to b3a49d1 Compare February 5, 2026 23:19

andrwng enabled auto-merge February 6, 2026 06:44

Lazin approved these changes Feb 6, 2026

View reviewed changes

andrwng merged commit bb190ca into redpanda-data:dev Feb 6, 2026
19 checks passed

-        auto* metastore = _ct_state->local().get_l1_metastore();
+        auto* metastore = _ct_state->local().get_l1_metastore();
+        if (metastore == nullptr) {
+            vlog(
+              clusterlog.error,
+              "Cloud topics metastore not available for recovery");
+            co_return cluster::errc::invalid_configuration_update;
+        }

Conversation

andrwng commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backports Required

Release Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Lazin Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

andrwng Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rockwotj left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andrwng commented Feb 5, 2026

Uh oh!

Lazin left a comment

Choose a reason for hiding this comment

Uh oh!

Lazin Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

andrwng Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Lazin Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

andrwng Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

andrwng commented Feb 5, 2026

Uh oh!

vbotbuildovich commented Feb 6, 2026

Retry command for Build#80268

Uh oh!

vbotbuildovich commented Feb 6, 2026

CI test results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

andrwng commented Feb 4, 2026 •

edited

Loading

andrwng Feb 4, 2026 •

edited

Loading

rockwotj left a comment •

edited

Loading