From 03913990a12d14b789c64f5b12f6455c23336451 Mon Sep 17 00:00:00 2001 From: lidezhu Date: Fri, 15 Aug 2025 12:05:41 +0800 Subject: [PATCH 1/7] ticdc: improve description about ticdc compatibility with lightning & BR --- ticdc/ticdc-faq.md | 22 ++++++++++++++----- ticdc/ticdc-overview.md | 4 ++-- .../tidb-lightning-physical-import-mode.md | 4 +--- 3 files changed, 20 insertions(+), 10 deletions(-) diff --git a/ticdc/ticdc-faq.md b/ticdc/ticdc-faq.md index b9eb34417a09b..b13846a33b069 100644 --- a/ticdc/ticdc-faq.md +++ b/ticdc/ticdc-faq.md @@ -369,22 +369,34 @@ In v6.1.3 and later versions, the default value of `safe-mode` changes to `false When upstream write traffic is at peak hours, the downstream may fail to consume all data in a timely manner, resulting in data pile-up. TiCDC uses disks to process the data that is piled up. TiCDC needs to write data to disks during normal operation. However, this is not usually the bottleneck for replication throughput and replication latency, given that writing to disks only results in latency within a hundred milliseconds. TiCDC also uses memory to accelerate reading data from disks to improve replication performance. -## Why does replication using TiCDC stall or even stop after data restore using TiDB Lightning physical import mode and BR from upstream? +## What are the compatibility limitations between TiDB Lightning Physical Import Mode and TiCDC? -Currently, TiCDC is not yet fully compatible with [TiDB Lightning physical import mode](/tidb-lightning/tidb-lightning-physical-import-mode.md) and BR. Therefore, avoid using TiDB Lightning physical import mode and BR on tables that are replicated by TiCDC. Otherwise, unknown errors might occur, such as TiCDC replication getting stuck, a significant spike in replication latency, or data loss. +TiDB Lightning [Physical Import Mode](/tidb-lightning/tidb-lightning-physical-import-mode.md) directly generates SST files and imports them into the TiKV cluster. Since this import method does not involve the regular data writing process, it does not generate change log records. In most cases, changefeed cannot observe this kinds of data changes. Only during the initialization phase of the changefeed or when Region changes (such as split/merge/leader transfer) trigger incremental scans might this kinds of data changes be observed. Therefore, changefeed cannot fully capture data imported via TiDB Lightning Physical Import Mode. -If you need to use TiDB Lightning physical import mode or BR to restore data for some tables replicated by TiCDC, take these steps: +If the tables operated by TiDB Lightning Physical Import Mode overlap with the tables monitored by the changefeed, various unknown errors may occur due to incomplete data capture, such as changefeed synchronization stalling or data inconsistency between upstream and downstream. If you need to use TiDB Lightning Physical Import Mode to import tables synchronized by TiCDC, follow these steps: 1. Remove the TiCDC replication task related to these tables. -2. Use TiDB Lightning physical import mode or BR to restore data separately in the upstream and downstream clusters of TiCDC. +2. Use TiDB Lightning physical import mode to restore data separately in the upstream and downstream clusters of TiCDC. -3. After the restoration is complete and data consistency between the upstream and downstream clusters is verified, create a new TiCDC replication task for incremental replication, with the timestamp (TSO) from the upstream backup as the `start-ts` for the task. For example, assuming the snapshot timestamp of the BR backup in the upstream cluster is `431434047157698561`, you can create a new TiCDC replication task using the following command: +3. After the restoration is complete and the data consistency of the corresponding tables in the upstream and downstream clusters has been verified, use the timestamp (TSO) after the completion of TiDB Lightning Physical Import Mode as the start-ts of the TiCDC replication task to create a new TiCDC replication task for incremental replication. ```shell cdc cli changefeed create -c "upstream-to-downstream-some-tables" --start-ts=431434047157698561 --sink-uri="mysql://root@127.0.0.1:4000? time-zone=" ``` +If you can confirm that the tables operated by TiDB Lightning Physical Import Mode do not overlap with the tables monitored by the changefeed, you can set the `check-requirements` in the TiDB Lightning configuration file to `false` to forcibly execute the data import operation. + +## What are the compatibility limitations between BR (Backup & Restore) and TiCDC? + +BR (Backup & Restore) also directly generates SST files and imports them into the TiKV cluster. Changefeeds do not guarantee the ability to fully capture data imported in this manner. For details, refer to [What are the compatibility limitations between TiDB Lightning Physical Import Mode and TiCDC?](/ticdc/ticdc-faq.md#what-are-the-compatibility-limitations-between-tidb-lightning-physical-import-mode-and-ticdc). + +Different versions of BR handle this differently: + +- For versions before v8.2.0, if a changefeed task already exists on the cluster, BR will refuse to create a restore task. + +- Starting from v8.2.0, a restore task is only allowed to be created if the backupTs of the data restored by BR is earlier than the checkpointTs of all changefeeds on the cluster. + ## After a changefeed resumes from pause, its replication latency gets higher and higher and returns to normal only after a few minutes. Why? When a changefeed is resumed, TiCDC needs to scan the historical versions of data in TiKV to catch up with the incremental data logs generated during the pause. The replication process proceeds only after the scan is completed. The scan process might take several to tens of minutes. diff --git a/ticdc/ticdc-overview.md b/ticdc/ticdc-overview.md index 94b7e854c3942..5c34a5d2fc468 100644 --- a/ticdc/ticdc-overview.md +++ b/ticdc/ticdc-overview.md @@ -156,8 +156,8 @@ Currently, the following scenarios are not supported: - A TiKV cluster that uses RawKV alone. - The [`CREATE SEQUENCE` DDL operation](/sql-statements/sql-statement-create-sequence.md) and the [`SEQUENCE` function](/sql-statements/sql-statement-create-sequence.md#sequence-function) in TiDB. When the upstream TiDB uses `SEQUENCE`, TiCDC ignores `SEQUENCE` DDL operations/functions performed upstream. However, DML operations using `SEQUENCE` functions can be correctly replicated. -- Currently, performing [TiDB Lightning physical import](/tidb-lightning/tidb-lightning-physical-import-mode.md) on tables and databases that are being replicated by TiCDC is not supported. For more information, see [Why does replication using TiCDC stall or even stop after data restore using TiDB Lightning and BR from upstream](/ticdc/ticdc-faq.md#why-does-replication-using-ticdc-stall-or-even-stop-after-data-restore-using-tidb-lightning-physical-import-mode-and-br-from-upstream). -- Before v8.2.0, BR does not support [restoring data](/br/backup-and-restore-overview.md) for a cluster with TiCDC replication tasks. For more information, see [Why does replication using TiCDC stall or even stop after data restore using TiDB Lightning and BR from upstream](/ticdc/ticdc-faq.md#why-does-replication-using-ticdc-stall-or-even-stop-after-data-restore-using-tidb-lightning-physical-import-mode-and-br-from-upstream). +- Currently, performing [TiDB Lightning physical import](/tidb-lightning/tidb-lightning-physical-import-mode.md) on tables and databases that are being replicated by TiCDC is not supported. For more information, see [What are the compatibility limitations between TiDB Lightning Physical Import Mode and TiCDC?](/ticdc/ticdc-faq.md#what-are-the-compatibility-limitations-between-tidb-lightning-physical-import-mode-and-ticdc). +- Before v8.2.0, BR does not support [restoring data](/br/backup-and-restore-overview.md) for a cluster with TiCDC replication tasks. For more information, see [What are the compatibility limitations between BR (Backup & Restore) and TiCDC?](/ticdc/ticdc-faq.md#what-are-the-compatibility-limitations-between-br-backup--restore-and-ticdc). - Starting from v8.2.0, BR relaxes the restrictions on data restoration for TiCDC: if the `BackupTS` (the backup time) of the data to be restored is earlier than the changefeed [`CheckpointTS`](/ticdc/ticdc-architecture.md#checkpointts) (the timestamp that indicates the current replication progress), BR can proceed with the data restoration normally. Considering that the `BackupTS` is usually much earlier, it can be assumed that in most scenarios, BR supports restoring data for a cluster with TiCDC replication tasks. TiCDC only partially supports scenarios involving large transactions in the upstream. For details, refer to the [TiCDC FAQ](/ticdc/ticdc-faq.md#does-ticdc-support-replicating-large-transactions-is-there-any-risk), where you can find details on whether TiCDC supports replicating large transactions and any associated risks. diff --git a/tidb-lightning/tidb-lightning-physical-import-mode.md b/tidb-lightning/tidb-lightning-physical-import-mode.md index 78a0e26d450c3..96b1da0e22d06 100644 --- a/tidb-lightning/tidb-lightning-physical-import-mode.md +++ b/tidb-lightning/tidb-lightning-physical-import-mode.md @@ -89,9 +89,7 @@ It is recommended that you allocate CPU more than 32 cores and memory greater th - TiDB Lightning earlier than v5.4.0 cannot import tables of `charset=GBK`. -- When you use TiDB Lightning with TiCDC, note the following: - - - TiCDC cannot capture the data inserted in the physical import mode. +- For considerations when using TiDB Lightning together with TiCDC, please refer to [What are the compatibility limitations between TiDB Lightning Physical Import Mode and TiCDC?](/ticdc/ticdc-faq.md#what-are-the-compatibility-limitations-between-tidb-lightning-physical-import-mode-and-ticdc). - When you use TiDB Lightning with BR, note the following: From 8cabf03aa0491320b70a12294c8eb2282356854d Mon Sep 17 00:00:00 2001 From: lidezhu <47731263+lidezhu@users.noreply.github.com> Date: Fri, 15 Aug 2025 12:13:09 +0800 Subject: [PATCH 2/7] Update ticdc/ticdc-faq.md Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- ticdc/ticdc-faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-faq.md b/ticdc/ticdc-faq.md index b13846a33b069..f8326a2b02897 100644 --- a/ticdc/ticdc-faq.md +++ b/ticdc/ticdc-faq.md @@ -371,7 +371,7 @@ When upstream write traffic is at peak hours, the downstream may fail to consume ## What are the compatibility limitations between TiDB Lightning Physical Import Mode and TiCDC? -TiDB Lightning [Physical Import Mode](/tidb-lightning/tidb-lightning-physical-import-mode.md) directly generates SST files and imports them into the TiKV cluster. Since this import method does not involve the regular data writing process, it does not generate change log records. In most cases, changefeed cannot observe this kinds of data changes. Only during the initialization phase of the changefeed or when Region changes (such as split/merge/leader transfer) trigger incremental scans might this kinds of data changes be observed. Therefore, changefeed cannot fully capture data imported via TiDB Lightning Physical Import Mode. +TiDB Lightning [Physical Import Mode](/tidb-lightning/tidb-lightning-physical-import-mode.md) directly generates SST files and imports them into the TiKV cluster. Since this import method does not involve the regular data writing process, it does not generate change log records. In most cases, a changefeed cannot observe this kind of data change. Only during the initialization phase of the changefeed or when Region changes (such as split/merge/leader transfer) trigger incremental scans might this kind of data change be observed. Therefore, a changefeed cannot fully capture data imported via TiDB Lightning Physical Import Mode. If the tables operated by TiDB Lightning Physical Import Mode overlap with the tables monitored by the changefeed, various unknown errors may occur due to incomplete data capture, such as changefeed synchronization stalling or data inconsistency between upstream and downstream. If you need to use TiDB Lightning Physical Import Mode to import tables synchronized by TiCDC, follow these steps: From d7c9724bc92cdd5af6f08c7ae4d5850b23b8f0c9 Mon Sep 17 00:00:00 2001 From: lidezhu <47731263+lidezhu@users.noreply.github.com> Date: Fri, 15 Aug 2025 12:13:36 +0800 Subject: [PATCH 3/7] Update ticdc/ticdc-faq.md Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- ticdc/ticdc-faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-faq.md b/ticdc/ticdc-faq.md index f8326a2b02897..5a5eda78bd85a 100644 --- a/ticdc/ticdc-faq.md +++ b/ticdc/ticdc-faq.md @@ -373,7 +373,7 @@ When upstream write traffic is at peak hours, the downstream may fail to consume TiDB Lightning [Physical Import Mode](/tidb-lightning/tidb-lightning-physical-import-mode.md) directly generates SST files and imports them into the TiKV cluster. Since this import method does not involve the regular data writing process, it does not generate change log records. In most cases, a changefeed cannot observe this kind of data change. Only during the initialization phase of the changefeed or when Region changes (such as split/merge/leader transfer) trigger incremental scans might this kind of data change be observed. Therefore, a changefeed cannot fully capture data imported via TiDB Lightning Physical Import Mode. -If the tables operated by TiDB Lightning Physical Import Mode overlap with the tables monitored by the changefeed, various unknown errors may occur due to incomplete data capture, such as changefeed synchronization stalling or data inconsistency between upstream and downstream. If you need to use TiDB Lightning Physical Import Mode to import tables synchronized by TiCDC, follow these steps: +If the tables operated by TiDB Lightning Physical Import Mode overlap with the tables monitored by the changefeed, various unknown errors may occur due to incomplete data capture, such as a changefeed stalling or data inconsistency between upstream and downstream. If you need to use TiDB Lightning Physical Import Mode to import tables replicated by TiCDC, follow these steps: 1. Remove the TiCDC replication task related to these tables. From 6cc903bd28fadc18bd74b96547fcd4044d73431f Mon Sep 17 00:00:00 2001 From: lidezhu <47731263+lidezhu@users.noreply.github.com> Date: Fri, 15 Aug 2025 12:17:56 +0800 Subject: [PATCH 4/7] Update ticdc/ticdc-faq.md --- ticdc/ticdc-faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-faq.md b/ticdc/ticdc-faq.md index 5a5eda78bd85a..cbcfc5fb2f18d 100644 --- a/ticdc/ticdc-faq.md +++ b/ticdc/ticdc-faq.md @@ -389,7 +389,7 @@ If you can confirm that the tables operated by TiDB Lightning Physical Import Mo ## What are the compatibility limitations between BR (Backup & Restore) and TiCDC? -BR (Backup & Restore) also directly generates SST files and imports them into the TiKV cluster. Changefeeds do not guarantee the ability to fully capture data imported in this manner. For details, refer to [What are the compatibility limitations between TiDB Lightning Physical Import Mode and TiCDC?](/ticdc/ticdc-faq.md#what-are-the-compatibility-limitations-between-tidb-lightning-physical-import-mode-and-ticdc). +BR (Backup & Restore) also directly generates SST files and imports them into the TiKV cluster. A changefeed cannot fully capture this kind of data change. For details, refer to [What are the compatibility limitations between TiDB Lightning Physical Import Mode and TiCDC?](/ticdc/ticdc-faq.md#what-are-the-compatibility-limitations-between-tidb-lightning-physical-import-mode-and-ticdc). Different versions of BR handle this differently: From a1e3eb66046f5af99745f3baca0d8ef32a259658 Mon Sep 17 00:00:00 2001 From: lidezhu <47731263+lidezhu@users.noreply.github.com> Date: Fri, 15 Aug 2025 12:19:40 +0800 Subject: [PATCH 5/7] Update ticdc/ticdc-faq.md --- ticdc/ticdc-faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-faq.md b/ticdc/ticdc-faq.md index cbcfc5fb2f18d..4fc3541e30233 100644 --- a/ticdc/ticdc-faq.md +++ b/ticdc/ticdc-faq.md @@ -389,7 +389,7 @@ If you can confirm that the tables operated by TiDB Lightning Physical Import Mo ## What are the compatibility limitations between BR (Backup & Restore) and TiCDC? -BR (Backup & Restore) also directly generates SST files and imports them into the TiKV cluster. A changefeed cannot fully capture this kind of data change. For details, refer to [What are the compatibility limitations between TiDB Lightning Physical Import Mode and TiCDC?](/ticdc/ticdc-faq.md#what-are-the-compatibility-limitations-between-tidb-lightning-physical-import-mode-and-ticdc). +BR (Backup & Restore) also directly generates SST files and imports them into the TiKV cluster. A changefeed cannot fully capture data imported in this manner. For details, refer to [What are the compatibility limitations between TiDB Lightning Physical Import Mode and TiCDC?](/ticdc/ticdc-faq.md#what-are-the-compatibility-limitations-between-tidb-lightning-physical-import-mode-and-ticdc). Different versions of BR handle this differently: From 67937d210c99a36956765cfa8e8ff07fc81f6f26 Mon Sep 17 00:00:00 2001 From: lidezhu <47731263+lidezhu@users.noreply.github.com> Date: Fri, 15 Aug 2025 12:20:04 +0800 Subject: [PATCH 6/7] Update ticdc/ticdc-faq.md Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- ticdc/ticdc-faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-faq.md b/ticdc/ticdc-faq.md index 4fc3541e30233..bafd31f91f609 100644 --- a/ticdc/ticdc-faq.md +++ b/ticdc/ticdc-faq.md @@ -395,7 +395,7 @@ Different versions of BR handle this differently: - For versions before v8.2.0, if a changefeed task already exists on the cluster, BR will refuse to create a restore task. -- Starting from v8.2.0, a restore task is only allowed to be created if the backupTs of the data restored by BR is earlier than the checkpointTs of all changefeeds on the cluster. +- Starting from v8.2.0, a restore task is only allowed to be created if the `backupTs` of the data restored by BR is earlier than the `checkpointTs` of all changefeeds on the cluster. ## After a changefeed resumes from pause, its replication latency gets higher and higher and returns to normal only after a few minutes. Why? From 662d9e1044cc9883ad08d77b0aa73efec0f1d05d Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Mon, 18 Aug 2025 11:14:20 +0800 Subject: [PATCH 7/7] Apply suggestions from code review --- ticdc/ticdc-faq.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/ticdc/ticdc-faq.md b/ticdc/ticdc-faq.md index bafd31f91f609..667987174e1f5 100644 --- a/ticdc/ticdc-faq.md +++ b/ticdc/ticdc-faq.md @@ -371,7 +371,7 @@ When upstream write traffic is at peak hours, the downstream may fail to consume ## What are the compatibility limitations between TiDB Lightning Physical Import Mode and TiCDC? -TiDB Lightning [Physical Import Mode](/tidb-lightning/tidb-lightning-physical-import-mode.md) directly generates SST files and imports them into the TiKV cluster. Since this import method does not involve the regular data writing process, it does not generate change log records. In most cases, a changefeed cannot observe this kind of data change. Only during the initialization phase of the changefeed or when Region changes (such as split/merge/leader transfer) trigger incremental scans might this kind of data change be observed. Therefore, a changefeed cannot fully capture data imported via TiDB Lightning Physical Import Mode. +TiDB Lightning [Physical Import Mode](/tidb-lightning/tidb-lightning-physical-import-mode.md) directly generates SST files and imports them into the TiKV cluster. Because this import method does not involve the regular data writing process, it does not generate change log records. In most cases, a changefeed cannot observe this kind of data change. Only during the initialization phase of the changefeed or when Region changes (such as split/merge/leader transfer) trigger incremental scans might this kind of data change be observed. Therefore, a changefeed cannot fully capture data imported via TiDB Lightning Physical Import Mode. If the tables operated by TiDB Lightning Physical Import Mode overlap with the tables monitored by the changefeed, various unknown errors may occur due to incomplete data capture, such as a changefeed stalling or data inconsistency between upstream and downstream. If you need to use TiDB Lightning Physical Import Mode to import tables replicated by TiCDC, follow these steps: @@ -379,7 +379,7 @@ If the tables operated by TiDB Lightning Physical Import Mode overlap with the t 2. Use TiDB Lightning physical import mode to restore data separately in the upstream and downstream clusters of TiCDC. -3. After the restoration is complete and the data consistency of the corresponding tables in the upstream and downstream clusters has been verified, use the timestamp (TSO) after the completion of TiDB Lightning Physical Import Mode as the start-ts of the TiCDC replication task to create a new TiCDC replication task for incremental replication. +3. After the restoration is complete and the data consistency of the corresponding tables in the upstream and downstream clusters has been verified, use the timestamp (TSO) after the completion of TiDB Lightning Physical Import Mode as the `start-ts` of the TiCDC replication task to create a new TiCDC replication task for incremental replication. ```shell cdc cli changefeed create -c "upstream-to-downstream-some-tables" --start-ts=431434047157698561 --sink-uri="mysql://root@127.0.0.1:4000? time-zone=" @@ -393,7 +393,7 @@ BR (Backup & Restore) also directly generates SST files and imports them into th Different versions of BR handle this differently: -- For versions before v8.2.0, if a changefeed task already exists on the cluster, BR will refuse to create a restore task. +- Before v8.2.0, if a changefeed task already exists on the cluster, BR refuses to create a restore task. - Starting from v8.2.0, a restore task is only allowed to be created if the `backupTs` of the data restored by BR is earlier than the `checkpointTs` of all changefeeds on the cluster.