Skip to content

DOC-12667 Server XDCR and mobile coexistence for 7_6_6 #3801

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 21 commits into
base: release/7.6
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,8 @@ include::third-party:partial$nav.adoc[]
**** xref:learn:clusters-and-availability/xdcr-filtering.adoc[XDCR Advanced Filtering]
**** xref:learn:clusters-and-availability/xdcr-conflict-resolution.adoc[XDCR Conflict Resolution]
**** xref:learn:clusters-and-availability/xdcr-with-scopes-and-collections.adoc[XDCR with Scopes and Collections]
**** xref:learn:clusters-and-availability/xdcr-enable-crossclusterversioning.adoc[XDCR enableCrossClusterVersioning]
**** xref:learn:clusters-and-availability/xdcr-active-active-sgw.adoc[XDCR Active-Active with Sync Gateway]
*** xref:learn:clusters-and-availability/groups.adoc[Server Group Awareness]
* xref:learn:security/security-overview.adoc[Security]
** xref:learn:security/authentication.adoc[Authentication]
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
= XDCR Active-Active with Sync Gateway
:description: pass:q[Using XDCR Active-Active replication with Sync Gateway, you can configure an active-active XDCR setup with both Sync Gateway (SGW) and mobile applications on the XDCR source and target clusters.]

[abstract]
{description}

[#xdcr-active-active-sgw-intro]
== Introduction

In the versions earlier than Server 7.6.6 and Sync Gateway (SGW) 4.0.0, only an active-passive setup was supported with both XDCR and SGW. XDCR Active-Active replication with Sync Gateway for XDCR-Mobile interoperability configuration is introduced in the Server 7.6.6 version, where you can configure an active-active XDCR setup with Sync Gateway (SGW) and mobile applications both on the XDCR source and target clusters. You need to have at least a Server 7.6.6 version and SGW 4.0.0 version to use this setup.

[IMPORTANT]
====
Here are a few limitations to the XDCR active-active with Sync Gateway feature.

* If you use the user created extended attributes (xattrs) in your documents, and you have more than 10 user xattrs in a document, then you cannot use the feature _XDCR Active-Active with Sync Gateway_. This is due to an internal limitation of managing extended attributes in a document. If you try to use the feature _XDCR Active-Active with Sync Gateway_ when you have more than 10 user xattrs in your document, the XDCR replication **silently skips** replicating the document that has more than 10 user xattrs. As a result, the data in the replication skipped documents will not be consistent between the target and source clusters. The only way you will know this skip occured is because the Prometheus stat **subdoc_cmd_docs_skipped** will be incremented and the document will _not_ be consistent between the target and source.
* Eventing Service cannot yet be used in a Sync Gateway 4.0+ version or in an XDCR bi-directional replication environment because a metadata update occurs causing XDCR to ping-pong and never stop replicating in an Active-Active XDCR environment. However, one-way replication is possible.
====

You can configure XDCR Active-Active with Sync Gateway for XDCR-Mobile interoperability using one of the following methods:

* Greenfield deployment: Set up a new active-active configuration with both XDCR and SGW.
* Upgrading an existing setup: Convert an existing active-passive XDCR-SGW configuration to an active-active XDCR-SGW setup.

NOTE: When using XDCR Active-Active with Sync Gateway, where Sync Gateway version is 4.0+ and Server version is 7.6.6, the replication target XDCR inbound user must have the RBAC roles, XDCR Inbound role and the Data Writer role.

[#xdcr-active-active-sgw-prerequisites]
== Prerequisites

Set the bucket property `enableCrossClusterVersioning` to use the setting `mobile=Active`. To enable the bucket property `enableCrossClusterVersioning` using REST API or from the UI, see xref:learn:clusters-and-availability/xdcr-enable-crossclusterversioning.adoc[XDCR enableCrossClusterVersioning].

[#xdcr-active-active-sgw-greenfield-deployment]
== Greenfield Deployment

To configure a new active-active XDCR with SGW setup, do the following:

. Create two clusters on Server 7.6.6 or a higher version with _all_ the nodes of the clusters, for example, cluster A and cluster B (or you can upgrade the existing Server clusters to 7.6.6 or a higher version).
. Create buckets, for example, B1 and B2 in cluster A and cluster B respectively, between which XDCR will be set up. Now, do the following:
.. Enable the ECCV setting on B1. All the mutations in B1 will have a new metadata called HLV.
.. Enable the ECCV setting on B2. All the mutations in B2 will have a new metadata called HLV.
+
NOTE: ECCV refers to the bucket property `enableCrossClusterVersioning`.
+
. Create an XDCR from B1 to B2 by setting `mobile=Active`. Also, create an XDCR from B2 to B1 by setting `mobile=Active`. For information about creating an XDCR, see xref:manage:manage-xdcr/create-xdcr-replication.adoc[Create a Replication].
. Configure SGW 4.0.0 version on each cluster, cluster A and cluster B.

This setup can handle application traffic on both buckets B1 and B2 of the respective clusters along with SGW import into both the buckets simultaneously.

[#xdcr-active-active-sgw-upgrade]
== Upgrading an existing setup

Convert an existing active-passive XDCR-SGW setup into an active-active XDCR-SGW setup.

For illustration, there are two clusters, A and B. An SGW is connected to cluster A and this cluster is active. Cluster B is passive with XDCR setup from bucket B1 in cluster A to bucket B2 in cluster B. The current application traffic should be only on bucket B1 of cluster A.

.Replication before upgrade: XDCR Active-Passive with SGW
image::clusters-and-availability/xdcr-active-sgw-before-upgrade.png[,720,align=left]

. Upgrade both clusters A and B with _all_ the nodes of the clusters to Server 7.6.6 or a higher version.
. Enable ECCV on bucket B1. All the mutations in B1, after this point of time, will have a new metadata called HLV.
+
NOTE: ECCV refers to the bucket property `enableCrossClusterVersioning`.
+
. Enable ECCV on bucket B2. All the mutations in B2, after this point of time, will have a new metadata called HLV.
. Update the replication settings to `mobile=Active` of the already existing XDCR from B1 to B2.
. Create an XDCR from B2 to B1 with the replication settings as `mobile=Active`.
. Upgrade SGW on cluster A to the version 4.0.0.
. Connect SGW version 4.0.0 to cluster B.
. Enable application active traffic on cluster B.

This setup can handle application traffic on both buckets B1 and B2 of the respective clusters along with SGW import into both the buckets simultaneously.

This is an illustration of the final configuration:

.Replication after upgrade: XDCR Active-Active with SGW
image::clusters-and-availability/xdcr-active-sgw-after-upgrade.png[,720,align=left]

Original file line number Diff line number Diff line change
@@ -1,49 +1,68 @@
= XDCR Conflict Resolution
:description: pass:q[_XDCR Conflict Resolution_ automatically synchronizes document-copies that have been modified in different ways at different locations.]
:description: pass:q[In an XDCR process, modified documents are copied or replicated from the source bucket (or collection) to the target bucket (or collection).]
:page-aliases: xdcr:xdcr-conflict-resolution,xdcr:xdcr-timestamp-based-conflict-resolution

[abstract]
{description}
If a target document with the same document ID as the source document already exists, the XDCR conflict resolution process determines whether the source document replaces the target document or not.

[#conflicts_and_their_resolution]
== Conflicts and Their Resolution

A _conflict_ is caused when the source and target copies of an XDCR-replicated document are updated independently of and dissimilarly to one another, each by a local application.
The conflict must be _resolved_, by determining which of the variants should prevail; and then correspondingly saving both documents in identical form.
XDCR provides an automated _conflict resolution_ process.
A conflict occurs when a document in the target bucket (or collection) has the same document ID as that of the source bucket (or collection) during an XDCR process. XDCR provides an automated conflict resolution process. XDCR supports the following two alternative conflict resolution policies:

Two, alternative conflict resolution policies are supported: _sequence-number-based_ (which is the default), and _timestamp-based_.
Note that _timestamp-based_ conflict resolution is only available in the Enterprise Edition of Couchbase Server.
* Sequence number-based conflict resolution (This is the default policy).
* Timestamp-based conflict resolution.

[#the_conflict_resolution_process]
== The Conflict Resolution Process

When a source document is modified, XDCR determines whether this revision of the document should be applied to the target.
For documents above 256 bytes in size, XDCR fetches metadata from the target cluster before replicating.
The target metadata for the document is compared with the source metadata for the document, in order to choose which document should prevail (the exact subset of metadata used in this comparison depends on the source bucket's _conflict resolution policy_).
If the source document prevails, it is replicated to the target; if the target document prevails, the source document is not replicated.
The conflict resolution processes can be enabled or disabled using the `enableCrossClusterVersioning` property (ECCV property).

Once a replicated document reaches the target, the target cluster also performs a metadata comparison as described, in order to confirm that the document from the source cluster should indeed prevail. If this is confirmed, the document from the source cluster is applied to the target cluster, and the target cluster's previous version of the document is discarded.
[#eccv-false-for-conflict-resolution-process]
=== When enableCrossClusterVersioning is false for a Conflict Resolution Process

As a performance optimization, XDCR makes no metadata comparison on the source for documents of 256 bytes or less, thus making unnecessary a metadata fetch from the target cluster: instead, the document is replicated immediately to the target, and metadata comparison is performed there.
When a source document is modified, XDCR determines whether this document revision must be applied to the target document. For documents above the size of 256 bytes, before the replication process, XDCR fetches metadata from the target cluster. To choose the document that must prevail (the exact subset of metadata used in this comparison depends on the conflict resolution policy of source bucket), the target metadata of the document is compared with the source metadata of the document. If the source document prevails, then the source document is replicated to the target cluster; and if the target document prevails, the source document is not replicated to the target cluster.

If a document is deleted on the source, XDCR makes no metadata comparison on the source before replication.
After the replicated document reaches the target, the target cluster also performs a metadata comparison, as described earlier, to confirm if the document from the source cluster must prevail or not. If it is confirmed that the document from the source cluster must prevail, then the document from the source cluster is applied to the target cluster, and the previous version of the target cluster document is discarded.

Once configured, conflict resolution is a fully automated process, requiring no manual intervention.
As a performance optimization, XDCR makes no metadata comparison on the source cluster for documents of the size 256 bytes or less. As a result, the following occurs:

* A metadata fetch from the target cluster is made unnecessary.
* The source document is replicated immediately to the target cluster.
* A metadata comparison is performed in the target cluster.

This behavior is called XDCR optimistic replication. The default optimistic replication threshold is 256 bytes.

NOTE: XDCR optimistic replication behavior applies only when the `enableCrossClusterVersioning` property is not in use.

Once configured, conflict resolution is a fully automated process, which requires no manual intervention.

[#eccv-true-for-conflict-resolution-process]
=== When enableCrossClusterVersioning is true for a Conflict Resolution Process

XDCR optimistic replication behavior, as described earlier, applies only when the 'enableCrossClusterVersioning' property is not in use.

When the `enableCrossClusterVersioning` property is set to `true` (enabled), the hybrid logical vector (HLV) metadata is created and maintained for each document in the document xattrs (system created extended attributes). This HLV metadata needs to be checked and updated for both the source and the target documents. As a result, the XDCR processing retrieves metadata from the target cluster for every document of any size that undergoes replication. For more information on the `enableCrossClusterVersioning` property and the HLV metadata, see xref:clusters-and-availability/xdcr-enable-crossclusterversioning.adoc[XDCR enableCrossClusterVersioning].

[#revision-id-based-conflict-resolution]
== Conflict Resolution Based on Sequence Number

Conflicts can be resolved by referring to documents' _sequence numbers_.
Sequence numbers are maintained per document, and are incremented on every document-update.
A document's sequence number is stored as part of its _metadata_: specifically, as the value of the `rev` key (see xref:manage:manage-ui/manage-ui.adoc#console-documents[Documents], for details on how to inspect metadata).
A document's sequence number is stored as a part of its _metadata_: specifically, as the value of the `rev` key (see xref:manage:manage-ui/manage-ui.adoc#console-documents[Documents], for details on how to inspect metadata).
The sequence numbers of source and target documents are compared; and the document with the higher sequence number prevails.
If both documents have the same sequence number, the conflict is resolved by comparing the following metadata-elements, in the order shown:

. CAS value
. Expiration (TTL) value
. Document flags

[#eccv-true-in-sequence-number-based-conflict-resolution]
=== When enableCrossClusterVersioning is true for a sequence number based conflict resolution process

When the `enableCrossClusterVersioning` property is set to `true`, the hybrid logical vector (HLV) metadata, located in the document xattrs (system created extended attributes) of the source and the target, is also used in the sequence number based conflict resolution processing. The features or options used in the XDCR process determine the specific ways of using the HLV metadata. For more information on the `enableCrossClusterVersioning` property and the HLV metadata, see xref:clusters-and-availability/xdcr-enable-crossclusterversioning.adoc[XDCR enableCrossClusterVersioning].

[#timestamp-based-conflict-resolution]
== Timestamp-Based Conflict Resolution

Expand All @@ -57,6 +76,11 @@ If both document-versions have the same timestamp-value, the conflict is resolve
. Expiration (TTL) value
. Document flags

[#eccv-true-in-timestamp-based-conflict-resolution]
=== When enableCrossClusterVersioning is true for a timestamp-based conflict resolution process

When the `enableCrossClusterVersioning` property is set to `true`, the hybrid logical vector (HLV) metadata located in the document xattr (system created extended attributes) of the source and the target, is also used in the timestamp-based conflict resolution processing. The features or options used in the XDCR process determine the specific ways of using the HLV metadata. For more information on the `enableCrossClusterVersioning` property and the HLV metadata, see xref:clusters-and-availability/xdcr-enable-crossclusterversioning.adoc[XDCR enableCrossClusterVersioning].

[#time-synchronization]
=== Time Synchronization

Expand Down Expand Up @@ -84,7 +108,7 @@ Each mutation has its own HLC timestamp.
[#ensuring_safe_failover]
=== Ensuring Safe Failover

When failover (say, from data center A to data center B) is required, timestamp-based conflict resolution requires that applications redirect traffic to data center B only after the greater of the following two time-periods has elapsed:
When failover of an application is required (say, from data center A to data center B), timestamp-based conflict resolution requires that applications redirect traffic to data center B only after the greater of the following two time-periods has elapsed:

* The replication latency between data centers A and B.
This provides sufficient time for any _in-flight_ mutations to be received by data center B prior to traffic redirection.
Expand All @@ -98,8 +122,15 @@ When availability is restored to data center A, applications must wait for the s

Conflict resolution policy is configured on a per-bucket basis at bucket creation time, it cannot be changed later.
For more information, see xref:manage:manage-buckets/create-bucket.adoc[Create a Bucket].
Choosing a conflict resolution method requires consideration of the logic of the applications that require the data.
This is illustrated by the following examples:

[IMPORTANT]
====
* You must select the same conflict resolution policy for all the buckets in the replication topology because you can create a replication between only those buckets that have the same conflict resolution policy.
* When creating a bucket, you must actively choose the conflict resolution policy and if you do not choose a policy, the Sequence number-based conflict resolution policy is set as default.
* After the bucket is created, you cannot change the conflict resolution policy for that bucket. In general, the Timestamp-based conflict resolution policy is preferred as the logic is easier to understand, feasible with general use cases, and also preferred for working with the latest Server features.
====

The following examples illustrate how the two different conflict resolution policies apply:

* _Sequence-Number-based_, whereby the document with the higher number of updates wins.
A hit-counter, for a website, is stored as a document within Couchbase Server: a value within the document is incremented each time the website is accessed.
Expand All @@ -114,9 +145,9 @@ Therefore, in this instance, timestamp-based conflict resolution should be used,
[#aligning_source_and_target_policies]
== Aligning Source and Target Policies

XDCR replications cannot be created between buckets with different conflict resolution policies: source and target buckets must always be configured with the same policy.
XDCR replications cannot be created between buckets with different conflict resolution policies. The source and target buckets must always be configured with the same conflict resolution policy.

When using XDCR with a source cluster running a pre-4.6 version of Couchbase Server, only conflict resolution based on _sequence numbers_ can be used.
When creating a bucket, you must actively choose the conflict resolution policy and if you do not choose a policy, the Sequence number-based conflict resolution policy is set as default. After the bucket is created, you cannot change the conflict resolution policy for that bucket. In general, the Timestamp-based conflict resolution policy is preferred as the logic is easier to understand.

[#monitoring-conflict-resolution]
== Monitoring Conflict Resolution on the Target Cluster
Expand Down
Loading