Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 8 additions & 15 deletions modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -8,30 +8,23 @@

.{product}
* xref:ROOT:introduction.adoc[]
* Planning
* Plan your migration
** xref:ROOT:feasibility-checklists.adoc[]
** xref:ROOT:deployment-infrastructure.adoc[]
** xref:ROOT:create-target.adoc[]
** xref:ROOT:rollback.adoc[]
* Phase 1
* Phase 1: Deploy {product-proxy}
** xref:ROOT:phase1.adoc[]
** xref:ROOT:setup-ansible-playbooks.adoc[]
** xref:ROOT:deploy-proxy-monitoring.adoc[]
** xref:ROOT:tls.adoc[]
** xref:ROOT:connect-clients-to-proxy.adoc[]
** xref:ROOT:metrics.adoc[]
** xref:ROOT:manage-proxy-instances.adoc[]
* Phase 2
** xref:ROOT:migrate-and-validate-data.adoc[]
** xref:sideloader:sideloader-zdm.adoc[]
** xref:ROOT:cassandra-data-migrator.adoc[]
** xref:ROOT:dsbulk-migrator.adoc[]
* Phase 3
** xref:ROOT:enable-async-dual-reads.adoc[]
* Phase 4
** xref:ROOT:change-read-routing.adoc[]
* Phase 5
** xref:ROOT:connect-clients-to-target.adoc[]
* xref:ROOT:migrate-and-validate-data.adoc[]
* xref:ROOT:enable-async-dual-reads.adoc[]
* xref:ROOT:change-read-routing.adoc[]
* xref:ROOT:connect-clients-to-target.adoc[]
* xref:ROOT:troubleshooting-tips.adoc[]
* xref:ROOT:faqs.adoc[]
* Release notes
Expand All @@ -47,8 +40,8 @@
* xref:sideloader:troubleshoot-sideloader.adoc[]

.{cass-migrator}
* xref:ROOT:cdm-overview.adoc[]
* xref:ROOT:cassandra-data-migrator.adoc[]
* {cass-migrator-repo}/releases[{cass-migrator-short} release notes]

.{dsbulk-migrator}
* xref:ROOT:dsbulk-migrator-overview.adoc[]
* xref:ROOT:dsbulk-migrator.adoc[]
349 changes: 346 additions & 3 deletions modules/ROOT/pages/cassandra-data-migrator.adoc

Large diffs are not rendered by default.

4 changes: 0 additions & 4 deletions modules/ROOT/pages/cdm-overview.adoc

This file was deleted.

194 changes: 140 additions & 54 deletions modules/ROOT/pages/change-read-routing.adoc
Original file line number Diff line number Diff line change
@@ -1,96 +1,182 @@
= Route reads to the target
= Phase 4: Route reads to the target

This topic explains how you can configure {product-proxy} to route all reads to the target cluster instead of the origin cluster.
After you migrate and validate your data in xref:ROOT:migrate-and-validate-data.adoc[Phase 2], and then test your target cluster's production readiness in xref:ROOT:enable-async-dual-reads.adoc[Phase 3], you can configure {product-proxy} to route _all_ read requests to the target cluster instead of the origin cluster.

image::migration-phase4ra9.png["Phase 4 diagram shows read routing on {product-proxy} was switched to the target."]

For illustrations of all the migration phases, see the xref:introduction.adoc#_migration_phases[Introduction].

== Steps

You would typically perform these steps once you have migrated all the existing data from the origin cluster, and completed all validation checks and reconciliation if necessary.
[IMPORTANT]
====
This phase routes production read requests to the target cluster exclusively.
Make sure all data is present on the target cluster, and it is prepared to handle full-scale production workloads.
====

This operation is a configuration change that can be carried out as explained xref:manage-proxy-instances.adoc#change-mutable-config-variable[here].
image::migration-phase4ra9.png[In migration Phase 4, {product-proxy}'s read routing switches to the target cluster]

[TIP]
====
If you xref:enable-async-dual-reads.adoc[enabled asynchronous dual reads] to test your target cluster's performance, make sure that you disable asynchronous dual reads when you're done testing.
== Prerequisites

To do this, edit the `vars/zdm_proxy_core_config.yml` file, and then set the `read_mode` variable to `PRIMARY_ONLY`.
* Complete xref:ROOT:migrate-and-validate-data.adoc[Phase 2], including thorough data validation and reconciliation of any discrepancies.
+
The success of Phase 4 depends on the target cluster having all the data from the origin cluster.
+
If your migration was idle for some time after completing Phase 2, or you skipped Phase 3, {company} recommends re-validating the data on the target cluster before proceeding.

If you don't disable asynchronous dual reads, {product-proxy} instances send asynchronous, duplicate read requests to your origin cluster.
* Complete xref:ROOT:enable-async-dual-reads.adoc[Phase 3], and then disable asynchronous dual reads by setting `read_mode` to `PRIMARY_ONLY`.
+
If you don't disable asynchronous dual reads, {product-proxy} sends asynchronous, duplicate read requests to your origin cluster.
This is harmless but unnecessary.
====

== Changing the read routing configuration
[#change-the-read-routing-configuration]
== Change the read routing configuration

If you're not there already, `ssh` back into the jumphost:
Read routing is controlled by a mutable configuration variable.
For more information, see xref:manage-proxy-instances.adoc#change-mutable-config-variable[Change a mutable configuration variable].

. Connect to your Ansible Control Host container.
+
For example, `ssh` into the jumphost:
+
[source,bash]
----
ssh -F ~/.ssh/zdm_ssh_config jumphost
----

On the jumphost, connect to the Ansible Control Host container:
+
Then, connect to the Ansible Control Host container:
+
[source,bash]
----
docker exec -it zdm-ansible-container bash
----

You will see a prompt like:
+
.Result
[%collapsible]
====
[source,bash]
----
ubuntu@52772568517c:~$
----
====

Now open the configuration file `vars/zdm_proxy_core_config.yml` for editing.

Change the variable `primary_cluster` to `TARGET`.
. Edit the {product-proxy} core configuration file: `vars/zdm_proxy_core_config.yml`.

Run the playbook that changes the configuration of the existing {product-proxy} deployment:
. Change the `primary_cluster` variable to `TARGET`.

. Run the rolling restart playbook to apply the configuration change to your entire {product-proxy} deployment:
+
[source,bash]
----
ansible-playbook rolling_update_zdm_proxy.yml -i zdm_ansible_inventory
----

Wait for the {product-proxy} instances to be restarted by Ansible, one by one.
All instances will now send all reads to the target cluster instead of the origin cluster.
. Wait while Ansible restarts the {product-proxy} instances, one by one.

At this point, the target cluster becomes the primary cluster, but {product-proxy} still keeps the origin cluster up-to-date through dual writes.
Once the instances are restarted, all reads are routed to the target cluster instead of the origin cluster.

== Verifying the read routing change
At this point, the target cluster is considered the primary cluster, but {product-proxy} still keeps the origin cluster synchronized through dual writes.

Once the read routing configuration change has been rolled out, you may want to verify that reads are correctly sent to the target cluster, as expected.
This is not a required step, but you may wish to do it for peace of mind.
== Verify the read routing change

[TIP]
====
Issuing a `DESCRIBE` or a read to any system table through {product-proxy} isn't a valid verification.
Once the read routing configuration change has been rolled out, you might want to verify that reads are being sent to the target cluster as expected.
This isn't required, but it can provide confirmation that the change was applied successfully.

{product-proxy} handles reads to system tables differently, by intercepting them and always routing them to the origin, in some cases partly populating them at the proxy level.
However, it is difficult to assess read routing because the purpose of {product-short} is to align the clusters and provide an invisible proxy layer between your client application and the database clusters.
By design, the data is expected to be identical on both clusters, and your client application has no awareness of which cluster is servicing its requests.

This means that system reads don't represent how {product-proxy} routes regular user reads.
Even after you switched the configuration to read the target cluster as the primary cluster, all system reads still go to the origin.
For this reason, the only way to manually test read routing is to intentionally write mismatched test data to the clusters.
Then, you can send a read request to {product-proxy} and see which cluster-specific data is returned, which indicates the cluster that received the read request.
There are two ways to do this.

Although `DESCRIBE` requests are not system requests, they are also generally resolved in a different way to regular requests, and should not be used as a means to verify the read routing behavior.
[tabs]
======
Manually create mismatched tables::
+
--
To manually create mismatched data, you can create a test table on each cluster, and then write different data to each table.

[IMPORTANT]
====
When you write the mismatched data to the tables, make sure you connect to each cluster directly.
Don't connect to {product-proxy}, because {product-proxy} will, by design, write the same data to both clusters through dual writes.
====

Verifying that the correct routing is taking place is a slightly cumbersome operation, due to the fact that the purpose of the {product-short} process is to align the clusters and therefore, by definition, the data will be identical on both sides.
. Create a small test table on both clusters, such as a simple key/value table.
You can use an existing keyspace, or create one for this test specifically.
For example:
+
[source,cql]
----
CREATE TABLE test_keyspace.test_table(k TEXT PRIMARY KEY, v TEXT);
----

. Use `cqlsh` to connect _directly to the origin cluster_, and then insert a row with any key and a value that is specific to the origin cluster.
For example:
+
[source,cql]
----
INSERT INTO test_keyspace.test_table(k, v) VALUES ('1', 'Hello from the origin cluster!');
----

. Use `cqlsh` to connect _directly to the target cluster_, and then insert a row with the same key and a value that is specific to the target cluster.
For example:
+
[source,cql]
----
INSERT INTO test_keyspace.test_table(k, v) VALUES ('1', 'Hello from the target cluster!');
----

. Use `cqlsh` to xref:connect-clients-to-proxy.adoc#_connecting_cqlsh_to_the_zdm_proxy[connect to {product-proxy}], and then issue a read request to your test table.
For example:
+
[source,cql]
----
SELECT * FROM test_keyspace.test_table WHERE k = '1';
----
+
The cluster-specific value in the response tells you which cluster received the read request.
For example:
+
* If the read request was correctly routed to the target cluster, the result from `test_table` contains `Hello from the target cluster!`.
* If the read request was incorrectly routed to the origin cluster, the result from `test_table` contains `Hello from the origin cluster!`.

. When you're done testing, drop the test tables from both clusters.
If you created dedicated test keyspaces, drop the keyspaces as well.
--

Use the Themis sample client application::
+
--
The xref:connect-clients-to-proxy.adoc#_themis_client[Themis sample client application] connects directly to the origin cluster, the target cluster, and {product-proxy}.
It inserts some test data in its own, dedicated table.
Then, you can view the results of reads from each source.
For more information, see the https://github.com/absurdfarce/themis/blob/main/README.md[Themis README].
--
======

=== System tables cannot validate read routing

Issuing a `DESCRIBE` command or read request to any system table through {product-proxy} cannot sufficiently validate read routing.

When {product-proxy} receives system reads, it intercepts them and always routes them to the origin, regardless of the `primary_cluster` variable.
In some cases, {product-proxy} partially populates these queries at the proxy level.

This means that system reads don't represent how {product-proxy} routes regular read requests.

Although `DESCRIBE` requests aren't system reads, they are also resolved differently than other `DESCRIBE` requests.
Don't use `DESCRIBE` requests to verify read routing behavior.

== Monitor and troubleshoot read performance

After changing read routing, monitor the performance of {product-proxy} and the target cluster to ensure reads are succeeding and meeting your performance expectations.

If read requests fail or perform poorly, you can <<change-the-read-routing-configuration>> back to `ORIGIN` while you investigate the issue.

If read requests fail due to missing data, go back to xref:ROOT:migrate-and-validate-data.adoc[Phase 2] and repeat your data validation and reconciliation processes as needed to rectify the missing data errors.

If your data model includes non-idempotent operations, ensure that this data is handled correctly during data migration, reconciliation, and ongoing dual writes.
For more information, see xref:ROOT:feasibility-checklists.adoc#non-idempotent-operations[Lightweight Transactions and other non-idempotent operations].

If your target cluster performs poorly, or you skipped Phase 3 previously, go back to xref:ROOT:enable-async-dual-reads.adoc[Phase 3] to test, adjust, and retest the target cluster before reattempting Phase 4.

For this reason, the only way to do a manual verification test is to force a discrepancy of some test data between the clusters.
To do this, you could consider using the xref:connect-clients-to-proxy.adoc#_themis_client[Themis sample client application].
This client application connects directly to the origin cluster, the target cluster, and {product-proxy}.
It inserts some test data in its own table, and then you can view the results of reads from each source.
Refer to the Themis README for more information.
== Next steps

Alternatively, you could follow this manual procedure:
You can stay at this phase as long as you like.
{product-proxy} continues to perform dual writes to both clusters, keeping the origin and target clusters synchronized.

* Create a small test table on both clusters, for example a simple key/value table (it could be in an existing keyspace, or in one that you create specifically for this test).
For example `CREATE TABLE test_keyspace.test_table(k TEXT PRIMARY KEY, v TEXT);`.
* Use `cqlsh` to connect *directly to the origin cluster*.
Insert a row with any key, and with a value specific to the origin cluster, for example `INSERT INTO test_keyspace.test_table(k, v) VALUES ('1', 'Hello from the origin cluster!');`.
* Now, use `cqlsh` to connect *directly to the target cluster*.
Insert a row with the same key as above, but with a value specific to the target cluster, for example `INSERT INTO test_keyspace.test_table(k, v) VALUES ('1', 'Hello from the target cluster!');`.
* Now, use `cqlsh` to xref:connect-clients-to-proxy.adoc#_connecting_cqlsh_to_the_zdm_proxy[connect to {product-proxy}], and then issue a read request for this test table: `SELECT * FROM test_keyspace.test_table WHERE k = '1';`.
The result will clearly show you where the read actually comes from.
When you're ready to complete the migration and stop using your origin cluster, proceed to xref:ROOT:connect-clients-to-target.adoc[Phase 5] to disable dual writes and cut over to the target cluster exclusively.
2 changes: 1 addition & 1 deletion modules/ROOT/pages/components.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ You can use these tools alone or with {product-proxy}.
{sstable-sideloader} is a service running in {astra-db} that imports data from snapshots of your existing {cass-short}-based cluster.
This tool is exclusively for migrations that move data to {astra-db}.

For more information, see xref:sideloader:sideloader-zdm.adoc[].
For more information, see xref:sideloader:sideloader-overview.adoc[].

=== {cass-migrator}

Expand Down
9 changes: 6 additions & 3 deletions modules/ROOT/pages/connect-clients-to-proxy.adoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
= Connect your client applications to {product-proxy}
:navtitle: Connect client applications to {product-proxy}
= Connect client applications to {product-proxy}

{product-proxy} is designed to mimic communication with a typical cluster based on {cass-reg}.
This means that your client applications connect to {product-proxy} in the same way that they already connect to your existing {cass-short}-based clusters.
Expand Down Expand Up @@ -246,4 +245,8 @@ If you need to provide credentials for an {astra-db} database, don't use the {sc
Instead, use the token-based authentication option explained in <<expected-authentication-credentials-for-astra-db>>.

If you include the {scb-short}, `cqlsh` ignores all other connection arguments and connects exclusively to your {astra-db} database instead of {product-proxy}.
====
====

== Next steps

After you connect your client applications to {product-proxy}, you can begin xref:ROOT:migrate-and-validate-data.adoc[Phase 2] of the migration, which is the data migration phase.
19 changes: 10 additions & 9 deletions modules/ROOT/pages/connect-clients-to-target.adoc
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
= Phase 5: Connect your client applications directly to the target
:navtitle: Phase 5: Connect client applications directly to the target
= Phase 5: Connect client applications to the target cluster
:navtitle: Phase 5: Connect client applications to the target

Phase 5 is the last phase of the xref:ROOT:introduction.adoc[migration process].
In this phase, you configure your client applications to connect directly and exclusively to the target cluster.
This removes the dependency on {product-proxy} and completes the migration.
Phase 5 is the last phase of the xref:ROOT:introduction.adoc[migration process], after you route all read requests to the target cluster in xref:ROOT:change-read-routing.adoc[Phase 4].

image::migration-phase5ra.png[In Phase 5, your applications no longer using the proxy and, instead, connect directly to the target.]
In this final phase, you connect your client applications directly and exclusively to the target cluster.
This removes the dependency on {product-proxy} and the origin cluster, thereby completing the migration process.

image::migration-phase5ra.png[In Phase 5, your applications no longer use the proxy and, instead, connect directly to the target cluster]

The minimum requirements for reconfiguring these connections depend on whether your target cluster is {astra-db} or a generic CQL cluster, such as {cass-reg}, {dse}, or {hcd}.

Expand Down Expand Up @@ -185,7 +186,7 @@ Depending on your application's requirements, you might need to make these chang

== Switch to the Data API

If you migrated to {astra-db} or {hcd-short}, and you have the option of using the Data API instead of, or in addition to, a {cass-short} driver.
If you migrated to {astra-db} or {hcd-short}, you have the option of using the Data API instead of, or in addition to, a {cass-short} driver.

Although the Data API can read and write to CQL tables, it is significantly different from driver code.
To use the Data API, you must rewrite your application code or create a new application.
Expand All @@ -201,6 +202,6 @@ For more information, see the following:

Your migration is now complete, and your target cluster is the source of truth for your client applications and data.

When you are ready, you can decommission your origin cluster and {product-proxy}, as these are no longer needed and clean xref:ROOT:rollback.adoc[rollback] is no longer possible.
When you are ready, you can decommission your origin cluster and {product-proxy} because these are no longer needed and xref:ROOT:rollback.adoc[seamless rollback] is no longer possible.

If you need to revert to the origin cluster after this point, you must perform a full migration with your previous origin cluster as the target to ensure that all data is rewritten and synchronized back to the origin.
If you need to revert to the origin cluster after this point, you must perform a full migration in the opposite direction, with your previous origin cluster as the target, to ensure that all data is rewritten and synchronized back to the origin.
Loading