-
Notifications
You must be signed in to change notification settings - Fork 39
DOC-13246: Sync Gateway 3.3 Partitioned Indexes Feature #864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do a full review of this, I am going to come back to this. I wanted to get this in a position for PM review.
The document was correct for how I documented the behavior, but unfortunately some of this was wrong from the planned implementation and not the final implementation. I tried to make it simpler when I realized how complicated it can get.
Some of the comments are specific to dev discussion with PM and might be settled off this PR. I wanted to flag them here for reference and something that should be addressed, but the how of addressing might take more conversation.
.Procedure | ||
. Take the database offline via xref:rest_api_admin_static.adoc#tag/Database-Configuration/operation/post_db-_config[`POST /{db}/_config`] with `offline` set to `true`. | ||
|
||
. Manually delete old indexes -- those with names matching `^sg_.*x1$`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is explicitly wrong, this will delete all indexes.
I am actually leaning toward not documenting this behavior because I think it's unlikely and I do not like users knowing the details of our indexes names.
If you are running a test environment, you can probably start from scratch without any indexes.
@idulo what do you think about removing this? If we do want to keep this we want to remove the indexes for each collection attached to a database. The names of the indexes are internal to sync gateway and we do reserve the right to change them. As of Sync Gateway 3.0, if you are using enable_shared_bucket_access=true
, they will be named sg_allDocs_x1
and sg_channels_x1
on each collection but I want to stress this is very much an internal API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand where you are coming from @torcolvin What I want to communicate to users is that they do have the option of zero downtime, especially in less risky env, same but without exposing any prefixes or suffixes that our indexes have. What happens if users delete indexes that are NOT partitioned or CAN'T be partitioned such as the access/roleAccess
indexes?
Procedure
Take the database offline via [POST /{db}/_config](https://preview.docs-test.couchbase.com/docs-sync-gateway-DOC-13246/sync-gateway/current/rest_api_admin.html#tag/Database-Configuration/operation/post_db-_config) with offline set to true.
Manually delete old indexes .
Bring the database online using [POST /{db}/_config](https://preview.docs-test.couchbase.com/docs-sync-gateway-DOC-13246/sync-gateway/current/rest_api_admin.html#tag/Database-Configuration/operation/post_db-_config) with index.num_partitions set to the required number of partitions.
Run integration and performance testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we instead have basically a warning saying you do have the option for zero downtime which is to manually delete non-partitioned indexes that have already been partitioned. However, check if they have been partitioned before deletion, be very cautious about what you are deleting, because you can leave your system in a messy state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not all the indexes that need to be deleted, and I really think figuring out which indexes to be deleted is really hard.
We could say all Sync Gateway indexes, given that it requires downtime. That is indexes starting with sg_
but that might still effect other databases that are running, and so I don't really feel good about advising this strategy at all.
If they delete all sync gateway indexes, then they have to pay the cost of rebuilding them, but maybe this isn't worth covering at all in this document, and this isn't a common use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we instead of the current Zero Downtime section, replace with:
For production environments where uptime is critical, Sync Gateway supports a zero-downtime path to enable index partitioning. This requires temporary overprovisioning, typically doubling index node capacity, while new partitioned indexes are created alongside existing ones.
Incorrect handling (e.g. manual index deletion) can cause data loss or leave the system in an unstable state. Incorrect deletion of internal indexes can lead to system instability and requires full index rebuilds. We strongly advise using the standard procedure with temporary downtime unless you are fully aware of the implications.
@torcolvin something along these lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We strongly advise using the standard procedure with temporary downtime unless you are fully aware of the implications.
I'm confused by this conversation, because this is the section on migrating with downtime. The standard procedure — the one which is recommended for production — is the zero downtime procedure, with the automatic removal of old indexes.
For the moment I've updated this to say, "Manually delete old Sync Gateway indexes", and added a warning as outlined above, but changed to say "the standard procedure with zero downtime".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merging this PR as is, so that I can get all the changes for SGW 3.3 visible in one place, as requested. We can continue this conversation at #877.
Docs issue: DOC-13246
Docs preview: Partitioned Indexes
Credentials: Preview docs for internal review
Other changes include:
num_index_replicas
in favor ofindex.num_replicas