-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
catalog: Catalog backend migration design #23652
catalog: Catalog backend migration design #23652
Conversation
This commit adds a design doc for how to migrate user environments from using the stash to persist as a backing store for the catalog. It uses multiple tombstone values. Works towards resolving #22392
This commit adds a design doc for how to migrate user environments from using the stash to persist as a backing store for the catalog. It uses a single tombstone values. Works towards resolving #22392
44e36ae
to
d11d862
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach makes sense to me!
|
||
Opening a catalog involves the following two steps: | ||
|
||
1. Increment the persist epoch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the "persist epoch"? Is it the same as this one "The catalog durably stores a fencing token called the epoch."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, copy and paste error, just fixed this.
|
||
## Alternatives | ||
|
||
Below is a similar algorithm that uses two tombstone values, one in persist and one in the stash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also prefer the proposed design over this alternative because I like having a single place to check for which one to use
initial state. For example, if we are migrating from the stash to persist and the stash has never | ||
been initialized, then don't initialize the stash. Another example is that if we're rolling back | ||
from persist to the stash and there's data format migrations, then we should skip migrations in | ||
persist. Should we make these optimizations? My opinion is no because they will complicate the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think no. Reducing complexity and the potential for bugs here is more important than minor startup time optimizations
2. Open the persist catalog. | ||
3. If the stash tombstone is `true`, then we're done. | ||
4. Read stash catalog snapshot. | ||
5. Write catalog snapshot to persist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we care about the previous state of the persist catalog? Must it be empty? Should be written down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed an update, we should replace whatever the previous state was with the new snapshot.
2. Open the persist catalog. | ||
3. If the stash tombstone is `false` or doesn't exist, then we're done. | ||
4. Read persist catalog snapshot. | ||
5. Write catalog snapshot to stash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question: do we need to assert anything about the state of the stash?
3. If the stash tombstone is `false` or doesn't exist, then we're done. | ||
4. Read persist catalog snapshot. | ||
5. Replace the contents of the stash catalog with the persist catalog snapshot. | ||
6. Set the stash tombstone to `false`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if there is a crash between steps 5 and 6? If this data was from step 5 above, then the persist tombstone will be empty or false (since the stash to persist migration is done at step 3 if it is true). That seems ok? Was trying to think through if steps 5 and 6 here and maybe above MUST be done as a single transaction, but seems like that does not need to be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if we crash between steps 5 and 6 we should be fine. When writing this I actually tried to add a specific section on that scenario, but couldn't figure out the right words.
The idea is that at all times one of the backing stores is the "source of truth", i.e. contains correct data, and the other backing store can be treated as having complete garbage. It might have stale data, it might have no data, it might even have the correct up to date data, but we just treat it as complete garbage. So after step 5, both the stash and persist have the exact same correct data. However, since we haven't flipped the tombstone flag, we treat the stash data as the source of truth and the persist data as garbage. If we crash here, we have the start the whole process over, as if we had never copied the data into persist, even if we end up making no real changes. The same idea applies if another writer fences us out in-between step 5 and 6, they have to start from the beginning and treat the persist data as garbage and the stash as the source of truth.
This commit adds a design doc for how to migrate user environments from
using the stash to persist as a backing store for the catalog. It uses
a single tombstone values.
Works towards resolving MaterializeInc/database-issues#6762
Motivation
This PR adds a known-desirable feature.
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.