Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

persist: Stabilize Schema Evolution #30205

Merged

Conversation

ParkMyCar
Copy link
Member

@ParkMyCar ParkMyCar commented Oct 25, 2024

Requires #30725 to be merged.

This PR stabilizes Schema Evolution in Persist which unblocks ALTER TABLE work. There are a few changes in this one PR, they are all geared around handling the slight instability we have with the nullability of columns in Materialized Views.

  1. Internally in Persist, all columns are marked as nullable at the Arrow/Parquet level.
  2. A one-time migration of durably persisted arrow::DataTypes in Persist's Schema Registery that allows them to be more nullable to account for the changes from [1].
  3. Deprecation of the existing SchemaIds in Part and Run metadata. We do this by renaming the existing schema_id fields to deprecated_schema_id and introducing a schema_id field with a new tag. Also a dyncfg is added so we can turn off writing to the new schema_id field until we're confident all nodes have rolled to a new version.
  4. During bootstrapping of the Coordinator, if the nullability of columns for a MatView have changed, we compare and evolve the new schema in Persist.
  5. Changed the upgrade check in catalog-debug to validate that the RelationDescs as planned by the new version of MZ are compatible with the ones durably recorded in Persist.

Also included in this PR are two feature gate changes to disable new features in our tests until > v0.126.

  1. Builtin Continual Tasks
  2. 0dt Enabled Sources

Both of these features cause the new version of MZ to issue writes during a 0dt upgrade when it's supposed to be in read-only mode. Because this PR changes the Arrow datatypes in a non-forward compatible way, this causes the 0dt tests to fail since the old version panics when it sees the new batches.

Motivation

Fixes https://github.com/MaterializeInc/database-issues/issues/8660

Tips for reviewer

All of these changes have been made in separate commits of the PR which ideally makes reviewing easier.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@ParkMyCar ParkMyCar changed the title persist: Stability Schema Evolution persist: Stabilize Schema Evolution Oct 25, 2024
@ParkMyCar ParkMyCar force-pushed the persist/all-columns-nullable branch 2 times, most recently from 7547fdb to 73a82eb Compare October 25, 2024 23:47
@ParkMyCar ParkMyCar marked this pull request as ready for review October 26, 2024 00:53
@ParkMyCar ParkMyCar requested review from a team as code owners October 26, 2024 00:53
@ParkMyCar ParkMyCar requested a review from jkosh44 October 26, 2024 00:53
Copy link

shepherdlybot bot commented Oct 26, 2024

Risk Score:83 / 100 Bug Hotspots:7 Resilience Coverage:66%

Mitigations

Completing required mitigations increases Resilience Coverage.

  • (Required) Code Review 🔍 Detected
  • (Required) Feature Flag
  • (Required) Integration Test
  • (Required) Observability 🔍 Detected
  • (Required) QA Review 🔍 Detected
  • (Required) Run Nightly Tests
  • Unit Test 🔍 Detected
Risk Summary:

The pull request has a high risk score of 83, primarily driven by the predictors "Avg Line Count In Files" and "Executable Lines Within Files," with 7 modified files identified as hotspots. Historically, PRs with these predictors are 155% more likely to cause a bug than the repository baseline. Additionally, the observed bug trend in the repository is increasing.

Note: The risk score is not based on semantic analysis but on historical predictors of bug occurrence in the repository. The attributes above were deemed the strongest predictors based on that history. Predictors and the score may change as the PR evolves in code, time, and review activity.

Bug Hotspots:
What's This?

File Percentile
../row/encode.rs 98
../src/main.rs 93
../src/arrow.rs 90
../src/coord.rs 99
../src/controller.rs 92
../src/cfg.rs 95
../src/lib.rs 98

Copy link
Contributor

@jkosh44 jkosh44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adapter changes LGTM

Copy link
Contributor

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scanned this and everything seems reasonable on the surface, but I think it'll be hard to make time this week for the full detailed review. @bkirwi mind picking up the persist review on this?

src/persist-client/src/batch.rs Outdated Show resolved Hide resolved
src/catalog-debug/src/main.rs Outdated Show resolved Hide resolved
@danhhz danhhz requested a review from bkirwi October 29, 2024 17:00
@ParkMyCar ParkMyCar force-pushed the persist/all-columns-nullable branch from 73a82eb to b5cb8e3 Compare October 30, 2024 15:59
src/catalog-debug/src/main.rs Outdated Show resolved Hide resolved
///
/// Errors if `new` is less nullable than `old`, or `old` and `new` are different types or have
/// different nested fields.
pub(crate) fn is_atleast_as_nullable(old: &DataType, new: &DataType) -> Result<(), anyhow::Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on the need for this? AFAICT this is ~identical to backward_compatible_typ(old, new).is_some(), aside from slightly different behaviour when fields are added. OTOH, it will silently allow fields to be dropped...

Is there some way to avoid duplicating the similar logic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is nearly identical! We could try and re-use backward_compatible_typ(...) but because bacward_compatible_typ(...) allows adding new columns is why I shied away from it.

I do believe we should be able to delete this code after a release or two since the new DataTypes will be migrated, so maybe code reuse isn't super important if we're going to get rid of it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay! I don't know that allowing additions is so much worse than allowing deletes, since neither should come up in practice... but if you're confident we can get rid of this in short order I won't worry about it too much one way or the other.

src/catalog-debug/src/main.rs Outdated Show resolved Hide resolved
@@ -208,6 +208,9 @@ pub enum BatchPart<T> {
updates: LazyInlineBatchPart,
ts_rewrite: Option<Antichain<T>>,
schema_id: Option<SchemaId>,

/// ID of a schema that has since been deprecated and exists only to cleanly roundtrip.
deprecated_schema_id: Option<SchemaId>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no real possibility of ever getting rid of this, hey? Since parts would fail to roundtrip if we ever removed it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not totally sure! I need to think through State a bit more and how we might support removing fields, but I wouldn't be surprised if we need to do a bit of work to support migration here

src/persist-client/src/internal/state.rs Outdated Show resolved Hide resolved
src/storage-controller/src/lib.rs Outdated Show resolved Hide resolved
@bkirwi
Copy link
Contributor

bkirwi commented Nov 1, 2024

Thanks for the followup etc! I'll do a final pass on this today.

Copy link
Contributor

@bkirwi bkirwi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One important-but-hopefully uncontroversial change suggested, but otherwise looks good! Thanks for making the idempotency change... feels much easier to reason about now.

I think we can probably get rid of DatumEncoder entirely now, since IIUC it only exists to preserve field nullability for DatumColumnEncoder. But we can leave that as a followup.

src/persist-client/src/internal/state.rs Show resolved Hide resolved
src/repr/src/row/encode.rs Show resolved Hide resolved
@ParkMyCar ParkMyCar force-pushed the persist/all-columns-nullable branch from 63c3efa to ec6ef90 Compare November 5, 2024 16:27
@ParkMyCar
Copy link
Member Author

Moving this into draft while I iterate a bit here to figure out the nightly failures

@ParkMyCar ParkMyCar marked this pull request as draft November 12, 2024 19:23
@ParkMyCar ParkMyCar force-pushed the persist/all-columns-nullable branch from 23d5c11 to 3f5b265 Compare November 12, 2024 19:24
@ParkMyCar ParkMyCar force-pushed the persist/all-columns-nullable branch 2 times, most recently from cb9b7b6 to 2434665 Compare November 27, 2024 15:46
@ParkMyCar ParkMyCar marked this pull request as ready for review November 27, 2024 16:19
@ParkMyCar ParkMyCar force-pushed the persist/all-columns-nullable branch from 2434665 to cbc956b Compare December 2, 2024 17:57
@ParkMyCar ParkMyCar marked this pull request as draft December 2, 2024 19:46
@ParkMyCar
Copy link
Member Author

Moving back to draft while I iterate more

@ParkMyCar ParkMyCar force-pushed the persist/all-columns-nullable branch from cbc956b to 497a477 Compare December 2, 2024 19:46
@ParkMyCar ParkMyCar force-pushed the persist/all-columns-nullable branch from b8ca9ba to 5e3a637 Compare December 4, 2024 17:54
@ParkMyCar
Copy link
Member Author

@bkirwi this ended up not changing all that much, but is ready for review again whenever you have a moment!

Note: Nightlies won't pass until #30725 is merged

@ParkMyCar ParkMyCar marked this pull request as ready for review December 4, 2024 17:55
ParkMyCar added a commit that referenced this pull request Dec 4, 2024
This PR disables Persist's compaction when environmentd or clusterd are
in read-only mode. As part of
#30205 we discovered
that during a 0dt deployment the read-only instance of Materialize was
scheduling compaction requests which was causing it to write data.

We disable compaction by adding a `process_compaction_requests:
Arc<AtomicBool>` to the `PersistConfig`, and when setting it to `true`
when `clusterd` receives the already existing `AllowWrites` command.
Also included is a CYA dyncfg that gates whether or not we check the
flag, it's set to `true` by default.

I also added a Prometheus metric to count the number of requests dropped
because compaction is disabled, a unit test to exercise the basic
enable/disable behavior, and manually checked when running a 0dt test
that compaction was successfully disabled and then enabled across
`environmentd` and all `clusterd`s when a deployment was promoted.

### Motivation

Fix issue found in
#30205

### Checklist

- [x] This PR has adequate test coverage / QA involvement has been duly
considered. ([trigger-ci for additional test/nightly
runs](https://trigger-ci.dev.materialize.com/))
- [x] This PR has an associated up-to-date [design
doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md),
is a design doc
([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)),
or is sufficiently small to not require a design.
  <!-- Reference the design in the description. -->
- [x] If this PR evolves [an existing `$T ⇔ Proto$T`
mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md)
(possibly in a backwards-incompatible way), then it is tagged with a
`T-proto` label.
- [x] If this PR will require changes to cloud orchestration or tests,
there is a companion cloud PR to account for those changes that is
tagged with the release-blocker label
([example](MaterializeInc/cloud#5021)).
<!-- Ask in #team-cloud on Slack if you need help preparing the cloud
PR. -->
- [x] If this PR includes major [user-facing behavior
changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note),
I have pinged the relevant PM to schedule a changelog post.

---------

Co-authored-by: Nikhil Benesch <[email protected]>
Co-authored-by: Ben Kirwin <[email protected]>
@ParkMyCar ParkMyCar force-pushed the persist/all-columns-nullable branch from 5e3a637 to 8bf98b5 Compare December 4, 2024 21:26
@ParkMyCar
Copy link
Member Author

Rebased (apologies if this messed up your review Ben!) and kicked off Nightlies

Copy link
Contributor

@bkirwi bkirwi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits and a lingering question, but in any case I think this should be good to go.

// updates to our optimizer.
self.controller
.storage
.evolve_nullability_for_bootstrap(storage_metadata, compute_collections)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that?

  • This is a write on the read path: starting the read-only replica irrevocably migrates the collection.
  • This is fine in the short term because we're only making things more nullable at the data type level, so all the old writes are valid with the new schema.
  • This is fine in the longer term because it should be a noop at the data type level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! Also, we only migrate collections for Materialized Views and Continual Tasks, not Tables or Sources

crate::schema::is_atleast_as_nullable(&old_v_datatype, &new_v_datatype);

// If the Arrow DataType for `k` or `v` has changed, but it's only become more
// nullable, then we allow in-place re-writing of the schema.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is friday brain, but: since we're updating existing schema ids here with a backwards compatible change, what would go wrong if we didn't do the schema-id-deprecation thing? Or nothing breaks, and we're doing it to avoid ambiguity?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not Friday brain at all, there are a couple of trade-offs here. We could skip deprecating the schema_id field and just write a new SchemaId(1) for all collections, but:

  • We would have have the same RelationDesc mapping to two different arrow::DataTypes, which I don't think is too bad, but would probably require us to also implement something like persist: Plumb relevant arrow::DataTypes to compaction #30627 in the future.
  • Right now the Persist SchemaId maps 1:1 with the Adapter's RelationVersion for Tables. If we evolve the schemas for all collections to be more nullable we'd need some additional mapping in the Adapter layer to support ALTER TABLE

These are all things that we can work around, but also IMO the concept of "starting" over is the cleanest choice.

prost::Message::encode_to_vec(&proto).into()
}

fn decode_data_type(buf: Bytes) -> Result<DataType, anyhow::Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit, but the names here aren't quite consistent: encoded vs. decode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

val,
val_data_type: new_v_encoded_datatype,
},
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should give ourselves some way to check if we're hitting this path later than we'd expect. (I think a log is fine if wiring up a metric is a hassle... shouldn't be too high of a multiplicity.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wired up a metric which should be easier to observe! We theoretically could write a query over our logs, but wiring that up is maybe harder than adding a prom metric haha

@bkirwi
Copy link
Contributor

bkirwi commented Dec 6, 2024

Not sure what to think of the parallel ddl failure... hit restart to see if it's sticky!

@def-
Copy link
Contributor

def- commented Dec 6, 2024

Not sure what to think of the parallel ddl failure... hit restart to see if it's sticky!

It's a test issue, fixed on main

@ParkMyCar ParkMyCar force-pushed the persist/all-columns-nullable branch from 6068b96 to 93a5822 Compare December 9, 2024 18:20
@ParkMyCar
Copy link
Member Author

The Nightly tests we care about are Green, merging!

@ParkMyCar ParkMyCar merged commit 9375fd3 into MaterializeInc:main Dec 9, 2024
223 of 234 checks passed
ParkMyCar added a commit that referenced this pull request Dec 16, 2024
As part of #30205 we
check if a new `arrow::DataType` is "at least as nullable" as the
previous type, and if so we update the registered data type in-place.

This check failed to account for the change we made a little while ago
where were use `arrow::DataType::List` for `ScalarType::Map`, instead of
`arrow::DataType::Map`.

### Motivation

Fixes MaterializeInc/database-issues#8834

### Tips for reviewer

<!--
Leave some tips for your reviewer, like:

    * The diff is much smaller if viewed with whitespace hidden.
    * [Some function/module/file] deserves extra attention.
* [Some function/module/file] is pure code movement and only needs a
skim.

Delete this section if no tips.
-->

### Checklist

- [ ] This PR has adequate test coverage / QA involvement has been duly
considered. ([trigger-ci for additional test/nightly
runs](https://trigger-ci.dev.materialize.com/))
- [ ] This PR has an associated up-to-date [design
doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md),
is a design doc
([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)),
or is sufficiently small to not require a design.
  <!-- Reference the design in the description. -->
- [ ] If this PR evolves [an existing `$T ⇔ Proto$T`
mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md)
(possibly in a backwards-incompatible way), then it is tagged with a
`T-proto` label.
- [ ] If this PR will require changes to cloud orchestration or tests,
there is a companion cloud PR to account for those changes that is
tagged with the release-blocker label
([example](MaterializeInc/cloud#5021)).
<!-- Ask in #team-cloud on Slack if you need help preparing the cloud
PR. -->
- [ ] If this PR includes major [user-facing behavior
changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note),
I have pinged the relevant PM to schedule a changelog post.
ParkMyCar added a commit that referenced this pull request Dec 16, 2024
As part of #30205 we
check if a new `arrow::DataType` is "at least as nullable" as the
previous type, and if so we update the registered data type in-place.

This check failed to account for the change we made a little while ago
where were use `arrow::DataType::List` for `ScalarType::Map`, instead of
`arrow::DataType::Map`.

### Motivation

Fixes MaterializeInc/database-issues#8834

### Tips for reviewer

<!--
Leave some tips for your reviewer, like:

    * The diff is much smaller if viewed with whitespace hidden.
    * [Some function/module/file] deserves extra attention.
* [Some function/module/file] is pure code movement and only needs a
skim.

Delete this section if no tips.
-->

### Checklist

- [ ] This PR has adequate test coverage / QA involvement has been duly
considered. ([trigger-ci for additional test/nightly
runs](https://trigger-ci.dev.materialize.com/))
- [ ] This PR has an associated up-to-date [design
doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md),
is a design doc
([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)),
or is sufficiently small to not require a design.
  <!-- Reference the design in the description. -->
- [ ] If this PR evolves [an existing `$T ⇔ Proto$T`
mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md)
(possibly in a backwards-incompatible way), then it is tagged with a
`T-proto` label.
- [ ] If this PR will require changes to cloud orchestration or tests,
there is a companion cloud PR to account for those changes that is
tagged with the release-blocker label
([example](MaterializeInc/cloud#5021)).
<!-- Ask in #team-cloud on Slack if you need help preparing the cloud
PR. -->
- [ ] If this PR includes major [user-facing behavior
changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note),
I have pinged the relevant PM to schedule a changelog post.
ParkMyCar added a commit that referenced this pull request Feb 6, 2025
Removes the onetime schema migration that was introduced in
#30205. I looked back
on 30 days worth of metrics and validated that
`mz_persist_one_time_migration_more_nullable` did not show up for any
users which indicates that our schema representation should be stable.

### Motivation

Code cleanup

### Checklist

- [x] This PR has adequate test coverage / QA involvement has been duly
considered. ([trigger-ci for additional test/nightly
runs](https://trigger-ci.dev.materialize.com/))
- [x] This PR has an associated up-to-date [design
doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md),
is a design doc
([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)),
or is sufficiently small to not require a design.
  <!-- Reference the design in the description. -->
- [x] If this PR evolves [an existing `$T ⇔ Proto$T`
mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md)
(possibly in a backwards-incompatible way), then it is tagged with a
`T-proto` label.
- [x] If this PR will require changes to cloud orchestration or tests,
there is a companion cloud PR to account for those changes that is
tagged with the release-blocker label
([example](MaterializeInc/cloud#5021)).
<!-- Ask in #team-cloud on Slack if you need help preparing the cloud
PR. -->
- [x] If this PR includes major [user-facing behavior
changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note),
I have pinged the relevant PM to schedule a changelog post.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants