persist: Stabilize Schema Evolution #30205

ParkMyCar · 2024-10-25T21:03:17Z

Requires #30725 to be merged.

This PR stabilizes Schema Evolution in Persist which unblocks ALTER TABLE work. There are a few changes in this one PR, they are all geared around handling the slight instability we have with the nullability of columns in Materialized Views.

Internally in Persist, all columns are marked as nullable at the Arrow/Parquet level.
A one-time migration of durably persisted arrow::DataTypes in Persist's Schema Registery that allows them to be more nullable to account for the changes from [1].
Deprecation of the existing SchemaIds in Part and Run metadata. We do this by renaming the existing schema_id fields to deprecated_schema_id and introducing a schema_id field with a new tag. Also a dyncfg is added so we can turn off writing to the new schema_id field until we're confident all nodes have rolled to a new version.
During bootstrapping of the Coordinator, if the nullability of columns for a MatView have changed, we compare and evolve the new schema in Persist.
Changed the upgrade check in catalog-debug to validate that the RelationDescs as planned by the new version of MZ are compatible with the ones durably recorded in Persist.

Also included in this PR are two feature gate changes to disable new features in our tests until > v0.126.

Builtin Continual Tasks
0dt Enabled Sources

Both of these features cause the new version of MZ to issue writes during a 0dt upgrade when it's supposed to be in read-only mode. Because this PR changes the Arrow datatypes in a non-forward compatible way, this causes the 0dt tests to fail since the old version panics when it sees the new batches.

Motivation

Fixes https://github.com/MaterializeInc/database-issues/issues/8660

Tips for reviewer

All of these changes have been made in separate commits of the PR which ideally makes reviewing easier.

Checklist

This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

shepherdlybot · 2024-10-26T00:54:32Z

Mitigations

Completing required mitigations increases Resilience Coverage.

(Required) Code Review 🔍 Detected
(Required) Feature Flag
(Required) Integration Test
(Required) Observability 🔍 Detected
(Required) QA Review 🔍 Detected
(Required) Run Nightly Tests
Unit Test 🔍 Detected

Risk Summary:

The pull request has a high risk score of 83, primarily driven by the predictors "Avg Line Count In Files" and "Executable Lines Within Files," with 7 modified files identified as hotspots. Historically, PRs with these predictors are 155% more likely to cause a bug than the repository baseline. Additionally, the observed bug trend in the repository is increasing.

Note: The risk score is not based on semantic analysis but on historical predictors of bug occurrence in the repository. The attributes above were deemed the strongest predictors based on that history. Predictors and the score may change as the PR evolves in code, time, and review activity.

Bug Hotspots:
What's This?

File	Percentile
../row/encode.rs	98
../src/main.rs	93
../src/arrow.rs	90
../src/coord.rs	99
../src/controller.rs	92
../src/cfg.rs	95
../src/lib.rs	98

jkosh44

Adapter changes LGTM

danhhz

Scanned this and everything seems reasonable on the surface, but I think it'll be hard to make time this week for the full detailed review. @bkirwi mind picking up the persist review on this?

src/persist-client/src/batch.rs

src/catalog-debug/src/main.rs

bkirwi · 2024-10-30T20:15:17Z

src/persist-client/src/schema.rs

+///
+/// Errors if `new` is less nullable than `old`, or `old` and `new` are different types or have
+/// different nested fields.
+pub(crate) fn is_atleast_as_nullable(old: &DataType, new: &DataType) -> Result<(), anyhow::Error> {


Can you expand on the need for this? AFAICT this is ~identical to backward_compatible_typ(old, new).is_some(), aside from slightly different behaviour when fields are added. OTOH, it will silently allow fields to be dropped...

Is there some way to avoid duplicating the similar logic?

It is nearly identical! We could try and re-use backward_compatible_typ(...) but because bacward_compatible_typ(...) allows adding new columns is why I shied away from it.

I do believe we should be able to delete this code after a release or two since the new DataTypes will be migrated, so maybe code reuse isn't super important if we're going to get rid of it?

Okay! I don't know that allowing additions is so much worse than allowing deletes, since neither should come up in practice... but if you're confident we can get rid of this in short order I won't worry about it too much one way or the other.

src/catalog-debug/src/main.rs

bkirwi · 2024-10-30T20:35:20Z

src/persist-client/src/internal/state.rs

@@ -208,6 +208,9 @@ pub enum BatchPart<T> {
        updates: LazyInlineBatchPart,
        ts_rewrite: Option<Antichain<T>>,
        schema_id: Option<SchemaId>,
+
+        /// ID of a schema that has since been deprecated and exists only to cleanly roundtrip.
+        deprecated_schema_id: Option<SchemaId>,


There's no real possibility of ever getting rid of this, hey? Since parts would fail to roundtrip if we ever removed it?

I'm not totally sure! I need to think through State a bit more and how we might support removing fields, but I wouldn't be surprised if we need to do a bit of work to support migration here

src/persist-client/src/internal/state.rs

src/storage-controller/src/lib.rs

bkirwi · 2024-11-01T15:31:37Z

Thanks for the followup etc! I'll do a final pass on this today.

bkirwi

One important-but-hopefully uncontroversial change suggested, but otherwise looks good! Thanks for making the idempotency change... feels much easier to reason about now.

I think we can probably get rid of DatumEncoder entirely now, since IIUC it only exists to preserve field nullability for DatumColumnEncoder. But we can leave that as a followup.

src/persist-client/src/internal/state.rs

src/repr/src/row/encode.rs

ParkMyCar · 2024-11-12T19:23:50Z

Moving this into draft while I iterate a bit here to figure out the nightly failures

ParkMyCar · 2024-12-02T19:46:16Z

Moving back to draft while I iterate more

ParkMyCar · 2024-12-04T17:55:22Z

@bkirwi this ended up not changing all that much, but is ready for review again whenever you have a moment!

Note: Nightlies won't pass until #30725 is merged

This PR disables Persist's compaction when environmentd or clusterd are in read-only mode. As part of #30205 we discovered that during a 0dt deployment the read-only instance of Materialize was scheduling compaction requests which was causing it to write data. We disable compaction by adding a `process_compaction_requests: Arc<AtomicBool>` to the `PersistConfig`, and when setting it to `true` when `clusterd` receives the already existing `AllowWrites` command. Also included is a CYA dyncfg that gates whether or not we check the flag, it's set to `true` by default. I also added a Prometheus metric to count the number of requests dropped because compaction is disabled, a unit test to exercise the basic enable/disable behavior, and manually checked when running a 0dt test that compaction was successfully disabled and then enabled across `environmentd` and all `clusterd`s when a deployment was promoted. ### Motivation Fix issue found in #30205 ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design.  - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](MaterializeInc/cloud#5021)).  - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Co-authored-by: Nikhil Benesch <[email protected]> Co-authored-by: Ben Kirwin <[email protected]>

ParkMyCar · 2024-12-04T21:28:01Z

Rebased (apologies if this messed up your review Ben!) and kicked off Nightlies

bkirwi

Nits and a lingering question, but in any case I think this should be good to go.

bkirwi · 2024-12-06T19:18:53Z

src/adapter/src/coord.rs

+        // updates to our optimizer.
+        self.controller
+            .storage
+            .evolve_nullability_for_bootstrap(storage_metadata, compute_collections)


Do I understand correctly that?

This is a write on the read path: starting the read-only replica irrevocably migrates the collection.

This is fine in the short term because we're only making things more nullable at the data type level, so all the old writes are valid with the new schema.

This is fine in the longer term because it should be a noop at the data type level.

Exactly! Also, we only migrate collections for Materialized Views and Continual Tasks, not Tables or Sources

bkirwi · 2024-12-06T19:34:40Z

src/persist-client/src/internal/state.rs

+                        crate::schema::is_atleast_as_nullable(&old_v_datatype, &new_v_datatype);
+
+                    // If the Arrow DataType for `k` or `v` has changed, but it's only become more
+                    // nullable, then we allow in-place re-writing of the schema.


Maybe this is friday brain, but: since we're updating existing schema ids here with a backwards compatible change, what would go wrong if we didn't do the schema-id-deprecation thing? Or nothing breaks, and we're doing it to avoid ambiguity?

Not Friday brain at all, there are a couple of trade-offs here. We could skip deprecating the schema_id field and just write a new SchemaId(1) for all collections, but:

We would have have the same RelationDesc mapping to two different arrow::DataTypes, which I don't think is too bad, but would probably require us to also implement something like persist: Plumb relevant arrow::DataTypes to compaction #30627 in the future.

Right now the Persist SchemaId maps 1:1 with the Adapter's RelationVersion for Tables. If we evolve the schemas for all collections to be more nullable we'd need some additional mapping in the Adapter layer to support ALTER TABLE

These are all things that we can work around, but also IMO the concept of "starting" over is the cleanest choice.

bkirwi · 2024-12-06T19:38:20Z

src/persist-client/src/internal/state.rs

            prost::Message::encode_to_vec(&proto).into()
        }

+        fn decode_data_type(buf: Bytes) -> Result<DataType, anyhow::Error> {


Tiny nit, but the names here aren't quite consistent: encoded vs. decode.

bkirwi · 2024-12-06T19:45:19Z

src/persist-client/src/internal/state.rs

+                                    val,
+                                    val_data_type: new_v_encoded_datatype,
+                                },
+                            );


We should give ourselves some way to check if we're hitting this path later than we'd expect. (I think a log is fine if wiring up a metric is a hassle... shouldn't be too high of a multiplicity.)

Wired up a metric which should be easier to observe! We theoretically could write a query over our logs, but wiring that up is maybe harder than adding a prom metric haha

bkirwi · 2024-12-06T20:03:04Z

Not sure what to think of the parallel ddl failure... hit restart to see if it's sticky!

def- · 2024-12-06T21:26:36Z

Not sure what to think of the parallel ddl failure... hit restart to see if it's sticky!

It's a test issue, fixed on main

* recusively mark all columns as nullable at the Arrow/Parquet level

…gistry

…alized Views and Continual Tasks, if need be

…lity changes

ParkMyCar · 2024-12-09T20:44:13Z

The Nightly tests we care about are Green, merging!

As part of #30205 we check if a new `arrow::DataType` is "at least as nullable" as the previous type, and if so we update the registered data type in-place. This check failed to account for the change we made a little while ago where were use `arrow::DataType::List` for `ScalarType::Map`, instead of `arrow::DataType::Map`. ### Motivation Fixes MaterializeInc/database-issues#8834 ### Tips for reviewer  ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design.  - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](MaterializeInc/cloud#5021)).  - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Removes the onetime schema migration that was introduced in #30205. I looked back on 30 days worth of metrics and validated that `mz_persist_one_time_migration_more_nullable` did not show up for any users which indicates that our schema representation should be stable. ### Motivation Code cleanup ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design.  - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](MaterializeInc/cloud#5021)).  - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

ParkMyCar changed the title ~~persist: Stability Schema Evolution~~ persist: Stabilize Schema Evolution Oct 25, 2024

ParkMyCar force-pushed the persist/all-columns-nullable branch 2 times, most recently from 7547fdb to 73a82eb Compare October 25, 2024 23:47

ParkMyCar marked this pull request as ready for review October 26, 2024 00:53

ParkMyCar requested review from a team as code owners October 26, 2024 00:53

ParkMyCar requested a review from jkosh44 October 26, 2024 00:53

jkosh44 approved these changes Oct 28, 2024

View reviewed changes

danhhz reviewed Oct 29, 2024

View reviewed changes

src/persist-client/src/batch.rs Outdated Show resolved Hide resolved

src/catalog-debug/src/main.rs Outdated Show resolved Hide resolved

danhhz requested a review from bkirwi October 29, 2024 17:00

ParkMyCar force-pushed the persist/all-columns-nullable branch from 73a82eb to b5cb8e3 Compare October 30, 2024 15:59

bkirwi reviewed Oct 30, 2024

View reviewed changes

bkirwi approved these changes Nov 1, 2024

View reviewed changes

src/persist-client/src/internal/state.rs Show resolved Hide resolved

src/repr/src/row/encode.rs Show resolved Hide resolved

ParkMyCar force-pushed the persist/all-columns-nullable branch from 63c3efa to ec6ef90 Compare November 5, 2024 16:27

ParkMyCar marked this pull request as draft November 12, 2024 19:23

ParkMyCar force-pushed the persist/all-columns-nullable branch from 23d5c11 to 3f5b265 Compare November 12, 2024 19:24

ParkMyCar mentioned this pull request Nov 25, 2024

persist: Plumb relevant arrow::DataTypes to compaction #30627

Closed

5 tasks

ParkMyCar force-pushed the persist/all-columns-nullable branch 2 times, most recently from cb9b7b6 to 2434665 Compare November 27, 2024 15:46

ParkMyCar marked this pull request as ready for review November 27, 2024 16:19

ParkMyCar force-pushed the persist/all-columns-nullable branch from 2434665 to cbc956b Compare December 2, 2024 17:57

ParkMyCar marked this pull request as draft December 2, 2024 19:46

ParkMyCar force-pushed the persist/all-columns-nullable branch from cbc956b to 497a477 Compare December 2, 2024 19:46

ParkMyCar mentioned this pull request Dec 4, 2024

persist: disable compaction in read-only mode #30725

Merged

5 tasks

ParkMyCar force-pushed the persist/all-columns-nullable branch from b8ca9ba to 5e3a637 Compare December 4, 2024 17:54

ParkMyCar marked this pull request as ready for review December 4, 2024 17:55

ParkMyCar force-pushed the persist/all-columns-nullable branch from 5e3a637 to 8bf98b5 Compare December 4, 2024 21:26

bkirwi approved these changes Dec 6, 2024

View reviewed changes

ParkMyCar added 9 commits December 9, 2024 13:16

start, all columns nullable

8eda798

* recusively mark all columns as nullable at the Arrow/Parquet level

implement one time migration for Arrow DataTypes in Persist Schema re…

c6b46b5

…gistry

deprecate all existing schema IDs in Part and Run metadata

fd694e5

during bootstrapping of the coordinator, evolve the schema for Materi…

ae3dd11

…alized Views and Continual Tasks, if need be

add schema check to the catalog_upgrade check

53295d8

regenerate JSON snapshot

1f24001

update 'backward_compatible_struct' logic to handle recursive nullabi…

94675d8

…lity changes

small changes to test flags

15a4b81

respond to feedback, tweak a name, add some metrics

93a5822

ParkMyCar force-pushed the persist/all-columns-nullable branch from 6068b96 to 93a5822 Compare December 9, 2024 18:20

ParkMyCar merged commit 9375fd3 into MaterializeInc:main Dec 9, 2024
223 of 234 checks passed

ParkMyCar mentioned this pull request Dec 16, 2024

persist: fix schema stability migration #30837

Merged

5 tasks

ParkMyCar mentioned this pull request Feb 6, 2025

[persist] Remove "onetime" schema migration #31313

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

persist: Stabilize Schema Evolution #30205

persist: Stabilize Schema Evolution #30205

ParkMyCar commented Oct 25, 2024 •

edited

Loading

shepherdlybot bot commented Oct 26, 2024 •

edited

Loading

jkosh44 left a comment

danhhz left a comment

bkirwi Oct 30, 2024

ParkMyCar Oct 31, 2024

bkirwi Nov 1, 2024

bkirwi Oct 30, 2024

ParkMyCar Oct 31, 2024

bkirwi commented Nov 1, 2024

bkirwi left a comment

ParkMyCar commented Nov 12, 2024

ParkMyCar commented Dec 2, 2024

ParkMyCar commented Dec 4, 2024

ParkMyCar commented Dec 4, 2024

bkirwi left a comment

bkirwi Dec 6, 2024

ParkMyCar Dec 9, 2024

bkirwi Dec 6, 2024

ParkMyCar Dec 9, 2024

bkirwi Dec 6, 2024

ParkMyCar Dec 9, 2024

bkirwi Dec 6, 2024

ParkMyCar Dec 9, 2024

bkirwi commented Dec 6, 2024

def- commented Dec 6, 2024

ParkMyCar commented Dec 9, 2024

persist: Stabilize Schema Evolution #30205

persist: Stabilize Schema Evolution #30205

Conversation

ParkMyCar commented Oct 25, 2024 • edited Loading

Motivation

Tips for reviewer

Checklist

shepherdlybot bot commented Oct 26, 2024 • edited Loading

Mitigations

jkosh44 left a comment

Choose a reason for hiding this comment

danhhz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bkirwi commented Nov 1, 2024

bkirwi left a comment

Choose a reason for hiding this comment

ParkMyCar commented Nov 12, 2024

ParkMyCar commented Dec 2, 2024

ParkMyCar commented Dec 4, 2024

ParkMyCar commented Dec 4, 2024

bkirwi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bkirwi commented Dec 6, 2024

def- commented Dec 6, 2024

ParkMyCar commented Dec 9, 2024

ParkMyCar commented Oct 25, 2024 •

edited

Loading

shepherdlybot bot commented Oct 26, 2024 •

edited

Loading