-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
persist: disable compaction in read-only mode #30725
persist: disable compaction in read-only mode #30725
Conversation
…ly mode * add an Arc<AtomicBool> to PersistConfig that determines whether or not we process a compaction request * add a dyncfg to CYA that determines whether or not we check the Arc<AtomicBool> * add a test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Deferring the real review to @bkirwi but no qualms from me.
Co-authored-by: Nikhil Benesch <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Made a suggestion on the doc to make it more clear what the default behaviour will be when the dyncfg is disabled.
Co-authored-by: Ben Kirwin <[email protected]>
TFTR! |
Requires #30725 to be merged. This PR stabilizes Schema Evolution in Persist which unblocks `ALTER TABLE` work. There are a few changes in this one PR, they are all geared around handling the slight instability we have with the nullability of columns in Materialized Views. 1. Internally in Persist, all columns are marked as nullable at the Arrow/Parquet level. 2. A one-time migration of durably persisted `arrow::DataType`s in Persist's Schema Registery that allows them to be more nullable to account for the changes from [1]. 3. Deprecation of the existing `SchemaId`s in `Part` and `Run` metadata. We do this by renaming the existing `schema_id` fields to `deprecated_schema_id` and introducing a `schema_id` field with a new tag. Also a dyncfg is added so we can turn off writing to the new `schema_id` field until we're confident all nodes have rolled to a new version. 4. During bootstrapping of the Coordinator, if the nullability of columns for a MatView have changed, we compare and evolve the new schema in Persist. 5. Changed the upgrade check in `catalog-debug` to validate that the `RelationDesc`s as planned by the new version of MZ are compatible with the ones durably recorded in Persist. Also included in this PR are two feature gate changes to disable new features in our tests until > `v0.126`. 1. Builtin Continual Tasks 2. 0dt Enabled Sources Both of these features cause the new version of MZ to issue writes during a 0dt upgrade when it's supposed to be in read-only mode. Because this PR changes the Arrow datatypes in a non-forward compatible way, this causes the 0dt tests to fail since the old version panics when it sees the new batches. ### Motivation Fixes MaterializeInc/database-issues#8660 ### Tips for reviewer All of these changes have been made in separate commits of the PR which ideally makes reviewing easier. ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](MaterializeInc/cloud#5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.
This PR disables Persist's compaction when environmentd or clusterd are in read-only mode. As part of #30205 we discovered that during a 0dt deployment the read-only instance of Materialize was scheduling compaction requests which was causing it to write data.
We disable compaction by adding a
process_compaction_requests: Arc<AtomicBool>
to thePersistConfig
, and when setting it totrue
whenclusterd
receives the already existingAllowWrites
command. Also included is a CYA dyncfg that gates whether or not we check the flag, it's set totrue
by default.I also added a Prometheus metric to count the number of requests dropped because compaction is disabled, a unit test to exercise the basic enable/disable behavior, and manually checked when running a 0dt test that compaction was successfully disabled and then enabled across
environmentd
and allclusterd
s when a deployment was promoted.Motivation
Fix issue found in #30205
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.