Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc/developer: add "Catalog tables for Optimizer Notices" design doc #23562

Merged
merged 1 commit into from
Jan 9, 2024

Conversation

aalexandrov
Copy link
Contributor

@aalexandrov aalexandrov commented Nov 30, 2023

Design doc for the infrastructure requested in MaterializeInc/database-issues#6468.

Rendered version.

Motivation

  • This PR adds a known-desirable feature.

Tips for reviewers

Still needs some sections to be filled in. I hope to move it out from draft mode later today.

  • @ggnall / @sthm please make sure that the problem statement and the UX diagram are aligned with your vision / feature requests.
  • @ggnall I'm also looking into guidance about the RBAC issues alternatives from the Open questions section.
  • @ggevay as the primary reviewer of adapter: add catalog tables for optimizer notices #23360 the most relevant parts for you are the Catalog Schema and Schema Maintenance parts of the Solution Proposal.
  • @ParkMyCar mainly I want to double-check for possible performance regressions in the way I intend to do mz_all_optimizer_notices maintenance.
  • @aljoscha please review the Changes required for Platform V2 and let me know if this aligns with your vision for distributed cooordd.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • This PR includes the following user-facing behavior changes:
    • There are no user-facing behavior changes.

@aalexandrov aalexandrov requested review from ggnall and ggevay November 30, 2023 09:07
@aalexandrov aalexandrov self-assigned this Nov 30, 2023
@aalexandrov aalexandrov requested a review from sthm November 30, 2023 09:09
@aalexandrov aalexandrov force-pushed the issue_21513_design branch 6 times, most recently from 8f4505c to b844f0f Compare November 30, 2023 13:20
Copy link
Contributor

@ggevay ggevay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote some comments. I'm still going through the design doc.

re-optimize the entry due to changes in the optimizer code or in the catalog
state.

On the other, for a frictionless UX we should be able to remember a user's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand


#### Inspecting a Notice

![Listing all objects wireframe](./static/20231113_optimizer_notice_catalog/ux_inspecting_a_notice.png)
Copy link
Contributor

@ggevay ggevay Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(For the corrective action of IndexTooWideForLiteralConstraints, we should mention that the user might want to drop the old index if it is not used by some other queries.)

Copy link
Contributor Author

@aalexandrov aalexandrov Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you with OK with just emitting a dedicated notice for indexes that are not used at all?

Then handling the IndexTooWideForLiteralConstraints and re-creating the objects might result in another notice IndexNotUsed which the user will have to ignore if they have created the index for serving purposes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tricky, because we'd like to keep notices quite high-signal, i.e., false positives should be rare. But IndexNotUsed would definitely show a false positive for each serving index, if we implement it by only looking at static objects. So, I'd say we should add this notice only when we'll also have the last usage time of each index implemented (https://github.com/MaterializeInc/materialize/issues/20877), because then we can take that also into account when deciding whether to show an IndexNotUsed.

Copy link
Contributor

@ggevay ggevay Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you with OK with just emitting a dedicated notice for indexes that are not used at all?

Ok! And then when we have IndexNotUsed later implemented in the above way, then these things will nicely fall into place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I'd say we should add this notice only when we'll also have the last usage time of each index implemented (https://github.com/MaterializeInc/materialize/issues/20877), because then we can take that also into account when deciding whether to show an IndexNotUsed.

The design also provisions for ignoring a notice in a way that will survive restarts. If this is the case the user can just manually ignore all false positive notices for IndexNotUsed.


### Catalog Schema

#### <a id="mz_optimizer_notices"></a> `mz_internal.mz_optimizer_notices` (new)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking that maybe instead of "Optimizer notices" we should call these things "Optimization notices", because some of them are actually emitted not by the optimizer. For example, DroppedInUseIndex is created in sequence_drop_common.

Copy link
Contributor Author

@aalexandrov aalexandrov Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My impression was that the current iteration focuses precisely on the notices emitted by the optimizer as well as other notices that we can add in the future. For example, dropping an index that is still in use will create dangling import_ids in some of the plans of running indexes, so we can have other notices attached to the specific index or materialized view that depends on that index.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rename this to "Optimization notices".

| `redacted_message` | `text` | A redacted version of the `message` column. `NULL` if no redaction is needed. |
| `redacted_hint` | `text` | A redacted version of the `hint` column. `NULL` if no redaction is needed. |
| `redacted_action` | `text` | A redacted version of the `action` column. `NULL` if no redaction is needed. |
| `action_type` | `text` | The type of the `action` string (`sql_statements` for a valid SQL string or `plain_text` for plain text). |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are recording action_type because at some point in the future we might want to add a button that simply executes it if it is sql_statements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes (this request came from @sthm). A button probably won't cut it until we have zero-downtime REPLAN which seems some way down the road. However, I guess even now it is better UX if the customer can copy-paste a valid SQL string the notice into their dbt project or whatever else they use to populate their catalog.

| `redacted_action` | `text` | A redacted version of the `action` column. `NULL` if no redaction is needed. |
| `action_type` | `text` | The type of the `action` string (`sql_statements` for a valid SQL string or `plain_text` for plain text). |
| `object_id` | `text` | The ID of the materialized view or index. Corresponds to [`mz_objects.id`](../mz_catalog/#mz_objects). For global notices, this column is `NULL`. |
| `dependency_ids` | `text list` | A list of dependency IDs that need to exist for this notice to be still valid. Corresponds to [`mz_objects.id`](../mz_catalog/#mz_objects). |
Copy link
Contributor

@ggevay ggevay Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about the exact connection between object_id and dependency_ids. For an example, let's consider the CommonArrangedInput notice (I just made up this name; it's https://github.com/MaterializeInc/materialize/issues/20802).

In this example, would dependency_ids list all those objects that arrange the same one input? But then if one of these objects is dropped, the notice might still be valid if originally there were more than two objects arranging the same input. It becomes invalid only if all but one of the ids in dependency_ids is dropped.

Another question for this example: what will be the object_id? Will we have one row for each object in dependency_ids?

Copy link
Contributor Author

@aalexandrov aalexandrov Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For an example, let's consider the CommonArrangedInput notice (I just made up this name; it's https://github.com/MaterializeInc/materialize/issues/20802).

Here is my idea:

  • The notice is scoped under multiple catalog objects at once (so object_id is null).
  • We have a background thread that periodically scans all catalog plans and extracts an index of such notices, one for each index.
  • The notice is referring to objects that need to exist in order to make it valid. Rather it is suggesting an index that will be picked up by different dataflows, so we are collecting dependants and not dependencies.
  • The dependants can be modeled in memory as a field CommonArrangedInput::dependant_ids: BTreeSet<GlobalId>. - When we "resolve" CommonArrangedInput into an OptimzierNotice we will render a message containing a list of fully-qualified names of these ids.
  • When one of these dependant_ids entries is dropped we will remove it from the in-memory state of associated CommonArrangedInput, emit a retraction for the old mz_all_optimzier_notices for these notices, and an insertion with the new state.
  • When the dependant_ids becomes a singleton or with only one entry we will emit a retraction for the old notice and remove it from the in-memory state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe another way to model this is to have dependant_ids (as a separate relation or a list) instead of object_id. This will contain one entry for notices scoped to a single element and multiple entries for things like CommonArrangedInput.

Copy link
Contributor

@ggevay ggevay Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, ok, so this will be used by Rust code in a custom way for each notice when computing table updates, and not by some SQL in a view that just simply hides the row when any (or all) of the things in dependency_ids is gone, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe another way to model this is to have dependant_ids (as a separate relation or a list) instead of object_id. This will contain one entry for notices scoped to a single element and multiple entries for things like CommonArrangedInput.

Sounds good.

Copy link
Contributor Author

@aalexandrov aalexandrov Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Eager cleanup of notices when a dependency is deleted explores the alternatives. At the moment the design is a bit asymmetric because for object_id I eagerly retract from mz_all_optimizer_notices, but for dependency_ids I use a SQL filter in the mz_optimizer_notices view definition. This is definitely something that I want to unify in the long term, currently leaning towards "eagerly remove entries" in both cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would love if we could unify the two, specifically if we could also eagerly retract from mz_all_optimizer_notices for the dependency_ids column.

@aalexandrov aalexandrov force-pushed the issue_21513_design branch 2 times, most recently from 410bbae to d10e74d Compare November 30, 2023 14:49
@aalexandrov aalexandrov marked this pull request as ready for review November 30, 2023 14:50
@aalexandrov aalexandrov force-pushed the issue_21513_design branch 2 times, most recently from a56e30a to 5d68a86 Compare November 30, 2023 16:15
@aalexandrov aalexandrov requested review from aljoscha, a team and ParkMyCar November 30, 2023 16:17
- A SQL statement that marks a notice as no longer ignored results in a
retraction added to the source.

#### SQL syntax extensions
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ParkMyCar I guess since you are currently in the SQL Council I need your approval on this as well.

docs should at least keep the suggested sections.
-->

## The Problem
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section looks good to me. For our current service model this also means Field Engineering can't look up prior notices and can only rely on asking for (or running) realtime EXPLAINs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section looks good to me. For our current service model this also means Field Engineering can't look up prior notices and can only rely on asking for (or running) realtime EXPLAINs.

FYI several months ago we landed support for

EXPLAIN MATERIALIZED VIEW <item>
EXPLAIN INDEX <item>

which will the notices generated when we last planned <item>. In that sense the EXPLAINs are not realtime, but they still need to be executed ad-hoc by field eng. and there is no good way to get a sense of common notices occurring in many items.

objects have been (re-)created.
- A notice marked as "ignored" indicates that the user considers it irrelevant
or low priority and they don't want to see it in the foreseeable future.
- The page will contain an extra filter dropdown that will allow customers to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These defaults look good. And it's nice that a user can reveal a class of notice they've previously dismissed.

to be built on top of a connection that uses the `mz_support` user.
- ☒ The UX might leak information by listing notices for catalog items that
are not owned by the current user with the same implications as above.
2. Use `DataSensitivity::SuperuserAndSupport` (restrictive).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems the most straightforward for an FE-targeted first version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would notice maintenance be restricted from support users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While FE are the primary users of this feature we should find a way to enable it for them. Maybe once regular users starts seeing the notices this type of access should be revisited.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I got a bit lost here. As a general comment for this RBAC section, could you maybe structure the text in a way that the measures that we take (for each of the 3 options) and their consequences are somehow separated?

I don't really understand in what way 2. is any better than 1. They both say "All notices will be visible to all Materialize users.". I understand this as everybody sees everything. Does it matter whether it's only through the Console or SQL?

Notices can be only interpreted by superusers

What does "interpreted" mean here? Do I understand correctly that if we go with 3., then the notices view in the Console will be simply unusable only by ordinary users who are not superusers and support?

The UX will need to be built on top of a connection that uses the mz_support user.

How does the Console currently perform its other queries? Does it simply use the same user as the user who is viewing the Console (and mz_support if impersonated)? If yes, then do you mean we would introduce an exception in the Console code, where this specific thing would be performed on a different connection? Or you mean that we perform these queries only if we are the mz_support user?

The UX will hide notices if the current user is not either a superuser or a support user.

Is this sentence describing a measure that we take (i.e., guard the notice page in the Console code by an if that checks whether we are a superuser or a support user and show), or is this somehow a consequence of some other measures that we take?

The UX won't leak information by listing notices for catalog items that are not owned by the current user.

You mean that the Console will examine the RBAC attributes of each id in dependency_ids and/or object_id, and show only those notices for which RBAC says that it is owned by the current user? This will be the final solution in the long run, right? Btw. as an intermediate step before implementing this, could we make the Console show only the redacted columns to non-superusers?

2. Use `DataSensitivity::SuperuserAndSupport` (restrictive).
- ☒ All notices will be visible only to Materialize superusers and
Materialize support (through the impersonated console).
- ☒ Notices will be visible to customers only from the UX. The UX will hide
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not following why this must be the case? UX seems like the primary interface, but I'm not sure why SQL wouldn't be supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't consider that regular users where User::is_external_admin() is true will also be able to see the notices through the SQL console. I updated the text a bit to reflect that.

Copy link
Contributor

@ggevay ggevay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more minor comments

| `action_type` | `text` | The type of the `action` string (`sql_statements` for a valid SQL string or `plain_text` for plain text). |
| `object_id` | `text` | The ID of the materialized view or index. Corresponds to [`mz_objects.id`](../mz_catalog/#mz_objects). For global notices, this column is `NULL`. |
| `dependency_ids` | `text list` | A list of dependency IDs that need to exist for this notice to be still valid. Corresponds to [`mz_objects.id`](../mz_catalog/#mz_objects). |
| `created_at` | `timestamp with time zone` | The time at which the notice was created. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the "creation" that also happens at bootstrap, or will this survive restarts? (I guess the latter would be more useful to users.)

`mz_objects`. This ensures that whenever a dependency is deleted we also
implicitly delete any notices that depend on it.
- `C` corresponds to [all `mz_all_optimizer_notices`
entries](#mz_all_optimizer_notices) that that have a `fingerprint` contained
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that that

| `object_id` | `text` | The ID of the materialized view or index. Corresponds to [`mz_objects.id`](../mz_catalog/#mz_objects). For global notices, this column is `NULL`. |
| `dependency_ids` | `text list` | A list of dependency IDs that need to exist for this notice to be still valid. Corresponds to [`mz_objects.id`](../mz_catalog/#mz_objects). |
| `created_at` | `timestamp with time zone` | The time at which the notice was created. |
| `updated_at` | `timestamp with time zone` | The time at which the notice was last updated. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are status updates by the user, right?

schema of that table is identical to `mz_optimizer_notices` and its contents are
[maintained directly by the adapter](#notice-generation-and-management).

#### `mz_ignored_optimizer_notices`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, do I understand correctly that mz_ignored_optimizer_notices needs to handle only the "ignored" state but is not concerned with the "resolved" state, because the "resolved" state doesn't need to survive restarts, because we expect those notices to simply not be recreated upon a restart?

Copy link
Contributor

@sthm sthm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, thanks for working on this!

-->

1. Design details about the actual console UX beyond basic wireframe diagrams.
2. Implementation details about the integration path for optimizer notices
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems an interesting followup, but it's fair to not consider it for the first release.

Copy link
Member

@ParkMyCar ParkMyCar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this idea! I need to understand a bit better how retractions from mz_all_optimizer_notices, would work. More than happy to chat in a huddle or zoom if that would help!

Comment on lines +269 to +287
item). However, the notice structure of the notices generated by the optimizer
is not suitable for emitting retractions - for example:

1. It doesn't have an optional `created_at` field.
2. Some of the text fields need to be resolved against catalog items that might
be renamed or removed in the meantime.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aalexandrov I'm wondering if we can update the schema of mz_all_optimizer_notices to make the entries deterministic, and thus suitable for retractions? Do we need the created_at field or updated_at columns, and can we re-write message and hint to rely on object ids and not names?

Comment on lines 229 to 258
Conceptually, the `mz_optimizer_notices` view is defined as:

```sql
A EXCEPT ALL (B UNION C)
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make a single builtin source, mz_optimizer_notices instead of maintaining mz_all_optimizer_notices, mz_ignored_optimizer_notices, and then a mz_optimizer_notices view on top of the two?

Could mz_optimizer_notices be a builtin source, with the schema you describe above, but an additional state column, which is how we could mark certain notices as ignored?

| `redacted_action` | `text` | A redacted version of the `action` column. `NULL` if no redaction is needed. |
| `action_type` | `text` | The type of the `action` string (`sql_statements` for a valid SQL string or `plain_text` for plain text). |
| `object_id` | `text` | The ID of the materialized view or index. Corresponds to [`mz_objects.id`](../mz_catalog/#mz_objects). For global notices, this column is `NULL`. |
| `dependency_ids` | `text list` | A list of dependency IDs that need to exist for this notice to be still valid. Corresponds to [`mz_objects.id`](../mz_catalog/#mz_objects). |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would love if we could unify the two, specifically if we could also eagerly retract from mz_all_optimizer_notices for the dependency_ids column.

Comment on lines 276 to 303
In order to protect ourselves against invalid retraction issues arising from
these problems we make a distinction between the notice type produced by the
optimizer (`RawOptimizerNotice`) and the notice type stored in the
`CatalogPlans` struct (`OptimizerNotice`).

Further, the type of the notices stored in `DataflowMeatinfo::optimizer_notices`
depends on the execution stage - within the optimizer we still need to use
`RawOptimizerNotice`, but before storing the final result in the `CatalogPlans`
struct we need to move to a `OptimizerNotice` vector (done with a
`CatalogState::render_notices` call). We use the `CatalogPlans` entries to
generate insertions and deletions in the `mz_all_optimizer_notices` table.

Also, since we expect to have more than two notice types in the long term, we
reorganize the code a bit so adding a new notice is easier in the future. The
required steps are as follows:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you go into more detail about how we would issue retractions? For example, if a dependent object was dropped, or an object was renamed, how do we issue a retraction, even by using the Catalog? The previous dependent ID is gone from the catalog, as is the previous name.

If there's already code for this in your PR that handles this scenario, I'm more than happy to read through that :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two ways we remove remove / supress notices from the mz_optimizer_notices view:

  1. When an entry referenced in object_id is deleted, we remove the notice at the end of Catalog::transact.
  2. When an entry referenced in dependency_ids is deleted, the current mz_optimizer_notices view definition will filter out the enclosing notice.

In the long term I would like to emit eager retractions also for (2), but for the the sake of merging an MVP that would allow me to put together an internal console app and expose the feature to field engineering for beta testing I thought this discrepancy won't hurt too much.

5. Add the notice type to the `raw_optimizer_notices` macro which generates the
`RawOptimizerNotice` enum and other boilerplate code.

#### `mz_all_optimizer_notices` maintenance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an estimate to how many entries might exist in mz_all_optimizer_notices? We should double check that filter pushdown works on a column, preferably on the object_id column which I'm guessing is what users will mostly filter on

Comment on lines 331 to 351
```sql
ALTER NOTICE STATUS SET '<fingerprint>' = '<status>';
ALTER NOTICE STATUS RESET '<fingerprint>';
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally ALTER commands take the format ALTER <type> <name> <field> (e.g. alter cluster or alter index) what do you think about the following format?

ALTER NOTICE <fingerprint> SET STATUS = '<status>';
ALTER NOTICE <finterprint> RESET STATUS;

- This can be quite tricky to achieve.
-->

## Alternatives
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about durably storing the notices, or some portion of the notice, in the Stash, instead of a builtin table/source? This is what we do for object comments, and it might alleviate some of the problems around retractions?

Copy link
Contributor Author

@aalexandrov aalexandrov Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about durably storing the notices, or some portion of the notice, in the Stash, instead of a builtin table/source?

I think the main reason not to do that is that that optimizer notices are a function of the catalog state, more specifically a lot of them would be a function of properties of the plans for materialized views and indexes created during bootstrap.

This means that when we restart / upgrade materialize the plans might change. We would need to somehow discard all optimizer notices created before the restart and create new set of notice entries for the builtin source. Given these semantics, I thought that maintaining the optimizer notices in a builtin table is sufficient for now.

The only durable state that we really want to avoid losing between restart is the bit that defines whether the notice is ignored or not.

@aalexandrov
Copy link
Contributor Author

aalexandrov commented Dec 7, 2023

I need to understand a bit better how retractions from mz_all_optimizer_notices, would work. More than happy to chat in a huddle or zoom if that would help!

Thanks for the review! I have an MVP open in #23360. I think it's best to have a huddle and go over the main parts of the PR together because I plan to change some things before I merge this PR and align the mechanisms used to retract invalid notices shortly after that with a follow-up PR.

Copy link
Member

@ParkMyCar ParkMyCar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aalexandrov and I chatted over Zoom and we're aligned on this design!

For posterity, the follow ups we have for later versions of the feature are:

  • Eagerly retracing notices when a dependency goes away, e.g. an index whose key was too wide gets dropped. We think this can be handled by adding a dependencies field to CatalogPlans, and then add a check to the Coordinator's consistency checks.
  • Handling of the ignored/resolved state. This is a bit trickier because we need to survive restarts of envd, recreation of objects, and ideally renaming of the object, or the schema/database it exists in. There are some ideas on how to "fingerprint" this ignored/resolved state, but punting on it for now as it's not required for V0.

@aalexandrov
Copy link
Contributor Author

@ParkMyCar I updated the design doc. If I can squeeze in the refactor to do the first bit (Eagerly retracing notices when a dependency goes away) I might tag you for a fast follow-up PR tomorrow.

@aalexandrov aalexandrov merged commit f61f445 into MaterializeInc:main Jan 9, 2024
@aalexandrov aalexandrov deleted the issue_21513_design branch January 9, 2024 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants