Create a special optimizer pipeline for constant INSERTs #30666

ggevay · 2024-12-01T18:20:50Z

This PR fixes this regression in INSERT performance: https://github.com/MaterializeInc/database-issues/issues/8801
It adds a new, very simple optimizer pipeline, whose job is just to take care of constant INSERTs.

The test that regressed in the issue by 10-20% compared to v0.125.3 is now 30-40% faster compared to v0.125.3:
https://buildkite.com/materialize/nightly/builds/10584

Motivation

This PR fixes a recognized performance regression: https://github.com/MaterializeInc/database-issues/issues/8801

Tips for reviewer

The first commit is just some trivial cleanup, can be reviewed separately.

Checklist

This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

def- · 2024-12-02T11:58:08Z

The test that regressed in the issue by 10-20% compared to v0.125.3 is now 30-40% faster compared to v0.125.3:

Wonderful, thank you for checking. I'll verify locally with a larger scale too.

def- · 2024-12-02T11:59:30Z

Fully nightly run triggered, good for me if green since this should be used across a lot of existing tests: https://buildkite.com/materialize/nightly/builds/10588

ggevay · 2024-12-02T12:01:01Z

Well, I've also run the test locally, and it showed a 15-16% regression. What could be the reason for it behaving differently locally? Also, I'm curious to see what your local runs show.

I did

bin/mzcompose --find feature-benchmark down && bin/mzcompose --find feature-benchmark run default --scenario InsertMultiRow --other-tag v0.125.3 --scale=5

as recommended in the issue.

def- · 2024-12-02T12:02:30Z

The main difference is that in Nightly this scenario runs with scale 4, while this is running it with scale 5 (10 times larger).

Similar for me:

+++ Benchmark Report for run 1:
NAME                                | TYPE            |      THIS       |      OTHER      |  UNIT  | THRESHOLD  |  Regression?  | 'THIS' is
--------------------------------------------------------------------------------------------------------------------------------------------------------
InsertMultiRow                      | wallclock       |           1.158 |           0.976 |   s    |    10%     |    !!YES!!    | worse:  18.6% slower
InsertMultiRow                      | memory_mz       |         620.461 |         857.162 |   MB   |    20%     |      no       | better: 27.6% less
InsertMultiRow                      | memory_clusterd |         132.656 |         213.432 |   MB   |    50%     |      no       | better: 37.8% less
+++ Benchmark Report for run 2:
NAME                                | TYPE            |      THIS       |      OTHER      |  UNIT  | THRESHOLD  |  Regression?  | 'THIS' is
--------------------------------------------------------------------------------------------------------------------------------------------------------
InsertMultiRow                      | wallclock       |           1.192 |           1.056 |   s    |    10%     |    !!YES!!    | worse:  12.9% slower
InsertMultiRow                      | memory_mz       |         655.174 |         660.229 |   MB   |    20%     |      no       | better:  0.8% less
InsertMultiRow                      | memory_clusterd |         113.010 |         111.771 |   MB   |    50%     |      no       | worse:   1.1% more

So this fixes the performance for smaller inserts (1000 rows) but not with many rows (10000 rows), interesting.

frankmcsherry

The code seems fine, but the structure, naming, and comments have a strong binding to INSERT statements, but apply the logic beyond insert statements. It seems to generally treat constant expressions better, independent of whether they are inserts or other statements. I think we should make sure the code reflects that, vs treating it as a special case.

src/adapter/src/coord/sequencer/inner.rs

src/adapter/src/optimize.rs

frankmcsherry · 2024-12-02T12:12:40Z

src/adapter/src/optimize/view.rs

+//! Optimizer implementation for `CREATE VIEW` statements and other misc statements, such as
+//! `INSERT`.


I'm not sure the additional context helps much. Sure we can just say "for relational expressions" or something? We're still in a file called view.rs, so .. there's more to do if we really want to clean things up, but I think "and other misc statements" could be tightened up or removed.

Looks the same to me at the moment. Not the most important detail, but also perhaps something wasn't pushed / refreshed?

Oops, sorry. Changed now to

//! An Optimizer that //! 1. Optimistically calls `optimize_mir_constant`. //! 2. Then, if we haven't arrived at a constant, it calls `optimize_mir_local`, i.e., the //! logical optimizer. //! //! This is used for `CREATE VIEW` statements and in various other situations where no physical //! optimization is needed, such as for `INSERT` statements.

src/adapter/src/optimize/view.rs

src/transform/src/lib.rs

frankmcsherry · 2024-12-02T12:15:20Z

src/transform/src/lib.rs

+    pub fn constant_insert_optimizer(_ctx: &mut TransformCtx) -> Self {
+        let transforms: Vec<Box<dyn Transform>> = vec![
+            Box::new(NormalizeLets::new(false)),
+            Box::new(canonicalization::ReduceScalars),


I don't follow why ReduceScalars is here. It doesn't do relation constant folding, but the other two do.

Just in case a user writes something like

insert into t values (1+1+1+1);

(This is now included in fold_constants_fixpoint().)

src/transform/src/lib.rs

ggevay · 2024-12-02T13:10:26Z

Ok, figured out that it was still showing a regression locally for me because I was running it with a larger scale (as mentioned by Dennis), and the query was so large that FoldConstants gave up due to hitting FOLD_CONSTANTS_LIMIT, so then I was still running the full logical optimizer. I think there is no need to put more effort into optimizing such large queries, so I'll just leave that as is.

ggevay · 2024-12-02T14:07:46Z

Thanks for the review @frankmcsherry! I've addressed the comments.

I'm running Nightly again: https://buildkite.com/materialize/nightly/builds/10589
The result I'm expecting is that the speedup will be slightly less now that we are calling fold_constants_fixpoint() instead of a custom sequence of transforms, but I think we'll still have a speedup.

ggevay · 2024-12-02T14:20:21Z

Locally, scale=4 is showing me 10-20% speedup.

frankmcsherry · 2024-12-02T15:03:31Z

So this fixes the performance for smaller inserts (1000 rows) but not with many rows (10000 rows), interesting.

@antiguru observed that with large constants, much of the time ~~of FoldConstants~~ is in trying to determine if there is a column that forms a unique key.

frankmcsherry

Looks good, still some comments, but I think this is better going forward. We can further optimize / specialize if we need, but I approve of starting from here, vs a more deeply specialized implementation!

frankmcsherry · 2024-12-02T15:05:30Z

src/adapter/src/optimize/view.rs

+//! Optimizer implementation for `CREATE VIEW` statements and other misc statements, such as
+//! `INSERT`.


Looks the same to me at the moment. Not the most important detail, but also perhaps something wasn't pushed / refreshed?

frankmcsherry · 2024-12-02T15:07:01Z

src/transform/src/lib.rs

@@ -565,11 +567,11 @@ pub fn fold_constants_fixpoint() -> Fixpoint {
        name: "fold_constants_fixpoint",
        limit: 100,
        transforms: vec![
+            Box::new(NormalizeLets::new(false)),


I don't understand why this moved. We end up with a thing that may not be normalized, but I don't see why we want that.

Ah, sorry, I thought that the order doesn't matter from the point of view of which result we arrive at, because it's a fixpoint loop. But this may not be true if FoldConstants and NormalizeLets fight with each other for some reason. I don't see a reason why they would fight, but anyhow I'll change the order back, because we are more robust that way, because normalization is important.

(The reason why I changed the order is that in the INSERT constant scenario (and probably in other similar cases) this would settle down with one less NormalizeLets run with the new order. But this is not so important.)

ggevay · 2024-12-02T15:53:57Z

So this fixes the performance for smaller inserts (1000 rows) but not with many rows (10000 rows), interesting.

@antiguru observed that with large constants, much of the time ~~of FoldConstants~~ is in trying to determine if there is a column that forms a unique key.

Actually, this had a different reason, see above. (Btw., I think it worked for 10000 and didn't work for 100000, for the reason explained above.)

ggevay · 2024-12-02T16:09:31Z

Addressed the remaining comments.

Again, new Nightly run, hopefully the last: https://buildkite.com/materialize/nightly/builds/10591

Will merge when (enough of) Nightly completes.

ggevay · 2024-12-03T07:07:53Z

Nightly is fine:

The relevant benchmark is showing a 20-30% speedup (compared to v0.125.3)!
The "Output consistency" failure is known (ColumnKnowledge performed work after EquivalencePropagation).
The 0dt failure was due to a too small timeout in a test (0dt test: Increase allowed time for SELECT after rehydration #30693).

ggevay force-pushed the insert-opt branch from a739581 to 599eee4 Compare December 2, 2024 11:37

ggevay marked this pull request as ready for review December 2, 2024 11:37

ggevay requested review from a team as code owners December 2, 2024 11:37

ggevay requested a review from jkosh44 December 2, 2024 11:37

ggevay added A-optimization Area: query optimization and transformation A-ADAPTER Topics related to the ADAPTER layer A-CLUSTER Topics related to the CLUSTER layer labels Dec 2, 2024

Fix/refactor some comments around optimize::view::Optimizer

a79799f

ggevay force-pushed the insert-opt branch from 599eee4 to ccf7c56 Compare December 2, 2024 11:51

frankmcsherry requested changes Dec 2, 2024

View reviewed changes

ggevay force-pushed the insert-opt branch from ccf7c56 to 6d0fffb Compare December 2, 2024 13:50

frankmcsherry approved these changes Dec 2, 2024

View reviewed changes

Create a special optimizer pipeline for constant INSERTs

a37c334

ggevay force-pushed the insert-opt branch from 6d0fffb to a37c334 Compare December 2, 2024 16:04

ggevay merged commit 8874d3e into MaterializeInc:main Dec 3, 2024
221 of 224 checks passed

		//! Optimizer implementation for `CREATE VIEW` statements and other misc statements, such as
		//! `INSERT`.

Create a special optimizer pipeline for constant INSERTs #30666

Create a special optimizer pipeline for constant INSERTs #30666

Uh oh!

Conversation

ggevay commented Dec 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Tips for reviewer

Checklist

Uh oh!

def- commented Dec 2, 2024

Uh oh!

def- commented Dec 2, 2024

Uh oh!

ggevay commented Dec 2, 2024

Uh oh!

def- commented Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frankmcsherry left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ggevay commented Dec 2, 2024

Uh oh!

ggevay commented Dec 2, 2024

Uh oh!

ggevay commented Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frankmcsherry commented Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frankmcsherry left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggevay commented Dec 2, 2024

Uh oh!

ggevay commented Dec 2, 2024

Uh oh!

ggevay commented Dec 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ggevay commented Dec 1, 2024 •

edited

Loading

def- commented Dec 2, 2024 •

edited

Loading

ggevay commented Dec 2, 2024 •

edited

Loading

frankmcsherry commented Dec 2, 2024 •

edited

Loading

ggevay commented Dec 3, 2024 •

edited

Loading