-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a special optimizer pipeline for constant INSERTs #30666
Conversation
Wonderful, thank you for checking. I'll verify locally with a larger scale too. |
Fully nightly run triggered, good for me if green since this should be used across a lot of existing tests: https://buildkite.com/materialize/nightly/builds/10588 |
Well, I've also run the test locally, and it showed a 15-16% regression. What could be the reason for it behaving differently locally? Also, I'm curious to see what your local runs show. I did
as recommended in the issue. |
The main difference is that in Nightly this scenario runs with scale 4, while this is running it with scale 5 (10 times larger). Similar for me:
So this fixes the performance for smaller inserts (1000 rows) but not with many rows (10000 rows), interesting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code seems fine, but the structure, naming, and comments have a strong binding to INSERT
statements, but apply the logic beyond insert statements. It seems to generally treat constant expressions better, independent of whether they are inserts or other statements. I think we should make sure the code reflects that, vs treating it as a special case.
src/adapter/src/optimize/view.rs
Outdated
//! Optimizer implementation for `CREATE VIEW` statements and other misc statements, such as | ||
//! `INSERT`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the additional context helps much. Sure we can just say "for relational expressions" or something? We're still in a file called view.rs
, so .. there's more to do if we really want to clean things up, but I think "and other misc statements" could be tightened up or removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks the same to me at the moment. Not the most important detail, but also perhaps something wasn't pushed / refreshed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, sorry. Changed now to
//! An Optimizer that
//! 1. Optimistically calls `optimize_mir_constant`.
//! 2. Then, if we haven't arrived at a constant, it calls `optimize_mir_local`, i.e., the
//! logical optimizer.
//!
//! This is used for `CREATE VIEW` statements and in various other situations where no physical
//! optimization is needed, such as for `INSERT` statements.
src/transform/src/lib.rs
Outdated
pub fn constant_insert_optimizer(_ctx: &mut TransformCtx) -> Self { | ||
let transforms: Vec<Box<dyn Transform>> = vec![ | ||
Box::new(NormalizeLets::new(false)), | ||
Box::new(canonicalization::ReduceScalars), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow why ReduceScalars
is here. It doesn't do relation constant folding, but the other two do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case a user writes something like
insert into t values (1+1+1+1);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This is now included in fold_constants_fixpoint()
.)
Ok, figured out that it was still showing a regression locally for me because I was running it with a larger scale (as mentioned by Dennis), and the query was so large that |
Thanks for the review @frankmcsherry! I've addressed the comments. I'm running Nightly again: https://buildkite.com/materialize/nightly/builds/10589 |
Locally, |
@antiguru observed that with large constants, much of the time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, still some comments, but I think this is better going forward. We can further optimize / specialize if we need, but I approve of starting from here, vs a more deeply specialized implementation!
src/adapter/src/optimize/view.rs
Outdated
//! Optimizer implementation for `CREATE VIEW` statements and other misc statements, such as | ||
//! `INSERT`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks the same to me at the moment. Not the most important detail, but also perhaps something wasn't pushed / refreshed?
src/transform/src/lib.rs
Outdated
@@ -565,11 +567,11 @@ pub fn fold_constants_fixpoint() -> Fixpoint { | |||
name: "fold_constants_fixpoint", | |||
limit: 100, | |||
transforms: vec![ | |||
Box::new(NormalizeLets::new(false)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this moved. We end up with a thing that may not be normalized, but I don't see why we want that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry, I thought that the order doesn't matter from the point of view of which result we arrive at, because it's a fixpoint loop. But this may not be true if FoldConstants
and NormalizeLets
fight with each other for some reason. I don't see a reason why they would fight, but anyhow I'll change the order back, because we are more robust that way, because normalization is important.
(The reason why I changed the order is that in the INSERT constant scenario (and probably in other similar cases) this would settle down with one less NormalizeLets
run with the new order. But this is not so important.)
Actually, this had a different reason, see above. (Btw., I think it worked for 10000 and didn't work for 100000, for the reason explained above.) |
Addressed the remaining comments. Again, new Nightly run, hopefully the last: https://buildkite.com/materialize/nightly/builds/10591 Will merge when (enough of) Nightly completes. |
Nightly is fine:
|
This PR fixes this regression in INSERT performance: https://github.com/MaterializeInc/database-issues/issues/8801
It adds a new, very simple optimizer pipeline, whose job is just to take care of constant INSERTs.
The test that regressed in the issue by 10-20% compared to v0.125.3 is now 30-40% faster compared to v0.125.3:
https://buildkite.com/materialize/nightly/builds/10584
Motivation
Tips for reviewer
The first commit is just some trivial cleanup, can be reviewed separately.
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.