Skip to content

Conversation

@56quarters
Copy link
Contributor

@56quarters 56quarters commented Jan 7, 2026

What this PR does

This change extends the EliminateDeduplicateAndMergeOptimizationPass to attempt to remove deduplicate and merge nodes on either side of a binary expression.

Which issue(s) this PR fixes or relates to

Prerequisite for #13863

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]. If changelog entry is not needed, please add the changelog-not-needed label to the PR.
  • about-versioning.md updated with experimental features.

Note

Improves query plan optimization by pruning unnecessary DeduplicateAndMerge nodes around binary expressions, respecting delayed name removal semantics.

  • Enhances EliminateDeduplicateAndMergeOptimizationPass to keep only the required dedupe node for binary ops and eliminate others when safe
  • Adds Prometheus counters: cortex_mimir_query_engine_eliminate_dedupe_attempted_total and cortex_mimir_query_engine_eliminate_dedupe_modified_total; increments on attempts/plan changes
  • Planner now registers the optimization pass with a Registerer; all call sites updated
  • Test suite updated to reflect simplified plans (fewer dedupe nodes), assert new metrics, and adjust dispatcher node paths/indices
  • Minor cleanup: simplify types.HasDuplicateSeries for 2-series case

Written by Cursor Bugbot for commit ae195fe. This will update automatically on new commits. Configure here.

@56quarters 56quarters added the changelog-not-needed PRs that don't need a CHANGELOG.md entry label Jan 7, 2026
@56quarters 56quarters force-pushed the 56quarters/mqe-dedupe branch 2 times, most recently from 414c1eb to fc4bc16 Compare January 8, 2026 00:04
This change extends the `EliminateDeduplicateAndMergeOptimizationPass`
to attempt to remove deduplicate and merge nodes on either side of a
binary expression.

Prerequisite for #13863

Signed-off-by: Nick Pillitteri <[email protected]>
@56quarters 56quarters force-pushed the 56quarters/mqe-dedupe branch from fc4bc16 to fadd1cc Compare January 8, 2026 17:31
Copy link
Contributor

@charleskorn charleskorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Could you please add a test for cases where a binary operation is nested? (eg. foo or bar or baz, or rate(foo[1m]) or rate(bar[1m]) or rate(baz[1m]))

@56quarters 56quarters marked this pull request as ready for review January 9, 2026 17:07
@56quarters 56quarters requested a review from a team as a code owner January 9, 2026 17:07
@56quarters 56quarters requested a review from charleskorn January 9, 2026 17:41
Copy link
Contributor

@charleskorn charleskorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo suggestions below

* use slices.ContainsFunc
* assert metrics when queries are not modified

Signed-off-by: Nick Pillitteri <[email protected]>
}

// getSelectorType determines if node is a selector and whether it has an exact name matcher.
func getSelectorType(node planning.Node) SelectorType {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Label replace handling incorrectly marks sibling branch nodes

Medium Severity

The label_replace handling at lines 148-150 assumes nodes[len-2] is always an ancestor of the label_replace function. However, with the new binary expression traversal (lines 93-103), the nodes list can now contain DeduplicateAndMerge nodes from sibling branches. For a query like rate(foo[5m]) or label_replace(rate(bar[5m]), ...), when processing label_replace, nodes[len-2] incorrectly points to foo_rate's dedup node (from the LHS branch) rather than an ancestor of label_replace. This causes the LHS dedup node to be incorrectly retained when it should be eliminated (since foo has an exact name matcher). Before this PR, binary expressions caused early return so this code path was never exercised.

Fix in Cursor Fix in Web

Copy link
Contributor Author

@56quarters 56quarters Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This results in keeping an extra DeduplicateAndMerge node which is less efficient than dropping it but doesn't cause in incorrect results. I'll handle this in a follow up PR.

@56quarters 56quarters merged commit 58dfac5 into main Jan 12, 2026
39 checks passed
@56quarters 56quarters deleted the 56quarters/mqe-dedupe branch January 12, 2026 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog-not-needed PRs that don't need a CHANGELOG.md entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants