Skip to content

Conversation

@tcp13equals2
Copy link
Contributor

@tcp13equals2 tcp13equals2 commented Jan 2, 2026

What this PR does

This PR introduces metrics for tracking the effectiveness of the step invariant operator.

When an expression is marked as a step invariant, the float or histogram points are assumed to be the same at each step. As such, the inner data only needs to be queried once and the value re-used in each subsequent step.

This PR provides a metric to track the number of points saved / re-used.

This PR also adds a metric for tracking the total number of step invariant nodes which have been observed. Note that not ever step invariant node will result in savings. This can occur if the inner operator returns no data.

An enhancement has also been made to remove the step invariant operator if the query is an instant query - as this operator does not provide any useful value when there is only a single step of data.

The new metrics are;

  • cortex_mimir_query_engine_step_invariant_nodes_total
  • cortex_mimir_query_engine_step_invariant_points_saved_total - and this will have a type label of fpoint or hpoint

To assist in supporting a metric tracker being passed into the operator, a new MetricsTracker has been added to the query planner which is passed down to operators within the OperatorParameters. This could be re-used to allow other metric trackers to be passed into operators.

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

  • [ x] Tests updated.
  • Documentation added.
  • [ x] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]. If changelog entry is not needed, please add the changelog-not-needed label to the PR.
  • about-versioning.md updated with experimental features.

Note

Introduces metrics and planner changes for step-invariant handling, plus an instant-query fast path.

  • Add planning metrics tracker and counters: cortex_mimir_query_engine_step_invariant_nodes_total and cortex_mimir_query_engine_step_invariant_steps_saved_total
  • Wire metrics into QueryPlanner and increment on step-invariant node creation; pass timeRange through planning and compute subquery child ranges to calculate saved steps
  • Skip adding StepInvariantExpression for instant queries
  • Update tests to reflect metrics, planning changes, and instant vs range behavior; update CHANGELOG.md

Written by Cursor Bugbot for commit 48928e9. This will update automatically on new commits. Configure here.

@tcp13equals2 tcp13equals2 marked this pull request as ready for review January 2, 2026 06:23
@tcp13equals2 tcp13equals2 requested a review from a team as a code owner January 2, 2026 06:23
Copy link
Contributor

@tacole02 tacole02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog LGTM

@charleskorn charleskorn changed the title Adding step invariant metrics and past path for instant queries. Adding step invariant metrics and fast path for instant queries. Jan 5, 2026
cursor[bot]

This comment was marked as outdated.

@cursor
Copy link

cursor bot commented Jan 20, 2026

Bugbot Autofix prepared a fix for the bug found in the latest run.

  • ✅ Fixed: Aggregate parameter uses wrong time range for step invariants
    • Changed AggregateExpr parameter parsing to use parent timeRange instead of inner.ChildrenTimeRange, preserving step invariant behavior.

Create PR

// It increments the nodes counter by 1, and the steps saved counter is incremented by the given stepCount-1.
// If the stepCount is less than or equal to 1 then this function is a no-op.
func (t *StepInvariantExpressionMetricsTracker) OnStepInvariantExpressionAdded(stepCount int) {
if stepCount <= 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this method is called, then doesn't that mean that a step-invariant node has been added and therefore we should always call t.nodes.Inc()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just an extra safety check ... but I will remove since the planner already checks the step count before calling this function.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is ON. A Cloud Agent has been kicked off to fix the reported issues.

@cursor
Copy link

cursor bot commented Jan 20, 2026

Bugbot Autofix prepared a fix for 2 of the 2 bugs found in the latest run.

  • ✅ Fixed: Test will panic: counter Add with negative value
    • Added a guard to no-op when stepCount <= 1, preventing negative counter adds and matching test expectations.
  • ✅ Fixed: Error checked after nil pointer dereference risk
    • Reordered the test to assert no error before accessing planner fields, avoiding nil pointer dereference.

Create PR

@tcp13equals2 tcp13equals2 merged commit 7debf14 into main Jan 20, 2026
39 checks passed
@tcp13equals2 tcp13equals2 deleted the track_step_invariant_usage branch January 20, 2026 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants