Skip to content

Record search metrics on cancelation #5743

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

rdettai
Copy link
Collaborator

@rdettai rdettai commented Apr 10, 2025

Description

This PRs moves [root|leaf]_search_requests_total [root|leaf]_search_request_duration_seconds [root|leaf]_search_targeted_splits into tasks so that they are recorded even on future cancelation. This will help having a more accurate view on queries that timeout.

How was this PR tested?

Describe how you tested this PR.

@guilload
Copy link
Member

Comment on lines -1450 to -1455
let label_values = match leaf_search_response_reresult {
Ok(Ok(_)) => ["success"],
_ => ["error"],
};
SEARCH_METRICS
.leaf_search_targeted_splits
.with_label_values(label_values)
.observe(num_splits as f64);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there any specific reason why this was measured per index here?

@rdettai rdettai force-pushed the cancelation-search-metrics branch 2 times, most recently from 082adcb to 81b6b7f Compare April 14, 2025 12:03
@rdettai
Copy link
Collaborator Author

rdettai commented Apr 14, 2025

Writing a custom future would be much cleaner

Good point, I updated the PR to use a future.

For the root search the state machine is a bit more complicated because the request can also fail/cancel during planning (fetch of the split metadata). I did some refactoring to have a clearer split between the plan and exec phases, and applied a tracking future to both. All in all it's probably not much simpler than my original proposal, but I think it's a bit more readable.

I also refactored a few functions names to make it clearer the granularity at which we are working (multi-index vs single doc mapping)

Comment on lines +1788 to +1790
(RootSearchMetricsStep::Plan, Some(false)) => (0, "plan-error"),
(RootSearchMetricsStep::Plan, None) => (0, "plan-cancelled"),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These extra statuses seem valuable, but we should also try to avoid creating too many series. No strong opinion on whether we should have these or not.

(On that topic, I think root_search_requests_total is actually redundant because root_search_request_duration_seconds is a histogram and that creates a _count series)

@rdettai rdettai requested a review from guilload April 14, 2025 12:24
@rdettai rdettai force-pushed the cancelation-search-metrics branch from 81b6b7f to 86b6611 Compare April 17, 2025 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants