-
Notifications
You must be signed in to change notification settings - Fork 28
feat: Add CostAttributionLabels to checks proto definition #1524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8a788e5
to
8307377
Compare
First step in enabling cost attribution for agents within Synthetic Monitoring. This first step updates the protobuf definition of a check so that the changes can be brought into upstream repositories that will propogate the changes across checks. This does not update any of the code to handle cost attribution labels. That will come in a future PR. - refs grafana/synthetic-monitoring#370
8307377
to
4d36caf
Compare
string job = 11 [(gogoproto.jsontag) = "job"]; // name of job this check belongs to | ||
bool basicMetricsOnly = 12 [(gogoproto.jsontag) = "basicMetricsOnly"]; // include only basic metrics | ||
string alertSensitivity = 13 [(gogoproto.jsontag) = "alertSensitivity"]; | ||
repeated Label costAttributionLabels = 14 [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we considered the pros and cons of having this associated with the tenant
instead? Conceptually I think that would be more accurate. This also leads me to a comment I wanted to make about the process that is triggered when a CAL is set/updated:
Based on some diagrams I've seen, I believe, our initial approach is to, once a CAL is set, update all the checks for that tenant and explicitly set a label for them with value _missing_
in case it's missing, or do nothing if it's already set for the check.
I have some doubts on whether this should work that way. Here's one example where we can find "conflicts":
(let's imagine we have a single CAL)
- Admin sets CAL to
team
- We update all checks to include the
team
label, setting it to_missing_
if it does not exist - Admin updates CAL to
service
- We update all checks to include the
service
label, setting it to_missing_
if it does not exist
In this case, when updating CAL, what do we do with the cases of team
labels that have value _missing_
? Do we plan to update them? Otherwise users will have checks that have new labels set by us which no longer have any function.
This whole "explicit" manipulation of the check labels seems odd to me.
The way I understand CALs and their impact on our checks and check telemetry, is that CALs behave as a "mask".
In the UI we can use the CALs as a mask with the check labels to show users that CAL labels are unset.
In the agent we can use the CALs as a mask with the check labels to know when a check has/has not the CAL labels set, and therefore set their value for telemetry purposes to _missing_
as that's the main use case we are interested so far, the telemetry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I considered having the CAL stored directly on the tenant.
Before diving in to that, let me at least expand upon your questions if we stuck with this approach:
In this case, when updating CAL, what do we do with the cases of team labels that have value missing? Do we plan to update them? Otherwise users will have checks that have new labels set by us which no longer have any function.
That's how we would have handle that scenario, otherwise we end up in a state where data is just not accurate.
Coming back, let me trace through some of the code and share what I see(read, mostly my way of thinking outloud and getting some confirmation). The main reason to have it on the check was to track executions, but I suppose if we're leaning in on using labels directly, we may be able to pivot. One limitation was the data available on a check when telemeter.AddTelemetry is called. The scraper struct only has the probe and check, and the check only has access to the TenantID.
So what I'm not seeing is how to propogate cost attributation label data from the tenant to the checks. There is a channel for tenants that seems to sync some tenant info in checks.handleBatchChanges. Is there a clear path that you see that could allow us to wire the defined cost attribution labels from the tenant to the checks? If so, I can see storing the data at the tenant level being more palatable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One limitation was the data available on a check when telemeter.AddTelemetry is called. The scraper struct only has the probe and check, and the check only has access to the TenantID.
I think we can just add it. For example see the implementation of TenantLimits, which is passed to the scraper here. We could follow a similar pattern for some "CALProvider" that uses the shared tenant's cache in the agent to retrieve the CALs.
I'm closing this out in favor of #1544, which adds the cost attribution label to the tenant. Thanks @ka3de for the nudge on how to propagate the cost attribution labels over to the checks. The main benefit to having this on the tenant vs checks is that when we need to propagate changes from CMAB => Agent, we only need to make one database call to update the tenant. The original design, we would need to update all of the checks for a tenant on each change. |
First step in enabling cost attribution for agents within Synthetic Monitoring. This first step updates the protobuf definition of a check so that the changes can be brought into upstream repositories that will propogate the changes across checks.
This does not update any of the code to handle cost attribution labels. That will come in a future PR.