-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Blazor - rendering metrics #61516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blazor - rendering metrics #61516
Conversation
6f6a498
to
51c936d
Compare
@@ -90,6 +92,10 @@ public Renderer(IServiceProvider serviceProvider, ILoggerFactory loggerFactory, | |||
_logger = loggerFactory.CreateLogger("Microsoft.AspNetCore.Components.RenderTree.Renderer"); | |||
_componentFactory = new ComponentFactory(componentActivator, this); | |||
|
|||
// TODO register RenderingMetrics as singleton in DI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to register RenderingMetrics
as singleton. But I think it should be done in one of the "Extensions" helpers and I'm not sure which. This could be done in next PR when @javiercn is back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one place for server https://github.com/dotnet/aspnetcore/blob/main/src/Components/Endpoints/src/DependencyInjection/RazorComponentsServiceCollectionExtensions.cs#L36 and one for WebAssembly https://github.com/dotnet/aspnetcore/blob/main/src/Components/WebAssembly/WebAssembly/src/Hosting/WebAssemblyHostBuilder.cs#L299
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to be "painful" because it lives in the Microsoft.AspNetCore.Components
. Typically, what we do in these cases is expose a helper method that is then called by the different hosts to register it on DI.
There's no way around not introducing a bit of public API for this. If I were to do something, I would go with creating an additional extension method here (look at AddValueProvider for a sample pattern) and have that called in the places that @maraf pointed out.
Later on, we can decide if we prefer to introduce a single extension method that we can pile up things in the future. The pattern MVC follows has AddMvcCore
for this reason, we could have something like AddComponentsCore
that is meant to be called out by the individual hosts, but for now let's just stick with what we've been doing so far (that would be my recommendation).
@@ -90,6 +92,10 @@ public Renderer(IServiceProvider serviceProvider, ILoggerFactory loggerFactory, | |||
_logger = loggerFactory.CreateLogger("Microsoft.AspNetCore.Components.RenderTree.Renderer"); | |||
_componentFactory = new ComponentFactory(componentActivator, this); | |||
|
|||
// TODO register RenderingMetrics as singleton in DI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one place for server https://github.com/dotnet/aspnetcore/blob/main/src/Components/Endpoints/src/DependencyInjection/RazorComponentsServiceCollectionExtensions.cs#L36 and one for WebAssembly https://github.com/dotnet/aspnetcore/blob/main/src/Components/WebAssembly/WebAssembly/src/Hosting/WebAssemblyHostBuilder.cs#L299
/ba-g CI timeout is #60989 |
var meterFactory = serviceProvider.GetService<IMeterFactory>(); | ||
_renderingMetrics = meterFactory != null ? new RenderingMetrics(meterFactory) : null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a case where IMeterFactory
is not ever on DI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In unit tests I think. Also I can imagine that somebody would like to disable it in non-cloud environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the unit tests we would simply update them. For production workloads our hosts should just call AddMetrics
and rely on it being there. I don't think this dependency should be optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about WASM, do you think it should always have metrics in DI ? That would make impossible to disable/trim metrics for Blazor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's fine, if we are concerned, we can look at the size before and after the change to understand the delta. In any case we could put it behind and app compat switch so that it gets trimmed by default if not enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a fair bit of code involved with metrics. I think you'll want a way to toggle it on and off in WASM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking [FeatureSwitchDefinition("System.Diagnostics.Metrics.Meter.IsSupported")]
var startTime = (_renderingMetrics != null && _renderingMetrics.IsDurationEnabled()) ? Stopwatch.GetTimestamp() : 0; | ||
_renderingMetrics?.RenderStart(componentState.Component.GetType().FullName); | ||
componentState.RenderIntoBatch(_batchBuilder, renderQueueEntry.RenderFragment, out var renderFragmentException); | ||
if (renderFragmentException != null) | ||
{ | ||
// If this returns, the error was handled by an error boundary. Otherwise it throws. | ||
HandleExceptionViaErrorBoundary(renderFragmentException, componentState); | ||
} | ||
_renderingMetrics?.RenderEnd(componentState.Component.GetType().FullName, renderFragmentException, startTime, Stopwatch.GetTimestamp()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several questions on this block:
- Do we understand what this is measuring?
- This is measuring the
Render
method on the component, which just updates the RenderTree, which might not be very interesting. - It might make more sense to measure when a render batch starts and ends, as that more clearly represents the time it took the app to create a single "snapshot" update.
- A more representative thing to measure is
SetComponentParametersAsync
which is where most of the application logic for an app lives.
- This is measuring the
- Do we know the perf cost of this? (Since it happens per render)
- Does it make sense to cache the component
FullName
inside ComponentState
- Does it make sense to cache the component
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* It might make more sense to measure when a render batch starts and ends
I agree that duration of whole batch could be separate metric.
* A more representative thing to measure is `SetComponentParametersAsync` which is where most of the application logic for an app lives.
OK. I will need some help to understand what are all the places in which this is called.
You mean ComponentBase.SetParametersAsync()
, right ?
* Do we know the perf cost of this? (Since it happens per render)
It depends on if there is listener attached or not. When it's not it's negligible.
* Does it make sense to cache the component `FullName` inside ComponentState
Reflection already does some caching.
src/Components/Server/src/DependencyInjection/ComponentServiceCollectionExtensions.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall changes look great. That said, there are a few things that I think we need to revisit:
- Measuring renders:
- We want to track from the start of a render batch to the completion of such render batch.
- We want to track SetParametersAsync on component instances as this is where most of the user code lives, the execution time of
RenderFragment
is "generally" not relevant as all that code does is fill in an array of RenderTreeFrames and for the most part is compiler generated code.- It's valuable though to measure/track the size of the render tree of a given component, as that is a good indicator of components that are rendering a lot of data.
- If we want to track the "impact" of a render, it's best to do so in
RenderTreeDiffBuilder.ComputeDiff
this is the bit of logic that actually does work like instantiate and set parameters on child components and so on.
- Circuit metrics:
- I think we might want to track the reason why a circuit is terminated (gracefully (session ended), timeout (circuit got disconnected and timed out), capacity (there were too many disconnected circuits)).
- Performance considerations:
- In general, we need to be very careful on the things that we do on a per-component basis, as a general principle, minimize allocations and avoid doing work unless needed. This is the "hotest" path for Blazor, so we need to ensure we don't regress perf.
"aspnetcore.components.circuits.duration", | ||
unit: "s", | ||
description: "Duration of circuit.", | ||
advice: new InstrumentAdvice<double> { HistogramBucketBoundaries = MetricsConstants.VeryLongSecondsBucketBoundaries }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is wrong with the existing LongSecondsBucketBoundaries? That's already used for Kestrel connection duration and signalr connection duration.
I don't like having 3 different durations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is used to bucketize the circuit duration, which I think naturally spans longer than the SignalR/Http one. If we were to use those, doesn't that mean that most data will end up in a single bucket? Ideally the buckets should be representative of the "durations" that we expect to track, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The biggest value in th existing buckets is 5 minutes. Do you think most circuits last longer than 5 minutes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. Circuits are associated with the "session" (a browser tab), so it's very feasible that they remain open while the user is looking at the tab. As for how long they can last, it's very app specific, but I think we want to have enough granularity to track whether sessions are very short (1-10 minutes) or last significantly longer (10 minutes, hours if the users leave the browser tab open and don't have energy saving settings)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific problem with having a different set of numbers? Is there a recommended number of values that we should strive for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No there isn't a specific problem. Just there is an extra choice for people to choose from, and good values for the buckets need to be decided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you mean people, do you mean us, or do you mean developers consuming the metrics. I assume you mean us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean us.
The comment on the new buckets doesn't seem right. SignalR connection duration goes up to 300 seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that Blazor circuit could be alive for days. And since circuit keeps the state of the blazor session it could be memory hungry. So understanding the histogram of how long your users keep the tab open matters for sizing your cluster.
I thought that it could happen that 80% of your circuits live over 5 minutes but with the previous buckets you don't know if that's 6 minutes or 6 days.
The comment on the new buckets doesn't seem right. SignalR connection duration goes up to 300 seconds.
Could you please be more specific? I will fix it on next PR. Thanks.
These metrics need to be documented at https://learn.microsoft.com/en-us/aspnet/core/log-mon/metrics/built-in |
Thank you. I will open another PR to address the feedback. |
Blazor - rendering metrics
new
Microsoft.AspNetCore.Components.Rendering
metercomponent.type
which isGetType().FullName
of the component being renderedaspnetcore.components.rendering.count
aspnetcore.components.rendering.duration
new
Microsoft.AspNetCore.Components.Server.Circuits
meteraspnetcore.components.circuits.count
aspnetcore.components.circuits.active_circuits
aspnetcore.components.circuits.connected_circuits
aspnetcore.components.circuits.duration
How to enable with OpenTelemetry/Aspire
Fixes #53613