Some notes about metrics gathering #5

NicolasT · 2023-03-12T09:59:33Z

NicolasT
Mar 12, 2023
Maintainer

I am particularly interested in having system metrics, but does it have to be in the Prometheus format? What about reduction count (e.g. cpu usage), could this be reported too?

Raised by @TristanCacqueray in https://discourse.haskell.org/t/towards-an-actor-framework-for-haskell/5929/10, based on the content of https://github.com/NicolasT/troupe/wiki/Design-Ideas/.

NicolasT · 2023-03-12T10:56:18Z

NicolasT
Mar 12, 2023
Maintainer Author

Prometheus is what I'm familiar with and what many tools support nowadays, instead of, e.g., ekg's internal format. In general, it doesn't matter that much, as long as there's a way to collect metrics in the library into some structure, and then allow an application author to have these exposed somehow. This could be in the Prometheus format, or using whichever other means, retrieved from this metrics store.

I'm actually considering turning ProcessM a into ProcessM r a with a MonadReader r instance, with the r value passed along through any spawn. This would allow application authors to thread some value throughout all processes, which could then include some metrics (i.e., variables whose value can be updated) among other things.

0 replies

NicolasT · 2023-03-12T10:58:59Z

NicolasT
Mar 12, 2023
Maintainer Author

As for CPU usage: I don't know to which extent GHC keeps per-thread metrics. It would be cool if we have some and can expose those (though it's, as you know, not the same as reduction count). To be investigated.

Also, per-process metrics might get expensive, as in, having millions of processes would imply millions of metrics, which would imply millions of (Prometheus) time-series, many of them potentially "short" (when lifetime of a process is short, e.g., a process to handle a single user request), which can get expensive.

0 replies

CordiaMcmartini · 2023-03-13T11:06:26Z

CordiaMcmartini
Mar 13, 2023

Yes, that's a good point about the potential scalability issues with per-process metrics. It's important to consider the trade-offs between the granularity of metrics and the cost of collecting and storing them. In cases where having per-process metrics is too expensive, we may need to settle for aggregating metrics at a coarser level, such as per-service or per-host, while still ensuring that we have enough visibility to identify performance bottlenecks and other issues. It's also worth considering approaches like sampling or probabilistic counting to reduce the number of metrics collected, while still maintaining a representative view of system behavior.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some notes about metrics gathering #5

{{title}}

Replies: 4 comments

{{title}}

{{title}}

{{title}}

Select a reply

Some notes about metrics gathering #5

NicolasT Mar 12, 2023 Maintainer

Replies: 4 comments

NicolasT Mar 12, 2023 Maintainer Author

NicolasT Mar 12, 2023 Maintainer Author

CordiaMcmartini Mar 13, 2023

NicolasT
Mar 12, 2023
Maintainer

NicolasT
Mar 12, 2023
Maintainer Author

NicolasT
Mar 12, 2023
Maintainer Author

CordiaMcmartini
Mar 13, 2023