You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For historical reasons, ridley allows metrics to be sampled independently from when the relevant prometheus endpoint is hit; this decoupling is convenient for certain aspects (like avoid too-frequent sampling of expensive metrics), but it's not what the Prometheus' best practices advocate, e.g.:
Metrics should only be pulled from the application when Prometheus scrapes them, exporters should not perform scrapes based on their own timers. That is, all scrapes should be synchronous.
Another downside is that clients pulling the metrics (like Grafana) wouldn't necessarily have the most up-to-date value for those metrics, if the "internal" sampling time didn't fire yet. Furthermore, Grafana's default scraping interval is 60s, so if anything the default sampling time of 5s that ridley uses is way too aggressive.
Considering all of that, it would be nice for ridley to comply to the Prometheus' guidelines on writing exporters, with a particular attention on scheduling.
This might require some reworking of ridley's internals. It has been a while since Ridley's inception, but essentially ridley uses
the Haskell prometheus library, which exposes a bunch of functions for scraping, most of which allow passing an IO action that will perform the sampling for each new incoming request.
Currently ridley uses serveMetrics and sample as this IO action, which as far as I can see is not not really synchronous: yes, it reads the values for counters and gauges synchronously, but the library expects library users to keep those counters up-to-date "out-of-band", as per README:
module Example where
import Control.Monad.IO.Class (liftIO)
import System.Metrics.Prometheus.Http.Scrape (serveMetricsT)
import System.Metrics.Prometheus.Concurrent.RegistryT
import System.Metrics.Prometheus.Metric.Counter (inc)
import System.Metrics.Prometheus.MetricId
main :: IO ()
main = runRegistryT $ do
-- Labels can be defined as lists or added to an empty label set
connectSuccessGauge <- registerGauge "example_connections" (fromList [("login", "success")])
connectFailureGauge <- registerGauge "example_connections" (addLabel "login" "failure" mempty)
connectCounter <- registerCounter "example_connection_total" mempty
latencyHistogram <- registerHistogram "example_round_trip_latency_ms" mempty [10, 20..100]
liftIO $ inc connectCounter -- increment a counter
-- [...] pass metric handles to the rest of the app
serveMetricsT 8080 ["metrics"] -- http://localhost:8080/metric server
As you can see the advocated way seems to be the one of passing those counter "handlers" around the app, which should take care of updating those, and then prometheus 's sample just takes a snapshot of these values.
I guess that in order to write a compliant exporter we should write our own IO RegistrySample version that calls each updateMetricHandler for a new HTTP request, rather than relying on "just reading the counters and gauges".
How all the HELP story and caching relates to this I'm not sure yet.
The text was updated successfully, but these errors were encountered:
For historical reasons,
ridley
allows metrics to be sampled independently from when the relevantprometheus
endpoint is hit; this decoupling is convenient for certain aspects (like avoid too-frequent sampling of expensive metrics), but it's not what the Prometheus' best practices advocate, e.g.:Another downside is that clients pulling the metrics (like Grafana) wouldn't necessarily have the most up-to-date value for those metrics, if the "internal" sampling time didn't fire yet. Furthermore, Grafana's default scraping interval is 60s, so if anything the default sampling time of 5s that
ridley
uses is way too aggressive.Considering all of that, it would be nice for
ridley
to comply to the Prometheus' guidelines on writing exporters, with a particular attention on scheduling.This might require some reworking of ridley's internals. It has been a while since Ridley's inception, but essentially
ridley
usesthe Haskell prometheus library, which exposes a bunch of functions for scraping, most of which allow passing an
IO
action that will perform the sampling for each new incoming request.Currently
ridley
usesserveMetrics
and sample as thisIO
action, which as far as I can see is not not really synchronous: yes, it reads the values for counters and gauges synchronously, but the library expects library users to keep those counters up-to-date "out-of-band", as perREADME
:As you can see the advocated way seems to be the one of passing those counter "handlers" around the app, which should take care of updating those, and then
prometheus
'ssample
just takes a snapshot of these values.I guess that in order to write a compliant exporter we should write our own
IO RegistrySample
version that calls eachupdateMetricHandler
for a new HTTP request, rather than relying on "just reading the counters and gauges".How all the
HELP
story and caching relates to this I'm not sure yet.The text was updated successfully, but these errors were encountered: