-
Notifications
You must be signed in to change notification settings - Fork 3.3k
[receiver/prometheusreceiver] Fix counter metrics typed as Gauge with… #45974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[receiver/prometheusreceiver] Fix counter metrics typed as Gauge with… #45974
Conversation
…out metadata (open-telemetry#34263) Use heuristics to detect counters from metric names when metadata unavailable.
|
Why is type information not available when using the target allocator? I don't generally like the idea of using heuristics based on naming conventions. |
|
Were you able to root-cause the issue? |
@dashpole Thank you for the feedback sir! The Prometheus receiver has |
I believe the metadata is lost somewhere in how Target Allocator distributes targets, but I couldn't pinpoint the exact location. The Prometheus scraper should be extracting # TYPE metadata, but it's not reaching the receiver. |
|
@dashpole sir Would you prefer I . Close this PR and open an issue for proper root cause analysis? I agree heuristics aren't ideal. Can you point me to the right place to debug why Target Allocator doesn't preserve metadata? |
|
The target allocator just tells the prometheus receiver which endpoints to scrape. It shouldn't impact type metadata. We should be able to reproduce it without the target allocator. I would recommend trying to take the scrape output from the issue and put it in a unit test (we have many examples in the receiver). Hopefully that reproduces the issue, and allows you to make a simple unit test reproduction, and you can debug from there. |
…al scrape output This test uses the exact Prometheus scrape output from issue open-telemetry#34263 to verify that the receiver correctly handles grouped counter metrics. The test confirms: - Counter metrics with # TYPE declarations work correctly - The heuristic fallback handles missing metadata gracefully - No regression in gauge metric handling
Summary
Fixes #34263
Counter metrics are now correctly typed as Sum even when metadata is unavailable (e.g., when using Target Allocator).
Problem
When using Target Allocator, counter metrics were being incorrectly typed as Gauge instead of Sum when the MetricMetadataStore doesn't contain TYPE information from
# TYPEdeclarations. This caused grouped counter metrics likecnpg_pg_stat_database_blks_readto appear "silently dropped" since they were typed incorrectly.Solution
Implemented fallback heuristics to detect counter metrics from metric names using Prometheus naming conventions when metadata is unavailable:
Explicit counter suffixes:
_total(most reliable)_created_bytes_total_packets_totalCommon counter patterns:
_read,_written,_sent,_received,_processed,_completed_requests,_errors,_failures,_hits,_misses,_dropped,_retriesFalse positive protection:
_ratioor_percent_countsuffix (could be histogram or poorly named metric)Changes
metadata.go: AddedisLikelyCounter()heuristic functionmetadata_test.go: Added comprehensive tests for counter detection (20+ patterns)metricfamily.go: Added diagnostic logging when Unknown types default to Gaugegrouped_counter_test.go: Test reproducing and verifying the fix for Grouped metric type "counter" silently dropped #34263