prometheus_remote_write: Fix cutoff logic. #225
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We noticed that fluent-bit’s prometheus remote_write output plugin was silently dropping some, but not all, process_exporter metrics after about one hour while the stdout output plugin was still showing metrics being collected. We were also able to reduce the time after which metrics were being dropped by modifying
CMT_ENCODE_PROMETHEUS_REMOTE_WRITE_CUTOFF_THRESHOLD
, which indicates the problem is the cutoff logic. This merge-request treatsCMT_ENCODE_PROMETHEUS_REMOTE_WRITE_CUTOFF_ERROR
as success and continues encoding other metrics, so they do not get dropped. It might be worth dropping this “error” code entirely, since it’s not really an error and leads to subtle bugs like this one.After merging this fix the bundled copy of cmetrics inside fluent-bit should be updated.