Skip to content

Add configurable retries with backoff to Prometheus Remote Write exporter #3985

@towseef41

Description

@towseef41

What problem do you want to solve?

Remote Write pushes are currently fail-fast: transient 429/5xx/network blips drop metric batches without retry or backpressure. That causes silent data loss during throttling or short outages.

Describe the solution you'd like

Add a configurable retry policy to the exporter: max attempts, exponential backoff with jitter, and a retryable status list (e.g., 429/408/5xx plus connection/timeouts). Log retry decisions and final failures. Defaults can be modest (e.g., 3 retries, small backoff) with max_retries=0 to disable.

Describe alternatives you've considered

Accept drops on error (current behavior): loses data during transient issues.
Rely on external queues/proxies: adds operational overhead and still needs exporter-side retries.
Push retry logic to remote endpoints: not always possible; exporters should be polite and back off themselves.

Additional Context

No response

Would you like to implement a fix?

Yes

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions