add cnpg alerts #558

klinch0 · 2025-01-03T10:26:40Z

Summary by CodeRabbit

New Features
- Added comprehensive monitoring and alerting rules for PostgreSQL instances.
- Introduced alerts for:
  - Long-running transactions
  - Backend waiting times
  - Transaction ID age
  - Replication lag
  - Archiving failures
  - Deadlock conflicts
  - Replication status
- New resource: PrometheusRule named cnpg-default-alerts.

coderabbitai · 2025-01-03T10:26:49Z

Walkthrough

A new PrometheusRule configuration has been introduced in the PostgreSQL operator templates, defining a set of alerting rules for monitoring PostgreSQL instances managed by the CNPG operator. This configuration includes eight alerts that address various performance and operational metrics, such as long-running transactions, replication lag, transaction ID age, deadlock conflicts, and archiving failures. Each alert contains detailed annotations, triggering conditions, and severity labels to enhance monitoring capabilities.

Changes

File	Change Summary
`packages/system/postgres-operator/templates/prometheusrule.yaml`	Added new PrometheusRule configuration with 8 monitoring alerts for PostgreSQL instances

Poem

🐰 Postgres alerts hop and bound,
Monitoring metrics now profound!
Transactions, replicas in sight,
Catching issues day and night 🚨
A rabbit's watch, precise and keen,
Keeping databases smoothly clean! 🔍

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 836cbc0 and 5aaf3c0.

📒 Files selected for processing (1)

packages/system/postgres-operator/templates/prometheusrule.yaml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

packages/system/postgres-operator/templates/prometheusrule.yaml

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (3)

packages/system/postgres-operator/templates/prometheusrule.yaml (3)

9-17: Consider enhancing the LongRunningTransaction alert

A few suggestions to improve this alert:

The 300s threshold might need adjustment based on your workload patterns.
Consider adding instance and cluster labels to improve alert routing and filtering.
The description could be more actionable by including the query details if available.

     - alert: LongRunningTransaction
       annotations:
-        description: Pod {{ $labels.pod }} is taking more than 5 minutes (300 seconds) for a query.
+        description: Pod {{ $labels.pod }} in cluster {{ $labels.cluster }} has a transaction running for more than 5 minutes. Query details: {{ $labels.query }}
         summary: A query is taking longer than 5 minutes.
       expr: |-
-        cnpg_backends_max_tx_duration_seconds > 300
+        cnpg_backends_max_tx_duration_seconds{job="cnpg"} > 300
       for: 1m
       labels:
         severity: warning
+        cluster: '{{ $labels.cluster }}'

36-44: Enhance PGReplication alert with more context

Consider adding more context to help with troubleshooting:

Include the actual lag value in the description
Add primary server information
Consider different severity levels based on lag duration

     - alert: PGReplication
       annotations:
-        description: Standby is lagging behind by over 300 seconds (5 minutes)
+        description: Standby {{ $labels.pod }} is lagging behind primary {{ $labels.primary }} by {{ $value }} seconds
         summary: The standby is lagging behind the primary
       expr: |-
         cnpg_pg_replication_lag > 300
       for: 1m
       labels:
         severity: warning
+        primary: '{{ $labels.primary }}'

63-71: Enhance ReplicaFailingReplication alert with more context

The alert logic is correct, but the description could be more helpful during incident response by including more context about the replication state.

     - alert: ReplicaFailingReplication
       annotations:
-        description: Replica {{ $labels.pod }} is failing to replicate
+        description: |
+          Replica {{ $labels.pod }} is failing to replicate.
+          Pod is in recovery ({{ $labels.in_recovery }}) but WAL receiver is down ({{ $labels.wal_receiver_status }}).
+          Check the replica's logs for potential issues with WAL receiver.
         summary: Checks if the replica is failing to replicate
       expr: |-
         cnpg_pg_replication_in_recovery > cnpg_pg_replication_is_wal_receiver_up
       for: 1m
       labels:
         severity: warning
+        replica_host: '{{ $labels.pod }}'

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26388c7 and 983d5ef.

📒 Files selected for processing (1)

packages/system/postgres-operator/templates/prometheusrule.yaml (1 hunks)

🔇 Additional comments (1)

packages/system/postgres-operator/templates/prometheusrule.yaml (1)

1-71: Consider overall monitoring strategy improvements

Validate metric availability:

Consider implementing different severity levels:

Critical: Data loss risks (XID age)

Warning: Performance issues (long transactions)

Info: Early warnings (replication lag starting)

Add runbooks or troubleshooting guides in the alert descriptions.

✅ Verification successful

Let me search for any existing documentation or metric definitions to validate these metrics further.

CloudNativePG metrics are correctly configured and aligned with operator's monitoring

Based on the verification results, I can confirm that:

All metrics used in the PrometheusRule are standard CloudNativePG metrics, defined in the operator's default monitoring queries configuration.

The metrics are properly aligned with the operator's monitoring capabilities:

cnpg_backends_max_tx_duration_seconds comes from the backends query

cnpg_backends_waiting_total comes from the backends_waiting query

cnpg_pg_database_xid_age comes from the pg_database query

cnpg_pg_replication_lag comes from the pg_replication query

cnpg_pg_stat_archiver_* metrics come from the pg_stat_archiver query

cnpg_pg_stat_database_deadlocks comes from the pg_stat_database query

cnpg_pg_replication_in_recovery and cnpg_pg_replication_is_wal_receiver_up come from the pg_replication query

The alert thresholds and severity levels are appropriate for their respective scenarios, with warning levels set for conditions that require attention but aren't immediately critical.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Check if these metrics exist in your Prometheus
metrics=(
  "cnpg_backends_max_tx_duration_seconds"
  "cnpg_backends_waiting_total"
  "cnpg_pg_database_xid_age"
  "cnpg_pg_replication_lag"
  "cnpg_pg_stat_archiver_last_failed_time"
  "cnpg_pg_stat_database_deadlocks"
  "cnpg_pg_replication_in_recovery"
  "cnpg_pg_replication_is_wal_receiver_up"
)

for metric in "${metrics[@]}"; do
  echo "Checking $metric..."
  curl -s "http://prometheus:9090/api/v1/query?query=$metric" 2>/dev/null || \
  rg "$metric" -g '*.yaml' -g '*.yml'
done
Length of output: 3480

Script:
#!/bin/bash
# Look for CloudNativePG documentation or metric definitions
fd . -e md -e yaml -e yml | xargs rg -l "cnpg_" | xargs rg "metrics|monitoring|prometheus"

# Also check if there's any operator CRD that might define these metrics
fd . -e yaml -e yml | xargs rg -l "cloudnative-pg.io" | xargs cat
Length of output: 28088

coderabbitai · 2025-01-03T10:29:03Z

packages/system/postgres-operator/templates/prometheusrule.yaml

+    - alert: PGDatabaseXidAge
+      annotations:
+        description: Over 150,000,000 transactions from frozen xid on pod {{ $labels.pod  }}
+        summary: Number of transactions from the frozen XID to the current one
+      expr: |-
+        cnpg_pg_database_xid_age > 300000000
+      for: 1m
+      labels:
+        severity: warning


⚠️ Potential issue

Fix threshold inconsistency in PGDatabaseXidAge alert

The description mentions 150,000,000 transactions but the expression uses 300,000,000. This inconsistency could lead to confusion during incident response.

- alert: PGDatabaseXidAge annotations: - description: Over 150,000,000 transactions from frozen xid on pod {{ $labels.pod }} + description: Over 300,000,000 transactions from frozen xid on pod {{ $labels.pod }} summary: Number of transactions from the frozen XID to the current one expr: |- cnpg_pg_database_xid_age > 300000000

Also consider adding a second alert with a lower threshold as an early warning:

- alert: PGDatabaseXidAgeWarning annotations: description: Over 150,000,000 transactions from frozen xid on pod {{ $labels.pod }} summary: Transaction ID age is approaching critical levels expr: |- cnpg_pg_database_xid_age > 150000000 for: 1m labels: severity: info

coderabbitai · 2025-01-03T10:29:03Z

packages/system/postgres-operator/templates/prometheusrule.yaml

+    - alert: BackendsWaiting
+      annotations:
+        description: Pod {{ $labels.pod  }} has been waiting for longer than 5 minutes
+        summary: If a backend is waiting for longer than 5 minutes
+      expr: |-
+        cnpg_backends_waiting_total > 300
+      for: 1m
+      labels:
+        severity: warning


⚠️ Potential issue

Fix critical issues in BackendsWaiting alert

There are several issues with this alert:

The metric cnpg_backends_waiting_total appears to be a counter (based on the _total suffix), but it's being compared directly to a threshold. Counters should typically be evaluated using rate().

The expression doesn't match the description - it's comparing a total count to 300, not a duration.

- alert: BackendsWaiting annotations: description: Pod {{ $labels.pod }} has been waiting for longer than 5 minutes summary: If a backend is waiting for longer than 5 minutes expr: |- - cnpg_backends_waiting_total > 300 + rate(cnpg_backends_waiting_total[5m]) > 0 for: 1m labels: severity: warning

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- alert: BackendsWaiting

annotations:

description: Pod {{ $labels.pod }} has been waiting for longer than 5 minutes

summary: If a backend is waiting for longer than 5 minutes

expr: |-

cnpg_backends_waiting_total > 300

for: 1m

labels:

severity: warning

- alert: BackendsWaiting

annotations:

description: Pod {{ $labels.pod }} has been waiting for longer than 5 minutes

summary: If a backend is waiting for longer than 5 minutes

expr: |-

rate(cnpg_backends_waiting_total[5m]) > 0

for: 1m

labels:

severity: warning

coderabbitai · 2025-01-03T10:29:03Z

packages/system/postgres-operator/templates/prometheusrule.yaml

+    - alert: DatabaseDeadlockConflicts
+      annotations:
+        description: There are over 10 deadlock conflicts in {{ $labels.pod }}
+        summary: Checks the number of database conflicts
+      expr: |-
+        cnpg_pg_stat_database_deadlocks > 10
+      for: 1m
+      labels:
+        severity: warning


⚠️ Potential issue

Fix counter usage in DatabaseDeadlockConflicts alert

The metric appears to be a counter but is being compared directly to a threshold. This should be using rate() to measure the increase over time.

- alert: DatabaseDeadlockConflicts annotations: - description: There are over 10 deadlock conflicts in {{ $labels.pod }} + description: High rate of deadlock conflicts (>10/5m) detected in {{ $labels.pod }} summary: Checks the number of database conflicts expr: |- - cnpg_pg_stat_database_deadlocks > 10 + rate(cnpg_pg_stat_database_deadlocks[5m]) * 300 > 10 for: 1m labels: severity: warning

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- alert: DatabaseDeadlockConflicts

annotations:

description: There are over 10 deadlock conflicts in {{ $labels.pod }}

summary: Checks the number of database conflicts

expr: |-

cnpg_pg_stat_database_deadlocks > 10

for: 1m

labels:

severity: warning

- alert: DatabaseDeadlockConflicts

annotations:

description: High rate of deadlock conflicts (>10/5m) detected in {{ $labels.pod }}

summary: Checks the number of database conflicts

expr: |-

rate(cnpg_pg_stat_database_deadlocks[5m]) * 300 > 10

for: 1m

labels:

severity: warning

coderabbitai · 2025-01-03T10:29:03Z

packages/system/postgres-operator/templates/prometheusrule.yaml

+    - alert: LastFailedArchiveTime
+      annotations:
+        description: Archiving failed for {{ $labels.pod }}
+        summary: Checks the last time archiving failed. Will be < 0 when it has not failed.
+      expr: |-
+        (cnpg_pg_stat_archiver_last_failed_time - cnpg_pg_stat_archiver_last_archived_time) > 1
+      for: 1m
+      labels:
+        severity: warning


⚠️ Potential issue

Revise LastFailedArchiveTime alert logic

The current implementation has several issues:

The summary mentions "Will be < 0 when it has not failed" but the expression checks for "> 1"

Comparing timestamps directly might lead to false positives

The metric names suggest these are Unix timestamps, which should be handled differently

- alert: LastFailedArchiveTime annotations: description: Archiving failed for {{ $labels.pod }} - summary: Checks the last time archiving failed. Will be < 0 when it has not failed. + summary: Archiving has failed recently expr: |- - (cnpg_pg_stat_archiver_last_failed_time - cnpg_pg_stat_archiver_last_archived_time) > 1 + (time() - cnpg_pg_stat_archiver_last_failed_time) < 300 + and + cnpg_pg_stat_archiver_last_failed_time > cnpg_pg_stat_archiver_last_archived_time for: 1m labels: severity: warning

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- alert: LastFailedArchiveTime

annotations:

description: Archiving failed for {{ $labels.pod }}

summary: Checks the last time archiving failed. Will be < 0 when it has not failed.

expr: |-

(cnpg_pg_stat_archiver_last_failed_time - cnpg_pg_stat_archiver_last_archived_time) > 1

for: 1m

labels:

severity: warning

- alert: LastFailedArchiveTime

annotations:

description: Archiving failed for {{ $labels.pod }}

summary: Archiving has failed recently

expr: |-

(time() - cnpg_pg_stat_archiver_last_failed_time) < 300

and

cnpg_pg_stat_archiver_last_failed_time > cnpg_pg_stat_archiver_last_archived_time

for: 1m

labels:

severity: warning

coderabbitai · 2025-01-03T10:29:03Z

packages/system/postgres-operator/templates/prometheusrule.yaml

+apiVersion: monitoring.coreos.com/v1
+kind: PrometheusRule
+metadata:
+  name: cnpg-default-alerts
+spec:


💡 Codebase verification

Resource name should use cloudnative-pg.fullname template

The PrometheusRule name should follow the Helm chart's naming convention by using the cloudnative-pg.fullname template helper, which is available in the chart's _helpers.tpl. This ensures consistent naming across all resources and allows for proper name overrides through Helm values.

Change the name from:

name: cnpg-default-alerts

to:

name: {{ include "cloudnative-pg.fullname" . }}-alerts

This follows the pattern seen in other charts and matches the naming template defined in the chart's helpers.

🔗 Analysis chain

Verify monitoring resource naming convention

Let's verify if the resource name follows the established pattern for monitoring resources in the cluster.

Resource name should use Helm templating for consistency

The resource name cnpg-default-alerts is hardcoded while other PrometheusRules in the codebase consistently use Helm templating functions for naming. Let's verify if there's a template function available for this chart.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Check naming patterns of existing PrometheusRules kubectl get prometheusrules -A -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}' 2>/dev/null || \ rg -g '*.yaml' -g '*.yml' 'kind:\s*PrometheusRule' -A 2

Length of output: 2480

Script:

#!/bin/bash # Check if there's a template helper function defined rg -g '*.yaml' -g '*.tpl' '_helpers' packages/system/postgres-operator/templates/

Length of output: 81

Script:

#!/bin/bash # Look for _helpers.tpl in the chart directory and its parent directories fd -g '_helpers.tpl' packages/system/postgres-operator # Also check Chart.yaml to get the chart name fd -g 'Chart.yaml' packages/system/postgres-operator -x cat {}

Length of output: 963

Script:

#!/bin/bash # Check the content of _helpers.tpl to see available naming templates cat packages/system/postgres-operator/charts/cloudnative-pg/templates/_helpers.tpl # Also check if there are any other PrometheusRules in the same chart for naming pattern rg -g '*.yaml' 'kind:\s*PrometheusRule' packages/system/postgres-operator/charts/cloudnative-pg/templates/

Length of output: 2045

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

packages/system/postgres-operator/templates/prometheusrule.yaml (1)

6-8: Consider restructuring alert groups and adding multi-level alerts

Consider the following improvements to the alert structure:

Split alerts into logical groups (e.g., replication, performance, maintenance)

Add warning levels for critical alerts (e.g., replication lag > 150s as warning, >300s as critical)

Define alert inhibition rules to prevent alert storms

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 983d5ef and 2190502.

📒 Files selected for processing (1)

packages/system/postgres-operator/templates/prometheusrule.yaml (1 hunks)

🔇 Additional comments (6)

packages/system/postgres-operator/templates/prometheusrule.yaml (6)

4-4: Resource name should use cloudnative-pg.fullname template

The resource name should follow the Helm chart's naming convention.

20-29: Fix critical issues in BackendsWaiting alert

The metric appears to be a counter but is being compared directly to a threshold.

31-40: Fix threshold inconsistency in PGDatabaseXidAge alert

Description mentions different threshold than expression.

53-62: Revise LastFailedArchiveTime alert logic

The timestamp comparison logic needs revision.

64-73: LGTM! Good use of increase() function

The implementation correctly uses increase() for the counter metric over a 10-minute window.

75-84: LGTM! Well-structured replica monitoring

The alert correctly combines recovery status and WAL receiver status to detect replication issues. The critical severity is appropriate for this type of failure.

coderabbitai · 2025-01-07T09:33:59Z

packages/system/postgres-operator/templates/prometheusrule.yaml

+    - alert: LongRunningTransaction
+      annotations:
+        description: >-
+          Pod {{ $labels.pod }} has a transaction running longer than 5 minutes (300 seconds).
+          This could indicate a potential lock issue or unoptimized query execution.
+        summary: Long-running transaction detected.
+      expr: rate(cnpg_backends_max_tx_duration_seconds[5m]) > 300
+      for: 1m
+      labels:
+        severity: warning


⚠️ Potential issue

Fix metric usage in LongRunningTransaction alert

The current expression uses rate() on what appears to be a gauge metric measuring duration. This is incorrect as rate() should only be used with counter metrics.

Apply this fix:

- expr: rate(cnpg_backends_max_tx_duration_seconds[5m]) > 300 + expr: max(cnpg_backends_max_tx_duration_seconds) > 300

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- alert: LongRunningTransaction

annotations:

description: >-

Pod {{ $labels.pod }} has a transaction running longer than 5 minutes (300 seconds).

This could indicate a potential lock issue or unoptimized query execution.

summary: Long-running transaction detected.

expr: rate(cnpg_backends_max_tx_duration_seconds[5m]) > 300

for: 1m

labels:

severity: warning

- alert: LongRunningTransaction

annotations:

description: >-

Pod {{ $labels.pod }} has a transaction running longer than 5 minutes (300 seconds).

This could indicate a potential lock issue or unoptimized query execution.

summary: Long-running transaction detected.

expr: max(cnpg_backends_max_tx_duration_seconds) > 300

for: 1m

labels:

severity: warning

coderabbitai · 2025-01-07T09:33:59Z

packages/system/postgres-operator/templates/prometheusrule.yaml

+    - alert: PGReplication
+      annotations:
+        description: >-
+          Standby on pod {{ $labels.pod }} is lagging behind the primary by more than 5 minutes (300 seconds).
+          This can lead to outdated data on replicas.
+        summary: Replication lag detected.
+      expr: rate(cnpg_pg_replication_lag[5m]) > 300
+      for: 1m
+      labels:
+        severity: critical


⚠️ Potential issue

Fix metric usage in PGReplication alert

The current expression uses rate() on what appears to be a gauge metric measuring replication lag. This is incorrect as rate() should only be used with counter metrics.

Apply this fix:

- expr: rate(cnpg_pg_replication_lag[5m]) > 300 + expr: max(cnpg_pg_replication_lag) > 300

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- alert: PGReplication

annotations:

description: >-

Standby on pod {{ $labels.pod }} is lagging behind the primary by more than 5 minutes (300 seconds).

This can lead to outdated data on replicas.

summary: Replication lag detected.

expr: rate(cnpg_pg_replication_lag[5m]) > 300

for: 1m

labels:

severity: critical

- alert: PGReplication

annotations:

description: >-

Standby on pod {{ $labels.pod }} is lagging behind the primary by more than 5 minutes (300 seconds).

This can lead to outdated data on replicas.

summary: Replication lag detected.

expr: max(cnpg_pg_replication_lag) > 300

for: 1m

labels:

severity: critical

kvaps

Apart from what Coderabbit said, the rest seems okay to me

fix regression introduced in #558  ## Summary by CodeRabbit - **Refactor** - Updated label formatting in PostgreSQL operator default alerts configuration - Enhanced alert template generation to dynamically include multiple alert configurations from separate files  Signed-off-by: Andrei Kvapil <[email protected]>

klinch0 requested a review from kvaps as a code owner January 3, 2025 10:26

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Jan 3, 2025

dosubot bot added the enhancement New feature or request label Jan 3, 2025

coderabbitai bot reviewed Jan 3, 2025

View reviewed changes

klinch0 marked this pull request as draft January 6, 2025 12:16

add cnpg alerts

2190502

klinch0 force-pushed the feature/add-cnpg-alerts branch from 983d5ef to 2190502 Compare January 7, 2025 09:31

klinch0 marked this pull request as ready for review January 7, 2025 09:32

coderabbitai bot reviewed Jan 7, 2025

View reviewed changes

kvaps approved these changes Jan 9, 2025

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Jan 9, 2025

klinch0 marked this pull request as draft January 9, 2025 13:02

fix

836cbc0

klinch0 marked this pull request as ready for review January 10, 2025 10:01

add lables and CNPGClusterOffline

5aaf3c0

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Jan 10, 2025

kvaps merged commit 4547efa into aenix-io:main Jan 13, 2025
1 check passed

kvaps mentioned this pull request Jan 15, 2025

fix cnpg alerts templating #574

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add cnpg alerts #558

add cnpg alerts #558

klinch0 commented Jan 3, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 3, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Jan 3, 2025

coderabbitai bot Jan 3, 2025

coderabbitai bot Jan 3, 2025

coderabbitai bot Jan 3, 2025

coderabbitai bot Jan 3, 2025

coderabbitai bot left a comment

coderabbitai bot Jan 7, 2025

coderabbitai bot Jan 7, 2025

kvaps left a comment

add cnpg alerts #558

add cnpg alerts #558

Conversation

klinch0 commented Jan 3, 2025 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Jan 3, 2025 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 3, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 3, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 3, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 3, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 3, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 7, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 7, 2025

Choose a reason for hiding this comment

kvaps left a comment

Choose a reason for hiding this comment

klinch0 commented Jan 3, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 3, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)