feat(pgbouncer): trigger reconnection on PostgreSQL primary failover #27
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch implements a mechanism to detect PostgreSQL primary changes.
When detected, we trigger PgBouncer pod deletions to force connection pool refresh,
preventing stale connections from routing traffic to demoted replicas after failover.
How it works: When a PgBouncer pod is deleted, it receives a SIGTERM,
forcing PgBouncer to enter SHUTDOWN WAIT_FOR_CLIENTS mode. This mode waits
for clients to disconnect for up to a grace period, then K8s sends a SIGKILL. When
PgBouncer restarts, it performs a fresh DNS lookup and connects to the correct
primary.
Note 1: This approach is more effective than the RECONNECT command for session
mode with persistent clients (MPG clusters) because RECONNECT waits for clients
to disconnect, which never happens for persistent clients. SIGTERM guarantees
termination and restart after grace period.
Note 2: For planned switchovers we could consider a PAUSE + RESUME operation,
but it would only work if we know a switchover is going to happen. This patch's
strategy works for both switchovers and failovers.
This strategy was suggested by a PgBouncer maintainer for handling failovers in
Kubernetes: pgbouncer/pgbouncer#1361.
Added a new e2e test.
Signed-off-by: Juliana Oliveira [email protected]