Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: unhandled connection errors #632

Merged
merged 1 commit into from
Jan 9, 2024
Merged

fix: unhandled connection errors #632

merged 1 commit into from
Jan 9, 2024

Conversation

Quentin-M
Copy link
Contributor

@Quentin-M Quentin-M commented Dec 4, 2023

pg-pool requires attaching an error handler to each connection acquired from the pool, which will handle connection-level errors. Without it, the process ends up crashing whenever the connection is interrupted (e.g. database failover).

This fact can be observed in pg-pool:

  • newClient 1 and _release (back to the pool) 2 attaches an "idleListener" error handler which upon encountering a connection error, destroys the connection / removes it from the pool, and re-emits 3
  • _acquireClient removes the "idleListener", leaving the returned connection without an error handler 4

This PR is in response to production outages we have faced during Aurora PostgreSQL failovers, with the unexpected disconnections for actively-used connections turning error emits into unhandled exceptions and ultimately crashing the entire NodeJS process, despite our existing error handlers on both .on('error' and .adapter.pg.on('error', and our catch-all try/catches.. With this patch in place, the unexpected disconnects (e.g. which could be simulated with a tcpkill -9) no longer crash the processes, proper errors are yielded by the queries - which can then be captured as usual, and the application can eventually recover once the database is available once again.

See brianc/node-postgres#1324 (and more specifically the last few comments there).

Checklist

  • DCO (Developer Certificate of Origin) signed in all commits
  • Tested as fixing our production outages
  • npm test passes on your machine
  • New tests added or existing tests modified to cover all changes
  • Code conforms with the style guide
  • Commit messages are following our guidelines

Copy link
Member

@dhmlau dhmlau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Quentin-M, thanks for your contribution. Your changes LGTM.

@Quentin-M
Copy link
Contributor Author

Quentin-M commented Dec 6, 2023

Thank you!

Let me know if you want me to spend the time to fix the commitlint...

✖   body must have leading blank line [body-leading-blank]
✖   subject may not be empty [subject-empty]
✖   type may not be empty [type-empty]

@dhmlau
Copy link
Member

dhmlau commented Dec 7, 2023

@Quentin-M, yes please!

…crash

pg-pool requires attaching an error handler to each item checked out off the pool, which will
handle connection-level errors. Without it, the application straight-up crashes whenever the
connection is interrupted (e.g. database failover).

See brianc/node-postgres#1324

Signed-off-by: Quentin Machu <[email protected]>
@dhmlau dhmlau merged commit 86ebae8 into loopbackio:master Jan 9, 2024
5 checks passed
@dhmlau
Copy link
Member

dhmlau commented Jan 9, 2024

@Quentin-M, thanks again for your contribution. Your PR has landed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants