add: use SO_REUSEPORT on platform supporting it by mkleczek · Pull Request #4703 · PostgREST/postgrest

mkleczek · 2026-03-10T14:21:34Z

DISCLAIMER:
This commit was authored entirely by a human without the assistance of LLMs.

Stacked on top of #4702 as it is not enough to start a new instance, it is also necessary not to fail in-flight requests on the old instance.

steve-chavez · 2026-03-10T16:27:13Z

This is failing all tests, I'd suggest to put these type of PRs as draft.

mkleczek · 2026-03-10T16:29:32Z

This is failing all tests, I'd suggest to put these type of PRs as draft.

Hmm... worked before latest push. Will switch to draft and fix.

mkleczek · 2026-03-10T16:53:19Z

This is failing all tests, I'd suggest to put these type of PRs as draft.

Hmm... worked before latest push. Will switch to draft and fix.

@steve-chavez - it looks like the issue is that on my machine PostgREST startup is fast enough so that it loads the schema cache before it accepts any requests. Here in CI the new instance fails with 503 because it didn't yet load the schema cache.

The question: is there any particular reason why we return 503, instead of simply not start listening on the socket until schema cache is loaded? The way we have it right now means we can't really support zero-downtime upgrades because once we start the new instance but before it loads the schema cache, some clients will get 503.

steve-chavez · 2026-03-10T20:41:52Z

The question: is there any particular reason why we return 503, instead of simply not start listening on the socket until schema cache is loaded?

We were aiming to have requests wait instead of 503, this waiting does happen during schema cache reload but not on startup; we discussed this on #4129. Would it be better to not listen on the socket? How would clients behave in this case?

develop7 · 2026-03-10T21:36:38Z

They would get a "connection refused" error, which means nobody there and is imo more confusing that any 5xx error. UX-wise I would prefer some waiting to a presumably hard fail any day.

mkleczek · 2026-03-11T13:28:11Z

We were aiming to have requests wait instead of 503, this waiting does happen during schema cache reload but not on startup; we discussed this on #4129. Would it be better to not listen on the socket? How would clients behave in this case?

They would get a "connection refused" error, which means nobody there and is imo more confusing that any 5xx error. UX-wise I would prefer some waiting to a presumably hard fail any day.

I am not convinced, see below.

This is a complex topic so let's dig into it a little more. The startup sequence right now is:

Postgrest is not running.
Clients get connection refused
Postgrest starts listening on a socket
Clients get 503
Postgrest loaded schema cache
Normal traffic

The alternatives are:
a - blocking during schema cache loading

Postgrest is not running.
Clients get connection refused
Postgrest starts listening on a socket
Clients are blocked and potentially timeout getting some network error
Postgrest loaded schema cache
Normal traffic

b - listening on a socket only after schema cache loaded

Postgrest is not running.
Clients get connection refused
Postgrest loaded schema cache and starts listening
Normal traffic

So from the point of view of the clients (they don't know when Postgrest was started), we have 3 alternatives:

connection refused -> 503 -> normal
connection refused -> blocked/time out -> normal
connection refused -> normal

I am not sure what value clients get from the first two options comparing to the third one. Diagnostics and readiness checks should be done using admin server anyway.

In case of SO_REUSEPORT the situation is even worse if we start listening early. We have the following situation: instance 1 is running, instance 2 is started. From the point of view of clients:

Today: normal traffic -> some clients get 503 (both instances are listening only one is ready) -> normal traffic
Blocking: normal traffic -> some clients blocked/time out (both instances are listening one is blocking) -> normal traffic
Not listening: just normal traffic (there are no disruptions at all because all requests are handled by instance 1 until instance 2 is ready and starts listening).

So the first two options cause disruptions whereas the third one is fully zero-downtime and transparent to the clients.

My take on it would be:

Start admin server as early as possible.
Load schema cache.
Start listening on main socket.

This would require splitting binding from listening on the main socket (ie. we need to bind without listening first so that we can pass the socket to the admin server).

@steve-chavez @develop7 thoughts?

mkleczek · 2026-03-11T13:51:21Z

Dependent on resolving (or having a workaround to) yesodweb/wai#853

steve-chavez · 2026-03-11T19:04:24Z

So the first two options cause disruptions whereas the third one is fully zero-downtime and transparent to the clients.
My take on it would be:

Agree, sounds much better.

wolfgangwalther · 2026-05-03T11:44:57Z

Conflicted in the changelog.

steve-chavez · 2026-05-04T16:21:02Z

+
+Zero-Downtime Upgrades
+======================
+


Suggested change

:author: `mkleczek <https://github.com/mkleczek>`_

We've been doing this for almost all how-tos:

https://docs.postgrest.org/en/v14/how-tos/sql-user-management-using-postgres-users-and-passwords.html

https://docs.postgrest.org/en/v14/how-tos/create-soap-endpoint.html

..

steve-chavez · 2026-05-04T16:24:25Z


  The TCP port to bind the web server. Use ``0`` to automatically assign a port.

+  On operating systems that support ``SO_REUSEPORT``, you can start multiple


Let's put a heading and anchor here so we can link it from other places

Suggested change

On operating systems that support ``SO_REUSEPORT``, you can start multiple

.. _reuseport:

SO_REUSEPORT

~~~~~~~~~~~~~

On operating systems that support ``SO_REUSEPORT``, you can start multiple

steve-chavez · 2026-06-24T21:30:58Z

    host=None,
    wait_for=Admin.ready,
-    wait_max_seconds=1,
+    wait_max_seconds=3,


Why did we increase the default? Could this be done for particular tests instead?

It was somewhat flaky on my machine. Can roll it back if needed.

Yes, let's try that

steve-chavez · 2026-06-24T21:51:02Z

+  When running multiple PostgREST instances on the same :ref:`server-port`, use
+  a different ``admin-server-port`` for each instance. Admin ports are not shared
+  between instances, so readiness checks always target one specific PostgREST
+  instance.
+


I'd suggest putting all these paragraphs under the reuseport section, otherwise it's kinda hard to hunt them down.

Added this paragraph on my suggestion: https://github.com/PostgREST/postgrest/pull/4703/changes#r3470538404

Can be deleted from here if you agree

steve-chavez · 2026-06-24T21:52:51Z

+  When :ref:`server-reuseport` is enabled on an operating system that supports
+  ``SO_REUSEPORT``, you can start multiple PostgREST instances on the same
+  :ref:`server-host` and ``server-port``. For example, two PostgREST processes
+  can use the same configuration:
+
+  .. code:: ini
+
+    server-host = "127.0.0.1"
+    server-port = 3000
+    server-reuseport = true
+
+  New connections are then distributed by the operating system between the
+  running PostgREST processes. This can be used to start a replacement process
+  before stopping the old one, or to run several PostgREST processes behind one
+  port.
+
+  If ``server-reuseport`` is disabled, starting another PostgREST process on
+  the same host and port will fail with the usual address-in-use error.
+
+.. _server-reuseport:
+
+server-reuseport
+----------------
+
+  =============== =================================
+  **Type**        Bool
+  **Default**     false
+  **Reloadable**  N
+  **Environment** PGRST_SERVER_REUSEPORT
+  **In-Database** `n/a`
+  =============== =================================
+
+  Enables ``SO_REUSEPORT`` on the TCP server socket. This allows multiple
+  PostgREST processes to bind to the same :ref:`server-host` and
+  :ref:`server-port` when the operating system supports it.
+
+  Enabling this setting on an operating system that does not support
+  ``SO_REUSEPORT`` is a configuration error. PostgREST will fail to start
+  instead of falling back to a normal TCP socket.
+
+  This setting does not apply when :ref:`server-unix-socket` is used.
+


Ditto here, maybe like:

Suggested change

When :ref:`server-reuseport` is enabled on an operating system that supports

``SO_REUSEPORT``, you can start multiple PostgREST instances on the same

:ref:`server-host` and ``server-port``. For example, two PostgREST processes

can use the same configuration:

.. code:: ini

server-host = "127.0.0.1"

server-port = 3000

server-reuseport = true

New connections are then distributed by the operating system between the

running PostgREST processes. This can be used to start a replacement process

before stopping the old one, or to run several PostgREST processes behind one

port.

If ``server-reuseport`` is disabled, starting another PostgREST process on

the same host and port will fail with the usual address-in-use error.

.. _server-reuseport:

server-reuseport

----------------

=============== =================================

**Type** Bool

**Default** false

**Reloadable** N

**Environment** PGRST_SERVER_REUSEPORT

**In-Database** `n/a`

=============== =================================

Enables ``SO_REUSEPORT`` on the TCP server socket. This allows multiple

PostgREST processes to bind to the same :ref:`server-host` and

:ref:`server-port` when the operating system supports it.

Enabling this setting on an operating system that does not support

``SO_REUSEPORT`` is a configuration error. PostgREST will fail to start

instead of falling back to a normal TCP socket.

This setting does not apply when :ref:`server-unix-socket` is used.

.. _server-reuseport:

server-reuseport

----------------

=============== =================================

**Type** Bool

**Default** false

**Reloadable** N

**Environment** PGRST_SERVER_REUSEPORT

**In-Database** `n/a`

=============== =================================

Enables ``SO_REUSEPORT`` on the TCP server socket. This allows multiple

PostgREST processes to bind to the same :ref:`server-host` and

:ref:`server-port` when the operating system supports it.

For example, two PostgREST processes can use the same configuration

.. code:: ini

server-host = "127.0.0.1"

server-port = 3000

server-reuseport = true

New connections are then distributed by the operating system between the

running PostgREST processes. This can be used to start a replacement process

before stopping the old one, or to run several PostgREST processes behind one

port.

Use a different ``admin-server-port`` for each instance. Admin ports are not shared

between instances:

- Readiness checks always target one specific PostgREST

- Give each instance a different :ref:`admin-server-port`, otherwise the new instance will fail to start.

Enabling this setting on an operating system that does not support

``SO_REUSEPORT`` is a configuration error. PostgREST will fail to start

instead of falling back to a normal TCP socket.

This setting does not apply when :ref:`server-unix-socket` is used.

steve-chavez · 2026-06-24T23:38:34Z

+  Multiple PostgREST instances can share the same public API host and port when
+  :ref:`server-reuseport` is enabled on operating systems that support
+  ``SO_REUSEPORT``. Admin ports are not shared: give each instance a different
+  :ref:`admin-server-port`, otherwise the new instance will fail to start.


Included this in https://github.com/PostgREST/postgrest/pull/4703/changes#r3470538404. To have everything in one place.

steve-chavez · 2026-06-24T23:39:26Z

+  If the machine has multiple network interfaces, configure concrete
+  :ref:`server-host` and :ref:`admin-server-host` values when you need health
+  checks to target a specific process. Avoid special values (``!4``, ``*``, etc)
+  in this case because the health check could report a false positive.


This doesn't look related to this feature?

This doesn't look related to this feature?

Not directly but I added it because it is important in this case: we have multiple PostgREST processes running at the same time and it is easy to target the wrong one with health checks.

Right, make sense. But it feels a bit out of place here. I believe it should go inside the server-reuseport section in the config.

So let me see whether I understand the problem this tries to hint at: I turn on server-reuseport. I set server-host at the default of !4. This automatically applies to admin-server-host as well, I think. I now accidentally set the admin-server-port to the same value for both instances. According to the note further up, I would expect this to fail, because the same port for the admin server is used.

But it's not, because it's using a different interface. So I run two admin servers on the same port, but on different interfaces. Now, things start to break.

Is this what you had in mind?

If yes, I feel like it fits right in here. But it should be more framed as an exception to the above rule ("admin servers on the same port will fail to start").

If not.. please elaborate.

mkleczek requested review from steve-chavez and wolfgangwalther March 10, 2026 14:27

mkleczek force-pushed the so-reuseport branch 3 times, most recently from c252511 to 27c16d7 Compare March 10, 2026 15:43

mkleczek marked this pull request as draft March 10, 2026 16:31

mkleczek force-pushed the so-reuseport branch from 27c16d7 to 070d7a1 Compare March 10, 2026 20:00

mkleczek force-pushed the so-reuseport branch 2 times, most recently from 64bb6ca to 2cb6503 Compare March 11, 2026 18:49

mkleczek force-pushed the so-reuseport branch from 2cb6503 to 83a7a91 Compare March 11, 2026 19:32

mkleczek mentioned this pull request Mar 11, 2026

refactor: provide AppState.waitForSchemaCacheLoaded function #4709

Merged

mkleczek force-pushed the so-reuseport branch 7 times, most recently from 7d7375a to bd4c7ec Compare March 12, 2026 10:20

mkleczek mentioned this pull request Mar 12, 2026

refactor: move socket creation and management to App module #4713

Merged

mkleczek force-pushed the so-reuseport branch from bd4c7ec to 2092d2a Compare March 12, 2026 15:13

mkleczek mentioned this pull request Mar 12, 2026

test(io): add test_so_reuseport_zero_downtime_handover #4715

Merged

mkleczek force-pushed the so-reuseport branch from 2092d2a to 52cb192 Compare March 12, 2026 15:20

mkleczek force-pushed the so-reuseport branch from 033bd60 to 127490e Compare April 29, 2026 09:59

steve-chavez reviewed May 1, 2026

View reviewed changes

Comment thread docs/how-tos/zero-downtime-upgrades.rst Outdated

mkleczek mentioned this pull request May 2, 2026

Add /$/live and /$/ready endpoints on the API server #4866

Closed

mkleczek force-pushed the so-reuseport branch from 127490e to b580f73 Compare May 3, 2026 12:07

steve-chavez reviewed May 4, 2026

View reviewed changes

Comment thread src/PostgREST/App.hs

steve-chavez reviewed May 4, 2026

View reviewed changes

mkleczek force-pushed the so-reuseport branch 2 times, most recently from 0e7d4aa to 680c6e1 Compare May 4, 2026 17:53

mkleczek mentioned this pull request May 4, 2026

A statement timeout can void the schema cache #4873

Closed

mkleczek force-pushed the so-reuseport branch 2 times, most recently from c12c6fd to bec258c Compare May 5, 2026 09:27

steve-chavez mentioned this pull request May 5, 2026

fix: Start listening after schema cache load #4880

Merged

mkleczek force-pushed the so-reuseport branch 8 times, most recently from 5641f65 to 71ce2e3 Compare May 12, 2026 05:01

steve-chavez reviewed Jun 16, 2026

View reviewed changes

Comment thread docs/how-tos/zero-downtime-upgrades.rst Outdated

steve-chavez reviewed Jun 24, 2026

View reviewed changes

Comment thread test/io/postgrest.py Outdated

add: use SO_REUSEPORT on platform supporting it

be6e366

steve-chavez reviewed Jun 24, 2026

View reviewed changes


		The TCP port to bind the web server. Use ``0`` to automatically assign a port.

		On operating systems that support ``SO_REUSEPORT``, you can start multiple


		Zero-Downtime Upgrades
		======================

Uh oh!

Uh oh!

Conversation

mkleczek commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steve-chavez commented Mar 10, 2026

Uh oh!

mkleczek commented Mar 10, 2026

Uh oh!

mkleczek commented Mar 10, 2026

Uh oh!

steve-chavez commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

develop7 commented Mar 10, 2026

Uh oh!

mkleczek commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mkleczek commented Mar 11, 2026

Uh oh!

steve-chavez commented Mar 11, 2026

Uh oh!

Uh oh!

wolfgangwalther commented May 3, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

steve-chavez Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

steve-chavez Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

mkleczek commented Mar 10, 2026 •

edited

Loading

steve-chavez commented Mar 10, 2026 •

edited

Loading

mkleczek commented Mar 11, 2026 •

edited

Loading

steve-chavez Jun 24, 2026 •

edited

Loading

steve-chavez Jun 24, 2026 •

edited

Loading