docs: add hot secondaries rfc #11227

jcsp · 2025-03-13T17:23:10Z

Problem

Summary of changes

github-actions · 2025-03-13T19:11:18Z

7953 tests run: 7569 passed, 0 failed, 384 skipped (full report)

Flaky tests (3)

Postgres 17

test_pgdata_import_smoke[None-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS]: debug-x86-64-without-lfc
test_cancellations: debug-x86-64-without-lfc

Postgres 14

test_pageserver_restart: release-arm64-with-lfc

Code coverage* (full report)

functions: 32.3% (8732 of 27002 functions)
lines: 48.4% (74806 of 154645 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
ce3d23e at 2025-03-14T20:27:59.970Z :recycle:}

arpad-m · 2025-03-13T19:56:24Z

docs/rfcs/043-hot-secondaries.md

+## Purpose
+
+We aim to provide a sub-second RTO for pageserver failures, for mission
+critical workloads.  To do this, we should enable the postgres client


I see a second benefit that hot secondaries bring: scaling read traffic. say someone runs a lot of analytic workloads on some database in parallel. for oltp stuff this is probably all already handle-able via caches, but idk.

arpad-m · 2025-03-13T21:18:05Z

docs/rfcs/043-hot-secondaries.md

+The average total disk write bandwidth is the sum of WAL generation rate plus L1/image generation rate: this is about the same as a normal attached location.  The average disk _read_ bandwidth of a hot secondary is far lower than an attached location because it is not reading back layers to compact them -- layers are only read in periods where the attached location was unavailable, so computes started reading from a hot secondary.
+
+The trigger for virtual compaction can be similar to the existing trigger
+for L1 compaction on attached locations: once we build up a deep stack of L0s, then we do virtual compaction to trim it.  This assumes that the attached location has kept up with compaction.  The hot secondary can be


what if both the primary and the hot secondary are in 100% perfect sync, so they have the same number of l0s.

then the moment comes when the hot secondary and primary both think about doing compaction. at that point, the secondary will look for remote layers immediately, while the primary is not ready yet, it hasn't uploaded any files yet.

edit: what I'm trying to say is that there is a risk of the hot secondary lagging behind in a similar fashion to the warm secondary. the warm secondary misses out on new layers until they make it into the layer map. the hot secondary doesn't miss out on them but has a larger compaction debt.

arpad-m

How would the transition from hot secondary to primary work? we now have some data in remote storage that might be inconsistent with local state, the primary might be ahead or behind, might have less layer files, etc.

in general, the design before has been that S3 is the pristine version of the state that other places, like secondaries or primaries, are downstream of. But now, once the hot secondary becomes a primary, it might have a step to delete files that are in S3 but not needed locally, because we have a slightly differently cut local copy of them, and we probably don't want to redownload stuff during an attach in order to become operational (this was the goal of the hot secondary after all).

also I'm wondering about backpressure, should hot secondaries failing to catch up cause backpressure? we can probably answer this later too, but if there is no backpressure, we might end in situations where the hot secondary is behind but has different l0s, so it might be smarter to ditch those l0s instead of ditching what's in s3.

VladLazar · 2025-03-21T15:59:35Z

docs/rfcs/2025-03-15-hot-secondaries.md

+- after some short timeout (100s of ms), compute gives up on getpage requests to the primary and sends
+  them to the hot secondary.


How does the compute learn about the pageserver hosting the hot secondary location? The RFC does not mention anything on it, so I'm assuming the current apply-config mechanism is implied.

I think that's fine to start with, but it implies an unbounded availability gap when faced with notification delivery issues (the like of which we've seen quite a few lately).

docs: add hot secondaries rfc

f441b1e

arpad-m reviewed Mar 13, 2025

View reviewed changes

jcsp added 2 commits March 14, 2025 15:33

rename

7d60895

Notes on cutover

ce3d23e

VladLazar reviewed Mar 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add hot secondaries rfc #11227

docs: add hot secondaries rfc #11227

jcsp commented Mar 13, 2025

github-actions bot commented Mar 13, 2025 •

edited

Loading

Postgres 17

Postgres 14

arpad-m Mar 13, 2025

arpad-m Mar 13, 2025 •

edited

Loading

arpad-m left a comment

VladLazar Mar 21, 2025

		- after some short timeout (100s of ms), compute gives up on getpage requests to the primary and sends
		them to the hot secondary.

docs: add hot secondaries rfc #11227

Are you sure you want to change the base?

docs: add hot secondaries rfc #11227

Conversation

jcsp commented Mar 13, 2025

Problem

Summary of changes

github-actions bot commented Mar 13, 2025 • edited Loading

7953 tests run: 7569 passed, 0 failed, 384 skipped (full report)

Postgres 17

Postgres 14

Code coverage* (full report)

arpad-m Mar 13, 2025

Choose a reason for hiding this comment

arpad-m Mar 13, 2025 • edited Loading

Choose a reason for hiding this comment

arpad-m left a comment

Choose a reason for hiding this comment

VladLazar Mar 21, 2025

Choose a reason for hiding this comment

github-actions bot commented Mar 13, 2025 •

edited

Loading

arpad-m Mar 13, 2025 •

edited

Loading