Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

catalog: Pass timestamps for transaction commit #30006

Merged
merged 2 commits into from
Nov 25, 2024

Conversation

jkosh44
Copy link
Contributor

@jkosh44 jkosh44 commented Oct 15, 2024

This commit adds plumbing for the catalog to accept a commit timestamp
for transactions.

In general, it is up to the adapter to provide that timestamp.
Currently, the timestamp that the adapter uses depends on the context.
During startup, when the timestamp oracle isn't always readily
available, the adapter will pass some timestamp that it knows will
be greater than or equal to the current catalog upper. Sometimes this
is a write timestamp from the timestamp oracle, other times it's the
current catalog upper. During all other times, the adapter
passes a write timestamp from the timestamp oracle.

After startup, the adapter applies the catalog upper with the timestamp
oracle to linearize all writes to the catalog that happened during
startup.

When getting a timestamp to allocate IDs in the catalog, the
adapter does not apply the received timestamps. If the adapter did
apply these timestamps, then strict serializable reads on tables might
block if the table upper is not past the applied timestamp.

Catalog transactions are able to re-use the oracle write timestamp
that is already available and used for audit log writes. The adapter
always performs a write to builtin tables immediately after catalog
transactions, which will allocate and apply a new write timestamp. So
effectively catalog transaction write timestamps are applied.

The goal of this PR is to move the catalog closer to the EpochMillis
timeline and set up the necessary infrastructure to pass timestamps
from the adapter to the catalog.

This work doesn't fully move the catalog onto the EpochMillis
timeline, the following work still remains:

  • Always apply the commit timestamp with the timestamp oracle after
    committing.
  • Add the catalog shard to the txn shard with all other tables.
    This would prevent blocking table reads when applying the commit
    timestamp. Additionally, it would ensure that the catalog is
    readable at the timestamp oracle read timestamp after a table
    write.

An alternative framing of this commit is that it actually does move the
catalog shard onto the EpochMillis timeline, but the catalog shard
exhibits eventual consistency.

Works towards resolving #MaterializeInc/database-issues/8384

Motivation

This PR adds a known-desirable feature.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@jkosh44 jkosh44 force-pushed the catalog-write-ts branch 10 times, most recently from d2c65c3 to 7dc531a Compare October 16, 2024 20:27
@jkosh44 jkosh44 changed the title Catalog write ts catalog: Pass timestamps for transaction commit Oct 16, 2024
@jkosh44 jkosh44 force-pushed the catalog-write-ts branch 9 times, most recently from dff51ab to 54ddb59 Compare October 21, 2024 14:22
@jkosh44 jkosh44 force-pushed the catalog-write-ts branch 2 times, most recently from 35a2c5c to 44628b9 Compare November 1, 2024 21:16
@jkosh44 jkosh44 force-pushed the catalog-write-ts branch 8 times, most recently from 0ebe505 to 7365135 Compare November 15, 2024 20:20
@jkosh44 jkosh44 force-pushed the catalog-write-ts branch 4 times, most recently from e6764a0 to 015ebb7 Compare November 22, 2024 17:53
Comment on lines +3786 to +3788
// Opening the durable catalog uses one or more timestamps without communicating with
// the timestamp oracle. Here we make sure to apply the catalog upper with the timestamp
// oracle to linearize future operations with opening the catalog.
Copy link
Contributor Author

@jkosh44 jkosh44 Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block, and the ones below, are the exceptions to the "don't apply catalog timestamps" rule. These happens at the end of startup and don't have a negative impact on read latency because we're going to do a bunch of table writes after this which bumps and applies the oracle timestamp.

This commit adds plumbing for the catalog to accept a commit timestamp
for transactions.

In general, it is up to the adapter to provide that timestamp.
Currently, the timestamp that the adapter uses depends on the context.
During startup, when the timestamp oracle isn't always readily
available, the adapter will pass some timestamp that it knows will
be greater than or equal to the current catalog upper. Sometimes this
is a write timestamp from the timestamp oracle, other times it's the
current catalog upper. During all other times, the adapter always
passes a write timestamp from the timestamp oracle.

Importantly, the adapter never applies the commit timestamps with the
timestamp oracle. If the adapter did apply these timestamps, then
strict serializable reads on tables might block if the table upper is
not past the applied timestamp.

The goal of this PR is to move the catalog closer to the EpochMillis
timeline and set up the necessary infrastructure to pass timestamps
from the adapter to the catalog.

This work doesn't fully move the catalog onto the EpochMillis
timeline, the following work still remains:

    - Always apply the commit timestamp with the timestamp oracle after
    committing.
    - Add the catalog shard to the txn shard with all other tables.
    This would prevent blocking table reads when applying the commit
    timestamp.

An alternative framing of this commit is that it actually does move the
catalog shard onto the EpochMillis timeline, but the catalog shard
exhibits eventual consistency.

Works towards resolving #MaterializeInc/database-issues/8384
@jkosh44 jkosh44 marked this pull request as ready for review November 22, 2024 20:37
@jkosh44 jkosh44 requested a review from a team as a code owner November 22, 2024 20:37
@jkosh44 jkosh44 requested a review from ParkMyCar November 22, 2024 20:37
Copy link

shepherdlybot bot commented Nov 22, 2024

Risk Score:83 / 100 Bug Hotspots:7 Resilience Coverage:33%

Mitigations

Completing required mitigations increases Resilience Coverage.

  • (Required) Code Review 🔍 Detected
  • (Required) Feature Flag
  • (Required) Integration Test
  • (Required) Observability
  • (Required) QA Review 🔍 Detected
  • (Required) Run Nightly Tests
  • Unit Test
Risk Summary:

The pull request has a high-risk score of 83, primarily driven by predictors such as the average line count in files and the number of executable lines within files. Historically, PRs with these predictors are 156% more likely to cause a bug than the repo baseline. Additionally, the repository has a steady observed bug trend, and 7 modified files are identified as hotspots with elevated levels of recent bug fixes.

Note: The risk score is not based on semantic analysis but on historical predictors of bug occurrence in the repository. The attributes above were deemed the strongest predictors based on that history. Predictors and the score may change as the PR evolves in code, time, and review activity.

Bug Hotspots:
What's This?

File Percentile
../durable/upgrade.rs 95
../durable/persist.rs 95
../inner/create_continual_task.rs 96
../src/coord.rs 98
../src/catalog.rs 90
../sequencer/inner.rs 95
../catalog/open.rs 92

@jkosh44
Copy link
Contributor Author

jkosh44 commented Nov 25, 2024

@def- the RQG tests seem to be consistently failing on Nightly (https://buildkite.com/materialize/nightly/builds/10500#0193553d-f924-4716-b493-c3a809bd6705), but I can't imagine how this PR would impact SELECT statements. Is that a known flake?

@def-
Copy link
Contributor

def- commented Nov 25, 2024

I think you just got unlucky and hit a bad seed. The difference is a slight rounding difference compared to Postgres, which we are tracking as https://github.com/MaterializeInc/database-issues/issues/8366

Copy link
Member

@ParkMyCar ParkMyCar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jkosh44 jkosh44 merged commit 3e34715 into MaterializeInc:main Nov 25, 2024
213 of 223 checks passed
@jkosh44 jkosh44 deleted the catalog-write-ts branch November 25, 2024 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants