Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(l1): snap sync overhaul #1763

Open
wants to merge 202 commits into
base: main
Choose a base branch
from
Open

feat(l1): snap sync overhaul #1763

wants to merge 202 commits into from

Conversation

fmoletta
Copy link
Contributor

@fmoletta fmoletta commented Jan 21, 2025

Motivation
This PR introduces the following upgrades for snap-sync:

  • Use DB-persisted checkpoints so we can persist the sync progress throughout restarts & cycles
  • Stop ForckChoices & NewPayloads being applied while syncing
  • Improved handling of stale pivot during sub-processes
  • Improved handling of pending requests when aborting due to stale pivot
  • Fetching of large storage tries (that don't fit in a single range request)
  • Safer (but a bit slower) healing that can be restarted
  • Faster storage fetching (multiple parallel fetches)

And also simplifies it by removing the following logic:

  • No longer downloads bodies and receipts for blocks before the pivot during snap sync (WARNING: this goes against the spec but shouldn't be a problem for the time being)
  • Removes restart from latest block when latest - 64 becomes stale. (By this point it is more effective to wait for the next fork choice update)
  • Periodically shows state sync progress

Description

  • Stores the last downloaded block's hash in the DB during snap sync to serve as a checkpoint if the sync is aborted halfway (common case when syncing from genesis). This checkpoint is cleared upon succesful snap sync.
  • No longer fetches receipts or block bodies past the pivot block during snap sync
  • Add method sync_status which returns an enum with the current sync status (either Inactive, Active or Pending) and uses it in the ForkChoiceUpdate & NewPayload engine rpc endpoints so that we don't apply their logic during an active or pending sync.
  • Fetcher process now identify stale pivots and remain passive until they receive the end signal
  • Fetcher processes now return their current queue upon return so that it can be persisted into the next cycle
  • Stores the latest state root during state sync and healing as a checkpoint
  • Stores the last fetched key during state sync as a checkpoint
  • Healing no longer stores the nodes received via p2p, it instead inserts the leaf values and rebuilds it to avoid trie corruption between restarts.
  • The current progress percentage and estimated time to finish is periodically reported during state sync
    Misc:
  • Replaces some noisy unwraps in networking module with errors
  • Applies annotated hacky fixes for problems reported in bug(l1): connection attempt on already connected peer due to revalidation #1684 bug(l1): wrong logic when validating peer ForkID #1685 & bug(l1): peer connection broken due to failure to add transaction to mempool before sync #1686

Closes None

Closes #issue_number

@fmoletta fmoletta marked this pull request as ready for review January 22, 2025 21:53
@fmoletta fmoletta requested a review from a team as a code owner January 22, 2025 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants