Skip to content

Pipeline certificate batch downloads during chain sync#5954

Open
ndr-ds wants to merge 1 commit intomainfrom
ndr-ds/pipelined-cert-downloads-main
Open

Pipeline certificate batch downloads during chain sync#5954
ndr-ds wants to merge 1 commit intomainfrom
ndr-ds/pipelined-cert-downloads-main

Conversation

@ndr-ds
Copy link
Copy Markdown
Contributor

@ndr-ds ndr-ds commented Apr 8, 2026

Motivation

Chain synchronization downloads certificate batches sequentially: download batch →
process batch → download next batch. The CPU is idle during downloads and the network is
idle during processing. For a chain with ~13,000 blocks, this results in ~70-89
blocks/sec with each batch cycle taking progressively longer (2.7s → 22s).

Proposal

Replace the sequential download loop in download_certificates_from with a pipelined
sliding window using FuturesOrdered. Up to max_concurrent_batch_downloads (default:
5) batches are downloaded concurrently, and completed batches are processed sequentially
in order as they arrive.

The number of network requests is unchanged — same batch count as before, just
overlapped with processing instead of serialized.

A new CLI option --max-concurrent-batch-downloads controls the concurrency level.

The RequestsScheduler is now wrapped in Arc to allow sharing across download futures
without requiring Env: Clone.

Future follow-up (needs validator deploy): Expose a streaming gRPC endpoint where
the client requests a height range and the validator streams blocks back in order. This
would reduce N batch requests to a single streaming request.

Test Plan

CI.

Release Plan

  • These changes should be backported to the latest testnet branch.

Copy link
Copy Markdown
Contributor Author

ndr-ds commented Apr 8, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

Comment thread linera-core/src/client/mod.rs Outdated
pub static DEFAULT_CERTIFICATE_DOWNLOAD_BATCH_SIZE: u64 = 500;
pub static DEFAULT_CERTIFICATE_UPLOAD_BATCH_SIZE: u64 = 500;
pub static DEFAULT_SENDER_CERTIFICATE_DOWNLOAD_BATCH_SIZE: usize = 20_000;
pub static DEFAULT_MAX_CONCURRENT_BATCH_DOWNLOADS: usize = 5;
Copy link
Copy Markdown
Contributor

@ma2bd ma2bd Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we start with a very conservative value (1 ?) so that we don't accidentally roll out expensive experimental features to the web clients? Alternatively web/@linera/client could have a different default.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, makes total sense, will decrease

@ndr-ds ndr-ds force-pushed the ndr-ds/pipelined-cert-downloads-main branch from a32e5ec to 2b878b2 Compare April 9, 2026 01:31
@ndr-ds ndr-ds requested review from afck and deuszx April 9, 2026 02:12
Comment thread linera-core/src/client/mod.rs Outdated
>;

let mut download_height = next_height;
let mut futures: FuturesOrdered<CertificateBatchFuture> = FuturesOrdered::new();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think this style is preferred?)

Suggested change
let mut futures: FuturesOrdered<CertificateBatchFuture> = FuturesOrdered::new();
let mut futures = FuturesOrdered::<CertificateBatchFuture>::new();

Comment thread linera-core/src/client/mod.rs Outdated
scheduler
.download_certificates(&remote, chain_id, height, limit)
.await
}));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this duplication be avoided by using e.g. StreamExt::for_each_concurrent?

Comment thread linera-core/src/client/mod.rs Outdated
while let Some(result) = futures.next().await {
let certificates = result?;
let Some(info) = self
.process_certificates(slice::from_ref(remote_node), certificates)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that mean we're still not doing downloading and processing concurrently? While we're running process_certificates, nobody is polling futures.next() now.

Copy link
Copy Markdown
Contributor

@deuszx deuszx Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it definitely is not running in parallel.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original implementation used tokio::spawn, I think Claude removed it and I didn't see it 🤦🏻‍♂️
Good catch

@ndr-ds ndr-ds force-pushed the ndr-ds/pipelined-cert-downloads-main branch from 2b878b2 to 86ef317 Compare April 9, 2026 20:19
@ndr-ds ndr-ds requested review from afck and ma2bd April 9, 2026 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants