Pipeline certificate batch downloads during chain sync#5954
Pipeline certificate batch downloads during chain sync#5954
Conversation
| pub static DEFAULT_CERTIFICATE_DOWNLOAD_BATCH_SIZE: u64 = 500; | ||
| pub static DEFAULT_CERTIFICATE_UPLOAD_BATCH_SIZE: u64 = 500; | ||
| pub static DEFAULT_SENDER_CERTIFICATE_DOWNLOAD_BATCH_SIZE: usize = 20_000; | ||
| pub static DEFAULT_MAX_CONCURRENT_BATCH_DOWNLOADS: usize = 5; |
There was a problem hiding this comment.
Can we start with a very conservative value (1 ?) so that we don't accidentally roll out expensive experimental features to the web clients? Alternatively web/@linera/client could have a different default.
There was a problem hiding this comment.
True, makes total sense, will decrease
a32e5ec to
2b878b2
Compare
| >; | ||
|
|
||
| let mut download_height = next_height; | ||
| let mut futures: FuturesOrdered<CertificateBatchFuture> = FuturesOrdered::new(); |
There was a problem hiding this comment.
(I think this style is preferred?)
| let mut futures: FuturesOrdered<CertificateBatchFuture> = FuturesOrdered::new(); | |
| let mut futures = FuturesOrdered::<CertificateBatchFuture>::new(); |
| scheduler | ||
| .download_certificates(&remote, chain_id, height, limit) | ||
| .await | ||
| })); |
There was a problem hiding this comment.
Can this duplication be avoided by using e.g. StreamExt::for_each_concurrent?
| while let Some(result) = futures.next().await { | ||
| let certificates = result?; | ||
| let Some(info) = self | ||
| .process_certificates(slice::from_ref(remote_node), certificates) |
There was a problem hiding this comment.
Doesn't that mean we're still not doing downloading and processing concurrently? While we're running process_certificates, nobody is polling futures.next() now.
There was a problem hiding this comment.
Yes, it definitely is not running in parallel.
There was a problem hiding this comment.
The original implementation used tokio::spawn, I think Claude removed it and I didn't see it 🤦🏻♂️
Good catch
2b878b2 to
86ef317
Compare

Motivation
Chain synchronization downloads certificate batches sequentially: download batch →
process batch → download next batch. The CPU is idle during downloads and the network is
idle during processing. For a chain with ~13,000 blocks, this results in ~70-89
blocks/sec with each batch cycle taking progressively longer (2.7s → 22s).
Proposal
Replace the sequential download loop in
download_certificates_fromwith a pipelinedsliding window using
FuturesOrdered. Up tomax_concurrent_batch_downloads(default:5) batches are downloaded concurrently, and completed batches are processed sequentially
in order as they arrive.
The number of network requests is unchanged — same batch count as before, just
overlapped with processing instead of serialized.
A new CLI option
--max-concurrent-batch-downloadscontrols the concurrency level.The
RequestsScheduleris now wrapped inArcto allow sharing across download futureswithout requiring
Env: Clone.Future follow-up (needs validator deploy): Expose a streaming gRPC endpoint where
the client requests a height range and the validator streams blocks back in order. This
would reduce N batch requests to a single streaming request.
Test Plan
CI.
Release Plan
testnetbranch.