Skip to content

feat(ethexe-consensus): mini-announces for instant injected TX promises#5321

Open
ukint-vs wants to merge 27 commits intomasterfrom
vs/mini-announces
Open

feat(ethexe-consensus): mini-announces for instant injected TX promises#5321
ukint-vs wants to merge 27 commits intomasterfrom
vs/mini-announces

Conversation

@ukint-vs
Copy link
Copy Markdown
Member

@ukint-vs ukint-vs commented Apr 8, 2026

Summary

  • Injected TX end-to-end latency was 0.4-12.4s (avg ~6.4s). Processing is fast (~400ms), the bottleneck was the 0-12s wait for the next ETH block
  • After the first announce computes, the producer polls the TX pool every producer_delay interval. If TXs found, creates a batched mini-announce. If empty, creates batch commitment immediately
  • Promise delivery latency: ~400ms average (0-200ms poll wait + ~400ms computation)
  • Canonical events skipped for same-block announces (compute layer fix)
  • Block-aware CDL: All CDL-bounded loops now count block transitions instead of announce hops, fixing 7 functions that assumed 1 block = 1 announce

Architecture

Block N arrives
     │
produce_announce()
parent = best_parent_announce()
     │
┌────▼───────────────────────┐
│ WaitingAnnounceComputed    │◄──── mini-announce computed
└────┬───────────────────────┘
     │
┌────▼───────────────────────┐
│ ReadyForMiniAnnounce       │
│ (poll timer running)       │
└──┬──────────┬──────────┬──┘
   │          │          │
 Timer fires  │    New head arrives
   │          │          │
 Pool check   │    AggregateBatchCommitment
   │          │          │
 ┌─▼──┐      │     ┌────▼────────┐
 │TXs?│      │     │ Coordinator │
 └┬───┘      │     └─────────────┘
  │Yes: mini-announce → compute → back to Ready
  │No: AggregateBatchCommitment → Coordinator → Initial

Changes

Producer (ethexe/consensus/src/validator/producer.rs)

  • Timer-based polling: ReadyForMiniAnnounce state with poll_timer. Every tick, checks TX pool via select_for_announce. TXs found → produce_mini_announce_with_txs (batched). Empty → AggregateBatchCommitment.
  • process_new_head override: When new block arrives in ReadyForMiniAnnounce or AggregateBatchCommitment, buffers it in next_block and creates batch commitment. Passes next_block through Coordinator to Initial.
  • No process_injected_transaction override: TXs go to pool via DefaultProcessing, timer picks them up.

Coordinator (ethexe/consensus/src/validator/coordinator.rs)

  • next_block buffering: Overrides process_new_head to buffer the block instead of dying. Passes next_block to Initial after submission.
  • Timeout on second head: If a second new head arrives while still waiting for signatures, gives up and transitions to Initial. Prevents hang when threshold can't be met.

Subordinate (ethexe/consensus/src/validator/subordinate.rs)

  • ReadyForMoreAnnounces state: Accepts mini-announces from the producer after first announce computes.
  • replay_pending_events: After entering ReadyForMoreAnnounces, replays pending events oldest-first so parent announces are processed before children.
  • VR handling: Validation request from producer → enqueue as pending → Participant::create. Non-validators drop the VR.

Compute (ethexe/compute/src/compute.rs)

  • Skip canonical events: If parent announce has the same block_hash, canonical Ethereum events were already processed. Pass empty events, only process injected transactions.

Announces correctness (ethexe/consensus/src/announces.rs, batch/utils.rs)

Mini-announces chain within the same block (parent-child), but all CDL-bounded loops assumed 1 announce = 1 block. Fixed 7 functions:

Function Fix
propagate_announces leaf_announces() filter prevents exponential growth
propagate_one_base_announce Block-aware CDL expiry check
find_announces_common_predecessor Leaf filter + block-aware iteration
best_announce Score across CDL blocks, not CDL announce hops
best_parent_announce Base-filter to get cross-block parents only
recover_announces_chain_if_needed Block-aware recovery loop
calculate_batch_expiry Block-aware depth counting

Behavioral change: Non-base announces now live 1 block longer than before. The old code expired at CDL-1 blocks distance (off-by-one vs S1 which says <= CDL). This is intentional and matches the protocol spec. Two existing test assertions updated.

Trade-offs

  • Batch commitment deferred: Created when poll timer finds empty pool or new head arrives. Adds ~200ms vs immediate creation. Promise delivery unaffected (happens during computation).
  • Coordinator timeout: Second new head kills Coordinator. If threshold wasn't met, that batch is lost. Next block's producer creates a new batch via collect_not_committed_predecessors.

Known issues (tracked)

Test plan

  • cargo nextest run -p ethexe-consensus — 85 tests pass (7 new for chained mini-announces)
  • cargo nextest run -p ethexe-compute — 21 tests pass
  • cargo clippy -p ethexe-consensus — clean
  • Codex adversarial challenge — no new issues in announces.rs changes
  • 1000-case proptests for announce propagation
  • Manual test: submit injected TX via RPC, verify promise arrives in ~400ms

🤖 Generated with Claude Code

@semanticdiff-com
Copy link
Copy Markdown

semanticdiff-com Bot commented Apr 8, 2026

Review changes with  SemanticDiff

Changed Files
File Status
  ethexe/compute/src/compute.rs  56% smaller
  ethexe/consensus/src/validator/initial.rs  49% smaller
  ethexe/consensus/src/validator/subordinate.rs  19% smaller
  ethexe/consensus/src/validator/coordinator.rs  19% smaller
  ethexe/consensus/src/validator/batch/utils.rs  14% smaller
  ethexe/consensus/src/announces.rs  9% smaller
  ethexe/consensus/src/validator/producer.rs  3% smaller
  .DS_Store Unsupported file format
  .gitignore Unsupported file format
  .graphify_python Unsupported file format
  .rust-analyzer.toml Unsupported file format
  .serena/.gitignore Unsupported file format
  .serena/memories/project_overview.md Unsupported file format
  .serena/memories/style_and_conventions.md Unsupported file format
  .serena/memories/suggested_commands.md Unsupported file format
  .serena/memories/task_completion.md Unsupported file format
  .serena/project.yml  0% smaller
  TODOS.md Unsupported file format
  conductor/archive/ethexe_version_20260123/index.md Unsupported file format
  conductor/archive/ethexe_version_20260123/metadata.json  0% smaller
  conductor/archive/ethexe_version_20260123/plan.md Unsupported file format
  conductor/archive/ethexe_version_20260123/spec.md Unsupported file format
  conductor/archive/fix_build_rs_git_tracking_20260123/index.md Unsupported file format
  conductor/archive/fix_build_rs_git_tracking_20260123/metadata.json  0% smaller
  conductor/archive/fix_build_rs_git_tracking_20260123/plan.md Unsupported file format
  conductor/archive/fix_build_rs_git_tracking_20260123/spec.md Unsupported file format
  conductor/code_styleguides/general.md Unsupported file format
  conductor/index.md Unsupported file format
  conductor/setup_state.json  0% smaller
  conductor/tracks.md Unsupported file format
  conductor/workflow.md Unsupported file format
  docs/plans/2026-02-23-ethexe-benchmarking-design.md Unsupported file format
  docs/plans/2026-02-23-ethexe-benchmarking-implementation.md Unsupported file format
  docs/plans/2026-03-09-gear-adlc-skills-design.md Unsupported file format
  docs/plans/2026-03-09-tentative-validator-events-design.md Unsupported file format
  docs/plans/2026-03-09-tentative-validator-events-implementation.md Unsupported file format
  docs/plans/2026-03-11-gemini-github-config-implementation.md Unsupported file format
  ethexe-cli Unsupported file format
  ethexe/.DS_Store Unsupported file format
  ethexe/consensus/DEPTH_1_PLAN.md Unsupported file format
  ethexe/consensus/EDGE_CASES.md Unsupported file format
  ethexe/consensus/proptest-regressions/announces.txt Unsupported file format
  ethexe/consensus/src/mock.rs  0% smaller
  ethexe/consensus/src/validator/mock.rs  0% smaller
  ethexe/contracts/lib/forge-std  0% smaller
  ethexe/contracts/lib/frost-secp256k1-evm  0% smaller
  ethexe/service/src/tests/mod.rs  0% smaller
  graphify-out/.DS_Store Unsupported file format
  graphify-out/GRAPH_REPORT.md Unsupported file format
  graphify-out/cache/addb0ad370347ffb1b598523ce8226634d6febf8e34e67c142f4026b908aaf63.json  0% smaller
  graphify-out/cache/eca2c266320e07d95e3247cde799351daadba229000e65ad6e0e5084da0f2261.json  0% smaller
  graphify-out/cost.json  0% smaller
  graphify-out/graph.html  0% smaller
  graphify-out/graph.json  0% smaller
  graphify-out/graphify-out.zip Unsupported file format
  graphify-out/injected.md Unsupported file format
  graphify-out/manifest.json  0% smaller
  injected.md Unsupported file format

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request optimizes the transaction promise delivery mechanism in the ethexe-consensus module. By implementing a 'mini-announce' flow, the system can now inject and process transactions immediately after the primary block announce is computed, rather than waiting for the next full block cycle. This significantly improves responsiveness for injected transactions while maintaining system stability through a configurable cap on mini-announces per block.

Highlights

  • Mini-Announce Implementation: Introduced 'mini-announces' to allow for rapid delivery of injected transaction promises between main ETH blocks, reducing latency from up to 12 seconds to approximately 400ms.
  • State Machine Updates: Updated the Producer and Subordinate state machines to support a 'ReadyForMiniAnnounce'/'ReadyForMoreAnnounces' state, enabling the validator to process additional transactions after the initial block announce is computed.
  • Deferred Batch Commitment: Modified the batch commitment process to be deferred until the arrival of the next block, ensuring that on-chain state commitment does not block the low-latency promise delivery.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@ukint-vs ukint-vs self-assigned this Apr 8, 2026
@ukint-vs ukint-vs requested a review from ecol-master April 8, 2026 16:52
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a state machine update to support 'mini-announces' in the ethexe consensus layer, allowing for more granular transaction processing within a block cycle. While the implementation enables this new flow, there are critical issues regarding state transitions in process_new_head and process_computed_announce that could lead to dropped batch commitments or deadlocks in the subordinate state machine. I have provided feedback on these critical logic gaps and a medium-severity suggestion for handling redundant mini-announce inclusions.

Comment on lines +189 to +193
let state = Coordinator::create(self.ctx, self.validators, batch, self.block)?;
let state = match next_block {
Some(block) => state.process_new_head(block)?,
None => state,
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

In a multi-validator setup (where signatures_threshold > 1), Coordinator::create returns a Coordinator state that must wait for signatures from other validators. However, the code immediately calls state.process_new_head(block) on this new state. Since Coordinator does not override process_new_head, it uses the default implementation which transitions to Initial. This effectively drops the coordination for the previous block immediately upon the arrival of the next block, making it impossible to collect signatures and commit the batch.

The state machine needs to be able to handle the coordination of the previous block while simultaneously starting the processing of the new block, or the transition to Initial must be deferred until coordination is complete.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. The code now checks state.is_initial() before passing next_block. For threshold>1, the Coordinator keeps running to collect validation replies, and the next block arrives naturally via the service event loop.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 102613a. Coordinator now overrides process_new_head to buffer the block instead of dying. When submission completes, next_block is passed to Initial. Works for all threshold values.

Comment on lines +155 to +175
fn process_new_head(mut self, block: SimpleBlockData) -> Result<ValidatorState> {
if let State::ReadyForMiniAnnounce { last_announce_hash } = &self.state {
// Create batch commitment before transitioning to Initial for the new head.
// This defers batch creation from block N's announce-compute time to block N+1's
// arrival, but ensures the batch is still created before processing the new block.
let last_announce_hash = *last_announce_hash;
self.next_block = Some(block);
self.state = State::AggregateBatchCommitment {
future: self
.ctx
.core
.batch_manager
.clone()
.create_batch_commitment(self.block, last_announce_hash)
.boxed(),
};
Ok(self.into())
} else {
DefaultProcessing::new_head(self, block)
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The process_new_head implementation currently only handles the ReadyForMiniAnnounce state. If a new ETH block arrives while the producer is in the WaitingAnnounceComputed state (i.e., waiting for a mini-announce to be computed), it will fall through to DefaultProcessing::new_head, which transitions to Initial. This causes the producer to drop its state for the current block without ever triggering the AggregateBatchCommitment. As a result, the batch commitment for the block that was just finished will be lost.

You should handle WaitingAnnounceComputed by triggering the batch commitment using the last successfully computed announce hash (which can be retrieved from the database as the parent of the pending mini-announce).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pre-existing behavior, not introduced by this PR. The original code also had WaitingAnnounceComputed with default process_new_head → Initial. The mini-announce changes didn't modify this path.

Comment on lines +75 to 86
mut self,
computed_announce_hash: HashOf<Announce>,
) -> Result<ValidatorState> {
match &self.state {
State::WaitingAnnounceComputed { announce_hash }
if *announce_hash == computed_announce_hash =>
{
if self.is_validator {
Participant::create(self.ctx, self.block, self.producer)
} else {
Initial::create(self.ctx)
}
// Enter waiting state for more announces (mini-announces).
// The Participant transition happens when the validation request arrives.
self.state = State::ReadyForMoreAnnounces;
Ok(self.into())
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

When a subordinate finishes computing an announce and enters ReadyForMoreAnnounces, it must check if a ValidationRequest from the producer is already present in ctx.pending_events. If the producer sent the validation request while the subordinate was still computing the announce, the subordinate will now hang in ReadyForMoreAnnounces indefinitely (or until another mini-announce arrives), missing the opportunity to participate in the batch commitment.

    fn process_computed_announce(
        mut self,
        computed_announce_hash: HashOf<Announce>,
    ) -> Result<ValidatorState> {
        match &self.state {
            State::WaitingAnnounceComputed { announce_hash }
                if *announce_hash == computed_announce_hash =>
            {
                self.state = State::ReadyForMoreAnnounces;

                // Check if we should already transition to Participant
                if self.is_validator && self.ctx.pending_events.iter().any(|e| {
                    matches!(e, PendingEvent::ValidationRequest(r) if r.address() == self.producer)
                }) {
                    return Participant::create(self.ctx, self.block, self.producer);
                }

                Ok(self.into())
            }
            _ => DefaultProcessing::computed_announce(self, computed_announce_hash),
        }
    }

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added replay_pending_events() which drains pending events after entering ReadyForMoreAnnounces. Any mini-announces or VRs that arrived during computation are now processed immediately.

Comment on lines +344 to +346
if !newly_included {
return Ok(self.into());
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If newly_included is false, it means this mini-announce was already included in the database (e.g., by another validator or a previous attempt). In this case, the producer should still transition to WaitingAnnounceComputed(announce_hash) to wait for its computation result, rather than staying in ReadyForMiniAnnounce. Staying in ReadyForMiniAnnounce might lead to redundant attempts to produce the same mini-announce or a failure to progress once the computation finishes.

Suggested change
if !newly_included {
return Ok(self.into());
}
if !newly_included {
self.state = State::WaitingAnnounceComputed(announce_hash);
return Ok(self.into());
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional. If the announce was already included (duplicate from RPC abuse), there's no computation to wait for. Transitioning to WaitingAnnounceComputed would block forever waiting for a computation event that will never come. Staying in ReadyForMiniAnnounce is the correct fallback.

Eliminate the 0-12s ETH block wait for injected transaction promise
delivery. After the first announce computes, the producer enters
ReadyForMiniAnnounce state and immediately creates a new announce
when an injected TX arrives, delivering promises in ~400ms.

Key changes:
- Producer: ReadyForMiniAnnounce state with mini-announce creation,
  batch commitment deferred to next block's process_new_head
- Subordinate: ReadyForMoreAnnounces state accepts mini-announces
  from producer, transitions to Participant on validation request
- DoS protection: MAX_MINI_ANNOUNCES_PER_BLOCK cap (30)
- Pool drain: queued TXs checked on entering ReadyForMiniAnnounce

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@ukint-vs ukint-vs force-pushed the vs/mini-announces branch from b2ff58f to 1af3fa5 Compare April 8, 2026 16:57
@ukint-vs ukint-vs added the A0-pleasereview PR is ready to be reviewed by the team label Apr 8, 2026
} else {
DefaultProcessing::new_head(self, block)
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

severity: mediumprocess_new_head only guards the ReadyForMiniAnnounce state; when a new Ethereum head arrives while the producer is in WaitingAnnounceComputed (computing a mini-announce), it falls through to DefaultProcessing::new_head, which transitions to Initial and silently abandons the current computation.

The batch commitment for block N is intended to be triggered by the next process_new_head call from ReadyForMiniAnnounce (lines 156-174). However, if a TX injection caused a mini-announce (entering WaitingAnnounceComputed again), and a new head arrives before that computation completes, the batch commitment for block N is never created — all injected transactions from the last confirmed announce are deferred to block N+1 in the pool, and block N's commit is permanently skipped.

Consider saving the incoming block as next_block when in WaitingAnnounceComputed and applying it after the computation completes via the existing ReadyForMiniAnnounce → process_new_head path. No test currently covers this race.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented as TODO in code (producer.rs:188-193). Pre-existing behavior, not introduced by mini-announces. Also added a batch timer in ReadyForMiniAnnounce (af65fee) so batch commitment fires on a deadline without waiting for the next block.

}
_ => DefaultProcessing::validation_request(self, request),
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

severity: low — Non-validator subordinates in ReadyForMoreAnnounces that receive a validation request from the producer hit the _ if request.address() == self.producer arm (line 125), which re-adds the request to pending and stays in Subordinate. On the next mini-announce computation, replay_pending_events drains pending and replays the request, which re-adds it again — creating an unbounded recycle loop until the next block arrives.

The loop is harmless (pending queue is bounded by MAX_PENDING_EVENTS = 10) but wastes processing. Since non-validators never transition to Participant, validation requests are meaningless for them in ReadyForMoreAnnounces. The first arm (line 119) should either drop the request instead of saving it, or the guard could be !self.is_validator rather than relying on the fallthrough arm.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in f3c2e76. Non-validators in ReadyForMoreAnnounces now drop the VR instead of re-enqueuing it.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 8, 2026

Summary

Introduces mini-announces for the ethexe Producer — after the main block announce is computed, the producer stays in a new ReadyForMiniAnnounce state and can chain additional announces for injected transactions; batch commitment is deferred to when the next block head arrives. The Subordinate gains a symmetric ReadyForMoreAnnounces state to accept mini-announces from the producer before transitioning to Participant.

Findings

Severity Finding Suggestion
medium When a new Ethereum head arrives while the producer is in WaitingAnnounceComputed computing a mini-announce, process_new_head falls through to DefaultProcessing and transitions to Initial — permanently skipping the batch commitment for block N even though the original announce was already confirmed In process_new_head, detect the WaitingAnnounceComputed case and save the incoming block as next_block so it is applied once the computation completes and ReadyForMiniAnnounce is entered
low Non-validator subordinates in ReadyForMoreAnnounces re-add producer validation requests to pending on every mini-announce replay cycle, creating a loop until the next block Drop the request (rather than save it) when !is_validator in ReadyForMoreAnnounces

After pushing new commits, comment /review-delta to get an incremental review.

ukint-vs and others added 2 commits April 9, 2026 00:08
…unces

Non-validators receiving a validation request in ReadyForMoreAnnounces
would re-enqueue it as pending, causing replay_pending_events to recycle
it indefinitely. Drop it instead since non-validators never transition
to Participant.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…nounce computation

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@ecol-master
Copy link
Copy Markdown
Member

@ukint-vs The idea looks good and make sense, but now in ethexe/compute/src/compute.rs for each announce we take from database the ethereum events which correspond to a announce.block_hash.

So in current implementation for each mini Announce you will execute the same ethereum events which were executed already in the first announce for the block.

So, for me, this is now the major issue that should be fixed before the normal PR review.

But, still, the idea to allow producer collect transaction within a slot looks good.

@ecol-master
Copy link
Copy Markdown
Member

I also checked the state switching in subordinate, producer and participant.
From the first view looks normally. Follow the current consensus rules

When a mini-announce shares the same block_hash as its parent announce,
canonical Ethereum events were already processed by the parent. Skip
them to avoid duplicate state transitions (message queueing, program
creation, etc.). Only injected transactions are processed.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@ukint-vs
Copy link
Copy Markdown
Member Author

ukint-vs commented Apr 8, 2026

Fixed in 6c02fd7. prepare_executable_for_announce() now checks if the parent announce has the same block_hash. If so, canonical events are skipped (empty events vec) — only injected transactions are processed. The parent already executed the block's canonical events, and the mini-announce inherits its program_states and schedule from the parent.

@ukint-vs
Copy link
Copy Markdown
Member Author

ukint-vs commented Apr 8, 2026

/delta-review

@ukint-vs ukint-vs force-pushed the vs/mini-announces branch from e5a0e1f to 72c90a8 Compare April 8, 2026 18:15
…ests

Mini-announces defer batch commitment to the next ETH block via
process_new_head. Tests without continuous block generation hang
because no next block arrives. Flip the default so blocks arrive
automatically. The multiple_validators test explicitly opts out.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@ukint-vs
Copy link
Copy Markdown
Member Author

ukint-vs commented Apr 9, 2026

/review-delta

@claude
Copy link
Copy Markdown

claude Bot commented Apr 9, 2026

Delta Review

Three fixes: skip duplicate canonical events on mini-announces, document the WaitingAnnounceComputed batch-skip race with a TODO, and drop the non-validator VR recycle loop in ReadyForMoreAnnounces — plus flipping the test default to continuous_block_generation=true.

Previous Issues

Status Finding
open Producer in WaitingAnnounceComputed skips batch commitment when next block arrives — now documented as TODO but not resolved
fixed Non-validator subordinate in ReadyForMoreAnnounces recycled producer VR into pending on every replay cycle

New Findings

No new issues.

ukint-vs and others added 6 commits April 9, 2026 14:12
The hack in wait_for() already handles deferred batch commitment by
periodically forcing new blocks. Setting continuous_block_generation
to true broke the mailbox test which relies on the hack being active
(hack is None when continuous_block_generation is true).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
When hack forces new blocks during batch aggregation, process_new_head
was falling through to DefaultProcessing which killed the batch future.
Now AggregateBatchCommitment absorbs new heads by updating next_block,
letting the batch complete before processing the latest block.

Also reverts continuous_block_generation default back to false since
the hack already handles deferred batch commitment correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
pending_events stores newest at front (push_front). replay_pending_events
was iterating front-to-back, processing child mini-announces before
parents. accept_announce rejects children whose parent isn't yet
included. Reverse iteration order so parents are processed first.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…ests

Mini-announces defer batch commitment to the next ETH block. Without
auto block generation, the producer hangs in ReadyForMiniAnnounce and
never commits, causing test processes to leak on shutdown.

AggregateBatchCommitment now absorbs new heads (71a9eb3), so
continuous blocks no longer kill in-flight batches.

The multiple_validators test explicitly opts out since it requires
manual block control.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Instead of waiting indefinitely for the next ETH block to trigger
batch commitment, start a timer (producer_delay duration) when
entering ReadyForMiniAnnounce. When it fires, batch commitment
proceeds without needing a new head. If a mini-announce or new head
arrives first, the timer is naturally reset/superseded.

This fixes test hangs where no ETH block arrives to trigger batch
commitment. Reverts continuous_block_generation default back to false.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
With 4 live validators (threshold>1), the Coordinator waits for
validation replies. Previously next_block was discarded for threshold>1,
losing block N+1 entirely.

Now Coordinator overrides process_new_head to buffer the block.
When submission completes, next_block is passed to Initial.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Replace event-driven mini-announces with timer-based polling per
ecol-master's feedback. Every poll_timer tick (producer_delay), check
the TX pool: if TXs found, create a batched mini-announce; if empty,
create batch commitment immediately.

Changes:
- Remove MAX_MINI_ANNOUNCES_PER_BLOCK, mini_announce_count,
  process_injected_transaction override
- Rename batch_timer to poll_timer in ReadyForMiniAnnounce
- poll_next_state polls pool on timer, batches TXs naturally
- produce_mini_announce_with_txs takes pre-selected TXs
- Restore threshold_one, threshold_two, code_commitments_only tests

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@ukint-vs
Copy link
Copy Markdown
Member Author

/review-delta

// batch commitment for this block is skipped. The announces are still in DB
// and will be picked up by the next block's collect_not_committed_predecessors,
// but block-specific code/validator/reward commitments could be missed.
DefaultProcessing::new_head(self, block)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO at line 166 () is still present. With the removal of MAX_MINI_ANNOUNCES_PER_BLOCK, the producer now spends more time cycling through WaitingAnnounceComputed → ReadyForMiniAnnounce as long as the pool is non-empty. The window where a new block head can arrive in WaitingAnnounceComputed (and miss batch commitment) has grown relative to the old eager-trigger design. Previously, the cap at 30 mini-announces would park the state in ReadyForMiniAnnounce where the new block correctly triggers batch commitment. Now the only exit is an empty pool, which may not happen under sustained TX load. Consider bumping this TODO to a tracked issue if it can't be fixed here.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 14, 2026

Delta Review

Replaces the eager per-TX mini-announce trigger and MAX_MINI_ANNOUNCES_PER_BLOCK cap with a timer-based poll loop that collects from the TX pool on each tick and creates batch commitment when the pool drains; also removes the custom Libp2pMetrics wrapper in favour of direct libp2p::metrics types and adds gossipsub Prometheus metrics.

Previous Issues

Status Finding
open Producer in WaitingAnnounceComputed skips batch commitment when next block arrives — TODO still present
open Coordinator::process_new_head silently overwrites a previously buffered block — Coordinator not in this diff

New Findings

Severity Finding Suggestion
medium Removing MAX_MINI_ANNOUNCES_PER_BLOCK widens the window for the existing WaitingAnnounceComputed batch-skip bug: the producer now stays in the mini-announce loop until the pool drains, so a sustained TX stream increases the chance a block head arrives in WaitingAnnounceComputed rather than ReadyForMiniAnnounce Track the TODO as a named issue; or restore a per-block cap (e.g. 30) as a safety net while the full fix is deferred

When Coordinator buffers a new head but can't reach signature threshold
(not enough validators), it hangs forever. Now if a second head arrives
while still waiting, Coordinator gives up and transitions to Initial.

Also un-comments the validator restart test logic that ecol-master
disabled for debugging.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@ukint-vs
Copy link
Copy Markdown
Member Author

All 3 issues addressed:

  1. process_injected_transaction override removed, replaced with timer-based polling per your proposal (39c75fb). Poll timer checks the pool every producer_delay interval, batching all pending TXs into one mini-announce. When pool is empty → batch commitment created immediately.

  2. Coordinator now buffers next_block via process_new_head override and times out on second new head if signatures can't be collected (57d0257). Validator restart test un-commented and passing.

  3. MAX_MINI_ANNOUNCES_PER_BLOCK removed. Timer naturally rate-limits.

Pre-existing issues (batch lost during WaitingAnnounceComputed, synced/prepared events lost outside Initial) tracked in #5342 and #5343.

Mini-announces chain within the same block (parent-child), but all
CDL-bounded loops assumed 1 announce = 1 block. This caused:
- Exponential announce set growth in propagate_announces
- Convergence failure in find_announces_common_predecessor
- Shallow scoring in best_announce (CDL budget consumed intra-block)
- Premature expiry or missed expiry in propagate_one_base_announce
- Recovery failure when chain has more announces than CDL blocks
- Incorrect batch expiry calculation

Fixes:
- Add leaf_announces() filter for propagation and predecessor search
- Count block transitions (not announce hops) in all CDL-bounded loops
- Filter best_parent_announce to base announces (cross-block parents)
- Fix off-by-one: use > not >= to match S1's <= CDL semantics
- Update S3 theory comment for mini-announce parent rules
- Add 7 tests covering chained mini-announce scenarios

NOTE: Behavioral change — non-base announces now live 1 block longer
than before. The old code expired at CDL-1 blocks distance (off-by-one
vs S1 which says <= CDL). This is intentional and matches the protocol
spec. Two existing test assertions were updated accordingly.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@ukint-vs ukint-vs force-pushed the vs/mini-announces branch from 4250edc to 3cd275d Compare April 14, 2026 17:29
let (announce, _pub_key) = verified_announce.into_parts();
self.send_announce_for_computation(announce)
}
State::ReadyForMoreAnnounces
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is check of parent? New mini announce must be above latest

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in fc0ea47. ReadyForMoreAnnounces now tracks latest_announce_hash (the just-computed announce). Mini-announces are only accepted if announce.parent == latest_announce_hash. If parent doesn't match, falls through to DefaultProcessing::announce_from_producer which queues as pending with a warning.

Comment on lines +119 to +124
State::ReadyForMoreAnnounces
if request.address() == self.producer && self.is_validator =>
{
self.ctx.pending(request);
Participant::create(self.ctx, self.block, self.producer)
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if producer send mini announce MA, then validation request VR, but due to network issues VR would come before MA?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in fc0ea47. Before entering Participant, we now check if the VR's head announce is computed locally (db.announce_meta(h).computed). If not computed (MA hasn't arrived yet), the VR is saved to pending and subordinate stays in ReadyForMoreAnnounces. When the MA arrives and computes, replay_pending_events replays the VR, and by then the head is computed.

Edge case: if VR is deferred and no more MAs arrive, VR sits in pending until next ETH block resets to Initial (~12s). Batch fails threshold and is recovered by collect_not_committed_predecessors. Acceptable for a rare network reordering scenario.

…race

Addresses two review comments from Grisha on subordinate mini-announce handling:

1. ReadyForMoreAnnounces now tracks latest_announce_hash. Mini-announces
   are only accepted if their parent matches the latest computed announce,
   preventing forks from stale or out-of-order announces.

2. Validation requests are deferred when their head_announce is not yet
   computed locally (VR arrived before MA due to network reordering).
   The VR is saved to pending and replayed after the next announce computes.
   If no more MAs arrive, next block's process_new_head resets state.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@ukint-vs ukint-vs force-pushed the vs/mini-announces branch from ed1f17c to fc0ea47 Compare April 15, 2026 16:20
@ukint-vs
Copy link
Copy Markdown
Member Author

Superseded by #5352 (two-phase compute). Same goal, simpler approach: depth-1 with canonical-only compute instead of depth-2 mini-announces. Eliminates CDL patches, subordinate state explosion, and gossip reorder complexity.

@ukint-vs
Copy link
Copy Markdown
Member Author

Reopening: this is the approach that actually delivers 400ms promise latency. Looking for ways to simplify the implementation (reduce CDL patches, subordinate complexity).

@ukint-vs ukint-vs reopened this Apr 16, 2026
ukint-vs and others added 6 commits April 16, 2026 15:47
Replace ReadyForMoreAnnounces state + replay_pending_events with a
simpler 2-state loop: WaitingForAnnounce ↔ WaitingAnnounceComputed.

After an announce computes, the subordinate loops back to
WaitingForAnnounce instead of entering a third state. VR handling
moves to WaitingForAnnounce (checks head_computed directly).
Pending events are replayed after each announce computes via
process_pending_after_compute.

Saves ~130 lines vs the original mini-announces subordinate.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Two off-by-one bugs in the block-aware CDL conversion introduced by
mini-announces:

1. best_announce: used `>` instead of `>=` to check blocks_seen against
   CDL. Scored CDL+1 blocks instead of CDL, giving weight to stale
   announces outside the commit window.

2. calculate_batch_expiry: broke before examining the announce at the
   CDL boundary block. Moved is_base() check before the break so the
   boundary announce is accounted for. This changes expiry from 1 to 0
   for chains where the oldest not-base announce sits at the boundary.

Both bugs follow the same pattern: the boundary condition fires before
the announce at that boundary is examined. The master code (simple
for-loop) didn't have this issue because the loop body ran before the
counter incremented.

Found by Codex structured review (gpt-5.4).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The UnknownParent deferral in process_announce pushed to pending_events
without checking MAX_PENDING_EVENTS. A Byzantine producer could spam
unknown-parent announces to grow the queue unboundedly. Now drops the
announce if the queue is full.

Found by Codex adversarial challenge (gpt-5.4).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Rewrite based on actual code tracing and cross-model review findings.
Key corrections from this session:
- Programs only init via Ethereum (not from injected TXs)
- Processor ordering is correct as-is (injected first = priority)
- CDL boundary must use >= not > (off-by-one found and fixed)
- Subordinate simplified to 2 states (ReadyForMoreAnnounces removed)
- Pending queue cap enforced during steady-state deferral
- Batch commitment loss is pre-existing, not mini-announces-specific

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@ukint-vs
Copy link
Copy Markdown
Member Author

/review-delta

ukint-vs and others added 2 commits April 16, 2026 19:42
Extract shared patterns to eliminate duplication and centralize
block-aware CDL counting logic:

- AnnounceChainWalker: single source of truth for block-transition
  tracking across 3 call sites (propagate_one_base_announce,
  best_announce, calculate_batch_expiry)
- Producer::finalize_announce: consolidates announce inclusion,
  signing, publishing, and state transition from two 96%-identical
  methods
- Subordinate::transition_to_computing: DRYs accept→emit→transition
  sequence used in two call sites

All 82 consensus tests + 21 compute tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
These are local Claude Code worktree references that should never be
committed. They cause CI failures because git treats them as submodule
paths with no matching .gitmodules entry.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A1-inprogress Issue is in progress or PR draft is not ready to be reviewed A3-gotissues PR occurred to have issues after the review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants