Skip to content

Conversation

@adlrocha
Copy link
Collaborator

@adlrocha adlrocha commented May 5, 2025

Note: This PR is not meant to be merged

This PR includes a draft spec proposal for the submission and verification of child shard segments into the upper layers of the hierarchy.

The outcome of this discussion should be something that enables the implementation of this part of the protocol.

Copy link
Owner

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like #220 this is a start, but it doesn't really go into sufficient depth of actually composing a provable verification scheme.

For example I'm really interested in what the actual proof looks like, what information do we need? Turned out that the crucial thing, SuperSegmentProof's contents, is exactly what is missing here right now. I have discovered multiple pieces of information that would be needed and there are a few alternatives regarding how and where to store them

Pseudo-code needs to have the actual logic similarly to the data structures, for example this is not very informative, the contents of verify_proof is what matters and that is the only thing that is actually missing:

fn verify_piece_inclusion_proof(
    piece: Piece,
    proof: PieceProof,
    segment: Segment,
) -> bool {
    segment.verify_proof(piece, proof)
}

Comment on lines +27 to +39
```rust
struct ChildSegmentDescription {
// Shard if of the child shard the segment belongs to.
shard_id: ShardId,
// The root of the segment.
segment_root: Hash,
// Root of the previous segment created in the shard
prev_segment_root: Hash
// Local index of the segment (it may be redundant if we assume
// that segments are always submitted in increasing order)
local_index: u64,
}
```
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shard_id is implicitly clear from the rest of information sent from the child shard, I don't think it needs to be included here. local_index can also be tracked by the parent shard if necessary (it is always trivially +1 from the previous), though I'm not sure how it is helpful. Similarly it is not clear what is the purpose of prev_segment_root, there is nothing in the data structure to make sense of it, the data isn't tied together in any way.

It looks like only segment_root is truly needed here or do you have plans for other fields too?

Comment on lines +41 to +45
3. The status of child shard segments is tracked indexed by their `local_index` through their
`IndexStatus`. `IndexStatus` gives information about if the segment has been already submitted to
the parent, is pending confirmation in the beacon chain, or it has been submitted to the global
history of the beacon chain and has already been assigned its global segment index (pertaining
its sequence in the global history).
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracked where and how? I assume this is a client-side thing?

Comment on lines +47 to +58
```rust
let segment_index = (local_index: u64, global_index: IndexStatus<u64>)

enum IndexStatus {
// The segment has been committed to the global history.
Committed(u64),
// The segment is pending to be committed to the global history.
Pending,
// The segment has not been submitted yet.
NotSubmitted,
}
```
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SegmentIndex today is simply a u64. Why does it need to be a tuple of two values? If we're talking about global history it assumes every segment has a unique increasing number, not a tuple.

Comment on lines +74 to +77
5. The parent chain pulls all the `ChildSegmentDescription` (or `SegmentHeader`) from the child
segments propagated and after performing the corresponding light verification includes them on
the `consensus_info` of their next block along with any new local segment created in the parent
shard to propagate them all up to the beacon chain.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's maybe standardize on either using "chain" or "shard" here.

Also I have a suggestion on how to (subjectively) improve description of such things. For me it is easier to build a mental model of the thing that is described when I can find an anchor, a reference point. Here you're talking about the parent chain (so it is not an anchor, it is parent to something we'll mention later), then child segments (so not an anchor, because it is a child of something that will be mentioned later) and then parent shard (which I initially thought was not the same parent at the beginning of a sentence, but later realized it was). By "anchor" I mean a place where I would "stand" if I was to visualize what is happening.

So imagine there are layers of shards on top of each other:

               beacon chain
            /                \
      shard 1                 shard 2
     /       \               /       \
shard 11    shard 12    shard 21    shard 22

In the description given right now, I happen to "stand" somewhere between "shard 1" and "shard 11" for example, which is an awkward place, especially when "parent" is mentioned twice in the same sentence, meaning different things. Let's rewrite the sentence assuming "standing" on "shard 1" specifically:

A shard aggregates its own SegmentHeader along with any SegmentHeaders referenced by corresponding child shard block headers in consensus_info, so they are all propagated to the beacon chain.

I was trying to assume a reference point being "a shard" and count everything relatively to it (child shard and parent shard/beacon chain). Also we can reduce the verbosity a lot when there is a stable reference.

If we start with a SegmentHeader (for which we have already established when it is created and included in the history), we don't need to describe it again here, we don't need to describe where it is created either because it is implicitly "a shard" we just started a sentence with. So whenever we do that (and we know exactly when), if there happens to be block headers of child shards containing the same we aggregate it.

It also implies that there might be local segment header missing, but the logic we have described is still valid, we still aggregate them like before. Mentioning segment header implies we have verified its contents, whatever it happens to be (which is already described somewhere), so it doesn't need to be re-explained here.

At least this is the way I build a mental model about what is happening here.

Comment on lines +135 to +137
fn generate_piece_inclusion_proof(piece: Piece, segment: Segment) -> Option<PieceProof> {
segment.generate_proof(piece)
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step 1 in the previous section started with the piece that already contains the proof that it is a part of the segment.


segment.generate_proof(piece) kind of makes sense, but wrapping it in generate_piece_inclusion_proof doesn't really help with the description, it is an unnecessary (in this case) abstraction and is basically a tautology.

}
```

2. **Locate the corresponding super segment**:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locating super segment is not an important or interesting part, I think it is safe to say that we will not actually be locating anything, the process of proof generation will be a reaction to the super segment creation, implying we already have it and it is certainly not missing in that case (so Option<> is't needed).

Comment on lines +151 to +159
3. **Generate the super segment proof**:

- Using the `SuperSegment`'s Merkle tree, generate a proof of inclusion for `segment4`.

```rust
fn generate_super_segment_proof(segment: Segment, super_segment: SuperSegment) -> Option<SuperSegmentProof> {
super_segment.generate_proof(segment)
}
```
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part that is meant to be useful, but it is kind of not really, just a tautology without any meat.

Something like this would be more descriptive if we are doing pseudo-code:

let tree = MerkleTree::new(super_segment.components());
let super_segment_root = tree.root();
let segment_proof = tree.get_proof(segment_offset);

It indicates that we do turn a super segment into some set of "things" that we can build a Merkle Tree with. Then we create a proof that out RawSegment was there at segment_offset.

What we'll need to include in the piece to make it verifiable is:

  • segment_root
  • segment_offset
  • segment_proof
  • super_segment.num_segments (assuming segment roots are the only thing we create a Merkle Tree over)

These are the things both necessary and sufficient to securely verify inclusion against a single root hash. See Merkle Tree API:

pub fn verify(
root: &[u8; OUT_LEN],
proof: &[[u8; OUT_LEN]],
leaf_index: usize,
leaf: [u8; OUT_LEN],
num_leaves: usize,
) -> bool {

root -> super_segment_root
proof -> segment_proof
leaf_index -> segment_offset
leaf -> segment_root
num_leaves -> num_segments

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading verification section below I think we also need to store mapping from segment index to super segment index somewhere somehow, so we can figure out which super segment root to use for piece verification. Previously with a single chain it was 1:1 mapping, but now piece index N can be in any super segment, it depends on how many segments were included in each.

Comment on lines +161 to +180
4. **Combine the proofs**:

- Package the piece inclusion proof and the super segment proof into a single proof structure.

```rust
struct GlobalHistoryProof {
piece_proof: PieceProof,
super_segment_proof: SuperSegmentProof,
}

fn generate_global_history_proof(piece: Piece, segment: Segment, beacon_chain: BeaconChain) -> Option<GlobalHistoryProof> {
let piece_proof = generate_piece_inclusion_proof(piece, segment)?;
let super_segment = locate_super_segment(segment, beacon_chain)?;
let super_segment_proof = generate_super_segment_proof(segment, super_segment)?;
Some(GlobalHistoryProof {
piece_proof,
super_segment_proof,
})
}
```
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say this is less of a combination of proofs and more of inserting segment proof and segment offset into the pieces we have produced earlier that were basically unverifiable to complete them.

Comment on lines +191 to +221
1. **Verify the piece inclusion proof**:

- Use the Merkle root of `segment4` to validate the inclusion proof for `piece1`.

```rust
fn verify_piece_inclusion_proof(piece: Piece, proof: PieceProof, segment: Segment) -> bool {
segment.verify_proof(piece, proof)
}
```

2. **Verify the super segment proof**:

- Use the Merkle root of the `SuperSegment` to validate the inclusion proof for `segment4`.

```rust
fn verify_super_segment_proof(segment: Segment, proof: SuperSegmentProof, super_segment: SuperSegment) -> bool {
super_segment.verify_proof(segment, proof)
}
```

3. **Combine the verification steps**:

- Ensure both the piece inclusion proof and the super segment proof are valid.

```rust
fn verify_global_history_proof(proof: GlobalHistoryProof, piece: Piece, segment: Segment, super_segment: SuperSegment) -> bool {
let piece_valid = verify_piece_inclusion_proof(piece, proof.piece_proof, segment);
let super_segment_valid = verify_super_segment_proof(segment, proof.super_segment_proof, super_segment);
piece_valid && super_segment_valid
}
```
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly this pseudo-code really says nothing about how things are verified, the contents of those methods matter!

We neither have segment nor super segment here. We only have a piece and super segment root, so the signature is still the same as here, just segment_root replaced with super_segment_root:

/// Validate proof embedded within a piece produced by the archiver
pub fn is_valid(&self, segment_root: &SegmentRoot, position: u32) -> bool {
let (record, &record_root, parity_chunks_root, record_proof) = self.split();
let source_record_merkle_tree_root = BalancedHashedMerkleTree::compute_root_only(record);
let record_merkle_tree_root = BalancedHashedMerkleTree::compute_root_only(&[
source_record_merkle_tree_root,
**parity_chunks_root,
]);
if record_merkle_tree_root != *record_root {
return false;
}
record_root.is_valid(segment_root, record_proof, position)
}

This also likely indicates we should include super segment index in the piece itself (segment index just like position could have been inferred from piece_index before, but not super segment index). That is unless we do something like "we only create super segment when we have N segments", which I don't think we want to do due to delays it'll cause, especially early in the history of the blockchain.

Alternatively we may store the information necessary to map segment index to super segment index, where we may also store the number of segments in a super segment too if the storage increase is not a concern for light clients, those two together will take something like 11-12 bytes of extra storage per super segment.

@nazar-pc nazar-pc marked this pull request as draft May 5, 2025 22:29
@adlrocha
Copy link
Collaborator Author

adlrocha commented May 30, 2025

Closing this PR. We have made a lot of progress for the end-to-end of segment commitments to global history making this discussion already updated. This discussion has been superseded by #267 with a more up-to-date and cleaner description about this protocol mechanism.

@adlrocha adlrocha closed this May 30, 2025
@nazar-pc nazar-pc deleted the discussion/segment-proofs branch May 31, 2025 00:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants