Skip to content

Conversation

@adlrocha
Copy link
Collaborator

@adlrocha adlrocha commented Apr 21, 2025

This PR:

  • Adds a README that allows to keep track of what is the state of progress in a single view.
  • Adds a draft for the sharded archiving spec.

As mentioned in the README, the goal for this is not to become the definite spec for the protocol, but to be the groundwork to start uncovering issues, dig deep into some details that I've may not be considering, get feedback, and enable already the implementation of some basic prototypes.

Copy link
Owner

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move to specs and squash (this and the other PR)

@nazar-pc
Copy link
Owner

Something wasn't right with the last push, commits ended up being duplicated with a bunch of commits that should not be here. I'm force-pushing a fixed version.

@adlrocha adlrocha changed the title Added spec directory and sharded archiving draft spec: sharded archiving draft Apr 21, 2025
@adlrocha
Copy link
Collaborator Author

Thanks, I really don't understand what happened to my main branch that got super polluted and it keeps doing this. Hopefully this fixes it, otherwise I'll clone a clean version of the repo to avoid this in the future.

@nazar-pc
Copy link
Owner

nazar-pc commented Apr 21, 2025

This is why I always say JetBrains' git integration is the best. I'm yet to see an issue like this happen to me.
VS Code users manage to do this regularly for whatever reason.

@adlrocha
Copy link
Collaborator Author

I use raw git (or graphite as a wrapper on top of this), but I agree that with raw git this shit keeps happening sometimes, and it is a PITA.

Copy link
Owner

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just started, didn't read the whole thing yet.

On one hand for me it is probably a bit too verbose since I assume this is based on Subspace. But for someone who is not familiar with Subspace this may not really be a sufficient description.


- `segment_commitments[]`: hash map of KZG commitments computed over pieces in a segment and stored
in the runtime of each shard. This data structure represents the segment history of the shard and
include both, local segment commitments, and segment commitments from child shards (through
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it need to store segment commitments of child shards?

Or I guess the better thing to ask is "where"?

When I think about data structures I think of struct S, these look like storage entries, but not clear if they are stored in the runtime, just part of the block header/body, or something off-chain entirely.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I forgot to update this after the change to Merkle Trees instead of KZG, and the commitment of segments directly to the beacon chain (editted the description accordingly).

My idea here was to store this information however is currently done for segment_commitments[] now in Subspace. From skimming through the code and the spec my understanding is that this is stored on-chain as state in the state of the blockchain, but please correct me if this is not the case.

Let me know if I need to elaborate on this in the description after my edit.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it is done in Subspace is such that we can verify solutions later. With votes not in the runtime it may not be that critical, but still probably worth doing.

Here it is not useful for the same purpose since pieces will be verified against beacon chain's super segments instead. It doesn't seem to add any value to permanently store roots of segments of the child shards, I don't see how it will be used.

in the runtime of each shard. This data structure represents the segment history of the shard and
include both, local segment commitments, and segment commitments from child shards (through
`SuperSegment`s). Each of them should be accordingly tagged as such.
- `shard_blocks[]`: hash map of blocks committed from child shards. It stores the history of the
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar question: where is it stored?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same answer as above, I was thinking this to be a convenient state object stored on-chain an updated when a block includes a BlockInfo or a new segment, but happy to be stand corrected.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocks are different, how do you plan to use this?

- Input: Genesis block of the shard.
- The genesis block is archived as soon as it is produced. We extend the encoding of the genesis
block with extra pseudorandom data up to `RECORDED_HISTORY_SEGMENT_SIZE`, such that the very first
archived segment can be produced right away, bootstrapping the farming process. This extra data is
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may do that (and it might look easier in the code), but depending on piece selection rules, this is only required on the beacon chain. Anything that starts after it can use existing pieces of the beacon chain or other shards for plotting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I went this way is to avoid having to have different genesis logic for shards that start from the start of the network, and shards that start when the system is already online.

Also, I was having a hard time figuring out how would the farming process be bootstrapped otherwise even for shards that come later in the history of the system (but I see your point, we may need to adjust this once we have the details for sharded farming).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the bootstrapping process will likely be technically the same for shards started at any time, even those that start originally can be forced to reference beacon chain block 1, so they can observe the history already.

Even if we create a segment on a shard during bootstrapping, it doesn't help in any way with plotting since we're plotting global history and this newly produced segment is not confirmed yet.

Copy link
Owner

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some immediate responses before looking at the updated contents


- `segment_commitments[]`: hash map of KZG commitments computed over pieces in a segment and stored
in the runtime of each shard. This data structure represents the segment history of the shard and
include both, local segment commitments, and segment commitments from child shards (through
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it is done in Subspace is such that we can verify solutions later. With votes not in the runtime it may not be that critical, but still probably worth doing.

Here it is not useful for the same purpose since pieces will be verified against beacon chain's super segments instead. It doesn't seem to add any value to permanently store roots of segments of the child shards, I don't see how it will be used.

in the runtime of each shard. This data structure represents the segment history of the shard and
include both, local segment commitments, and segment commitments from child shards (through
`SuperSegment`s). Each of them should be accordingly tagged as such.
- `shard_blocks[]`: hash map of blocks committed from child shards. It stores the history of the
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocks are different, how do you plan to use this?

- Input: Genesis block of the shard.
- The genesis block is archived as soon as it is produced. We extend the encoding of the genesis
block with extra pseudorandom data up to `RECORDED_HISTORY_SEGMENT_SIZE`, such that the very first
archived segment can be produced right away, bootstrapping the farming process. This extra data is
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the bootstrapping process will likely be technically the same for shards started at any time, even those that start originally can be forced to reference beacon chain block 1, so they can observe the history already.

Even if we create a segment on a shard during bootstrapping, it doesn't help in any way with plotting since we're plotting global history and this newly produced segment is not confirmed yet.

@nazar-pc
Copy link
Owner

I don't know what happened, but looks like you overrode my squash and pushed even more commits 😕

I'll fix it up, but try to maybe pull the latest version first next time.

@adlrocha
Copy link
Collaborator Author

I don't know what happened, but looks like you overrode my squash and pushed even more commits 😕

I'll fix it up, but try to maybe pull the latest version first next time.

I did pull it first, this is super annoying. I will just go ahead and clone a clean repo to avoid this from happening again. Sorry about that!

- The execution of the `SegmentInfo` transaction when included in a block of the beacon chain
triggers the proof of the shard's segment in the global history of the beacon chain.
- To verify the `SegmentInfo` and execute the transaction, nodes verify the following:
- Ensure `segment_index` is the subsequent index after the last one committed for the shard.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really sufficient. I'd say all this should be happening as part of block header inclusion from the child shard, alongside with a proof that the segment was a part of the block (or place it in the header of the block, not sure).

Comment on lines +231 to +233
- If the verification of the `SegmentInfo` is successful, the child shard segment is added to the
beacon chain shard's history by adding the shard's `segment_root` to the `segment_roots[]` that
indexes the all proofs from the global history (including beacon chain and all child segments).
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make a lot of sense, the whole point of super segment is to avoid storing roots of all shards and instead compress it into a single super segment root.

Comment on lines +243 to +244
- When a new super segment is created instead of broadcasting the network the full super segment,
only the `SuperSegmentHeader` is broadcasted, allowing nodes to proactively determine if they
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- When a new super segment is created instead of broadcasting the network the full super segment,
only the `SuperSegmentHeader` is broadcasted, allowing nodes to proactively determine if they
- When a new super segment is created instead of broadcasting to the network the full super segment,
only the `SuperSegmentHeader` is broadcast, allowing nodes to proactively determine if they

I'm not sure how can nodes proactively determine just from a Merkle Root what commitments they have and what commitments they don't. In fact this SuperSegmentHeadder is never broadcast, it is composed locally by the block producer using segments they see (if they see any at all) in the transaction pool.

Comment on lines +247 to +248
- `history_delta`: Number of segment proofs included in the `super_segment`. This will be used to
determine the number of segments committed in the beacon chain at a specific `block_height`.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delta can always be computed by comparing two segment headers, the only thing you need for knowing the size of the history (which is crucial for consensus) is the latest segment index that exists, which beacon chain by definition knows because it assigns them to shard segments.

- Input: Transactions with `BlockInfo`, new local segments, and `SegmentInfo`.
- The protocol is recursive, which means that if we have a hierarchical architecture with more than
one level below the beacon chain, all shards independently of the level they belong to will
perform the same operations described in the sections from above: i.e. submit new
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not exactly accurate. Super segments according to our latest discussion are only created on the beacon chain, so at least that is different.

@adlrocha
Copy link
Collaborator Author

adlrocha commented May 5, 2025

Closing in favor of #220 and #227 that is less high-level, dive deeper into the details, and constraints the discussions of the different parts of the subprotocol.

@adlrocha adlrocha closed this May 5, 2025
@nazar-pc nazar-pc deleted the adlrocha/spec branch May 31, 2025 00:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants