-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redundant write on EigenDA failure #242
base: main
Are you sure you want to change the base?
Conversation
@Inkvi thanks for this. Bit overwhelmed at the moment but will review asap. Ping me if I forget. |
log.Error("Failed to write to EigenDA backend", "err", err) | ||
// write to EigenDA failed, which shouldn't happen if the backend is functioning properly | ||
// use the payload as the key to avoid data loss | ||
if m.secondary.Enabled() && !m.secondary.AsyncWriteEntry() { | ||
redundantErr := m.secondary.HandleRedundantWrites(ctx, value, value) | ||
if redundantErr != nil { | ||
log.Error("Failed to write to redundant backends", "err", redundantErr) | ||
return nil, redundantErr | ||
} | ||
|
||
return crypto.Keccak256(value), nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iiuc the idea here is to use secondary backends as primary in the event of a failed dispersal to eigenda. this will create complications to some of our integrations (e.g, in arbitrum x eigenda the commitment is posted/verified against the inbox directly which would now fail with failover). Would prefer if this was an opt-in and feature guarded by some dangerous
config flag...
cc @samlaf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall seems like an approach that could work, but the PR in its current form would require a lot more work:
- we would want to revert to a keccak commitment mode when this kind of failover happens, so that the derivation pipeline knows that the failover happens (this would also be a more robust approach than the ad-hoc reading you currently have)
- as @epociask said, this failover behavior should be feature guarded
- this failover should only happen if the blob's size is <128KiB (to not hit against this size limit in the derivation pipeline)
- would need to add some tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@epociask thanks for highlighting the integration problems with Arbitrum. I am not familiar with Arbitrum stack and how alt da is handled there. Could you shed more light on the failure mechanism? Are you saying that Arbitrum's inbox contract verifies the EigenDA certificate directly on-chain? That would imply a direct integration between Arbitrum stack and EigenDA which is unlikely in my mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samlaf the last time I looked at the op-batcher (v 1.9.5) it couldn't revert to a different commitment mode and generic commitment mode is used by the batcher whenever da service is specified. If that still holds true, then the blob size limitation of 128kb is not applicable anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would imply a direct integration between Arbitrum stack and EigenDA which is unlikely in my mind.
This is what we've done in our Arbitrum fork, feel free to look at the code here where the on-chain cert verification is performed.
If that still holds true, then the blob size limitation of 128kb is not applicable anymore.
In a world with enabled fraud proofs wouldn't you need to one step prove the reading of blob contents from a preimage oracle? Curious how that would work using a hash based commitment scheme where opcode resolution would require uploading the entire pre-image or blob contents. Agree the size limitation is irrelevant for insecure integrations but presents dramatic security implications to stage1-2 rollups.
Changes proposed
Current proxy implementation lacks protection for failures on writes. If EigenDA fails for any reason, we still want to write our commitment to the DA layer if we have caches or fallbacks enabled. Otherwise our DA batches will be postponed until Eigen recovers and might overwhelm the batcher if that period is prolonged.
Since certificate is not available at this point, a keccak of the payload has to be used as commitment.
Note to reviewers
I am aware of your PR to achieve a similar results with Eth DA as backend instead of s3. But that PR is a few months old and there was no traction from OP team.