-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EOFCREATE
: Don't hash the init-container
#162
Comments
A relevant code is CREATE3: https://github.com/Vectorized/solady/blob/main/src/utils/CREATE3.sol Should ask for feedback from library authors. |
If we decide to take away initcontainer hashing (or change it somehow), we need to revisit and ask for an update to the EOF considerations for Verkle EIPs. A link a tentative PR which handles this (and a thread about initcontainer hashing) here.
|
A different perspective on this: the hash of the init container locks in the semantics, not the exact code contents of the contract. This doesn't prevent future rewrites of the code because they will be semantics-preserving or inherently breaking regardless of code observability. Code observability should be removed to the extent that CODECOPY/EXTCODECOPY can cause semantics-preserving rewrites to become breaking changes indirectly. In my opinion, it should be totally okay to rewrite code even when the address is a witness of the original code on the account. In fact, it's a good thing that there's a way to recover CODECOPY via this witness in a way that doesn't risk being broken by rewrites! I also think this ability to provide on-chain proof of the code/semantics of an account is an important primitive that we shouldn't get rid of. |
Thank you for this perspective. We're gathering inputs on this one, so this feedback is very useful.
Do the Solution 1 & 2 above still qualify as getting rid of? Note that |
Because of Solution 2 would work if there was a way to trace back to a "root" deployer whose address was computed with codehash. If EOFCREATE doesn't do that (and CREATE2 is "removed" via EOF), I don't think it would be possible to get a root deployer like that because creation transactions don't use the codehash. I think the current state where the code hash is directly included is better though, because it takes a single hash to compute the address rather than a tracing process involving multiple hashes. Additionally, you may only care about proving the code of an account ignoring the code of its deployer, and if you have to trace it back to the deployer you are not able to do that. Overall I think including code hash into the address directly is significantly better. |
A note about CREATE3.sol and similar patterns: this is often used to deploy contracts via CREATE2 at a deterministic address that doesn't depend on the creation parameters (or only some of them). For example Uniswap v3 does this here:
This kind of use case is actually natively addressed by EOFCREATE! The workaround will no longer be needed because the input is not included in the address formula. The other side to this is that users of CREATE2 that do care about the creation parameters will need a new way to validate them. Either a trusted factory mixes them into the salt, or the contract exposes getters. There is another potential use case for CREATE3.sol, which is to deploy at a deterministic address that is fully independent of the creation code. This other use case is not only not addressed by EOFCREATE, it becomes impossible to strictly implement under EOF, although it is easy to work around by deploying a proxy instead. I don't know how common this use case is honestly. More context here: https://github.com/moodysalem/EIPs/blob/46350bb/EIPS/eip-3171.md#motivation |
Analysis of input data for create addressIt is preferable the inputs be less than 136 bytes (the keccak256 block size).
|
Interesting. I was thinking about extending solution 2 to hash also inputs provided to
I think the pattern may be for user to hash inputs and combine them with the salt. |
What about chained EOF creates going deep? In this case would the "code address" of the second level create be the code address of the parent? If the code address is the address if the topmost container... not so much. Because then the index could be re-used at different depths to cause different contracts to be deployed at the same address based on call data (although returncontract can do that more cleanly). Or is it the "sender address" that gets updated in nested EOFCREATES? Either way we need tests for this scenario. |
O: O: O: O: |
Is this a problem though? seems OK to me. We have the a deterministic address to deploy D / E at, but the code itself (contents) is not in the witness |
Wouldn't this scheme work?
The combination Amending @chfast's examples above: O: O: O: O: E no longer equal to D Any further nested EOFCREATE would have different |
EDIT: this post is likely some kind of misunderstanding on my part. We mentioned originally that code address is the address of the outer-most EOFCREATE, but actually, a scheme where code_address changes during each EOFCREATE seems to avoid address conflicts better... Having revisited @chfast 's example after a while, I think by O: CALL A This already makes D != E or putting it differently:
However, this only changes the way one runs into the D==E conflict: O: CALL C (deployed above) but initcodes used for D and E are different (at different nesting depth in A) |
...or wait, maybe we actually do want the |
It doesn't sound right for nested EOFCREATE to have |
The problem stems from the fact that we're swapping the code executing at the context of C - during init it is the initcode, after it is the initcode's subcontainer (RETURNCONTRACTed). This makes the two instances of
|
From my perspective in case of EOFCREATE nested in outer EOFCREATE initcode, the inner EOFCREATE's initcode doesn't "live at C" at all, it has almost nothing to do with C. C is an address that will be deployed (or not) when outer EOFCREATE finishes. But this is a bit of bikeshedding. I agree |
Yeah, I see your point here. But actually the bikeshedding is useful. I revisited the option with "code_address doesn't change on EOFCREATE + during_init" with this new perspective and now it seems to me it works too, I must've made a mistake somewhere yesterday, PTAL: O: CALL A O: CALL C (deployed above) O: CALL C (deployed above) In this version |
Yes, I like this version more. Seems to work and not conflict on deeper nesting levels, too. O: CALL A |
This seems to work. The approach seems equivalent to a list of indices pointing at a deeply nested subcontainer of I find it hard to reason about though. I think if you see So the init containers for each of these contracts are:
There seems to be some redundancy here:
So an alternative could be to remove
Note that EOFCREATE in a DELEGATECALL context always results in |
Since it looks like we may have solved this issue I'll resurface my previous comment. With this change we would be losing the ability to make on-chain proofs about the behavior of an account without a trusted factory (although it would be recoverable with a zk-coprocessor). I do think we need to consider whether it's okay to remove that, or if it's a primitive that applications are relying on. I'm currently weakly leaning towards probably okay to remove. |
Can you clarify what kind of a behavior proof? Just that |
Yeah that works if you have a trusted/known factory. This is probably enough. |
I like this variant, too. I would reframe it as we'd have 2 different schemes depending on whether EOFCREATE is called inside initcode:
|
how about just removing the witness entirely? i.e. |
one issue with using the initcontainer index in the hash is that it makes counterfactual address calculation potentially impossible on chain. since the EOFCREATE-ing contract cannot introspect the code of the factory address (that it delegates to), it cannot counterfactually produce the target address. |
also wanted to point out that using initcontainer index rules out certain types of code rewrites as well. for instance, reordering of initcontainers or fusing them. |
It would be possible if the target of DELEGATECALL uses the proposed If we say that developers should only delegate to code they trust not to do this, we're back in the argument about the degree of responsibility put on them. CREATE2 collisions triggered by DELEGATECALL targets is not something they're responsible for today in legacy code. |
During the conversation in Nov 27 call a new separate metadata section was proposed and generally agreed upon: data that can't be read by EVM and any change to this does not affect the code (spec/EIP TBD). I'd like to point out that, if we are to keep the initcode hash in the |
I see it more like the metadata section's being there becomes a strong argument for leaving out the initcode hash entirely - in any of the variants proposed here. Calculating initcode hash from a subset of sections sounds impractical to me. |
Oh yes, impossible when
Isn't it already the case only trusted code should be delegated to? That is, in broader context than just creation logic. If we have a target function Is there a usage pattern where that wouldn't work well enough @frangio? |
speaking of counterfactuality, i think the issue with |
Yes but the way creation salts are constructed is not a property one has to audit of DELEGATECALL targets at the moment. It's not impossible to audit, it's just a new checklist item with respect to legacy. The way I'm thinking about this is global vs local properties, where global collision avoidance should be taken care of by the EVM and local collision avoidance should be ensured by the contract code, where "contract code" is the developer and their compiler, and the property should not be breakable by an end user (eventually an attacker) under any circumstance (barring Keccak256 breaking) including if the user is able to choose a DELEGATECALL target. I recognize this last part is pretty strong so I'm not attached to it. |
Not sure what you mean by this. It can be computed on chain if you know the parameters, which the contract should document and would already be documented anyway, among other things because the salt is often constructed and not explicit in the input. Can you describe end to end a scenario where you see an issue? |
i mean like you need more arguments than are required for just the eofcreate, e.g. def create_something() -> address:
salt: bytes32 = self._compute_salt()
return factory.create(args, salt) # calls EOFCREATE or delegates to to another factory
def counterfactual() -> address: # compute the address counterfactually of factory.create()
salt: bytes32 = self._compute_salt()
return ... # can't, need the final code_address and potentially the initcontainer index depending on the scheme |
Ok, I was going to suggest the factory should expose a getter for counterfactual addresses because it knows those parameters. But this is not possible if the factory is upgradeable because the code_address parameter becomes time dependent, future values are not known and all past values need to be stored. |
right -- so i think the point is it would break an existing property of CREATE2, which is that you can counterfactually predict the address of invoking CREATE2 from just its inputs. |
Maybe this has already been discussed and my comment is completely off-topic, but what I personally care about is having a way to guarantee cross-chain runtime bytecode equivalence. Both proposals here |
I'd argue you need init code equivalence because the same runtime code initialized differently can have wildly different properties.
Indeed but note that if your goal is to build a generic factory like CreateX you will not be able to do that with EOFCREATE. Generic factories would be enabled by TXCREATE, where the init code hash is available and can be used in the salt to guarantee code equivalence. At the moment it's unclear if this will ship in Osaka. See EOF Implementers Call #63 for more discussion. |
That's an interesting point, although any observable difference in chain state can result in runtime code with different properties (examples: reading |
One more thing to mention is that we weren't so far considering altering creation tx hashing scheme. It's currently same as legacy (sender + sender_nonce), but could be considered for EOF to include initcode hash and salt instead of the sender_nonce, at the expense of ugly requirement to append 32 bytes of salt after the init container and before calldata. Plot twist: instead of after the initcontainer, the salt could be... in the tx initcontainer's EIP-7834 metadata section :P. This is slightly offtopic, b/c this thread is for EOFCREATE, but if reliable-deterministic-cross-chain addresses are a concern, and we can't afford to wait for TXCREATE, maybe that is some way out? |
Right, I think it depends on the use case. For
Interesting - well after skimming through that proposal my first view is: Having a new transaction type to access this feature is an unnecessary overhead IMHO. We should strive for KISS, and not start adding new transaction types due to a bad design in
Interesting - I have to read that EIP first again. I would like to mention that in yesterday's EOF call, it was mentioned that Lastly, if you're interested in some stats on |
We've put together a summary doc with some possible scenarios of revising the hashing schemes/deployment methods for EOF: https://notes.ethereum.org/@ipsilon/SyrzctZSJg. The goal of this is to lay out our options in the clear and discuss them - whether we still can alter the EOF address schemes and if yes - what is the best way to do it. All feedback very welcome. If the doc is missing a solution/scenario you'd like to make a case for, please let me know. I'm looking forward to discussing this in depth on the next EOF implementers call. Taking the liberty to tag @pcaversaccio @charles-cooper @cameel @kuzdogan |
@pdobacz What does it mean for an approach to support AA deployments? What are the AA-specific challenges? More generally, can you provide a description of each of the items in the comparison table? I think it would be useful to have as a list of desiderata. These are the ones I can recall:
Additionally the following have come up:
IMO (1) is the main requirement, and (2) is just one technique currently used to achieve it. That is because same-address multi-chain deployments are extremely challenging, unless there is a preexisting multi-chain generic factory, so we make a big effort into deploying that factory to enable it. The issue with EOF is that it kills generic factories, but if multi-chain deployments were solved I believe this would not be a significant issue. Since it seems like TXCREATE would be a big source of ACDE friction, perhaps we can focus on solving multi-chain deployments in some other way. |
Since EOF is the current "headliner" in Osaka I don't think we will have the same friction getting TXCREATE in, especially since it is being driven by end-user requirements and not driven by the evm devs. Given that, I think Address+salt for both EOFCREATE and TXCREATE and a set of ERC standard contracts (with "standard" deployments) that address the use cases is what I see working. We may want to add / wrap the hash with a per-opcode value to prevent EOFCREATE and TXCREATE from having same-contract collisions. like |
"AA deployments" come up in the context of comparison of the pieces labelled (D/) and (F/), so TXCREATE and the nonce-less creation txs. While (D/) TXCREATE allows a smart contract wallet to deploy arbitrary code, while such creation txs do not, which is what sets the latter at a disadvantage. Thanks for noting, I think this wording is too much of a mental shortcut. Should read "support deployments by smart contract wallets, as required by AA"
Good point. I tried to make this self-explanatory by elaborating in the A, B, C... sections, but this seems to not be clear enough.
Got it, so I think, in the table "Bytecode guarantees" boils down to 1., while "Generic factories" boils down to 2. My question is, whether or not 2. isn't of merit beyond providing 1. I initially thought having a factory which on one hand supports deploying arbitrary code, and on the other running some fixed logic (like registering the new contract in some registry or whatnot), would be useful. But if we can ascertain this is not a useful tool, I can put the row "Generic factories" off the comparison table and treat it only as means to accomplishing 1. |
What deployments are needed? Say in the context of ERC-4337, a UserOp can specify a factory, but the factory doesn't need to be generic as far as I can tell. |
Hm, Okay, maybe I'm missing sth, but let's take 4337's: (EDIT: I was missing something indeed, see below)
I understand that this step I think same challenge is with 7702 wallets deploying contracts. EDIT: as noted by Frangio, I didn't understand the main usecase behind that |
So |
OK, so it definitely misled me, thank you for bearing with me.
Yes, so this is my question as well, or more generally - how, if at all, can ERC-4337 SC Wallets deploy new contracts. In legacy EVM this is possibly using CREATE2, but I don't know if it is a pattern actually used or intended. |
Oh! I see what you mean now. I just looked through Safe and Coinbase Smart Wallet as two examples, and couldn't find any way to directly deploy a contract from either. In the case of Safe it could be done through DELEGATECALL into a factory. I'll try asking other AA teams, but I don't see why this feature would be needed tbh. |
On the last EOF implementers call 65 we arrived at some consensus to aim for pushing the Scenario 1b from https://notes.ethereum.org/@ipsilon/SyrzctZSJg, being the addition of The change would be proposed in a new EIP (currently in the making), but it can be previewed in the PR to the EOF Megaspec document. The spec there is equivalent to that of the EIP being prepared. Please take a look @pcaversaccio @charles-cooper @cameel @kuzdogan and provide feedback. I'm looking forward mainly to receiving confirmation that this approach satisfies the required deployment methods in use today. Or of course, if they don't, please let us know why and how we should fix it. We can also move that preliminary discussion to that EOF Megaspec document PR, before we have the EIP draft and a corresponding EthMag thread. (Last minute heads-up: meanwhile we've identified what might be an issue. Quick gist: the EIP-7834 metadata section is at odds with the approach for a factory to include |
I don't want to sidetrack the discussion but this is interesting from EOF Megaspec document: If I'm not mistaken, this would be the first predeploy (!= precompile) on Ethereum. Is there some discussion (e.g. an EIP) for this? I have seen such an approach for many L2s, but not yet for Ethereum. So scenario 1b) states:
I'm sorry if I missed the conversation around it, but you claim this scenario has "Bytecode guarantees" in here: How can you exactly guarantee the bytecode for a counterfactual contract address without using the init code in |
Thanks @pdobacz I think I understood most of it but as a less versed person on the spec and terminology I'd want to summarize my takeaways and maybe you can correct me:
Where is the
But I don't see it in the 1b option. It's just ![]() |
Thank you for taking a look, this is much appreciated! Let me try to clarify everything. Actually, I'm sorry, but the Creator Contract source code mentioned in the PR got an error (now fixed) - TXCREATE looks up the initcontainer by its hash, not its index. TXCREATE's description was correct, but maybe the incorrect Creator Contract source-code misled you both. Having fixed that, to answer your specific questions:
@pcaversaccio Consider a factory pseudocode (I omitted
If such factory is deployed at
@kuzdogan looks all correct. A minor remark is that the intention is for the
Maybe the above answer to pcaversaccio helps? It would be up to the specific factory to include it. I now realized that the "bootstrap" Creator Contract should work more similar to the |
The create address derivation for
EOFCREATE
is based onCREATE2
.where the
sender_address
is the logical address of the contract invokingEOFCREATE
.We identified that the
keccak256(init-container)
goes against the "code non-observability" because it locks in the contents of the init-container e.g. preventing re-writing it in some future upgrade.It also seems unnecessary expensive:
EOFCREATE
can only pick up one of the deploy-time sub-containers.Solution 1: Use sub-container index
The create address is already bound to the "sender address", code is immutable (no
SELFDESTRUCT
) so replacing the hash of the sub-container with just its index may be enough.Solution 2: Use code's address + sub-container index
The
CREATE2
scheme uses the "sender address" with may not be the address of the code (seeDELEGATECALL
). I'm not sure if this is desired property forCREATE2
. But forEOFCREATE
this looks to be a problem. A contract may deploy different contract usingDELEGATECALL
proxy: forEOFCREATE
inside aDELEGATECALL
the same sub-container index will point to different sub-container. To fix this we can replace/combine the physical code address:keccak256(code_address + salt + sub-container-index)
keccak256(sender_address + code_address + salt + sub-container-index)
The text was updated successfully, but these errors were encountered: