`EOFCREATE`: Don't hash the init-container #162

chfast · 2024-09-06T10:19:08Z

The create address derivation for EOFCREATE is based on CREATE2.

keccack256(sender_address + salt + keccak256(init-container))

where the sender_address is the logical address of the contract invoking EOFCREATE.

We identified that the keccak256(init-container) goes against the "code non-observability" because it locks in the contents of the init-container e.g. preventing re-writing it in some future upgrade.

It also seems unnecessary expensive: EOFCREATE can only pick up one of the deploy-time sub-containers.

Solution 1: Use sub-container index

The create address is already bound to the "sender address", code is immutable (no SELFDESTRUCT) so replacing the hash of the sub-container with just its index may be enough.

Solution 2: Use code's address + sub-container index

The CREATE2 scheme uses the "sender address" with may not be the address of the code (see DELEGATECALL). I'm not sure if this is desired property for CREATE2. But for EOFCREATE this looks to be a problem. A contract may deploy different contract using DELEGATECALL proxy: for EOFCREATE inside a DELEGATECALL the same sub-container index will point to different sub-container. To fix this we can replace/combine the physical code address:

keccak256(code_address + salt + sub-container-index)
keccak256(sender_address + code_address + salt + sub-container-index)

The text was updated successfully, but these errors were encountered:

axic · 2024-10-03T13:41:34Z

A relevant code is CREATE3: https://github.com/Vectorized/solady/blob/main/src/utils/CREATE3.sol

Should ask for feedback from library authors.

pdobacz · 2024-10-07T12:45:56Z

If we decide to take away initcontainer hashing (or change it somehow), we need to revisit and ask for an update to the EOF considerations for Verkle EIPs. A link a tentative PR which handles this (and a thread about initcontainer hashing) here.

check Verkle EIP(s) if update is necessary

frangio · 2024-10-07T20:38:13Z

We identified that the keccak256(init-container) goes against the "code non-observability" because it locks in the contents of the init-container e.g. preventing re-writing it in some future upgrade.

A different perspective on this: the hash of the init container locks in the semantics, not the exact code contents of the contract. This doesn't prevent future rewrites of the code because they will be semantics-preserving or inherently breaking regardless of code observability.

Code observability should be removed to the extent that CODECOPY/EXTCODECOPY can cause semantics-preserving rewrites to become breaking changes indirectly. In my opinion, it should be totally okay to rewrite code even when the address is a witness of the original code on the account. In fact, it's a good thing that there's a way to recover CODECOPY via this witness in a way that doesn't risk being broken by rewrites!

I also think this ability to provide on-chain proof of the code/semantics of an account is an important primitive that we shouldn't get rid of.

pdobacz · 2024-10-08T07:16:30Z

Thank you for this perspective. We're gathering inputs on this one, so this feedback is very useful.

I also think this ability to provide on-chain proof of the code/semantics of an account is an important primitive that we shouldn't get rid of.

Do the Solution 1 & 2 above still qualify as getting rid of? Note that EOFCREATE in current form doesn't allow deploying arbitrary off-chain code, only that listed as one of its subcontainers. So it can be proven on-chain that a given account has code deployed from a known address' subcontainer

frangio · 2024-10-08T14:59:17Z

Because of DELEGATECALL I don't think Solution 1 gives any guarantees about the code/semantics of an account.

Solution 2 would work if there was a way to trace back to a "root" deployer whose address was computed with codehash. If EOFCREATE doesn't do that (and CREATE2 is "removed" via EOF), I don't think it would be possible to get a root deployer like that because creation transactions don't use the codehash.

I think the current state where the code hash is directly included is better though, because it takes a single hash to compute the address rather than a tracing process involving multiple hashes. Additionally, you may only care about proving the code of an account ignoring the code of its deployer, and if you have to trace it back to the deployer you are not able to do that. Overall I think including code hash into the address directly is significantly better.

frangio · 2024-10-08T19:15:59Z

A note about CREATE3.sol and similar patterns: this is often used to deploy contracts via CREATE2 at a deterministic address that doesn't depend on the creation parameters (or only some of them). For example Uniswap v3 does this here:

This kind of use case is actually natively addressed by EOFCREATE! The workaround will no longer be needed because the input is not included in the address formula.

The other side to this is that users of CREATE2 that do care about the creation parameters will need a new way to validate them. Either a trusted factory mixes them into the salt, or the contract exposes getters.

There is another potential use case for CREATE3.sol, which is to deploy at a deterministic address that is fully independent of the creation code. This other use case is not only not addressed by EOFCREATE, it becomes impossible to strictly implement under EOF, although it is easy to work around by deploying a proxy instead. I don't know how common this use case is honestly.

More context here: https://github.com/moodysalem/EIPs/blob/46350bb/EIPS/eip-3171.md#motivation

chfast · 2024-10-15T16:16:10Z

Analysis of input data for create address

It is preferable the inputs be less than 136 bytes (the keccak256 block size).

`CREATE`

Input length: 23–31
Prefix byte: 0xd6–0xde

The input is RLP encoded sender's address and sender's nonce. The encoding is variable-length with fixed encoding for the address and variable-length encoding of the nonce.

`CREATE2`

Input length: 85
Prefix byte: 0xff

The input is fixed length concatenation of prefix byte 0xff, sender's address (20 bytes), user-provided salt (32 bytes) and initcode hash (32 bytes). The prefix byte has been added to avoid collisions with the CREATE scheme but this is unnecessary because both schemes don't have inputs of the same lengths.

`EOFCREATE` solution 2

Input length: 63 or 97
Prefix byte: none

Concatenation of the addresses the sender and the code (2x 20 bytes), user-provided salt (32 bytes) and subcontainer index (1 byte). Alternatively, we can allocate 32 bytes per address for compatibility with Address Space Expansion.
None of these total lengths match any existing schemes so the prefix byte is not necessary.

chfast · 2024-10-15T16:21:04Z

This kind of use case is actually natively addressed by EOFCREATE! The workaround will no longer be needed because the input is not included in the address formula.

Interesting. I was thinking about extending solution 2 to hash also inputs provided to EOFCREATE. But looks there are some use cases where this is undesirable.

The other side to this is that users of CREATE2 that do care about the creation parameters will need a new way to validate them. Either a trusted factory mixes them into the salt, or the contract exposes getters.

I think the pattern may be for user to hash inputs and combine them with the salt.

shemnon · 2024-10-15T16:27:54Z

What about chained EOF creates going deep? In this case would the "code address" of the second level create be the code address of the parent? If the code address is the address if the topmost container... not so much. Because then the index could be re-used at different depths to cause different contracts to be deployed at the same address based on call data (although returncontract can do that more cleanly).

Or is it the "sender address" that gets updated in nested EOFCREATES? Either way we need tests for this scenario.

chfast · 2024-10-15T16:43:10Z

O: CALL A
A: EOFCREATE[1](X)
C: eofcreate_addr(sender=A, code=A, salt=X, idx=1)

O: CALL A
A: DELEGATECALL B
B: EOFCREATE[2](X)
C: eofcreate_addr(sender=A, code=B, salt=X, idx=2)

O: CALL A
A: EOFCREATE[1](X)
C: eofcreate_addr(sender=A, code=A, salt=X, idx=1)
C: EOFCREATE[2](Y) (initcode execution)
D: eofcreate_addr(sender=C, code=C?, salt=Y, idx=2)
C: deployed

O: CALL C (deployed above)
C: EOFCREATE[2](Y)
E: eofcreate_addr(sender=C, code=C?, salt=Y, idx=2)
this will generate the same address and collide with D.

pdobacz · 2024-10-15T17:45:48Z

this will generate the same address and collide with D.

Is this a problem though? seems OK to me. We have the a deterministic address to deploy D / E at, but the code itself (contents) is not in the witness

frangio · 2024-11-04T22:22:54Z

Wouldn't this scheme work?

keccack256(sender_address || code_address || salt || during_init || init_subcontainer_idx)

during_init would be a boolean that is true iff EOFCREATE executes during sender init.

The combination (code_address, during_init, init_subcontainer_idx) should uniquely identify a container. If during_init == true, take the init container that created code and look at subcontainer number init_subcontainer_idx. If during_init == false, take the runtime container of code and look at subcontainer number init_subcontainer_idx.

Amending @chfast's examples above:

O: CALL A
A: EOFCREATE[1](X)
C: eofcreate_addr(sender=A, code=A, salt=X, during_init=0, idx=1)

O: CALL A
A: DELEGATECALL B
B: EOFCREATE[2](X)
C: eofcreate_addr(sender=A, code=B, salt=X, during_init=0, idx=2)

O: CALL A
A: EOFCREATE[1](X)
C: eofcreate_addr(sender=A, code=A, salt=X, during_init=0, idx=1)
C: EOFCREATE[2](Y) (initcode execution)
D: eofcreate_addr(sender=C, code=C, salt=Y, during_init=1, idx=2)

O: CALL C (deployed above)
C: EOFCREATE[2](Y)
E: eofcreate_addr(sender=C, code=C, salt=Y, during_init=0, idx=2)

E no longer equal to D

Any further nested EOFCREATE would have different sender.

pdobacz · 2024-11-05T08:26:56Z

EDIT: this post is likely some kind of misunderstanding on my part. We mentioned originally that code address is the address of the outer-most EOFCREATE, but actually, a scheme where code_address changes during each EOFCREATE seems to avoid address conflicts better...

Having revisited @chfast 's example after a while, I think by code_address we mean the address where the outer-most EOFCREATE in a nested chain of EOFCREATEs resides (there is no other address with code in that chain yet!), so:

O: CALL A
A: EOFCREATE[1](X)
C: eofcreate_addr(sender=A, code=A, salt=X, idx=1)
C: EOFCREATE[2](Y) (initcode execution)
D: eofcreate_addr(sender=C, code=C A, salt=Y, idx=2) <----- code not C but A here
C: deployed

This already makes D != E or putting it differently:

When CALL is used to change context in a chain of calls, both sender and code will change
DELEGATECALL - code will change, sender won't
EOFCREATE - sender will change, code won't

However, this only changes the way one runs into the D==E conflict:

O: CALL C (deployed above)
C: DELEGATECALL A
A: EOFCREATE[2](Y)
E: eofcreate_addr(sender=C, code=A, salt=Y, idx=2) == D

but initcodes used for D and E are different (at different nesting depth in A)

pdobacz · 2024-11-05T08:46:47Z

...or wait, maybe we actually do want the code to change on EOFCREATE too, even though there is no code at that address yet. Combined with @frangio's during_init addition seems to work to avoid conflicts.

gumb0 · 2024-11-07T05:54:04Z

...or wait, maybe we actually do want the code to change on EOFCREATE too, even though there is no code at that address yet. Combined with @frangio's during_init addition seems to work to avoid conflicts.

It doesn't sound right for nested EOFCREATE to have code_address equal to the address that outer EOFCREATE will deploy... Or maybe it should be called differently then, not code_address, but executing_address

pdobacz · 2024-11-07T09:30:54Z

...or wait, maybe we actually do want the code to change on EOFCREATE too, even though there is no code at that address yet. Combined with @frangio's during_init addition seems to work to avoid conflicts.

It doesn't sound right for nested EOFCREATE to have code_address equal to the address that outer EOFCREATE will deploy... Or maybe it should be called differently then, not code_address, but executing_address

The problem stems from the fact that we're swapping the code executing at the context of C - during init it is the initcode, after it is the initcode's subcontainer (RETURNCONTRACTed). This makes the two instances of EOFCREATE[2](Y) mean different things and is solved by Frangio's proposal. From that PoV it kinda makes sense - it's 2 different codes, but both live at C, so they have a common code_address, so to speak.

code_address name is just a name related to DELEGATECALL (which we are addressing here). Having this in mind, with executing_address it isn't clear to me if it's the code_address or msg.recipient address (the context which "executes" some code)...

gumb0 · 2024-11-07T09:56:31Z

...or wait, maybe we actually do want the code to change on EOFCREATE too, even though there is no code at that address yet. Combined with @frangio's during_init addition seems to work to avoid conflicts.

It doesn't sound right for nested EOFCREATE to have code_address equal to the address that outer EOFCREATE will deploy... Or maybe it should be called differently then, not code_address, but executing_address

The problem stems from the fact that we're swapping the code executing at the context of C - during init it is the initcode, after it is the initcode's subcontainer (RETURNCONTRACTed). This makes the two instances of EOFCREATE[2](Y) mean different things and is solved by Frangio's proposal. From that PoV it kinda makes sense - it's 2 different codes, but both live at C, so they have a common code_address, so to speak.

From my perspective in case of EOFCREATE nested in outer EOFCREATE initcode, the inner EOFCREATE's initcode doesn't "live at C" at all, it has almost nothing to do with C. C is an address that will be deployed (or not) when outer EOFCREATE finishes.
C happens to be msg.recipient when inner EOFCREATE is being executed (this is what I call "executing address")

But this is a bit of bikeshedding. I agree during_init flag seems to solve it.

pdobacz · 2024-11-07T10:29:42Z

inner EOFCREATE's initcode doesn't "live at C" at all, it has almost nothing to do with C

Yeah, I see your point here. But actually the bikeshedding is useful. I revisited the option with "code_address doesn't change on EOFCREATE + during_init" with this new perspective and now it seems to me it works too, I must've made a mistake somewhere yesterday, PTAL:

O: CALL A
A: EOFCREATE[1](X)
C: eofcreate_addr(sender=A, code=A, salt=X, during_init=0, idx=1)
C: EOFCREATE[2](Y) (initcode execution)
D: eofcreate_addr(sender=C, code=A, salt=Y, during_init=1, idx=2)
C: deployed

O: CALL C (deployed above)
C: EOFCREATE[2](Y)
E: eofcreate_addr(sender=C, code=C, salt=Y, during_init=0, idx=2) != D

O: CALL C (deployed above)
C: DELEGATECALL A
A: EOFCREATE[2](Y)
F: eofcreate_addr(sender=C, code=A, salt=Y, during_init=0, idx=2) != D and != E

In this version code_address matches the expectations - it is where the code lives

gumb0 · 2024-11-07T10:38:18Z

In this version code_address matches the expectations - it is where the code lives

Yes, I like this version more. Seems to work and not conflict on deeper nesting levels, too.

O: CALL A
A: EOFCREATE[1]
C: eofcreate_addr(sender=A, code=A, salt=X, during_init=0, idx=1)
C: EOFCREATE[2] (initcode execution)
D: eofcreate_addr(sender=C, code=A, salt=Y, during_init=1, idx=2)
D: EOFCREATE[2] (initcode execution)
G: eofcreate_addr(sender=D, code=A, salt=Y, during_init=1, idx=2)

frangio · 2024-11-07T15:46:06Z

This seems to work. The approach seems equivalent to a list of indices pointing at a deeply nested subcontainer of code.

I find it hard to reason about though.

I think if you see G = eofcreate_addr(sender=D, code=A, salt=Y, during_init=1, idx=2), during_init=1 means that the runtime container at G was deployed by an init container located somewhere in A, in particular subcontainer index 2 of the init container that deployed D. Recursively you arrive at C, where during_init=0 means that the runtime container at C was deployed by the subcontainer A[1].

So the init containers for each of these contracts are:

C: A[1]
D: A[1][2]
G: A[1][2][2]

There seems to be some redundancy here:

the runtime container at G was deployed by an init container located somewhere in A, in particular subcontainer index 2 of the init container that deployed D

A is not really used in the procedure.

So an alternative could be to remove during_init and replace during_init=1 with code_address=0, since the actual code_address is implicit in sender_address in this case.

keccack256(sender_address || code_address || salt || init_subcontainer_idx)

Note that EOFCREATE in a DELEGATECALL context always results in code_address set to the target of DELEGATECALL, because that's where the init container is located, regardless of whether the sender is being deployed.

frangio · 2024-11-07T16:11:32Z

Since it looks like we may have solved this issue I'll resurface my previous comment. With this change we would be losing the ability to make on-chain proofs about the behavior of an account without a trusted factory (although it would be recoverable with a zk-coprocessor). I do think we need to consider whether it's okay to remove that, or if it's a primitive that applications are relying on.

I'm currently weakly leaning towards probably okay to remove.

pdobacz · 2024-11-07T17:00:21Z

losing the ability to make on-chain proofs about the behavior of an account without a trusted factory

Can you clarify what kind of a behavior proof? Just that codehash(address) == particular hash? Could this be substituted by address_and_subcontainer_idx_of_a_particular_factory(address) == particular address and idx? That is instead of proving code hash of an address is X, we prove where exactly the code is coming from.

frangio · 2024-11-07T19:09:33Z

Yeah that works if you have a trusted/known factory. This is probably enough.

gumb0 · 2024-11-08T05:27:40Z

So an alternative could be to remove during_init and replace during_init=1 with code_address=0, since the actual code_address is implicit in sender_address in this case.

I like this variant, too. I would reframe it as we'd have 2 different schemes depending on whether EOFCREATE is called inside initcode:

Non-nested EOFCREATE: keccak256(sender_address + code_address + salt + sub-container-index)
Nested EOFCREATE inside initcode: keccak256(sender_address + salt + sub-container-index)

charles-cooper · 2024-11-21T16:24:04Z

how about just removing the witness entirely? i.e. keccak256(sender_address + salt). i think the user is mainly interested that other users cannot trivially produce a collision with some salt they want to deploy to, but the EVM can let them be responsible for making sure they don't produce a collision with themselves.

charles-cooper · 2024-11-25T12:04:32Z

one issue with using the initcontainer index in the hash is that it makes counterfactual address calculation potentially impossible on chain. since the EOFCREATE-ing contract cannot introspect the code of the factory address (that it delegates to), it cannot counterfactually produce the target address.

charles-cooper · 2024-11-25T12:08:06Z

We identified that the keccak256(init-container) goes against the "code non-observability" because it locks in the contents of the init-container e.g. preventing re-writing it in some future upgrade.

also wanted to point out that using initcontainer index rules out certain types of code rewrites as well. for instance, reordering of initcontainers or fusing them.

frangio · 2024-11-28T17:17:48Z

We've included Foo_hash in the final_salt, so that should be impossible.

It would be possible if the target of DELEGATECALL uses the proposed raw_salt in Solidity or just handwritten EVM code.

If we say that developers should only delegate to code they trust not to do this, we're back in the argument about the degree of responsibility put on them. CREATE2 collisions triggered by DELEGATECALL targets is not something they're responsible for today in legacy code.

kuzdogan · 2024-11-29T10:08:21Z

During the conversation in Nov 27 call a new separate metadata section was proposed and generally agreed upon: data that can't be read by EVM and any change to this does not affect the code (spec/EIP TBD).

I'd like to point out that, if we are to keep the initcode hash in the EOFCREATE parameters, there's a benefit to leaving out the metadata section in the hash calculation. This is a common current headache in CREATE2 contracts and reason many teams choose to opt-out of the metadata hash in the bytecode.

pdobacz · 2024-12-02T09:22:41Z

if we are to keep the initcode hash in the EOFCREATE parameters, there's a benefit to leaving out the metadata section in the hash calculation.

I see it more like the metadata section's being there becomes a strong argument for leaving out the initcode hash entirely - in any of the variants proposed here. Calculating initcode hash from a subset of sections sounds impractical to me.

pdobacz · 2024-12-02T09:57:00Z

It would be possible if the target of DELEGATECALL uses the proposed raw_salt in Solidity or just handwritten EVM code.

Oh yes, impossible when final_salt is used. But I assume using raw_salt would be intentional usage, when the CREATE2/EOFCREATE usage pattern must be somehow customized, and then you depart from the "CREATE2 guarantees parity".

If we say that developers should only delegate to code they trust not to do this

Isn't it already the case only trusted code should be delegated to? That is, in broader context than just creation logic. If we have a target function delegatecall_me_i'll_create_a_contract, I'd expect it would document precisely how it will create, if it uses the non-default raw_salt inside, and the caller would decide, if it can delegate to it.

Is there a usage pattern where that wouldn't work well enough @frangio?

charles-cooper · 2024-12-02T13:13:45Z

Not sure about this, but I think CREATE2 caters for fully counterfactual deployments (code not on-chain), so it needs to include the initcode hash. EOFCREATE doesn't, so it has the opportunity to improve, fully counterfactual deployments will be up to the future TXCREATE.

speaking of counterfactuality, i think the issue with keccak256(msg.sender + code_address + salt) and the variants including the subcontainer index are that they can't be computed on-chain. i'm not sure how big of an issue this is, but it is already stepping outside of the design goals of the original CREATE2, which allow smart contracts to compute counterfactual deploy addresses.

frangio · 2024-12-02T13:14:33Z

Isn't it already the case only trusted code should be delegated to?

Yes but the way creation salts are constructed is not a property one has to audit of DELEGATECALL targets at the moment. It's not impossible to audit, it's just a new checklist item with respect to legacy.

The way I'm thinking about this is global vs local properties, where global collision avoidance should be taken care of by the EVM and local collision avoidance should be ensured by the contract code, where "contract code" is the developer and their compiler, and the property should not be breakable by an end user (eventually an attacker) under any circumstance (barring Keccak256 breaking) including if the user is able to choose a DELEGATECALL target. I recognize this last part is pretty strong so I'm not attached to it.

frangio · 2024-12-02T13:17:51Z

i think the issue with keccak256(msg.sender + code_address + salt) and the variants including the subcontainer index are that they can't be computed on-chain

Not sure what you mean by this. It can be computed on chain if you know the parameters, which the contract should document and would already be documented anyway, among other things because the salt is often constructed and not explicit in the input.

Can you describe end to end a scenario where you see an issue?

charles-cooper · 2024-12-02T15:55:30Z

i think the issue with keccak256(msg.sender + code_address + salt) and the variants including the subcontainer index are that they can't be computed on-chain

Not sure what you mean by this. It can be computed on chain if you know the parameters, which the contract should document and would already be documented anyway, among other things because the salt is often constructed and not explicit in the input.

Can you describe end to end a scenario where you see an issue?

i mean like you need more arguments than are required for just the eofcreate, e.g.

def create_something() -> address:
    salt: bytes32 = self._compute_salt()
    return factory.create(args, salt)  # calls EOFCREATE or delegates to to another factory

def counterfactual() -> address:  # compute the address counterfactually of factory.create()
    salt: bytes32 = self._compute_salt()
    return ... # can't, need the final code_address and potentially the initcontainer index depending on the scheme

frangio · 2024-12-02T17:57:36Z

Ok, I was going to suggest the factory should expose a getter for counterfactual addresses because it knows those parameters. But this is not possible if the factory is upgradeable because the code_address parameter becomes time dependent, future values are not known and all past values need to be stored.

charles-cooper · 2024-12-02T18:12:14Z

right -- so i think the point is it would break an existing property of CREATE2, which is that you can counterfactually predict the address of invoking CREATE2 from just its inputs.

pcaversaccio · 2024-12-11T17:48:12Z

Maybe this has already been discussed and my comment is completely off-topic, but what I personally care about is having a way to guarantee cross-chain runtime bytecode equivalence. Both proposals here keccak256(msg.sender + salt) and keccak256(msg.sender + code_address + salt) do not guarantee this. I understand that we can't introspect code in EOF and that there is no initcode hash that can be used like in CREATE2 but I just want to raise a point that I deem important.

frangio · 2024-12-11T19:39:17Z

I'd argue you need init code equivalence because the same runtime code initialized differently can have wildly different properties.

Both proposals here keccak256(msg.sender + salt) and keccak256(msg.sender + code_address + salt) do not guarantee this.

Indeed but note that if your goal is to build a generic factory like CreateX you will not be able to do that with EOFCREATE. Generic factories would be enabled by TXCREATE, where the init code hash is available and can be used in the salt to guarantee code equivalence. At the moment it's unclear if this will ship in Osaka. See EOF Implementers Call #63 for more discussion.

charles-cooper · 2024-12-11T19:42:51Z

I'd argue you need init code equivalence because the same runtime code initialized differently can have wildly different properties.

That's an interesting point, although any observable difference in chain state can result in runtime code with different properties (examples: reading block.timestamp in the initcode, or calling view functions on some external contract).

pdobacz · 2024-12-11T19:54:18Z

One more thing to mention is that we weren't so far considering altering creation tx hashing scheme. It's currently same as legacy (sender + sender_nonce), but could be considered for EOF to include initcode hash and salt instead of the sender_nonce, at the expense of ugly requirement to append 32 bytes of salt after the init container and before calldata.

Plot twist: instead of after the initcontainer, the salt could be... in the tx initcontainer's EIP-7834 metadata section :P.

This is slightly offtopic, b/c this thread is for EOFCREATE, but if reliable-deterministic-cross-chain addresses are a concern, and we can't afford to wait for TXCREATE, maybe that is some way out?

pcaversaccio · 2024-12-12T10:10:40Z

I'd argue you need init code equivalence because the same runtime code initialized differently can have wildly different properties.

Right, I think it depends on the use case. For CreateX, for example, I care about runtime bytecode equivalence everywhere (CreateX is stateless by design and has an empty constructor) to ensure that the built-in contract creation functions are equivalent everywhere deployed (important as the factory allows for cross-chain frontrun protection for example). But I can see, how init code equivalence would make sense for many other applications.

Indeed but note that if your goal is to build a generic factory like CreateX you will not be able to do that with EOFCREATE. Generic factories would be enabled by TXCREATE, where the init code hash is available and can be used in the salt to guarantee code equivalence. At the moment it's unclear if this will ship in Osaka. See EOF Implementers Call #63 for more discussion.

Interesting - well after skimming through that proposal my first view is: Having a new transaction type to access this feature is an unnecessary overhead IMHO. We should strive for KISS, and not start adding new transaction types due to a bad design in EOFCREATE.

Plot twist: instead of after the initcontainer, the salt could be... in the tx initcontainer's EIP-7834 metadata section :P.

Interesting - I have to read that EIP first again.

I would like to mention that in yesterday's EOF call, it was mentioned that CreateX uses Nick's method. This is not true. I have the private key as backup, and here I elaborate on the decision why. Also, there are many RPCs that don't support per-EIP-155 transactions even though at the network level it would be supported (reason being that Geth defaults to non-support of pre-EIP-155 transactions since Berlin; see ethereum/go-ethereum#22339) as well as networks that simply don't support pre-EIP-155 transactions. Furthermore, some note on why Nick's method doesn't scale: I have 3 presigned transactions for CreateX creation available (see here), with one having 45m gasLimit. The last one couldn't be broadcasted on Ethereum due to today's block gasLimit. But even that one wouldn't be enough to e.g. deploy on Filecoin, which requires more than 100m gasLimit. So you see, having a backup key makes CreateX deployable on non-EVM-equivalent but on EVM-similar chains.

Lastly, if you're interested in some stats on CreateX, there is a community-maintained Dune dashboard: https://dune.com/patronumlabs/createx.

pdobacz · 2025-01-08T17:30:20Z

We've put together a summary doc with some possible scenarios of revising the hashing schemes/deployment methods for EOF: https://notes.ethereum.org/@ipsilon/SyrzctZSJg. The goal of this is to lay out our options in the clear and discuss them - whether we still can alter the EOF address schemes and if yes - what is the best way to do it.

All feedback very welcome. If the doc is missing a solution/scenario you'd like to make a case for, please let me know. I'm looking forward to discussing this in depth on the next EOF implementers call.

Taking the liberty to tag @pcaversaccio @charles-cooper @cameel @kuzdogan

frangio · 2025-01-10T22:17:59Z

@pdobacz What does it mean for an approach to support AA deployments? What are the AA-specific challenges?

More generally, can you provide a description of each of the items in the comparison table? I think it would be useful to have as a list of desiderata.

These are the ones I can recall:

EOFCREATE hashing scheme:
- Must prevent collisions between deployers
- Should not include the init container hash (it requires code introspection)
- May not have the same collision guarantees as CREATE/CREATE2 but should allow languages/libraries to recover them by appropriate construction of a salt
- Ideally the developer is free to choose what to mix into the hash (input data, init code index or hash, etc.)
- Contracts should be able to predict addresses as a function of inputs (not including contract state)

Additionally the following have come up:

Same-address multi-chain deployments, where equivalent addresses roughly imply equivalent deployments
Generic factories, which requires something like TXCREATE

IMO (1) is the main requirement, and (2) is just one technique currently used to achieve it. That is because same-address multi-chain deployments are extremely challenging, unless there is a preexisting multi-chain generic factory, so we make a big effort into deploying that factory to enable it. The issue with EOF is that it kills generic factories, but if multi-chain deployments were solved I believe this would not be a significant issue. Since it seems like TXCREATE would be a big source of ACDE friction, perhaps we can focus on solving multi-chain deployments in some other way.

shemnon · 2025-01-13T15:37:23Z

Since EOF is the current "headliner" in Osaka I don't think we will have the same friction getting TXCREATE in, especially since it is being driven by end-user requirements and not driven by the evm devs.

Given that, I think Address+salt for both EOFCREATE and TXCREATE and a set of ERC standard contracts (with "standard" deployments) that address the use cases is what I see working. We may want to add / wrap the hash with a per-opcode value to prevent EOFCREATE and TXCREATE from having same-contract collisions. like hash(0xef0001ec || <address> || <salt> || 0xef0001ec) for EOFCREAE and use 0xef0001ed as the bumper for TXCREATE

pdobacz · 2025-01-14T10:50:36Z

@pdobacz What does it mean for an approach to support AA deployments? What are the AA-specific challenges?

"AA deployments" come up in the context of comparison of the pieces labelled (D/) and (F/), so TXCREATE and the nonce-less creation txs. While (D/) TXCREATE allows a smart contract wallet to deploy arbitrary code, while such creation txs do not, which is what sets the latter at a disadvantage.

Thanks for noting, I think this wording is too much of a mental shortcut. Should read "support deployments by smart contract wallets, as required by AA"

More generally, can you provide a description of each of the items in the comparison table? I think it would be useful to have as a list of desiderata.

Good point. I tried to make this self-explanatory by elaborating in the A, B, C... sections, but this seems to not be clear enough.

These are the ones I can recall:
* EOFCREATE hashing scheme:
  
  * Must prevent collisions between deployers
  * Should not include the init container hash (it requires code introspection)
  * May not have the same collision guarantees as CREATE/CREATE2 but should allow languages/libraries to recover them by appropriate construction of a salt
  * Ideally the developer is free to choose what to mix into the hash (input data, init code index or hash, etc.)
  * Contracts should be able to predict addresses as a function of inputs (not including contract state)
Additionally the following have come up:
1. Same-address multi-chain deployments, where equivalent addresses roughly imply equivalent deployments

2. Generic factories, which requires something like TXCREATE
IMO (1) is the main requirement, and (2) is just one technique currently used to achieve it. That is because same-address multi-chain deployments are extremely challenging, unless there is a preexisting multi-chain generic factory, so we make a big effort into deploying that factory to enable it. The issue with EOF is that it kills generic factories, but if multi-chain deployments were solved I believe this would not be a significant issue. Since it seems like TXCREATE would be a big source of ACDE friction, perhaps we can focus on solving multi-chain deployments in some other way.

Got it, so I think, in the table "Bytecode guarantees" boils down to 1., while "Generic factories" boils down to 2. My question is, whether or not 2. isn't of merit beyond providing 1. I initially thought having a factory which on one hand supports deploying arbitrary code, and on the other running some fixed logic (like registering the new contract in some registry or whatnot), would be useful. But if we can ascertain this is not a useful tool, I can put the row "Generic factories" off the comparison table and treat it only as means to accomplishing 1.

frangio · 2025-01-14T18:17:56Z

support deployments by smart contract wallets, as required by AA

What deployments are needed? Say in the context of ERC-4337, a UserOp can specify a factory, but the factory doesn't need to be generic as far as I can tell.

pdobacz · 2025-01-15T09:14:07Z

support deployments by smart contract wallets, as required by AA

What deployments are needed? Say in the context of ERC-4337, a UserOp can specify a factory, but the factory doesn't need to be generic as far as I can tell.

Hm, Okay, maybe I'm missing sth, but let's take 4337's: (EDIT: I was missing something indeed, see below)

Create the account if it does not yet exist, using the initcode provided in the UserOperation...

I understand that this step ~~requires~~ can use a generic factory (that is, not an EOFCREATE one). In EOF with (D/) instead of deploying initcode provided in data would use a TXCREATE pointing to an initcontainer included in the tx's initcodes field (c.f. TXCREATE old spec). This isn't possible with (F/).

I think same challenge is with 7702 wallets deploying contracts.

EDIT: as noted by Frangio, I didn't understand the main usecase behind that initcode in ERC-4337, see comments that follow. The main usecase behind such initcode in ERC-4337 sense seems to be one which EOFCREATE does support.

frangio · 2025-01-15T15:24:51Z

initCode is a misleading name so we should clarify that it's defined in ERC-4337 as:

concatenation of factory address and factoryData (or empty)

So initCode is not necessarily bytecode for a generic factory, it could very well be the address of a specialized factory (that can use EOFCREATE) and an encoded function call to request creation of an account from that factory. I'd expect that to be the most common way since it's cheaper. But it could technically also be bytecode for a generic factory, are there scenarios where this becomes necessary?

pdobacz · 2025-01-15T17:41:21Z

initCode is a misleading name so we should clarify that it's defined in ERC-4337 as:

OK, so it definitely misled me, thank you for bearing with me.

But it could technically also be bytecode for a generic factory, are there scenarios where this becomes necessary?

Yes, so this is my question as well, or more generally - how, if at all, can ERC-4337 SC Wallets deploy new contracts. In legacy EVM this is possibly using CREATE2, but I don't know if it is a pattern actually used or intended.

frangio · 2025-01-15T21:53:50Z

how, if at all, can ERC-4337 SC Wallets deploy new contracts

Oh! I see what you mean now.

I just looked through Safe and Coinbase Smart Wallet as two examples, and couldn't find any way to directly deploy a contract from either. In the case of Safe it could be done through DELEGATECALL into a factory. I'll try asking other AA teams, but I don't see why this feature would be needed tbh.

pdobacz · 2025-01-29T14:24:26Z

On the last EOF implementers call 65 we arrived at some consensus to aim for pushing the Scenario 1b from https://notes.ethereum.org/@ipsilon/SyrzctZSJg, being the addition of TXCREATE, InitcodeTransaction and a predeployed Creator Contract(s) to serve as toe-hold contracts. Also keccak256(sender_address + salt) is proposed as the hashing scheme for both EOFCREATE and TXCREATE. Legacy-like creation transactions for EOF (EIP-7698) would at the same time be removed from EOF. Refer to the call notes/recording for details.

The change would be proposed in a new EIP (currently in the making), but it can be previewed in the PR to the EOF Megaspec document. The spec there is equivalent to that of the EIP being prepared.

Please take a look @pcaversaccio @charles-cooper @cameel @kuzdogan and provide feedback. I'm looking forward mainly to receiving confirmation that this approach satisfies the required deployment methods in use today. Or of course, if they don't, please let us know why and how we should fix it. We can also move that preliminary discussion to that EOF Megaspec document PR, before we have the EIP draft and a corresponding EthMag thread.

(Last minute heads-up: meanwhile we've identified what might be an issue. Quick gist: the EIP-7834 metadata section is at odds with the approach for a factory to include initcontainer_hash in the TXCREATE's salt to obtain bytecode guarantees. Hopefully, it's not a show-stopper. I'll write the details down later.)

pcaversaccio · 2025-01-29T16:39:15Z

On the last EOF implementers call 65 we arrived at some consensus to aim for pushing the Scenario 1b from https://notes.ethereum.org/@ipsilon/SyrzctZSJg, being the addition of TXCREATE, InitcodeTransaction and a predeployed Creator Contract(s) to serve as toe-hold contracts. Also keccak256(sender_address + salt) is proposed as the hashing scheme for both EOFCREATE and TXCREATE. Legacy-like creation transactions for EOF (EIP-7698) would at the same time be removed from EOF. Refer to the call notes/recording for details.

The change would be proposed in a new EIP (currently in the making), but it can be previewed in the PR to the EOF Megaspec document. The spec there is equivalent to that of the EIP being prepared.

Please take a look @pcaversaccio @charles-cooper @cameel @kuzdogan and provide feedback. I'm looking forward mainly to receiving confirmation that this approach satisfies the required deployment methods in use today. Or of course, if they don't, please let us know why and how we should fix it. We can also move that preliminary discussion to that EOF Megaspec document PR, before we have the EIP draft and a corresponding EthMag thread.

(Last minute heads-up: meanwhile we've identified what might be an issue. Quick gist: the EIP-7834 metadata section is at odds with the approach for a factory to include initcontainer_hash in the TXCREATE's salt to obtain bytecode guarantees. Hopefully, it's not a show-stopper. I'll write the details down later.)

I don't want to sidetrack the discussion but this is interesting from EOF Megaspec document:

If I'm not mistaken, this would be the first predeploy (!= precompile) on Ethereum. Is there some discussion (e.g. an EIP) for this? I have seen such an approach for many L2s, but not yet for Ethereum.

So scenario 1b) states:

C/ EOFCREATE hashes with keccak256(sender + salt)
D/ TXCREATE hashes with keccak256(sender + salt)
E/ a predeployed TXCREATE factory contract to bootstrap EOF

I'm sorry if I missed the conversation around it, but you claim this scenario has "Bytecode guarantees" in here:

How can you exactly guarantee the bytecode for a counterfactual contract address without using the init code in EOFCREATE and TXCREATE?

kuzdogan · 2025-01-29T16:42:56Z

Thanks @pdobacz I think I understood most of it but as a less versed person on the spec and terminology I'd want to summarize my takeaways and maybe you can correct me:

There will be no creation tx's, ie. tx's with to=null won't be able to deploy EOF code, only legacy code
There will be a new tx type InitcodeTransaction with a new initcodes field containing the initcodes to be deployed. This tx has to be sent to a contract that utilizes TXCREATE opcode.
There will be a TXCREATE factory contract deployed whose sole job will be to deploy each of codes in the initcodes field's array. Say this contract will be at 0xabcd.
One can't deploy contracts from an EOA by to=null tx's but can send a InitcodeTransaction type transaction to 0xabcd contract. That means all the EVM tooling for this specific chain needs to know ahead the address of this contract 0xabcd?

(Last minute heads-up: meanwhile we've identified what might be an issue. Quick gist: the EIP-7834 metadata section is at odds with the approach for a factory to include initcontainer_hash in the TXCREATE's salt to obtain bytecode guarantees. Hopefully, it's not a show-stopper. I'll write the details down later.)

Where is the initcontainer_hash in this whole picture? I also some initcode hash being mentioned in TXCREATE specs:

    - pops one more value from the stack (first argument): `tx_initcode_hash`
    - loads the initcode EOF container from the transaction `initcodes` array which hashes to `tx_initcode_hash`

But I don't see it in the 1b option. It's just keccak(sender + salt):

pdobacz · 2025-01-29T17:36:11Z

Thank you for taking a look, this is much appreciated! Let me try to clarify everything.

Actually, I'm sorry, but the Creator Contract source code mentioned in the PR got an error (now fixed) - TXCREATE looks up the initcontainer by its hash, not its index. TXCREATE's description was correct, but maybe the incorrect Creator Contract source-code misled you both.

Having fixed that, to answer your specific questions:

How can you exactly guarantee the bytecode for a counterfactual contract address without using the init code in EOFCREATE and TXCREATE?

@pcaversaccio Consider a factory pseudocode (I omitted value and input for brevity):

function createWithInitcodeWitness(initcode_hash, salt) {
    final_salt = keccak256(initcode_hash || salt)
    return txcreate(initcode_hash, final_salt)
}

If such factory is deployed at guarantees_factory and gets CALLed, it will deploy using the initcode entry from the transaction which corresponds to initcode_hash (by the rules of TXCREATE). At the same time initcode_hash is included in the final_salt passed on to TXCREATE, so new_address = keccak256(0xff || guarantees_factory_address || keccak256(initcode_hash || salt)).

I'd want to summarize my takeaways and maybe you can correct me:

@kuzdogan looks all correct. A minor remark is that the intention is for the 0xabcd to be the same for all chains adopting the EIP (and EOF), likely an address from the precompile addresses range. I'm not 100% sure if this is something we can expect to hold?

But I don't see it in the 1b option. It's just keccak(sender + salt):

Maybe the above answer to pcaversaccio helps? It would be up to the specific factory to include it.

I now realized that the "bootstrap" Creator Contract should work more similar to the createWithInitcodeWitness function, i.e. should include initcode_hash in the address. This would ensure that all "derived" TXCREATE factories land in same addresses iff they have the same code.

chfast assigned axic Sep 6, 2024

pdobacz mentioned this issue Sep 9, 2024

EOFv1 final tuning #165

Open

5 tasks

shemnon mentioned this issue Oct 30, 2024

EOF Implementers Call #61 ethereum/pm#1184

Closed

shemnon mentioned this issue Dec 11, 2024

EOF Implementers Call #63 ethereum/pm#1205

Closed

pdobacz mentioned this issue Jan 27, 2025

Bring back TXCREATE #177

Draft

EOFCREATE: Don't hash the init-container #162

EOFCREATE: Don't hash the init-container #162

Comments

chfast commented Sep 6, 2024

Solution 1: Use sub-container index

Solution 2: Use code's address + sub-container index

axic commented Oct 3, 2024

pdobacz commented Oct 7, 2024 • edited Loading

frangio commented Oct 7, 2024

pdobacz commented Oct 8, 2024

frangio commented Oct 8, 2024 • edited Loading

frangio commented Oct 8, 2024 • edited Loading

chfast commented Oct 15, 2024

Analysis of input data for create address

CREATE

CREATE2

EOFCREATE solution 2

chfast commented Oct 15, 2024

shemnon commented Oct 15, 2024

chfast commented Oct 15, 2024 • edited Loading

pdobacz commented Oct 15, 2024 • edited Loading

frangio commented Nov 4, 2024 • edited Loading

pdobacz commented Nov 5, 2024 • edited Loading

pdobacz commented Nov 5, 2024

gumb0 commented Nov 7, 2024

pdobacz commented Nov 7, 2024

gumb0 commented Nov 7, 2024

pdobacz commented Nov 7, 2024 • edited Loading

gumb0 commented Nov 7, 2024

frangio commented Nov 7, 2024 • edited Loading

frangio commented Nov 7, 2024 • edited Loading

pdobacz commented Nov 7, 2024

frangio commented Nov 7, 2024

gumb0 commented Nov 8, 2024

charles-cooper commented Nov 21, 2024

charles-cooper commented Nov 25, 2024

charles-cooper commented Nov 25, 2024

frangio commented Nov 28, 2024

kuzdogan commented Nov 29, 2024

pdobacz commented Dec 2, 2024

pdobacz commented Dec 2, 2024

charles-cooper commented Dec 2, 2024

frangio commented Dec 2, 2024 • edited Loading

frangio commented Dec 2, 2024

charles-cooper commented Dec 2, 2024 • edited Loading

frangio commented Dec 2, 2024

charles-cooper commented Dec 2, 2024 • edited Loading

pcaversaccio commented Dec 11, 2024

frangio commented Dec 11, 2024

charles-cooper commented Dec 11, 2024

pdobacz commented Dec 11, 2024

pcaversaccio commented Dec 12, 2024

pdobacz commented Jan 8, 2025

frangio commented Jan 10, 2025 • edited Loading

shemnon commented Jan 13, 2025

pdobacz commented Jan 14, 2025

frangio commented Jan 14, 2025

pdobacz commented Jan 15, 2025 • edited Loading

frangio commented Jan 15, 2025 • edited Loading

pdobacz commented Jan 15, 2025

frangio commented Jan 15, 2025

pdobacz commented Jan 29, 2025

pcaversaccio commented Jan 29, 2025

kuzdogan commented Jan 29, 2025

pdobacz commented Jan 29, 2025 • edited Loading

`EOFCREATE`: Don't hash the init-container #162

`EOFCREATE`: Don't hash the init-container #162

pdobacz commented Oct 7, 2024 •

edited

Loading

frangio commented Oct 8, 2024 •

edited

Loading

frangio commented Oct 8, 2024 •

edited

Loading

`CREATE`

`CREATE2`

`EOFCREATE` solution 2

chfast commented Oct 15, 2024 •

edited

Loading

pdobacz commented Oct 15, 2024 •

edited

Loading

frangio commented Nov 4, 2024 •

edited

Loading

pdobacz commented Nov 5, 2024 •

edited

Loading

pdobacz commented Nov 7, 2024 •

edited

Loading

frangio commented Nov 7, 2024 •

edited

Loading

frangio commented Nov 7, 2024 •

edited

Loading

frangio commented Dec 2, 2024 •

edited

Loading

charles-cooper commented Dec 2, 2024 •

edited

Loading

charles-cooper commented Dec 2, 2024 •

edited

Loading

frangio commented Jan 10, 2025 •

edited

Loading

pdobacz commented Jan 15, 2025 •

edited

Loading

frangio commented Jan 15, 2025 •

edited

Loading

pdobacz commented Jan 29, 2025 •

edited

Loading