Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring back TXCREATE #177

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
120 changes: 93 additions & 27 deletions spec/eof.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,47 @@ On top of the types defined in the table above, the following validity constrain
- the total size of not yet deployed container might be up to `data_size` lower than the above values due to how the data section is rewritten and resized during deployment (see [Data Section Lifecycle](#data-section-lifecycle))
- the total size of a container must not exceed `MAX_INITCODE_SIZE` (as defined in EIP-3860)

## Transaction Types

Introduce new transaction type `InitcodeTransaction` which extends EIP-1559 (type 2) transaction by adding a new field `initcodes: List[ByteList[MAX_INITCODE_SIZE], MAX_INITCODE_COUNT]`.

The `initcodes` can only be accessed via the `TXCREATE` instruction (see below), therefore `InitcodeTransactions` are intended to be sent to contracts including `TXCREATE` in their execution.

We introduce a standardised Creator Contract (i.e. written in EVM, but existing at a known address, such as precompiles), which eliminates the need to have create transactions with empty `to`. The Creator Contract will be predeployed at the EOF activation block. Note that such introduction of the Creator Contract is needed, because only EOF contracts can create EOF contracts. See the appendix below for Creator Contract code.

Under transaction validation rules `initcodes` are not validated for conforming to the EOF specification. They are only validated when accessed via `TXCREATE`. This avoids potential DoS attacks of the mempool. If during the execution of an `InitcodeTransaction` no `TXCREATE` instruction is called, such transaction is still valid.

`initcodes` data is similar to calldata for two reasons:
1) It must be fully transmitted in the transaction.
2) It is accessible to the EVM, but it can't be fully loaded into EVM memory.

For these reasons, define cost of each of the `initcodes` items same as calldata (16 gas for non-zero bytes, 4 for zero bytes -- see EIP-2028). The intrinsic gas of an `InitcodeTransaction` is extended by the sum of all those items' costs.
Copy link
Contributor

@gumb0 gumb0 Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how this interacts with EIP-7623, probably in its terms tokens_in_calldata should be extended with counting the tokens of initcodes (which in the end is the same as extending intrinsic gas with initcodes cost and applying calldata cost floor)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, good catch. I think the intent is that initcodes is charged for just like regular calldata. I propose we delay speccing this, just in case 7623 undergoes some last minute change. I'll leave an issue to handle this

Copy link
Contributor

@shemnon shemnon Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's contract data. Perhaps we just outright charge the floor rate for the txcreate initcode, and say that encompasses the INITCODE_WORD_COST as well as all EOF code validation costs.


EIP-3860 and EIP-170 still apply, i.e. `MAX_CODE_SIZE` as 24576, `MAX_INITCODE_SIZE` as `2 * MAX_CODE_SIZE`. Define `MAX_INITCODE_COUNT` as 256.

`InitcodeTransaction` is invalid if either:
- there are more than `MAX_INITCODE_COUNT` entries in `initcodes`
- `initcodes` is an empty array
- length of any entry in `initcodes` exceeds `MAX_INITCODE_SIZE`
- any entry in `initcodes` has zero length
- the `to` is `nil`

#### RLP and signature

Given the definitions from [EIP-2718](https://eips.ethereum.org/EIPS/eip-2718) the `TransactionPayload` for an `InitcodeTransaction` is the RLP serialization of:

```
[chain_id, nonce, max_priority_fee_per_gas, max_fee_per_gas, gas_limit, to, value, data, access_list, initcodes, y_parity, r, s]
```

`TransactionType` is `INITCODE_TX_TYPE` (`0x05`) and the signature values `y_parity`, `r`, and `s` are calculated by constructing a secp256k1 signature over the following digest:

```
keccak256(INITCODE_TX_TYPE || rlp([chain_id, nonce, max_priority_fee_per_gas, max_fee_per_gas, gas_limit, to, value, data, access_list, initcodes]))
```

The [EIP-2718](https://eips.ethereum.org/EIPS/eip-2718) `ReceiptPayload` for this transaction is `rlp([status, cumulative_transaction_gas_used, logs_bloom, logs])`.

## Execution Semantics

Code executing within an EOF environment will behave differently than legacy code. We can break these differences down into i) changes to existing behavior and ii) introduction of new behavior.
Expand All @@ -118,6 +159,7 @@ Code executing within an EOF environment will behave differently than legacy cod
- The instruction `JUMPDEST` is renamed to `NOP` and remains charging 1 gas without any effect.
- Note: jumpdest-analysis is not performed anymore.
- EOF contract may not deploy legacy code (it is naturally rejected on the code validation stage)
- Legacy creation transactions (any tranactions with empty `to`) are invalid in case `data` contains EOF code (starts with `EF00` magic)
- When executed from a legacy contract, if instructions `CREATE` and `CREATE2` have EOF code as initcode (starting with `EF00` magic)
- deployment fails (returns 0 on the stack)
- caller's nonce is not updated and gas for initcode execution is not consumed
Expand All @@ -126,29 +168,6 @@ Code executing within an EOF environment will behave differently than legacy cod

**NOTE** Like for legacy targets, the aforementioned behavior of `EXTCODECOPY`, `EXTCODEHASH` and `EXTCODESIZE` does not apply to EOF contract targets mid-creation, i.e. those report same as accounts without code.

#### Creation transactions

Creation transactions (tranactions with empty `to`), with `data` containing EOF code (starting with `EF00` magic) are interpreted as having a concatenation of EOF `initcontainer` and `calldata` in the `data` and:

1. intrinsic gas cost rules and limits defined in EIP-3860 for legacy creation transaction apply. The entire `data` of the transaction is used for these calculations
2. Find the split of `data` into `initcontainer` and `calldata`:
- Parse EOF header
- Find `intcontainer` size by reading all section sizes from the header and adding them up with the header size to get the full container size.
3. Validate the `initcontainer` and all its subcontainers recursively.
- unlike in general validation `initcontainer` is additionally required to have `data_size` declared in the header equal to actual `data_section` size.
- validation includes checking that the `initcontainer` does not contain `RETURN` or `STOP`
4. If EOF header parsing or full container validation fails, transaction is considered valid and failing. Gas for initcode execution is not consumed, only intrinsic creation transaction costs are charged.
5. `calldata` part of transaction `data` that follows `initcontainer` is treated as calldata to pass into the execution frame
6. execute the container and deduct gas for execution
1. Calculate `new_address` as `keccak256(sender || sender_nonce)[12:]`
2. A successful execution ends with initcode executing `RETURNCONTRACT{deploy_container_index}(aux_data_offset, aux_data_size)` instruction (see below). After that:
- load deploy-contract from EOF subcontainer at `deploy_container_index` in the container from which `RETURNCONTRACT` is executed
- concatenate data section with `(aux_data_offset, aux_data_offset + aux_data_size)` memory segment and update data size in the header
- let `deployed_code_size` be updated deploy container size
- if `deployed_code_size > MAX_CODE_SIZE` instruction exceptionally aborts
- set `state[new_address].code` to the updated deploy container
7. deduct `200 * deployed_code_size` gas

**NOTE** Legacy contract and legacy creation transactions may not deploy EOF code, that is behavior from [EIP-3541](https://eips.ethereum.org/EIPS/eip-3541) is not modified.

### New Behavior
Expand Down Expand Up @@ -198,13 +217,12 @@ The following instructions are introduced in EOF code:
- peform (and charge for) memory expansion using `[input_offset, input_size]`
- load initcode EOF subcontainer at `initcontainer_index` in the container from which `EOFCREATE` is executed
- let `initcontainer` be that EOF container, and `initcontainer_size` its length in bytes
- deduct `6 * ((initcontainer_size + 31) // 32)` gas (hashing charge)
- check call depth limit and whether caller balance is enough to transfer `value`
- in case of failure returns 0 on the stack, caller's nonce is not updated and gas for initcode execution is not consumed.
- caller's memory slice [`input_offset`:`input_size`] is used as calldata
- execute the container and deduct gas for execution. The 63/64th rule from EIP-150 applies.
- increment `sender` account's nonce
- calculate `new_address` as `keccak256(0xff || sender || salt || keccak256(initcontainer))[12:]`
- calculate `new_address` as `keccak256(0xff || sender || salt)[12:]`
- behavior on `accessed_addresses` and address colission is same as `CREATE2` (rules for `CREATE2` from [EIP-684](https://eips.ethereum.org/EIPS/eip-684) and [EIP-2929](https://eips.ethereum.org/EIPS/eip-2929) apply to `EOFCREATE`)
- an unsuccesful execution of initcode results in pushing `0` onto the stack
- can populate returndata if execution `REVERT`ed
Expand All @@ -216,11 +234,26 @@ The following instructions are introduced in EOF code:
- set `state[new_address].code` to the updated deploy container
- push `new_address` onto the stack
- deduct `200 * deployed_code_size` gas
- `TXCREATE (0xed)` instruction
- Works the same as `EOFCREATE` except:
- does not have `initcontainer_index` immediate
- pops one more value from the stack (first argument): `tx_initcode_hash`
- loads the initcode EOF container from the transaction `initcodes` array which hashes to `tx_initcode_hash`
- fails (returns 0 on the stack) if such initcode does not exist in the transaction, or if called from a transaction of `TransactionType` other than `INITCODE_TX_TYPE`
- caller's nonce is not updated and gas for initcode execution is not consumed. Only `TXCREATE` constant gas was consumed
- let `initcontainer` be that EOF container, and `initcontainer_size` its length in bytes
- deduct `2 * ((initcontainer_size + 31) // 32)` gas (EIP-3860 charge)
- just before executing the initcode container:
- **validates the initcode container and all its subcontainers recursively**
- validation includes checking that the `initcontainer` does not contain `RETURN` or `STOP`
- in addition to this, checks if the initcode container has its `len(data_section)` equal to `data_size`, i.e. data section content is exactly as the size declared in the header (see [Data section lifecycle](#data-section-lifecycle))
- fails (returns 0 on the stack) if any of those was invalid
- caller’s nonce is not updated and gas for initcode execution is not consumed. Only `TXCREATE` constant and EIP-3860 gas were consumed
- `RETURNCONTRACT (0xee)` instruction
- loads `uint8` immediate `deploy_container_index`
- pops two values from the stack: `aux_data_offset`, `aux_data_size` referring to memory section that will be appended to deployed container's data
- cost 0 gas + possible memory expansion for aux data
- ends initcode frame execution and returns control to `EOFCREATE` caller frame (unless called in the topmost frame of a creation transaction).
- ends initcode frame execution and returns control to `EOFCREATE` or `TXCREATE` caller frame.
- `deploy_container_index` and `aux_data` are used to construct deployed contract (see above)
- instruction exceptionally aborts if after the appending, data section size would overflow the maximum data section size or underflow (i.e. be less than data section size declared in the header)
- `DATALOAD (0xd0)` instruction
Expand Down Expand Up @@ -348,6 +381,40 @@ During scanning, for each instruction:

Annotated examples of EOF formatted containers demonstrating several key features of EOF can be found in [this test file within the `evmone` project repository](https://github.com/ethereum/evmone/blob/master/test/unittests/eof_example_test.cpp).

## Appendix: Creator Contract

```solidity
{
/// Takes [tx_initcode_hash][salt][init_data] as input,
/// creates contract and returns the address or failure otherwise

// init_data.length can be 0, but the first 2 words are mandatory
let size := calldatasize()
if lt(size, 64) { revert(0, 0) }

let tx_initcode_hash := calldataload(0)
let salt := calldataload(32)
Copy link
Contributor

@shemnon shemnon Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keccak the tx_initcode_hash with the salt. Having salt only deployment as the principal toehold feels like opening up to a race, with no connection to the deployed code. Subsequent ERC contracts can be deployed that do keep this feature, but if the salt only contract goes first the ERC contract set could get front run by literally anyone.

Notionally keccak(0xef0001 || tx_initcode_hash || salt)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, same realization in the "big thread". initcode_hash missing would actually not only open a race but break the feature of cross-chain deployments to same address, with no option of recovery! Will fix in a second.

is the addition of 0xef0001 magic just for good measure or do you have a scenario where a collision can be crafted somehow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of the particular choice of magic value coinciding with EOF magic value. For now I'll leave it out for brevity, but I'll add a ticket to track figuring this out properly when time comes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be another magic value, but I would like the address mining to be in it's own "namespace" and adding a fixed unique value does that. Any mini-salt is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, good point. How do we pick such "namespace" though... 0xff again (and rely on different sizes of pre-image). Or do we want to be creative? I'll put in 0xff to publish and let's work from there

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, good point. How do we pick such "namespace" though... 0xff again (and rely on different sizes of pre-image). Or do we want to be creative? I'll put in 0xff to publish and let's work from there

Copy link
Contributor

@shemnon shemnon Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea with 0xef0001 is that is is the premable to EOF.

For the ERCs I plan on 0x87650A, 0x87650B, and 0x87650C assuming 8765 is the ERC number.

So it could be the EIP number in hex as an option too.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having salt only deployment as the principal toehold feels like opening up to a race, with no connection to the deployed code.

The same can be said about init_data, right? Seems like the salt should be the hash of all of calldata for this to be secure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the contract is being called via EXTDELEGATECALL, the caller address is also hashed into deploy address automatically, doesn't this alleviate front-running risk?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, is that the expected usage? EOAs are not able to use EXTDELEGATECALL (they can with EIP-7702 but we can't assume that of every EOA). I wouldn't make this contract insecure if used via normal EXTCALL.


mstore8(0, 0xff) // a magic value to ensure a specific preimage space
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced about the necessity of extra magic value here, there's already another 0xff prepended in final hashing of TXCREATE itself, isnt't that enough?

calldatacopy(1, 0, 64) // copy tx_initcode_hash and salt to memory to hash
let final_salt := keccak256(0, 65)

let init_data_size := sub(size, 64)
calldatacopy(0, 64, init_data_size)

let ret := txcreate(tx_initcode_hash, callvalue(), final_salt, 0, init_data_size)
if iszero(ret) { revert(0, 0) }

mstore(0, ret)
return(0, 32)

// Helper to compile this with existing Solidity (with --strict-assembly mode)
function txcreate(a, b, c, d, e) -> f {
f := verbatim_5i_1o(hex"ed", a, b, c, d, e)
}
}
```

pdobacz marked this conversation as resolved.
Show resolved Hide resolved
## Appendix: Original EIPs

These are the individual EIPs which evolved into this spec.
Expand All @@ -362,4 +429,3 @@ These are the individual EIPs which evolved into this spec.
- 📃[EIP-663](https://eips.ethereum.org/EIPS/eip-663): Unlimited SWAP and DUP instructions [_history_](https://github.com/ethereum/EIPs/commits/master/EIPS/eip-663.md)
- 📃[EIP-7069](https://eips.ethereum.org/EIPS/eip-7069): Revamped CALL instructions (*does not require EOF*) [_history_](https://github.com/ethereum/EIPs/commits/master/EIPS/eip-7069.md)
- 📃[EIP-7620](https://eips.ethereum.org/EIPS/eip-7620): EOF - Contract Creation Instructions [_history_](https://github.com/ethereum/EIPs/commits/master/EIPS/eip-7620.md)
- 📃[EIP-7698](https://eips.ethereum.org/EIPS/eip-7698): EOF - Creation transaction [_history_](https://github.com/ethereum/EIPs/commits/master/EIPS/eip-7698.md)
59 changes: 1 addition & 58 deletions spec/eof_future_upgrades.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,61 +2,4 @@

**This document gathers the designs which were excluded from the [Mega EOF spec](./eof.md), i.e. will not be a part of the first EOF release. They are planned to be introduced in a future upgrade.**

# `TXCREATE` and `InitcodeTransaction`

## Transaction Types

Introduce new transaction type `InitcodeTransaction` which extends EIP-1559 (type 2) transaction by adding a new field `initcodes: List[ByteList[MAX_INITCODE_SIZE], MAX_INITCODE_COUNT]`.

The `initcodes` can only be accessed via the `TXCREATE` instruction (see below), therefore `InitcodeTransactions` are intended to be sent to contracts including `TXCREATE` in their execution.

Under transaction validation rules `initcodes` are not validated for conforming to the EOF specification. They are only validated when accessed via `TXCREATE`. This avoids potential DoS attacks of the mempool. If during the execution of an `InitcodeTransaction` no `TXCREATE` instruction is called, such transaction is still valid.

`initcodes` data is similar to calldata for two reasons:
1) It must be fully transmitted in the transaction.
2) It is accessible to the EVM, but it can't be fully loaded into EVM memory.

For these reasons, define cost of each of the `initcodes` items same as calldata (16 gas for non-zero bytes, 4 for zero bytes -- see EIP-2028). The intrinsic gas of an `InitcodeTransaction` is extended by the sum of all those items' costs.

EIP-3860 and EIP-170 still apply, i.e. `MAX_CODE_SIZE` as 24576, `MAX_INITCODE_SIZE` as `2 * MAX_CODE_SIZE`. Define `MAX_INITCODE_COUNT` as 256.

`InitcodeTransaction` is invalid if either:
- there are more than `MAX_INITCODE_COUNT` entries in `initcodes`
- `initcodes` is an empty array
- length of any entry in `initcodes` exceeds `MAX_INITCODE_SIZE`
- any entry in `initcodes` has zero length
- the `to` is `nil`

#### RLP and signature

Given the definitions from [EIP-2718](https://eips.ethereum.org/EIPS/eip-2718) the `TransactionPayload` for an `InitcodeTransaction` is the RLP serialization of:

```
[chain_id, nonce, max_priority_fee_per_gas, max_fee_per_gas, gas_limit, to, value, data, access_list, initcodes, y_parity, r, s]
```

`TransactionType` is `INITCODE_TX_TYPE` (`0x04`) and the signature values `y_parity`, `r`, and `s` are calculated by constructing a secp256k1 signature over the following digest:

```
keccak256(INITCODE_TX_TYPE || rlp([chain_id, nonce, max_priority_fee_per_gas, max_fee_per_gas, gas_limit, to, value, data, access_list, initcodes]))
```

The [EIP-2718](https://eips.ethereum.org/EIPS/eip-2718) `ReceiptPayload` for this transaction is `rlp([status, cumulative_transaction_gas_used, logs_bloom, logs])`.

### New Behavior

- `TXCREATE (0xed)` instruction
- Works the same as `EOFCREATE` except:
- does not have `initcontainer_index` immediate
- pops one more value from the stack (first argument): `tx_initcode_hash`
- loads the initcode EOF container from the transaction `initcodes` array which hashes to `tx_initcode_hash`
- fails (returns 0 on the stack) if such initcode does not exist in the transaction, or if called from a transaction of `TransactionType` other than `INITCODE_TX_TYPE`
- caller's nonce is not updated and gas for initcode execution is not consumed. Only `TXCREATE` constant gas was consumed
- let `initcontainer` be that EOF container, and `initcontainer_size` its length in bytes
- in addition to hashing charge as in `EOFCREATE`, deducts `2 * ((initcontainer_size + 31) // 32)` gas (EIP-3860 charge)
- just before executing the initcode container:
- **validates the initcode container and all its subcontainers recursively**
- validation includes checking that the `initcontainer` does not contain `RETURN` or `STOP`
- in addition to this, checks if the initcode container has its `len(data_section)` equal to `data_size`, i.e. data section content is exactly as the size declared in the header (see [Data section lifecycle](#data-section-lifecycle))
- fails (returns 0 on the stack) if any of those was invalid
- caller’s nonce is not updated and gas for initcode execution is not consumed. Only `TXCREATE` constant, EIP-3860 gas and hashing gas were consumed
(nothing here so far)