Skip to content

Alternative design for shift opcodes#10216

Open
lu-pinto wants to merge 8 commits intobesu-eth:mainfrom
lu-pinto:shift-opcodes-alt-design
Open

Alternative design for shift opcodes#10216
lu-pinto wants to merge 8 commits intobesu-eth:mainfrom
lu-pinto:shift-opcodes-alt-design

Conversation

@lu-pinto
Copy link
Copy Markdown
Contributor

@lu-pinto lu-pinto commented Apr 10, 2026

PR description

I was honestly not convinced with the current design of the opcodes on EVM V2 from #10154 so I did some experimentation and I would like to challenge the existing design.
I managed to achieve the same performance level while splitting up duties between the arithmetic/bitwise computations from the opcodes themselves. Opcodes should be the ones fetching/updating the stack, and not the code that does computations - this should be strictly decoupled from one another.

IMO code looks much cleaner and easier to read. It also benefits from code reuse with already existing arithmetics in UInt256. I will take a look at repurposing shl and shr for modulus arithmetics in another PR as well as I believe we might be able to reuse them.

Performance stats:

Test Case Latency (ns) main@2d4f077c27 Latency (ns) @065670ffe3
SarV2_SHIFT_06.0496.334
SarV2_NEGATIVE_SHIFT_18.6818.83
SarV2_POSITIVE_SHIFT_17.9698.312
SarV2_ALL_BITS_SHIFT_18.5188.804
SarV2_NEGATIVE_SHIFT_1286.7967.223
SarV2_NEGATIVE_SHIFT_2556.9967.546
SarV2_POSITIVE_SHIFT_1286.7567.101
SarV2_POSITIVE_SHIFT_2556.7656.959
SarV2_OVERFLOW_SHIFT_2566.8487.222
SarV2_OVERFLOW_LARGE_SHIFT6.9547.356
SarV2_FULL_RANDOM15.34915.379
ShlV2_SHIFT_05.7856.362
ShlV2_SHIFT_18.4928.778
ShlV2_SHIFT_1287.1497.105
ShlV2_SHIFT_2556.8717.277
ShlV2_OVERFLOW_SHIFT_2566.6477.698
ShlV2_OVERFLOW_LARGE_SHIFT6.8327.798
ShlV2_FULL_RANDOM11.9278.183
ShrV2_SHIFT_05.8176.357
ShrV2_SHIFT_17.7428.233
ShrV2_SHIFT_1286.8336.975
ShrV2_SHIFT_2556.8247.061
ShrV2_OVERFLOW_SHIFT_2566.6427.69
ShrV2_OVERFLOW_LARGE_SHIFT6.8057.846
ShrV2_FULL_RANDOM11.3818.628

Issue(s)

#10131

Thanks for sending a pull request! Have you done the following?

  • Checked out our contribution guidelines?
  • Considered documentation and added the doc-change-required label to this PR if updates are required.
  • Considered the changelog and included an update if required.
  • For database changes (e.g. KeyValueSegmentIdentifier) considered compatibility and performed forwards and backwards compatibility tests

Locally, you can run these tests to catch failures early:

  • spotless: ./gradlew spotlessApply
  • unit tests: ./gradlew build
  • acceptance tests: ./gradlew acceptanceTest
  • integration tests: ./gradlew integrationTest
  • reference tests: ./gradlew ethereum:referenceTests:referenceTests
  • hive tests: Engine or other RPCs modified?

@lu-pinto lu-pinto requested review from ahamlat and siladu and removed request for siladu April 10, 2026 09:23
@lu-pinto lu-pinto force-pushed the shift-opcodes-alt-design branch from da22999 to 82337b0 Compare April 10, 2026 09:24
public static OperationResult staticOperation(final MessageFrame frame) {
if (!frame.stackHasItems(2)) return UNDERFLOW_RESPONSE;
frame.setTopV2(StackArithmetic.sar(stack, frame.stackTopV2()));
long[] _stack = frame.stackDataV2();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to just use stack instead of _stack to be inline with the naming used in the project.

Comment on lines +51 to +52
if (!frame.stackHasItems(2)) return UNDERFLOW_RESPONSE;
frame.setTopV2(StackArithmetic.shl(stack, frame.stackTopV2()));
long[] _stack = frame.stackDataV2();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to just use stack instead of _stack to be inline with the naming used in the project.

public static OperationResult staticOperation(final MessageFrame frame) {
if (!frame.stackHasItems(2)) return UNDERFLOW_RESPONSE;
frame.setTopV2(StackArithmetic.shr(stack, frame.stackTopV2()));
long[] _stack = frame.stackDataV2();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to just use stack instead of _stack to be inline with the naming used in the project.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it is a leftover from the previous method argument before I removed it

|| shift.u2() != 0
|| shift.u1() != 0
|| Long.compareUnsigned(shift.u0(), 256) >= 0) {
bytesToShift = 256;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is bitsToShift ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use bitShift in private methods below

Copy link
Copy Markdown
Contributor Author

@lu-pinto lu-pinto Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it is bits, well spotted

|| shift.u2() != 0
|| shift.u1() != 0
|| Long.compareUnsigned(shift.u0(), 256) >= 0) {
bytesToShift = 256;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as above.

return new UInt256(w3, w2, w1, w0);
}

private static long shiftLeftWord(final long value, final long nextValue, final int bitShift) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add javadoc.

return (value << bitShift) | (nextValue >>> (64 - bitShift));
}

private static long shiftRightWord(final long value, final long prevValue, final int bitShift) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add javadoc.

final long[] s = new long[8];
writeLimbs(s, 0, valueVal);
writeLimbs(s, 4, shiftVal);
final UInt256 result = executor.execute(valueVal, shiftVal);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 (this is a good argument that this design is better)

Copy link
Copy Markdown
Contributor

@ahamlat ahamlat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the proposed design, I find it better and the code much cleaner. There is a small performance regression, could you double check if it is real with multiple runs and investigate the origin.

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>
Signed-off-by: Luis Pinto <luis.pinto@consensys.net>
Signed-off-by: Luis Pinto <luis.pinto@consensys.net>
Signed-off-by: Luis Pinto <luis.pinto@consensys.net>
Signed-off-by: Luis Pinto <luis.pinto@consensys.net>
Signed-off-by: Luis Pinto <luis.pinto@consensys.net>
@lu-pinto lu-pinto force-pushed the shift-opcodes-alt-design branch from 82337b0 to f4ed77d Compare April 10, 2026 13:21
Signed-off-by: Luis Pinto <luis.pinto@consensys.net>
Signed-off-by: Luis Pinto <luis.pinto@consensys.net>
@lu-pinto
Copy link
Copy Markdown
Contributor Author

I like the proposed design, I find it better and the code much cleaner. There is a small performance regression, could you double check if it is real with multiple runs and investigate the origin.

Looked into it and optimised a little more - but I'm going to park it here. Worst cases (FULL_RANDOM) are much closer or have improved significantly. IMO these are prob the most realistic ones.
The other ones are very hard to get better numbers without impacting the worse case because I primarily optimized for it.

Arguments.of(
"0x8000000000000000000000000000000000000000000000000000000000000000",
"0x100",
"0x0100",
Copy link
Copy Markdown
Contributor

@ahamlat ahamlat Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do make this change and all the changes below on unit tests ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it related not using anymore fromHexStringLenient ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed fromHexStringLenient to fromHexString in the test to make the hexadecimal exact without having to guess if there will be a zero or not prepended. Hard to know if you don't know what lenient does. Since we are providing the values hardcoded does it make sense to "disguise" them? For instance 0x0 is half a byte so it seems lenient would put a zero to complete the byte.
I can revert it if you feel strongly about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants