-
Notifications
You must be signed in to change notification settings - Fork 1
ring: reduce allocs and ~5–10% faster Sign/Verify (precomputed H_p, pooled scratch, EncodeInto) #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…oled scratch, EncodeInto, ensureHP)
0858c33 to
2f93470
Compare
Olshansk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's A LOT of intricate logic in here so hard to review without going into all the details.
That said, I'm sue it'll help with performance!
- Added @noot in case she has time to review.
- Requested a couple of minor changes
This will go very well hand-in-hand with some of the other changes we're working on here: pokt-network/shannon-sdk#47
| BenchmarkVerify32_Ed25519-32 246 4,801,361 ns/op 51,904 B/op 704 allocs/op | ||
| BenchmarkVerify64_Ed25519-32 121 9,620,050 ns/op 104,307 B/op 1,407 allocs/op | ||
| BenchmarkVerify128_Ed25519-32 60 19,432,527 ns/op 211,098 B/op 2,869 allocs/op | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BenchmarkVerify128_Ed25519-32 60 19,432,527 ns/op 211,098 B/op 2,869 allocs/op
#### After (v0.2.0)
```bash
-------------------------
goos: linux
goarch: amd64
benchmarks.md
Outdated
| ``` | ||
| Before (v0.1.0) | ||
| ------------------------- | ||
| goos: linux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before (v0.1.0)
------------
benchmarks.md
Outdated
| **Summary:** | ||
|
|
||
| - **Secp256k1**: ~**5–10% faster** across typical sizes. Examples: | ||
| - Sign32: **11.70ms → 10.60ms** (~**9.4%** faster) | ||
| - Sign64: **22.93ms → 21.02ms** (~**8.3%** faster) | ||
| - Verify32: **11.32ms → 10.32ms** (~**8.9%** faster) | ||
| - Verify64: **22.91ms → 20.69ms** (~**9.7%** faster) | ||
| - **Ed25519**: similar gains on average (**~5–8%** for common sizes). | ||
| - Verify32: **4.80ms → 4.41ms** (~**8.0%**) | ||
| - Sign32: **4.96ms → 4.62ms** (~**7.0%**) | ||
| - **Allocations**: consistently down **15–30%** depending on bench. | ||
| (e.g., Verify32 secp256k1: **843 → 682 allocs**; Sign2 secp256k1: **84 → 75 allocs**) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| **Summary:** | |
| - **Secp256k1**: ~**5–10% faster** across typical sizes. Examples: | |
| - Sign32: **11.70ms → 10.60ms** (~**9.4%** faster) | |
| - Sign64: **22.93ms → 21.02ms** (~**8.3%** faster) | |
| - Verify32: **11.32ms → 10.32ms** (~**8.9%** faster) | |
| - Verify64: **22.91ms → 20.69ms** (~**9.7%** faster) | |
| - **Ed25519**: similar gains on average (**~5–8%** for common sizes). | |
| - Verify32: **4.80ms → 4.41ms** (~**8.0%**) | |
| - Sign32: **4.96ms → 4.62ms** (~**7.0%**) | |
| - **Allocations**: consistently down **15–30%** depending on bench. | |
| (e.g., Verify32 secp256k1: **843 → 682 allocs**; Sign2 secp256k1: **84 → 75 allocs**) | |
| #### Summary | |
| **Secp256k1**: ~**5–10% faster** across typical sizes. Examples: | |
| - Sign32: **11.70ms → 10.60ms** (~**9.4%** faster) | |
| - Sign64: **22.93ms → 21.02ms** (~**8.3%** faster) | |
| - Verify32: **11.32ms → 10.32ms** (~**8.9%** faster) | |
| - Verify64: **22.91ms → 20.69ms** (~**9.7%** faster) | |
| **Ed25519**: similar gains on average (**~5–8%** for common sizes). | |
| - Verify32: **4.80ms → 4.41ms** (~**8.0%**) | |
| - Sign32: **4.96ms → 4.62ms** (~**7.0%**) | |
| **Allocations**: consistently down **15–30%** depending on bench. | |
| (e.g., Verify32 secp256k1: **843 → 682 allocs**; Sign2 secp256k1: **84 → 75 allocs**) |
|
|
||
| toolchain go1.24.3 | ||
|
|
||
| replace github.com/athanorlabs/go-dleq => github.com/jorgecuesta/go-dleq v0.0.0-20250918223310-7a1fc288336f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the following:
- TODO to remove this
- Link to the PR we are waiting to be approved
- Explain why a fork was needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ok
- ok
- The PR explain it self, is redundant, but the general idea is to reduce allocations.
ring.go
Outdated
| type Ring struct { | ||
| pubkeys []types.Point | ||
| curve types.Curve | ||
| // hp[i] = hashToCurve(pubkeys[i]); precomputed once to avoid recomputing in Sign/Verify loops. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, just update to explain why it exists. Removing the part that looks like code.
ring.go
Outdated
| } | ||
|
|
||
| // PublicKeysRef returns a read-only view of the ring's public keys without copying. | ||
| // NOTE: Do not mutate the returned slice or its elements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What forces it to be read only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably the wording is wrong, the point on this is avoid clone the PublicKey into a new object. Basically use Reference to reduce allocations.
Olshansk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pokt-network/ring-go can you reopen a pull request from ring-go proper?
I merged my most recent changes, so this will be a good addition.
This PR tunes the hot paths in ring signature Sign/Verify without changing any public APIs.
What changed
H_p(P_i)inRingand reuse in Sign/Verify loops.c[]challenge slice to cut GC churn.types.PointEncodeInto(falls back toEncode()).PublicKeys()copy semantics, addPublicKeysRef()for zero-copy reads.Why
H_p(P_i)and allocated on each iteration.Results (Ryzen 9 5950X, Go1.24.3 linux/amd64)
Secp256k1
Ed25519
Full before/after benchmark output
**Before** ```text goos: linux goarch: amd64 pkg: github.com/pokt-network/ring-go cpu: AMD Ryzen 9 5950X 16-Core Processor BenchmarkSign2_Secp256k1-32 1174 1,026,322 ns/op 5,021 B/op 84 allocs/op BenchmarkSign4_Secp256k1-32 674 1,733,693 ns/op 8,536 B/op 140 allocs/op BenchmarkSign8_Secp256k1-32 373 3,251,426 ns/op 15,567 B/op 252 allocs/op BenchmarkSign16_Secp256k1-32 201 6,046,313 ns/op 29,633 B/op 476 allocs/op BenchmarkSign32_Secp256k1-32 92 11,699,021 ns/op 57,812 B/op 925 allocs/op BenchmarkSign64_Secp256k1-32 49 22,926,795 ns/op 114,472 B/op 1,825 allocs/op BenchmarkSign128_Secp256k1-32 24 45,918,672 ns/op 227,992 B/op 3,634 allocs/op BenchmarkVerify2_Secp256k1-32 1726 725,005 ns/op 3,404 B/op 53 allocs/op BenchmarkVerify4_Secp256k1-32 849 1,384,662 ns/op 6,814 B/op 105 allocs/op BenchmarkVerify8_Secp256k1-32 423 2,760,680 ns/op 13,646 B/op 209 allocs/op BenchmarkVerify16_Secp256k1-32 210 5,600,718 ns/op 27,366 B/op 419 allocs/op BenchmarkVerify32_Secp256k1-32 104 11,323,208 ns/op 55,044 B/op 843 allocs/op BenchmarkVerify64_Secp256k1-32 49 22,912,945 ns/op 111,577 B/op 1,708 allocs/op BenchmarkVerify128_Secp256k1-32 24 47,007,356 ns/op 228,492 B/op 3,502 allocs/opBenchmarkSign2_Ed25519-32 2856 417,582 ns/op 4,672 B/op 70 allocs/op
BenchmarkSign4_Ed25519-32 1622 721,830 ns/op 8,032 B/op 119 allocs/op
BenchmarkSign8_Ed25519-32 888 1,305,862 ns/op 14,706 B/op 214 allocs/op
BenchmarkSign16_Ed25519-32 481 2,494,322 ns/op 28,041 B/op 403 allocs/op
BenchmarkSign32_Ed25519-32 241 4,961,154 ns/op 54,882 B/op 791 allocs/op
BenchmarkSign64_Ed25519-32 123 9,775,632 ns/op 109,027 B/op 1,579 allocs/op
BenchmarkSign128_Ed25519-32 57 19,487,283 ns/op 216,505 B/op 3,103 allocs/op
BenchmarkVerify2_Ed25519-32 3784 294,033 ns/op 3,217 B/op 44 allocs/op
BenchmarkVerify4_Ed25519-32 1953 584,609 ns/op 6,436 B/op 87 allocs/op
BenchmarkVerify8_Ed25519-32 975 1,209,195 ns/op 12,993 B/op 180 allocs/op
BenchmarkVerify16_Ed25519-32 500 2,421,635 ns/op 25,969 B/op 356 allocs/op
BenchmarkVerify32_Ed25519-32 246 4,801,361 ns/op 51,904 B/op 704 allocs/op
BenchmarkVerify64_Ed25519-32 121 9,620,050 ns/op 104,307 B/op 1,407 allocs/op
BenchmarkVerify128_Ed25519-32 60 19,432,527 ns/op 211,098 B/op 2,869 allocs/op
Notes
types.PointEncodeIntois used when available; otherwise we fall back toEncode().