Skip to content

Conversation

@jorgecuesta
Copy link

This PR tunes the hot paths in ring signature Sign/Verify without changing any public APIs.

What changed

  • Precompute H_p(P_i) in Ring and reuse in Sign/Verify loops.
  • Use pooled scratch for the c[] challenge slice to cut GC churn.
  • Challenge fast-path: single buffer with types.PointEncodeInto (falls back to Encode()).
  • Keep PublicKeys() copy semantics, add PublicKeysRef() for zero-copy reads.

Why

  • The Verify loop previously recomputed H_p(P_i) and allocated on each iteration.
  • Challenge construction used nested appends and temporary slices.
  • These changes remove per-iteration work and reduce allocations.

Results (Ryzen 9 5950X, Go1.24.3 linux/amd64)

Secp256k1

  • Sign32: 11.70ms → 10.60ms (~9.4%)
  • Sign64: 22.93ms → 21.02ms (~8.3%)
  • Verify32: 11.32ms → 10.32ms (~8.9%)
  • Verify64: 22.91ms → 20.69ms (~9.7%)
  • Allocations: typically 15–30% lower (e.g., Verify32 843 → 682 allocs)

Ed25519

  • Verify32: 4.80ms → 4.41ms (~8.0%)
  • Sign32: 4.96ms → 4.62ms (~7.0%)
  • Allocations drop similarly across sizes.
Full before/after benchmark output **Before** ```text goos: linux goarch: amd64 pkg: github.com/pokt-network/ring-go cpu: AMD Ryzen 9 5950X 16-Core Processor BenchmarkSign2_Secp256k1-32 1174 1,026,322 ns/op 5,021 B/op 84 allocs/op BenchmarkSign4_Secp256k1-32 674 1,733,693 ns/op 8,536 B/op 140 allocs/op BenchmarkSign8_Secp256k1-32 373 3,251,426 ns/op 15,567 B/op 252 allocs/op BenchmarkSign16_Secp256k1-32 201 6,046,313 ns/op 29,633 B/op 476 allocs/op BenchmarkSign32_Secp256k1-32 92 11,699,021 ns/op 57,812 B/op 925 allocs/op BenchmarkSign64_Secp256k1-32 49 22,926,795 ns/op 114,472 B/op 1,825 allocs/op BenchmarkSign128_Secp256k1-32 24 45,918,672 ns/op 227,992 B/op 3,634 allocs/op BenchmarkVerify2_Secp256k1-32 1726 725,005 ns/op 3,404 B/op 53 allocs/op BenchmarkVerify4_Secp256k1-32 849 1,384,662 ns/op 6,814 B/op 105 allocs/op BenchmarkVerify8_Secp256k1-32 423 2,760,680 ns/op 13,646 B/op 209 allocs/op BenchmarkVerify16_Secp256k1-32 210 5,600,718 ns/op 27,366 B/op 419 allocs/op BenchmarkVerify32_Secp256k1-32 104 11,323,208 ns/op 55,044 B/op 843 allocs/op BenchmarkVerify64_Secp256k1-32 49 22,912,945 ns/op 111,577 B/op 1,708 allocs/op BenchmarkVerify128_Secp256k1-32 24 47,007,356 ns/op 228,492 B/op 3,502 allocs/op

BenchmarkSign2_Ed25519-32 2856 417,582 ns/op 4,672 B/op 70 allocs/op
BenchmarkSign4_Ed25519-32 1622 721,830 ns/op 8,032 B/op 119 allocs/op
BenchmarkSign8_Ed25519-32 888 1,305,862 ns/op 14,706 B/op 214 allocs/op
BenchmarkSign16_Ed25519-32 481 2,494,322 ns/op 28,041 B/op 403 allocs/op
BenchmarkSign32_Ed25519-32 241 4,961,154 ns/op 54,882 B/op 791 allocs/op
BenchmarkSign64_Ed25519-32 123 9,775,632 ns/op 109,027 B/op 1,579 allocs/op
BenchmarkSign128_Ed25519-32 57 19,487,283 ns/op 216,505 B/op 3,103 allocs/op
BenchmarkVerify2_Ed25519-32 3784 294,033 ns/op 3,217 B/op 44 allocs/op
BenchmarkVerify4_Ed25519-32 1953 584,609 ns/op 6,436 B/op 87 allocs/op
BenchmarkVerify8_Ed25519-32 975 1,209,195 ns/op 12,993 B/op 180 allocs/op
BenchmarkVerify16_Ed25519-32 500 2,421,635 ns/op 25,969 B/op 356 allocs/op
BenchmarkVerify32_Ed25519-32 246 4,801,361 ns/op 51,904 B/op 704 allocs/op
BenchmarkVerify64_Ed25519-32 121 9,620,050 ns/op 104,307 B/op 1,407 allocs/op
BenchmarkVerify128_Ed25519-32 60 19,432,527 ns/op 211,098 B/op 2,869 allocs/op

**After**
```text
goos: linux
goarch: amd64
pkg: github.com/pokt-network/ring-go
cpu: AMD Ryzen 9 5950X 16-Core Processor
BenchmarkSign2_Secp256k1-32 1224 995,215 ns/op 4,254 B/op 75 allocs/op
BenchmarkSign4_Secp256k1-32 724 1,629,372 ns/op 6,985 B/op 121 allocs/op
BenchmarkSign8_Secp256k1-32 409 2,901,838 ns/op 12,447 B/op 213 allocs/op
BenchmarkSign16_Secp256k1-32 218 5,481,979 ns/op 23,384 B/op 397 allocs/op
BenchmarkSign32_Secp256k1-32 97 10,596,642 ns/op 45,218 B/op 767 allocs/op
BenchmarkSign64_Secp256k1-32 49 21,016,084 ns/op 89,338 B/op 1,510 allocs/op
BenchmarkSign128_Secp256k1-32 25 42,112,324 ns/op 178,167 B/op 3,010 allocs/op
BenchmarkVerify2_Secp256k1-32 1854 646,071 ns/op 2,643 B/op 43 allocs/op
BenchmarkVerify4_Secp256k1-32 928 1,259,166 ns/op 5,249 B/op 85 allocs/op
BenchmarkVerify8_Secp256k1-32 454 2,522,735 ns/op 10,494 B/op 169 allocs/op
BenchmarkVerify16_Secp256k1-32 231 5,053,713 ns/op 21,021 B/op 339 allocs/op
BenchmarkVerify32_Secp256k1-32 112 10,319,757 ns/op 42,287 B/op 682 allocs/op
BenchmarkVerify64_Secp256k1-32 55 20,688,756 ns/op 85,595 B/op 1,381 allocs/op
BenchmarkVerify128_Secp256k1-32 26 42,974,759 ns/op 175,867 B/op 2,839 allocs/op

BenchmarkSign2_Ed25519-32 2902 407,265 ns/op 3,513 B/op 56 allocs/op
BenchmarkSign4_Ed25519-32 1677 702,718 ns/op 5,840 B/op 94 allocs/op
BenchmarkSign8_Ed25519-32 940 1,243,208 ns/op 10,326 B/op 159 allocs/op
BenchmarkSign16_Ed25519-32 502 2,346,752 ns/op 19,381 B/op 294 allocs/op
BenchmarkSign32_Ed25519-32 262 4,615,297 ns/op 37,654 B/op 574 allocs/op
BenchmarkSign64_Ed25519-32 133 9,010,359 ns/op 74,074 B/op 1,113 allocs/op
BenchmarkSign128_Ed25519-32 56 18,047,776 ns/op 147,841 B/op 2,223 allocs/op
BenchmarkVerify2_Ed25519-32 4135 275,322 ns/op 2,178 B/op 31 allocs/op
BenchmarkVerify4_Ed25519-32 2133 547,658 ns/op 4,334 B/op 61 allocs/op
BenchmarkVerify8_Ed25519-32 1076 1,096,213 ns/op 8,649 B/op 121 allocs/op
BenchmarkVerify16_Ed25519-32 534 2,203,613 ns/op 17,300 B/op 241 allocs/op
BenchmarkVerify32_Ed25519-32 262 4,414,711 ns/op 34,675 B/op 484 allocs/op
BenchmarkVerify64_Ed25519-32 133 8,848,523 ns/op 69,692 B/op 973 allocs/op
BenchmarkVerify128_Ed25519-32 64 18,091,954 ns/op 141,032 B/op 1,969 allocs/op

Notes

  • Interfaces are preserved. types.PointEncodeInto is used when available; otherwise we fall back to Encode().
  • All tests pass locally.

@jorgecuesta jorgecuesta force-pushed the perf/ringsig-verify-allocs-and-challenge-fastpath branch from 0858c33 to 2f93470 Compare September 19, 2025 03:57
@Olshansk Olshansk self-assigned this Sep 19, 2025
@Olshansk Olshansk requested review from Olshansk, noot and red-0ne and removed request for noot September 19, 2025 23:32
@Olshansk Olshansk added the enhancement New feature or request label Sep 19, 2025
@Olshansk Olshansk added this to Shannon Sep 19, 2025
@github-project-automation github-project-automation bot moved this to 📋 Backlog in Shannon Sep 19, 2025
@Olshansk Olshansk moved this from 📋 Backlog to 👀 In review in Shannon Sep 19, 2025
Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's A LOT of intricate logic in here so hard to review without going into all the details.

That said, I'm sue it'll help with performance!

  1. Added @noot in case she has time to review.
  2. Requested a couple of minor changes

This will go very well hand-in-hand with some of the other changes we're working on here: pokt-network/shannon-sdk#47

BenchmarkVerify32_Ed25519-32 246 4,801,361 ns/op 51,904 B/op 704 allocs/op
BenchmarkVerify64_Ed25519-32 121 9,620,050 ns/op 104,307 B/op 1,407 allocs/op
BenchmarkVerify128_Ed25519-32 60 19,432,527 ns/op 211,098 B/op 2,869 allocs/op
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BenchmarkVerify128_Ed25519-32 60 19,432,527 ns/op 211,098 B/op 2,869 allocs/op


#### After (v0.2.0)

```bash
-------------------------
goos: linux
goarch: amd64

benchmarks.md Outdated
Comment on lines 74 to 77
```
Before (v0.1.0)
-------------------------
goos: linux
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before (v0.1.0)

------------

benchmarks.md Outdated
Comment on lines 60 to 71
**Summary:**

- **Secp256k1**: ~**5–10% faster** across typical sizes. Examples:
- Sign32: **11.70ms → 10.60ms** (~**9.4%** faster)
- Sign64: **22.93ms → 21.02ms** (~**8.3%** faster)
- Verify32: **11.32ms → 10.32ms** (~**8.9%** faster)
- Verify64: **22.91ms → 20.69ms** (~**9.7%** faster)
- **Ed25519**: similar gains on average (**~5–8%** for common sizes).
- Verify32: **4.80ms → 4.41ms** (~**8.0%**)
- Sign32: **4.96ms → 4.62ms** (~**7.0%**)
- **Allocations**: consistently down **15–30%** depending on bench.
(e.g., Verify32 secp256k1: **843 → 682 allocs**; Sign2 secp256k1: **84 → 75 allocs**)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Summary:**
- **Secp256k1**: ~**5–10% faster** across typical sizes. Examples:
- Sign32: **11.70ms → 10.60ms** (~**9.4%** faster)
- Sign64: **22.93ms → 21.02ms** (~**8.3%** faster)
- Verify32: **11.32ms → 10.32ms** (~**8.9%** faster)
- Verify64: **22.91ms → 20.69ms** (~**9.7%** faster)
- **Ed25519**: similar gains on average (**~5–8%** for common sizes).
- Verify32: **4.80ms → 4.41ms** (~**8.0%**)
- Sign32: **4.96ms → 4.62ms** (~**7.0%**)
- **Allocations**: consistently down **15–30%** depending on bench.
(e.g., Verify32 secp256k1: **843 → 682 allocs**; Sign2 secp256k1: **84 → 75 allocs**)
#### Summary
**Secp256k1**: ~**5–10% faster** across typical sizes. Examples:
- Sign32: **11.70ms → 10.60ms** (~**9.4%** faster)
- Sign64: **22.93ms → 21.02ms** (~**8.3%** faster)
- Verify32: **11.32ms → 10.32ms** (~**8.9%** faster)
- Verify64: **22.91ms → 20.69ms** (~**9.7%** faster)
**Ed25519**: similar gains on average (**~5–8%** for common sizes).
- Verify32: **4.80ms → 4.41ms** (~**8.0%**)
- Sign32: **4.96ms → 4.62ms** (~**7.0%**)
**Allocations**: consistently down **15–30%** depending on bench.
(e.g., Verify32 secp256k1: **843 → 682 allocs**; Sign2 secp256k1: **84 → 75 allocs**)


toolchain go1.24.3

replace github.com/athanorlabs/go-dleq => github.com/jorgecuesta/go-dleq v0.0.0-20250918223310-7a1fc288336f
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the following:

  1. TODO to remove this
  2. Link to the PR we are waiting to be approved
  3. Explain why a fork was needed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. ok
  2. ok
  3. The PR explain it self, is redundant, but the general idea is to reduce allocations.

pokt-network/go-dleq#2

ring.go Outdated
type Ring struct {
pubkeys []types.Point
curve types.Curve
// hp[i] = hashToCurve(pubkeys[i]); precomputed once to avoid recomputing in Sign/Verify loops.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, just update to explain why it exists. Removing the part that looks like code.

ring.go Outdated
}

// PublicKeysRef returns a read-only view of the ring's public keys without copying.
// NOTE: Do not mutate the returned slice or its elements.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What forces it to be read only?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably the wording is wrong, the point on this is avoid clone the PublicKey into a new object. Basically use Reference to reduce allocations.

Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pokt-network/ring-go can you reopen a pull request from ring-go proper?

I merged my most recent changes, so this will be a good addition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Status: 👀 In review

Development

Successfully merging this pull request may close these issues.

2 participants