Skip to content

Conversation

@rod-chapman
Copy link
Contributor

Conjecture: it is better for proof performance and stability to use signed "int" types for quantifiers, loop indexing expressions, for loop counters. This PR implements this change to gain data on its impact on proof and runtime performance.

Specifically, use signed "int" for:

  1. for loop counters
  2. quantified expressions
  3. variables and struct fields that count or index into arrays
  4. formal parameters for array indexes, offsets and counts

Additionally, strengthen pre-conditions on polyvecl_add() and polyveck_add() to stabilize proof of those functions for all parameter sets.

Opening a PR here to get Benchmark results.

@rod-chapman rod-chapman self-assigned this Nov 17, 2025
@rod-chapman rod-chapman requested a review from a team as a code owner November 17, 2025 05:53
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks (opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 46496 cycles 46492 cycles 1.00
ML-DSA-44 sign 132781 cycles 132736 cycles 1.00
ML-DSA-44 verify 47846 cycles 47836 cycles 1.00
ML-DSA-65 keypair 81473 cycles 81463 cycles 1.00
ML-DSA-65 sign 219268 cycles 219207 cycles 1.00
ML-DSA-65 verify 80166 cycles 80117 cycles 1.00
ML-DSA-87 keypair 132624 cycles 132624 cycles 1
ML-DSA-87 sign 281142 cycles 281081 cycles 1.00
ML-DSA-87 verify 130398 cycles 130382 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks (no-opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 118925 cycles 115293 cycles 1.03
ML-DSA-44 sign 473353 cycles 430366 cycles 1.10
ML-DSA-44 verify 125791 cycles 122175 cycles 1.03
ML-DSA-65 keypair 203509 cycles 197217 cycles 1.03
ML-DSA-65 sign 775494 cycles 700477 cycles 1.11
ML-DSA-65 verify 203368 cycles 197631 cycles 1.03
ML-DSA-87 keypair 333054 cycles 325245 cycles 1.02
ML-DSA-87 sign 966363 cycles 883796 cycles 1.09
ML-DSA-87 verify 336501 cycles 328572 cycles 1.02

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Mac Mini (M1, 2020) benchmarks (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 118925 cycles 115293 cycles 1.03
ML-DSA-44 sign 473353 cycles 430366 cycles 1.10
ML-DSA-65 keypair 203509 cycles 197217 cycles 1.03
ML-DSA-65 sign 775494 cycles 700477 cycles 1.11
ML-DSA-87 sign 966363 cycles 883796 cycles 1.09

This comment was automatically generated by workflow using github-action-benchmark.

@mkannwischer
Copy link
Contributor

Turning into draft. Please mark it as ready once you are happy.

@mkannwischer mkannwischer marked this pull request as draft November 17, 2025 05:56
@hanno-becker
Copy link
Contributor

hanno-becker commented Nov 17, 2025

This change, if we want it, should be coordinated with mlkem-native. However, I seem to remember that we have done a round-trip on this one already on mlkem-native, changing to int and back. If there is a larger win to be had, OK (the CI should tell!), but otherwise I prefer not to change this again for now. There's a lot of issues open at the moment that we should work on first.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 35477 cycles 34971 cycles 1.01
ML-DSA-44 sign 120769 cycles 119790 cycles 1.01
ML-DSA-44 verify 38070 cycles 38125 cycles 1.00
ML-DSA-65 keypair 62581 cycles 62854 cycles 1.00
ML-DSA-65 sign 202130 cycles 201785 cycles 1.00
ML-DSA-65 verify 62658 cycles 62556 cycles 1.00
ML-DSA-87 keypair 94618 cycles 94404 cycles 1.00
ML-DSA-87 sign 232701 cycles 232302 cycles 1.00
ML-DSA-87 verify 94050 cycles 95034 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i) (no-opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 97429 cycles 95284 cycles 1.02
ML-DSA-44 sign 361642 cycles 348188 cycles 1.04
ML-DSA-44 verify 103866 cycles 100631 cycles 1.03
ML-DSA-65 keypair 164406 cycles 163700 cycles 1.00
ML-DSA-65 sign 581400 cycles 565596 cycles 1.03
ML-DSA-65 verify 165510 cycles 164648 cycles 1.01
ML-DSA-87 keypair 272798 cycles 268448 cycles 1.02
ML-DSA-87 sign 745311 cycles 721228 cycles 1.03
ML-DSA-87 verify 276929 cycles 270809 cycles 1.02

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 4th gen (c7i) (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 sign 361642 cycles 348188 cycles 1.04
ML-DSA-44 verify 103866 cycles 100631 cycles 1.03
ML-DSA-87 sign 745311 cycles 721228 cycles 1.03

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 229445 cycles 226157 cycles 1.01
ML-DSA-44 sign 664813 cycles 671249 cycles 0.99
ML-DSA-44 verify 226948 cycles 228114 cycles 0.99
ML-DSA-65 keypair 397562 cycles 403995 cycles 0.98
ML-DSA-65 sign 1095420 cycles 1098524 cycles 1.00
ML-DSA-65 verify 382805 cycles 386494 cycles 0.99
ML-DSA-87 keypair 669766 cycles 660767 cycles 1.01
ML-DSA-87 sign 1457310 cycles 1452890 cycles 1.00
ML-DSA-87 verify 641781 cycles 646243 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 69626 cycles 69526 cycles 1.00
ML-DSA-44 sign 185280 cycles 185167 cycles 1.00
ML-DSA-44 verify 69350 cycles 69157 cycles 1.00
ML-DSA-65 keypair 119819 cycles 119298 cycles 1.00
ML-DSA-65 sign 297183 cycles 295971 cycles 1.00
ML-DSA-65 verify 115704 cycles 115271 cycles 1.00
ML-DSA-87 keypair 200395 cycles 202656 cycles 0.99
ML-DSA-87 sign 384705 cycles 386954 cycles 0.99
ML-DSA-87 verify 192822 cycles 194182 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 69885 cycles 69696 cycles 1.00
ML-DSA-44 sign 213811 cycles 213851 cycles 1.00
ML-DSA-44 verify 72442 cycles 72640 cycles 1.00
ML-DSA-65 keypair 123413 cycles 123343 cycles 1.00
ML-DSA-65 sign 350836 cycles 351128 cycles 1.00
ML-DSA-65 verify 120702 cycles 120764 cycles 1.00
ML-DSA-87 keypair 201178 cycles 201094 cycles 1.00
ML-DSA-87 sign 449802 cycles 449398 cycles 1.00
ML-DSA-87 verify 198402 cycles 198566 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 57401 cycles 56987 cycles 1.01
ML-DSA-44 sign 180088 cycles 179585 cycles 1.00
ML-DSA-44 verify 60897 cycles 61012 cycles 1.00
ML-DSA-65 keypair 100194 cycles 99883 cycles 1.00
ML-DSA-65 sign 298121 cycles 295718 cycles 1.01
ML-DSA-65 verify 100466 cycles 99685 cycles 1.01
ML-DSA-87 keypair 154404 cycles 154185 cycles 1.00
ML-DSA-87 sign 353119 cycles 352663 cycles 1.00
ML-DSA-87 verify 152739 cycles 152476 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 74059 cycles 74099 cycles 1.00
ML-DSA-44 sign 228347 cycles 228508 cycles 1.00
ML-DSA-44 verify 78024 cycles 78150 cycles 1.00
ML-DSA-65 keypair 130665 cycles 130698 cycles 1.00
ML-DSA-65 sign 378750 cycles 378877 cycles 1.00
ML-DSA-65 verify 129403 cycles 129288 cycles 1.00
ML-DSA-87 keypair 209763 cycles 212127 cycles 0.99
ML-DSA-87 sign 475574 cycles 479633 cycles 0.99
ML-DSA-87 verify 208830 cycles 208843 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 41970 cycles 44173 cycles 0.95
ML-DSA-44 sign 130506 cycles 132703 cycles 0.98
ML-DSA-44 verify 44403 cycles 44975 cycles 0.99
ML-DSA-65 keypair 72647 cycles 72772 cycles 1.00
ML-DSA-65 sign 211062 cycles 211979 cycles 1.00
ML-DSA-65 verify 73343 cycles 73416 cycles 1.00
ML-DSA-87 keypair 111281 cycles 110956 cycles 1.00
ML-DSA-87 sign 250781 cycles 251982 cycles 1.00
ML-DSA-87 verify 111299 cycles 110441 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a) (no-opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 138183 cycles 135118 cycles 1.02
ML-DSA-44 sign 565278 cycles 540392 cycles 1.05
ML-DSA-44 verify 150945 cycles 147966 cycles 1.02
ML-DSA-65 keypair 233095 cycles 228775 cycles 1.02
ML-DSA-65 sign 928247 cycles 892245 cycles 1.04
ML-DSA-65 verify 241561 cycles 237901 cycles 1.02
ML-DSA-87 keypair 379658 cycles 375167 cycles 1.01
ML-DSA-87 sign 1163122 cycles 1110977 cycles 1.05
ML-DSA-87 verify 395517 cycles 389295 cycles 1.02

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'AMD EPYC 3rd gen (c6a) (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 sign 565278 cycles 540392 cycles 1.05
ML-DSA-65 sign 928247 cycles 892245 cycles 1.04
ML-DSA-87 sign 1163122 cycles 1110977 cycles 1.05

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 116683 cycles 115793 cycles 1.01
ML-DSA-44 sign 379663 cycles 377845 cycles 1.00
ML-DSA-44 verify 121197 cycles 120399 cycles 1.01
ML-DSA-65 keypair 200502 cycles 200323 cycles 1.00
ML-DSA-65 sign 623400 cycles 623418 cycles 1.00
ML-DSA-65 verify 198456 cycles 198371 cycles 1.00
ML-DSA-87 keypair 327748 cycles 327812 cycles 1.00
ML-DSA-87 sign 791431 cycles 792354 cycles 1.00
ML-DSA-87 verify 325411 cycles 325174 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4 (no-opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 129372 cycles 128377 cycles 1.01
ML-DSA-44 sign 467582 cycles 456440 cycles 1.02
ML-DSA-44 verify 138206 cycles 136405 cycles 1.01
ML-DSA-65 keypair 222749 cycles 220594 cycles 1.01
ML-DSA-65 sign 763447 cycles 746628 cycles 1.02
ML-DSA-65 verify 224041 cycles 220336 cycles 1.02
ML-DSA-87 keypair 367910 cycles 365330 cycles 1.01
ML-DSA-87 sign 964782 cycles 943639 cycles 1.02
ML-DSA-87 verify 373028 cycles 369203 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i) (no-opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 159919 cycles 157920 cycles 1.01
ML-DSA-44 sign 584037 cycles 564384 cycles 1.03
ML-DSA-44 verify 172383 cycles 169300 cycles 1.02
ML-DSA-65 keypair 274775 cycles 269505 cycles 1.02
ML-DSA-65 sign 967580 cycles 925083 cycles 1.05
ML-DSA-65 verify 279616 cycles 275869 cycles 1.01
ML-DSA-87 keypair 455704 cycles 451918 cycles 1.01
ML-DSA-87 sign 1215243 cycles 1182194 cycles 1.03
ML-DSA-87 verify 464075 cycles 460483 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 3rd gen (c6i) (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 sign 584037 cycles 564384 cycles 1.03
ML-DSA-65 sign 967580 cycles 925083 cycles 1.05

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a) (no-opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 124113 cycles 122429 cycles 1.01
ML-DSA-44 sign 485318 cycles 458938 cycles 1.06
ML-DSA-44 verify 135294 cycles 130854 cycles 1.03
ML-DSA-65 keypair 209706 cycles 206731 cycles 1.01
ML-DSA-65 sign 795264 cycles 743958 cycles 1.07
ML-DSA-65 verify 217003 cycles 211746 cycles 1.02
ML-DSA-87 keypair 347141 cycles 339781 cycles 1.02
ML-DSA-87 sign 1000928 cycles 933126 cycles 1.07
ML-DSA-87 verify 359099 cycles 347812 cycles 1.03

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'AMD EPYC 4th gen (c7a) (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 sign 485318 cycles 458938 cycles 1.06
ML-DSA-44 verify 135294 cycles 130854 cycles 1.03
ML-DSA-65 sign 795264 cycles 743958 cycles 1.07
ML-DSA-87 sign 1000928 cycles 933126 cycles 1.07
ML-DSA-87 verify 359099 cycles 347812 cycles 1.03

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3 (no-opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 140868 cycles 138837 cycles 1.01
ML-DSA-44 sign 511047 cycles 493118 cycles 1.04
ML-DSA-44 verify 151495 cycles 148639 cycles 1.02
ML-DSA-65 keypair 245060 cycles 242641 cycles 1.01
ML-DSA-65 sign 834600 cycles 809188 cycles 1.03
ML-DSA-65 verify 245377 cycles 240500 cycles 1.02
ML-DSA-87 keypair 400154 cycles 396800 cycles 1.01
ML-DSA-87 sign 1059923 cycles 1026947 cycles 1.03
ML-DSA-87 verify 407562 cycles 401645 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Graviton3 (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 sign 511047 cycles 493118 cycles 1.04
ML-DSA-65 sign 834600 cycles 809188 cycles 1.03
ML-DSA-87 sign 1059923 cycles 1026947 cycles 1.03

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2 (no-opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 216054 cycles 214284 cycles 1.01
ML-DSA-44 sign 799465 cycles 782417 cycles 1.02
ML-DSA-44 verify 233645 cycles 229728 cycles 1.02
ML-DSA-65 keypair 386891 cycles 385454 cycles 1.00
ML-DSA-65 sign 1314452 cycles 1311169 cycles 1.00
ML-DSA-65 verify 377529 cycles 376493 cycles 1.00
ML-DSA-87 keypair 611139 cycles 607119 cycles 1.01
ML-DSA-87 sign 1661295 cycles 1625312 cycles 1.02
ML-DSA-87 verify 623517 cycles 617064 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpacemiT K1 8 (Banana Pi F3) benchmarks (no-opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 818983 cycles 825937 cycles 0.99
ML-DSA-44 sign 3280090 cycles 3342877 cycles 0.98
ML-DSA-44 verify 905584 cycles 919262 cycles 0.99
ML-DSA-65 keypair 1388748 cycles 1403477 cycles 0.99
ML-DSA-65 sign 5342488 cycles 5445750 cycles 0.98
ML-DSA-65 verify 1449418 cycles 1468946 cycles 0.99
ML-DSA-87 keypair 2279651 cycles 2305788 cycles 0.99
ML-DSA-87 sign 6701296 cycles 6816326 cycles 0.98
ML-DSA-87 verify 2369663 cycles 2395888 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 310722 cycles 310585 cycles 1.00
ML-DSA-44 sign 1229229 cycles 1226029 cycles 1.00
ML-DSA-44 verify 341924 cycles 334141 cycles 1.02
ML-DSA-65 keypair 557707 cycles 588746 cycles 0.95
ML-DSA-65 sign 1973538 cycles 2004197 cycles 0.98
ML-DSA-65 verify 532434 cycles 551130 cycles 0.97
ML-DSA-87 keypair 861565 cycles 883006 cycles 0.98
ML-DSA-87 sign 2491772 cycles 2482222 cycles 1.00
ML-DSA-87 verify 894050 cycles 899288 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks (opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 292640 cycles 290379 cycles 1.01
ML-DSA-44 sign 931435 cycles 934175 cycles 1.00
ML-DSA-44 verify 296597 cycles 291054 cycles 1.02
ML-DSA-65 keypair 493055 cycles 493069 cycles 1.00
ML-DSA-65 sign 1536503 cycles 1542022 cycles 1.00
ML-DSA-65 verify 481944 cycles 478936 cycles 1.01
ML-DSA-87 keypair 837005 cycles 834261 cycles 1.00
ML-DSA-87 sign 2027012 cycles 2068513 cycles 0.98
ML-DSA-87 verify 816781 cycles 810807 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

@rod-chapman
Copy link
Contributor Author

Proof results, running on an r7g instance with -j16 (so as close as I can get to the CI runner that we use for CBMC). The first figure is real-time, the second is CPU "user" time,

Branch main:

  MLDSA-44: 6m56s, 40m57s
  MLDSA-65: 5m13s, 45m38s
  MLDSA-87: 8m03s, 52m10s

Branch signed_loop_quants:

  MLDSA-44: 3m39s, 31m42s
  MLDSA-65: 3m29s, 34m57s
  MLDSA-87: 7m41s, 40m36s

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks (no-opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 474882 cycles 469390 cycles 1.01
ML-DSA-44 sign 2261281 cycles 2217478 cycles 1.02
ML-DSA-44 verify 550155 cycles 550302 cycles 1.00
ML-DSA-65 keypair 793300 cycles 783147 cycles 1.01
ML-DSA-65 sign 3691370 cycles 3648464 cycles 1.01
ML-DSA-65 verify 854102 cycles 849774 cycles 1.01
ML-DSA-87 keypair 1276265 cycles 1268145 cycles 1.01
ML-DSA-87 sign 4549078 cycles 4484895 cycles 1.01
ML-DSA-87 verify 1379616 cycles 1372709 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks (opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 115582 cycles 115605 cycles 1.00
ML-DSA-44 sign 376976 cycles 377307 cycles 1.00
ML-DSA-44 verify 120282 cycles 120319 cycles 1.00
ML-DSA-65 keypair 200134 cycles 200147 cycles 1.00
ML-DSA-65 sign 622885 cycles 622903 cycles 1.00
ML-DSA-65 verify 198036 cycles 198334 cycles 1.00
ML-DSA-87 keypair 327234 cycles 327229 cycles 1.00
ML-DSA-87 sign 790777 cycles 791295 cycles 1.00
ML-DSA-87 verify 325304 cycles 324859 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks (no-opt)

Benchmark suite Current: 1e19e9b Previous: 328f13f Ratio
ML-DSA-44 keypair 215471 cycles 214042 cycles 1.01
ML-DSA-44 sign 798434 cycles 781550 cycles 1.02
ML-DSA-44 verify 233112 cycles 229688 cycles 1.01
ML-DSA-65 keypair 385919 cycles 384618 cycles 1.00
ML-DSA-65 sign 1320449 cycles 1323621 cycles 1.00
ML-DSA-65 verify 376518 cycles 375790 cycles 1.00
ML-DSA-87 keypair 610330 cycles 606502 cycles 1.01
ML-DSA-87 sign 1658551 cycles 1622919 cycles 1.02
ML-DSA-87 verify 623112 cycles 616687 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

@rod-chapman
Copy link
Contributor Author

The proof performance looks really good, but runtime performance is affected by 1-5% for the C backend(s), particularly on Intel/AMD. I will look at the "components" benchmark to see what's going on.

@rod-chapman
Copy link
Contributor Author

The performance degradation appears to be concentrated in the Inverse NTT in the C backend. Not sure what's going on.

Specifically, use signed "int" for:
1. for loop counters
2. quantified expressions
3. variables and struct fields that count or index into arrays
4. formal parameters for array indexes, offsets and counts

Additionally, strengthen pre-conditions on polyvecl_add()
and polyveck_add() to stabilize proof of those functions
for all parameter sets.

Signed-off-by: Rod Chapman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants