Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSM performance benefits for Groth16 #54

Open
waamm opened this issue Dec 26, 2024 · 4 comments
Open

MSM performance benefits for Groth16 #54

waamm opened this issue Dec 26, 2024 · 4 comments

Comments

@waamm
Copy link

waamm commented Dec 26, 2024

Hi! After quickly replacing the VariableBaseMSM::multi_scalar_mul invocations here with msm_cuda::multi_scalar_mult_arkworks, I am not noticing any performance difference. Does that make sense, or am I using it incorrectly? I'm using a Tesla T4.

Edit: Running cargo build --release --features=bn254 -vv shows a lot of stable-x86_64-unknown-linux-gnu... and after installing nvidia-cuda-toolkit, I think I am having gcc incompatibility issues

Edit2: It compiled now after updating CUDA. Benchmark is failing:

     Running benches/msm.rs (target/release/deps/msm-933742322995d626)
Benchmarking CUDA/2**23: Warming up for 3.0000 serror: bench failed, to rerun pass `--bench msm`

Caused by:
  process didn't exit successfully: `/home/wicher/sppark/poc/msm-cuda/target/release/deps/msm-933742322995d626 --bench` (signal: 11, SIGSEGV: invalid memory reference)

Edit3: Seems to work now for some reason. Is there a specific reason this library uses v0.3 of arkworks?

@waamm waamm changed the title MSM performance benefits MSM performance benefits for Groth16 Dec 26, 2024
@dot-asm
Copy link
Collaborator

dot-asm commented Jan 9, 2025

Seems to work now for some reason. Is there a specific reason this library uses v0.3 of arkworks?

This is not true. This library has literally no dependencies (build-dependencies don't count). Arkworks is used in the test suite purely for verification purposes.

@dot-asm
Copy link
Collaborator

dot-asm commented Jan 9, 2025

After quickly replacing the VariableBaseMSM::multi_scalar_mul invocations here with msm_cuda::multi_scalar_mult_arkworks, I am not noticing any performance difference.

The trouble is that any "quick" thing is more likely to fail to deliver the expected improvement. At least based on what we've seen. Orchestrating the data flow and more tight integration with application is the key. See https://github.com/supranational/supra_seal/tree/main/c2 for an example. Well, "see" is a bit of a misnomer, because you're unlikely to figure it out just like that. The keyword is rather that it takes over a bigger section of the prover in order to to do the "orchestrating and tight integration" thing.

@dot-asm
Copy link
Collaborator

dot-asm commented Jan 9, 2025

Though there is another caveat. Current MSM implementation is oversensitive to repeating bit patterns in scalars. I mean if scalars have some bit structure, and some provers were observed to produce more repetitive patterns than others, current MSM implementation would tend to underperform. The reason for why the issue is not resolved is because there was no strong motivating factor so far...

@dot-asm
Copy link
Collaborator

dot-asm commented Jan 9, 2025

See https://github.com/supranational/supra_seal/tree/main/c2 for an example.

Just in case, this is Groth16, a demanding one, operating on vectors with ~2**27 elements. And the payoff was absolutely worth the effort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants