Skip to content

Conversation

vsvnakers
Copy link

We have added the RVV implementation code for RISCV, which is benchmarked against NEON, in scalarquantizer.cpp for performance testing. The decode_8_components of CPU and RVV were tested and scored on the RISCV board, and the implementation of RVV showed significant performance improvement.

One of the RVV tests regarding decode_8_components

Test Principle:

This program compares the execution time of Scalar decoding and RVV Vectorized decoding to verify the performance improvements brought by RVV optimization.

  1. Scalar Decoding: Each byte is processed individually and decoded sequentially.
  2. RVV Vectorized Decoding: Uses RVV vector instructions to process 8 bytes at once, utilizing parallel processing to accelerate the decoding.

Test Procedure:

  1. Initial Data: Create a random array code.
  2. Scalar Decoding: Decode each byte sequentially and measure the time.
  3. RVV Decoding: Decode 8 bytes at once using RVV vector instructions and measure the time.
  4. Timing: Record the time for Scalar decoding and RVV decoding.

Output:

  • Dummy sum: Used to ensure consistency in calculations.
  • RVV execution time: Time taken for RVV decoding.
  • Scalar execution time: Time taken for scalar decoding.
[root@EulixOS ~]# ./decode_8
Dummy sum = 9476415.0000
RVV time    = 1571.824 ms
Scalar time = 2450.067 ms

Copy link

meta-cla bot commented Aug 4, 2025

Hi @vsvnakers!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@vsvnakers
Copy link
Author

I hope you’re all doing well! I wanted to kindly follow up on my PR (#4503), which has been open for about a week.
This PR includes an optimization for Scalarquantizer using RVV, and I’d greatly appreciate it if one of potential maintainers(@mnorris11 @Enet4 @alexanderguzhva ) could take a look when you have a chance.

If there are any issues, missing details, or areas that could be improved, please feel free to let me know — I’m more than happy to make adjustments.
Thank you very much for your time and support in helping move this forward! 🙏

@alexanderguzhva
Copy link
Contributor

alexanderguzhva commented Aug 15, 2025

@vsvnakers my comments are the following

  1. please remove unneeded debugging commented out code, such as // printf(...);
  2. please remove unneeded meaningful commented out code, such as // vfloat32m1x2_t res;
  3. please remove debugging code with active printf(...); lines, unless it is critically important debugging info (which needs to be covered with if (verbose) {} construct.
  4. please keep existing code style, so there's no need to change if (cond) { do_it(); } into if (cond) do_it(); for one-liners. The rules for formatting the code can be found in included .clang-format file(s)

overall, the change looks good

@mnorris11
Copy link

Thanks @alexanderguzhva for taking a look

Sorry @vsvnakers , it fell through the cracks during change in oncall who reviews issues.

for context, we are working on a large refactor of the library to allow easier maintenance of SIMD code.

@subhadeepkaran do you want to see how compatible this one is?

@vsvnakers
Copy link
Author

Hi @alexanderguzhva @mnorris11, thanks a lot for the detailed feedback 🙏
I’ve addressed the comments (removed the unnecessary debug/commented code and kept the existing code style as suggested).
When you have a moment, could you please take another look? I’d really appreciate it!

@alexanderguzhva
Copy link
Contributor

@vsvnakers the code looks good

@mnorris11
Copy link

@vsvnakers , thanks for this contribution! After the large SIMD refactor, we can start reviewing it. ETA this month. @subhadeepkaran to check

@mdouze
Copy link
Contributor

mdouze commented Aug 29, 2025

Hi! AFAICS, the diff is still mainly formatting comments. Please use the clang formatter to make it readable. Thanks!

Signed-off-by: lyd1992 <[email protected]>

Signed-off-by: vsvnakers <[email protected]>
Signed-off-by: lyd1992 <[email protected]>

Signed-off-by: vsvnakers <[email protected]>
@vsvnakers
Copy link
Author

Hi @mdouze, I reformatted the file using clang-format as recommended.
Could you please take another look when you have time? Thanks a lot!

@mdouze
Copy link
Contributor

mdouze commented Sep 7, 2025

@vsvnakers, thanks a lot, the PR is much more readable now.
As @mnorris11 says, we are busy reshaping all the SIMD code, mainly to avoid the many ifdefs related to different SIMD flavors. This will impact any SIMD PRs, so we'd rather have a PR against that new API. You can see PR #4557 about how this is going to be implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants