Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simd_gather/scatter implementations seem fundamentally wrong #640

Open
RalfJung opened this issue Mar 3, 2025 · 0 comments
Open

simd_gather/scatter implementations seem fundamentally wrong #640

RalfJung opened this issue Mar 3, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@RalfJung
Copy link
Member

RalfJung commented Mar 3, 2025

The point of simd_gather/scatter is that I have a vector of pointers and a mask, and only the pointers that are "enabled" will actually be used. The others may point to garbage memory or memory that is being concurrently read/written by other threads or whatever, they must not be touched.

If I understand the implementations for these intrinsics in compiler/rustc_codegen_gcc/src/intrinsic/simd.rs correctly, then they currently always read and write all the pointers, and uses shuffle to decide which values to keep. Apart from the fact that this does a bunch of sequential loads and stores (completely losing the SIMD effect, making me wonder why a shuffle is used when some basic if-then-else would be a lot simpler), this will make programs that use simd_gather/scatter produce the wrong behavior in quite subtle ways.

As a comparatively minor issue that is not worth tracking separately, the implementation also seems to assume that the length of the vector is a power of 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants