The point of simd_gather/scatter is that I have a vector of pointers and a mask, and only the pointers that are "enabled" will actually be used. The others may point to garbage memory or memory that is being concurrently read/written by other threads or whatever, they must not be touched.
If I understand the implementations for these intrinsics in compiler/rustc_codegen_gcc/src/intrinsic/simd.rs correctly, then they currently always read and write all the pointers, and uses shuffle to decide which values to keep. Apart from the fact that this does a bunch of sequential loads and stores (completely losing the SIMD effect, making me wonder why a shuffle is used when some basic if-then-else would be a lot simpler), this will make programs that use simd_gather/scatter produce the wrong behavior in quite subtle ways.
As a comparatively minor issue that is not worth tracking separately, the implementation also seems to assume that the length of the vector is a power of 2.
The point of simd_gather/scatter is that I have a vector of pointers and a mask, and only the pointers that are "enabled" will actually be used. The others may point to garbage memory or memory that is being concurrently read/written by other threads or whatever, they must not be touched.
If I understand the implementations for these intrinsics in
compiler/rustc_codegen_gcc/src/intrinsic/simd.rscorrectly, then they currently always read and write all the pointers, and usesshuffleto decide which values to keep. Apart from the fact that this does a bunch of sequential loads and stores (completely losing the SIMD effect, making me wonder why ashuffleis used when some basic if-then-else would be a lot simpler), this will make programs that use simd_gather/scatter produce the wrong behavior in quite subtle ways.As a comparatively minor issue that is not worth tracking separately, the implementation also seems to assume that the length of the vector is a power of 2.