You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The point of simd_gather/scatter is that I have a vector of pointers and a mask, and only the pointers that are "enabled" will actually be used. The others may point to garbage memory or memory that is being concurrently read/written by other threads or whatever, they must not be touched.
If I understand the implementations for these intrinsics in compiler/rustc_codegen_gcc/src/intrinsic/simd.rs correctly, then they currently always read and write all the pointers, and uses shuffle to decide which values to keep. Apart from the fact that this does a bunch of sequential loads and stores (completely losing the SIMD effect, making me wonder why a shuffle is used when some basic if-then-else would be a lot simpler), this will make programs that use simd_gather/scatter produce the wrong behavior in quite subtle ways.
As a comparatively minor issue that is not worth tracking separately, the implementation also seems to assume that the length of the vector is a power of 2.
The text was updated successfully, but these errors were encountered:
The point of simd_gather/scatter is that I have a vector of pointers and a mask, and only the pointers that are "enabled" will actually be used. The others may point to garbage memory or memory that is being concurrently read/written by other threads or whatever, they must not be touched.
If I understand the implementations for these intrinsics in
compiler/rustc_codegen_gcc/src/intrinsic/simd.rs
correctly, then they currently always read and write all the pointers, and usesshuffle
to decide which values to keep. Apart from the fact that this does a bunch of sequential loads and stores (completely losing the SIMD effect, making me wonder why ashuffle
is used when some basic if-then-else would be a lot simpler), this will make programs that use simd_gather/scatter produce the wrong behavior in quite subtle ways.As a comparatively minor issue that is not worth tracking separately, the implementation also seems to assume that the length of the vector is a power of 2.
The text was updated successfully, but these errors were encountered: