Skip to content

Semantics of simd_select_bitmask for "out-of-bounds" mask? #269

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
RalfJung opened this issue Mar 17, 2022 · 4 comments
Closed

Semantics of simd_select_bitmask for "out-of-bounds" mask? #269

RalfJung opened this issue Mar 17, 2022 · 4 comments

Comments

@RalfJung
Copy link
Member

What should the semantics of the simd_select_bitmask intrinsic be when the bitmask has "out-of-bounds" bit set? This can happen when the vector length is less than 8, and the bitmask is stored as a u8.

  • This could be just declared UB, so codegen does not have to worry about this case. That's what implement simd bitmask intrinsics miri#2029 implements for Miri. However, then this is unsound, as a user can pass an arbitrary u8 to that function. So, we'd have to mask out the extra bits before passing them to simd_select_bitmask.
  • Or we could just say that these extra bits are ignored, and then codegen backends will have to make sure they handle that case properly.

The portable-simd test suite currently does not hit this case at all.

@calebzulawski
Copy link
Member

I believe it's not UB in the current LLVM backend, but I'm not sure if that was intentional.

@RalfJung
Copy link
Member Author

Yeah, the current backend does trunc.

However, @workingjubilee mentioned future plans for other bitmask-taking intrinsics where this would be UB, so it might make sense to make it UB here as well.

@workingjubilee
Copy link
Member

Yes, specifically I was thinking about possibly executing masked loads with a bitvector. We could require it to implicitly trunc to the right size but it seems simpler in terms of codegen to just say the bitvector should probably specify reading the exact number of values you want, even if read as a larger number. That way if you just slam the number into a register and do the masked load, it always works correctly.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 3, 2024

The docs for the intrinsic also explicitly say "padding bits must be all zero", so I think currently, this has a very clear answer -- out-of-bounds masks are UB.

If there's a motivation to change that, someone should file an issue spelling that out. :)

@RalfJung RalfJung closed this as completed Aug 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants