mask8x8::from_bitmask falls back to scalar code

I tried this code:

```rust
pub fn func(a: u8, b: u64) -> u8 {
    let c = mask8x8::from_bitmask(a).to_int().cast();
    let d = c | u8x8::from_array(b.to_le_bytes());
    d.horizontal_and()
}
```

I expected to see this happen: vectorized `mask8x8::from_bitmask`, for example like this Rust code:
```rust
u8x8::splat(0).lanes_ne(u8x8::splat(a) & u8x8::from_array([1, 2, 4, 8, 16, 32, 64, 128]))
```

(on x86 with appropriate target-cpu, using PDEP may be the best approach.)

Instead, this happened on x86: Each bit is extracted individually by `movl`, `shrb`, `andb` to its own general-purpose register and then inserted with `vpinsrb` or `pinsrw` (depending on target-cpu). After that, the bits are expanded to 0x00 or 0xff using vectorized code. Scalar bit extraction needs more instructions and more runtime than vectorized code. Also, it may pressure the register allocator in more complex functions.


### Meta

`rustc --version --verbose`:
```
rustc 1.61.0-nightly (f103b2969 2022-03-12)
binary: rustc
commit-hash: f103b2969b0088953873dc1ac92eb3387c753596
commit-date: 2022-03-12
host: x86_64-unknown-linux-gnu
release: 1.61.0-nightly
LLVM version: 14.0.0
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mask8x8::from_bitmask falls back to scalar code #264

Meta

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

mask8x8::from_bitmask falls back to scalar code #264

Description

Meta

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions