Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fast methods for any and all for Bool tuples #55673

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

matthias314
Copy link
Contributor

The new methods allow for vectorization and are therefore much faster: With nightly, I get

julia> t = ntuple(Returns(true), 32); @b any($t)
2.440 ns

julia> t = ntuple(>(4), 32); @b any($t)
5.014 ns

julia> t = ntuple(Returns(false), 32); @b any($t)
22.961 ns

With this PR,

julia> t = ntuple(Returns(false), 32); @b any($t)
2.638 ns

As for other Tuple methods, this is only done up to length 32. Beyond that, the generic method is called. At present, the cut-off is 3 elements.

I find this quite useful. What you think?

@matthias314
Copy link
Contributor Author

It seems that some people like this proposal. What about the maintainers?

@nsajko nsajko added the collections Data structures holding multiple items, e.g. sets label Oct 4, 2024
Comment on lines +663 to +665
all(x::NTuple{N,Bool}) where N = N <= 32 ? reduce(&, x; init = true) : _all(x, :)

any(x::NTuple{N,Bool}) where N = N <= 32 ? reduce(|, x; init = false) : _any(x, :)
Copy link
Contributor

@nsajko nsajko Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the tuple is homogeneous, there's no need for recursion, we can handle inputs of all sizes with a loop. This results in the same LLVM code for a length-32 tuple as with your variant with reduce:

Suggested change
all(x::NTuple{N,Bool}) where N = N <= 32 ? reduce(&, x; init = true) : _all(x, :)
any(x::NTuple{N,Bool}) where N = N <= 32 ? reduce(|, x; init = false) : _any(x, :)
function _all_bool(x::Tuple{Vararg{Bool}})
@_terminates_locally_meta
r = true
for i eachindex(x)
r &= getfield(x, i, false) # avoid `getindex` bounds checking to help vectorization
end
r
end
function _any_bool(x::Tuple{Vararg{Bool}})
@_terminates_locally_meta
r = false
for i eachindex(x)
r |= getfield(x, i, false) # avoid `getindex` bounds checking to help vectorization
end
r
end
any(x::Tuple{Vararg{Bool}}) = _any_bool(x)
all(x::Tuple{Vararg{Bool}}) = _all_bool(x)
all(x::Tuple{Bool}) = _all_bool(x) # disambiguate

@nsajko
Copy link
Contributor

nsajko commented Jan 17, 2025

There's also a merge conflict now.

@nsajko
Copy link
Contributor

nsajko commented Jan 17, 2025

Could also be a bit more general by dispatching on typeof(identity). Then the single-argument methods wouldn't be necessary.

nsajko added a commit to nsajko/julia that referenced this pull request Jan 18, 2025
In particular:
* Help ensure vectorization for homogeneous tuples of `Bool`. Inspired
  by JuliaLang#55673, but more general by using a loop, thus being performant
  for any input length.
* Delete single-argument methods, instead define methods dispatching on
  `typeof(identity)`. This makes the methods more generally useful.
* Make some optimizations defined for `all` also be defined for `any`
  in a symmetric manner.
* Delete the methods specific to the empty tuple, as they're not
  required for such calls to be foldable.

Closes JuliaLang#55673
@nsajko
Copy link
Contributor

nsajko commented Jan 18, 2025

Here's a more comprehensive and general PR: #57091, @matthias314 hope you don't mind.

@nsajko nsajko added the performance Must go faster label Jan 18, 2025
nsajko added a commit to nsajko/julia that referenced this pull request Jan 18, 2025
In particular:
* Help ensure vectorization for homogeneous tuples of `Bool`. Inspired
  by JuliaLang#55673, but more general by using a loop, thus being performant
  for any input length.
* Delete single-argument methods, instead define methods dispatching on
  `typeof(identity)`. This makes the methods more generally useful.
* Make some optimizations defined for `all` also be defined for `any`
  in a symmetric manner.
* Delete the methods specific to the empty tuple, as they're not
  required for such calls to be foldable.

Closes JuliaLang#55673
nsajko added a commit to nsajko/julia that referenced this pull request Jan 18, 2025
In particular:
* Help ensure vectorization for homogeneous tuples of `Bool`. Inspired
  by JuliaLang#55673, but more general by using a loop, thus being performant
  for any input length.
* Delete single-argument methods, instead define methods dispatching on
  `typeof(identity)`. This makes the methods more generally useful.
* Make some optimizations defined for `all` also be defined for `any`
  in a symmetric manner.
* Delete the methods specific to the empty tuple, as they're not
  required for such calls to be foldable.

Closes JuliaLang#55673
nsajko added a commit to nsajko/julia that referenced this pull request Jan 18, 2025
In particular:
* Help ensure vectorization for homogeneous tuples of `Bool`. Inspired
  by JuliaLang#55673, but more general by using a loop, thus being performant
  for any input length.
* Delete single-argument methods, instead define methods dispatching on
  `typeof(identity)`. This makes the methods more generally useful.
* Make some optimizations defined for `all` also be defined for `any`
  in a symmetric manner.
* Delete the methods specific to the empty tuple, as they're not
  required for such calls to be foldable.

Closes JuliaLang#55673
nsajko added a commit to nsajko/julia that referenced this pull request Jan 19, 2025
In particular:
* Help ensure vectorization for homogeneous tuples of `Bool`. Inspired
  by JuliaLang#55673, but more general by using a loop, thus being performant
  for any input length.
* Delete single-argument methods, instead define methods dispatching on
  `typeof(identity)`. This makes the methods more generally useful.
* Make some optimizations defined for `all` also be defined for `any`
  in a symmetric manner.
* Delete the methods specific to the empty tuple, as they're not
  required for such calls to be foldable.

Closes JuliaLang#55673
nsajko added a commit to nsajko/julia that referenced this pull request Jan 21, 2025
In particular:
* Help ensure vectorization for homogeneous tuples of `Bool`. Inspired
  by JuliaLang#55673, but more general by using a loop, thus being performant
  for any input length.
* Delete single-argument methods, instead define methods dispatching on
  `typeof(identity)`. This makes the methods more generally useful.
* Make some optimizations defined for `all` also be defined for `any`
  in a symmetric manner.
* Delete the methods specific to the empty tuple, as they're not
  required for such calls to be foldable.

Closes JuliaLang#55673
nsajko added a commit to nsajko/julia that referenced this pull request Jan 22, 2025
In particular:
* Help ensure vectorization for homogeneous tuples of `Bool`. Inspired
  by JuliaLang#55673, but more general by using a loop, thus being performant
  for any input length.
* Delete single-argument methods, instead define methods dispatching on
  `typeof(identity)`. This makes the methods more generally useful.
* Make some optimizations defined for `all` also be defined for `any`
  in a symmetric manner.
* Delete the methods specific to the empty tuple, as they're not
  required for such calls to be foldable.

Closes JuliaLang#55673
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants