use zero(FT) instead of zero(field) in broadcasts#1599
Open
juliasloan25 wants to merge 1 commit intomainfrom
Open
use zero(FT) instead of zero(field) in broadcasts#1599juliasloan25 wants to merge 1 commit intomainfrom
juliasloan25 wants to merge 1 commit intomainfrom
Conversation
Member
Author
|
cc @petebachant @szy21 @kmdeck |
ph-kev
approved these changes
Dec 6, 2025
Comment on lines
610
to
611
| @. csf.scalar_temp1 = ifelse(area_fraction == 0, 0, csf.scalar_temp1) | ||
| @. csf.scalar_temp2 = ifelse(area_fraction == 0, 0, csf.scalar_temp2) |
Member
There was a problem hiding this comment.
I am not sure if this matter, but do we need to do something like zero(FT) or something similar to that?
Member
Author
There was a problem hiding this comment.
zero(FT) can be slightly more type-stable and doesn't require type promotion of the Int to FT. I'll switch to that
Member
Author
There was a problem hiding this comment.
I also profiled the different options and saw that zero(FT) is faster even though it has fewer allocations, probably because when we use 0 the type has to be promoted.
julia> @btime @. field1 = ifelse(area_fraction == 0, zero(field1), field1)
199.083 μs (3 allocations: 128 bytes)
julia> @btime @. field1 = ifelse(area_fraction == 0, zero(eltype(field1)), field1)
198.125 μs (4 allocations: 160 bytes)
julia> @btime @. field1 = ifelse(area_fraction == 0, 0, field1);
222.667 μs (2 allocations: 96 bytes)
julia> @btime @. field1 = ifelse(area_fraction == 0, zero(FT), field1)
174.708 μs (8 allocations: 288 bytes)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
We commonly use the pattern
@. field = ifelse(area_fraction ≈ 0, zero(field), field). I was under the impression that thezero(field)would be equivalent tozero(eltype(field))when broadcasted, but this isn't the case. See the allocation and timing information belowContent
zero(field)with 0 in all broadcastedifelsecallsifelsecallsTiming comparison
Despite having the most allocations,
zero(FT)is the fastest to run since it the type of the zero doesn't need to be promoted (as it does for e.g. 0).Allocation comparison
The allocations are even worse when the whole expression isn't broadcasted, which we were doing in a few places: