You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I recently updated my dplyr version (late to the party), and I'm hitting some increased memory usage. I've traced it back to how warnings are handled. Beginning in dplyr 1.1.1, I get the following output:
library(dplyr)
#> #> Attaching package: 'dplyr'#> The following objects are masked from 'package:stats':#> #> filter, lag#> The following objects are masked from 'package:base':#> #> intersect, setdiff, setequal, unionidentity<-function(x, warn) {
if (warn) {
warning("fake warning")
}
x
}
df<-tibble::tibble(e= rep(1, 1e8))
print(gc())
#> used (Mb) gc trigger (Mb) max used (Mb)#> Ncells 620941 33.2 1306337 69.8 1306337 69.8#> Vcells 101049658 771.0 148096356 1129.9 101084366 771.3df<-df %>% mutate(e= identity(e, warn=TRUE))
#> Warning: There was 1 warning in `mutate()`.#> ℹ In argument: `e = identity(e, warn = TRUE)`.#> Caused by warning in `identity()`:#> ! fake warning
print(gc())
#> used (Mb) gc trigger (Mb) max used (Mb)#> Ncells 729780 39.0 1306337 69.8 1306337 69.8#> Vcells 101287706 772.8 148096356 1129.9 102369359 781.1
rm(df)
print(gc())
#> used (Mb) gc trigger (Mb) max used (Mb)#> Ncells 729742 39.0 1306337 69.8 1306337 69.8#> Vcells 101287654 772.8 148096356 1129.9 102369359 781.1
If I restart R and rerun with warn = FALSE, the final memory usage is only 7.9 MB rather than 772.8 MB. Additionally, if I rewrite the mutate to avoid using a pipe via df <- mutate(df, e = identity(e, warn = TRUE)), the final memory usage is only 8.8 MB. Switching the pipe to |> also yields low memory usage. Under dplyr 1.1.0, the above reprex yields 18.8 MB.
I don't have a full appreciation for whether warnings would capture my environment, but I'm wondering if that's perhaps happening within either base R or dplyr's own record of warnings.
The text was updated successfully, but these errors were encountered:
I recently updated my
dplyr
version (late to the party), and I'm hitting some increased memory usage. I've traced it back to how warnings are handled. Beginning indplyr 1.1.1
, I get the following output:Created on 2025-01-31 with reprex v2.0.2
If I restart R and rerun with
warn = FALSE
, the final memory usage is only 7.9 MB rather than 772.8 MB. Additionally, if I rewrite the mutate to avoid using a pipe viadf <- mutate(df, e = identity(e, warn = TRUE))
, the final memory usage is only 8.8 MB. Switching the pipe to|>
also yields low memory usage. Underdplyr 1.1.0
, the above reprex yields 18.8 MB.I don't have a full appreciation for whether warnings would capture my environment, but I'm wondering if that's perhaps happening within either base R or dplyr's own record of warnings.
The text was updated successfully, but these errors were encountered: