Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent mean() in the presence of missing values #410

Open
krlmlr opened this issue Dec 18, 2024 · 0 comments
Open

Inconsistent mean() in the presence of missing values #410

krlmlr opened this issue Dec 18, 2024 · 0 comments

Comments

@krlmlr
Copy link
Member

krlmlr commented Dec 18, 2024

library(duckplyr)
#> Loading required package: dplyr
#> ✔ Overwriting dplyr methods with duckplyr methods.
#> ℹ Turn off with `duckplyr::methods_restore()`.

duck_tbl(a = c(1:3, NA)) |>
  summarize(b = mean(a))
#> # A duckplyr data frame: 1 variable
#>       b
#>   <dbl>
#> 1     2

duck_tbl(a = c(1:3, NA)) |>
  df_to_parquet("duck_tbl.parquet")
#> NULL

duck_parquet("duck_tbl.parquet") |>
  summarize(b = mean(a))
#> # A duckplyr data frame: 1 variable
#>       b
#>   <dbl>
#> 1     2

Created on 2024-12-18 with reprex v2.1.1

None of the dplyr downstream dependencies seems to rely on this, though.

@krlmlr krlmlr changed the title Inconsistent mean() in the presence of missing values when reading from Parquet data Inconsistent mean() in the presence of missing values Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant