diff --git a/.github/workflows/R-CMD-check.yaml b/.github/workflows/R-CMD-check.yaml index 85e51c0b..6cabe407 100644 --- a/.github/workflows/R-CMD-check.yaml +++ b/.github/workflows/R-CMD-check.yaml @@ -73,7 +73,7 @@ jobs: shell: Rscript {0} # install julia - - uses: julia-actions/setup-julia@v1 + - uses: julia-actions/setup-julia@v2 # add julia to renviron - name: Create and populate .Renviron file run: echo JULIA_BINDIR= "${{ env.juliaLocation }}" >> ~/.Renviron diff --git a/CRAN-SUBMISSION b/CRAN-SUBMISSION index 790e6a20..91ce739c 100644 --- a/CRAN-SUBMISSION +++ b/CRAN-SUBMISSION @@ -1,4 +1,4 @@ -Version: 0.14.0 -Date: 2023-07-09 19:55:48 UTC +Version: 0.14.2 +Date: 2023-07-09 20:03:37 UTC SHA: - 431829b4d99cd13f0652e71483b3ce8e87cb5b58 + ecc6d5bda291c1b8af3348f757e4bef98f52dffb diff --git a/DESCRIPTION b/DESCRIPTION index abb9d107..10c0b16d 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,17 +1,17 @@ Package: fwildclusterboot Title: Fast Wild Cluster Bootstrap Inference for Linear Models -Version: 0.14.3 +Version: 0.15.0 Authors@R: c( person("Alexander", "Fischer", , "alexander-fischer1801@t-online.de", role = c("aut", "cre")), person("David", "Roodman", role = "aut"), - person("Megha", "Joshi", - role = "rev", - comment = "Megha reviewed the package (v. 0.13) for ropensci + person("Megha", "Joshi", + role = "rev", + comment = "Megha reviewed the package (v. 0.13) for ropensci , see " ), person("Eunseop", "Kim", - role = "rev", - comment = "Eunseop reviewed the package (v. 0.13) for ropensci + role = "rev", + comment = "Eunseop reviewed the package (v. 0.13) for ropensci , see " ), person("Achim", "Zeileis", role = "ctb", @@ -50,11 +50,9 @@ Imports: dreamerr, Formula, generics, - gtools, JuliaConnectoR, Matrix, Rcpp, - rlang, summclust Suggests: bench, @@ -73,7 +71,7 @@ Suggests: rmarkdown, sandwich, testthat (>= 3.0.0), - tibble, + tibble, MASS LinkingTo: Rcpp, diff --git a/NAMESPACE b/NAMESPACE index 4596da39..8a4dcf9a 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -56,11 +56,7 @@ importFrom(dreamerr,validate_dots) importFrom(dreamerr,warn_up) importFrom(generics,glance) importFrom(generics,tidy) -importFrom(gtools,permutations) importFrom(parallel,detectCores) -importFrom(rlang,abort) -importFrom(rlang,inform) -importFrom(rlang,warn) importFrom(stats,aggregate) importFrom(stats,as.formula) importFrom(stats,coef) diff --git a/NEWS.md b/NEWS.md index 91ed8afd..64844a09 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,13 +1,47 @@ -# fwildclusterboot 0.14.3 +# fwildclusterboot 0.15 + +- Switches from `dqrng::dqsample()` to `dqrng::dqrrademacher()` for sampling of Rademacher weights. In consequence, + the same seed might no longer produce identical results between versions. But if you're number of bootstrap iterations + has been large enough, this should not impact any of your conclusions =) + Good performance improvements. See below for benchmarks. +- Updates ropensci review tags. +- Bumps lifecycle badge to stable. +- Error message handling via base R functions. +- The `rlang` and `gtools` dependencies are dropped. + +```r + > library(bench) +> +> n <- 999999 * 100 +> +> mark( ++ dqrrademacher(n), ++ dqrng::dqsample( ++ x = c(-1L, 1L), ++ size = n, ++ replace = TRUE ++ ), ++ iterations = 5, ++ check = FALSE ++ ) +# A tibble: 2 × 13 + expression min median itr/s…¹ mem_a…² gc/se…³ n_itr n_gc + +1 dqrrademacher(n) 123ms 166ms 5.25 381MB 3.15 5 3 +2 dqrng::dqsample(x = c(-1L, 1L), size = n, replace = TRUE) 615ms 647ms 1.47 763MB 1.47 5 5 +# … with 5 more variables: total_time , result , memory , time , gc , and +# abbreviated variable names ¹​`itr/sec`, ²​mem_alloc, ³​`gc/sec` +``` -- Fix a bug with CI inversion when `r` was set to be close to the estimated parameter. (CI inversion failed). See [#138](https://github.com/s3alfisc/fwildclusterboot/issues/138). Thanks to Achim Zeileis & team for raising this issue! +# fwildclusterboot 0.14.3 +- Fixes a bug with CI inversion when r was set to be close to the estimated parameter, in which case the confidence interval inversion failed. See #138. Thanks to Achim Zeileis & team for raising this issue! +- This might lead to insubstantial differences in computed bootstrap confidence intervals compared to prior versions - the reason is that starting values for the root finding procedure used in inverting the CIs might have changed. # fwildclusterboot 0.14.2 Minor fixes for CRAN release. - # fwildclusterboot 0.14.1 - brings back the `print()` method, it had a use case after all @@ -18,8 +52,8 @@ Minor fixes for CRAN release. ## Breaking Changes -- the `print.boottest()` and `print.mboottest()` method have been deprecated, as both did not have a distinct use case. -- Bugfix: `boottest()` should never have run with `fixest::feols()` and +- the `print.boottest()` and `print.mboottest()` method have been deprecated, as both did not have a distinct use case. +- Bugfix: `boottest()` should never have run with `fixest::feols()` and varying slopes syntax via `var1[var2]`. Unfortunately it did for the heteroskedastic bootstrap - it's a bug. I am very sorry if you are affected by this! This version adds an error message for this case. ## Performance @@ -28,15 +62,15 @@ Version 0.14 ... - sparsifies the "fast and reliable" bootstraps - bootstrap types 31, 33, 13 (which leads to good speed gains for problems with high dimensional fixed effects) - allows to project out cluster fixed effects when running the "fast and reliable" algorithms "11" and "31" -- computes the generalized inverse `pinv` via rcpp eigen instead of `MASS::ginv()` whenever `Matrix::solve()` fails +- computes the generalized inverse `pinv` via rcpp eigen instead of `MASS::ginv()` whenever `Matrix::solve()` fails - unlocks parallelization (nthreads was internally set to 1 for some reason) -## rOpenSci Review feedback +## rOpenSci Review feedback - update docs: - add a vignette on wild bootstrap concepts (wild bootstrap 101) - - better explanation of plot method in docs and vignette + - better explanation of plot method in docs and vignette - some guidelines on how to turn messages and warnings off - reorganization of ropensci ssr tags into code - it is now possible to interrupt rcpp loops @@ -50,15 +84,15 @@ Version 0.14 ... # fwildclusterboot 0.13 -## Potentially Breaking Changes: +## Potentially Breaking Changes: -* `boottest()`, `mboottest()` and `boot_aggregate()`no longer have a dedicated `seed` argument. From version 0.13, reproducibility of results can only be controlled by setting a **global seed** via `drqng::dqset.seed()` and `set.seed()`. For more context, see the discussion below. As a consequence, results produced via old versions of `fwildlcusterboot` are no longer exactly reproducible. +* `boottest()`, `mboottest()` and `boot_aggregate()`no longer have a dedicated `seed` argument. From version 0.13, reproducibility of results can only be controlled by setting a **global seed** via `drqng::dqset.seed()` and `set.seed()`. For more context, see the discussion below. As a consequence, results produced via old versions of `fwildlcusterboot` are no longer exactly reproducible. -* When the bootstrap is run via `engine = "WildBootTests.jl"`, the bootstrapped t-statistics and the original t-statistic are now returned as vectors (to align with the results from other `enginges`). Previously, they were returned as matrices. +* When the bootstrap is run via `engine = "WildBootTests.jl"`, the bootstrapped t-statistics and the original t-statistic are now returned as vectors (to align with the results from other `enginges`). Previously, they were returned as matrices. -## Other Changes: +## Other Changes: -* `boottest()` receives a new argument, `sampling`, which controls if random numbers are drawn via functions from `base` or the `dqrng` package. +* `boottest()` receives a new argument, `sampling`, which controls if random numbers are drawn via functions from `base` or the `dqrng` package. * Some code refactoring. All bootstrap algorithms and their associated files have been renamed (e.g. `boot_algo2.R` is not called `boot_algo_fastnwild.R`, etc.). * Much nicer error and message formatting, via `rlang::abort()`, `warn()` and `inform()`. `rlang` is added as a dependency. @@ -66,9 +100,9 @@ Version 0.14 ... Prior to the changes introduced in `v0.13`, `boottest()` will always call `set.seed()` or `dqrng::dqset.seed()` internally, regardless of whether the `seed` argument is specified or not (in the ladder case, it will create an internal seed by randomly drawing from a large set of integers). I consider this harmless, as setting seeds inside `boottest()` in this way does not affect the reproducibility of scripts run end-to-end. -However, I have learned that is generally considered bad practice to overwrite global variables without notification - for example, the authors of [numpy](https://numpy.org/doc/stable/reference/random/generated/numpy.random.seed.html) have deprecated their `np.random.seed()` function for this reason. +However, I have learned that is generally considered bad practice to overwrite global variables without notification - for example, the authors of [numpy](https://numpy.org/doc/stable/reference/random/generated/numpy.random.seed.html) have deprecated their `np.random.seed()` function for this reason. -Here is a quick example on what happens if a function "reseeds": it affects the future chain of random draws. +Here is a quick example on what happens if a function "reseeds": it affects the future chain of random draws. ```r fn_reseed <- function(x){set.seed(x)} @@ -85,27 +119,27 @@ rnorm(1); rnorm(1) # [1] -0.5604756 # [1] -0.2301775 ``` -The two 'second' calls to `rnorm(1)` are based on different global seed states. +The two 'second' calls to `rnorm(1)` are based on different global seed states. -As a result, I have decided to deprecate the `seed' function argument. Random number generation must now **be** set outside of `boottest()` using `set.seed()` and `dqrng::dqset.seed()`. +As a result, I have decided to deprecate the `seed' function argument. Random number generation must now **be** set outside of `boottest()` using `set.seed()` and `dqrng::dqset.seed()`. -This means that bootstrap results generated via versions < 0.13 will no longer be exactly replicable under the new version, but with a sufficiently large number of bootstrap iterations, this change should not affect any of your conclusions. +This means that bootstrap results generated via versions < 0.13 will no longer be exactly replicable under the new version, but with a sufficiently large number of bootstrap iterations, this change should not affect any of your conclusions. # fwildclusterboot 0.12.1 -This is a hot-fix release which turns of tests on CRAN that fail in non-standard CRAN test environments. +This is a hot-fix release which turns of tests on CRAN that fail in non-standard CRAN test environments. # fwildclusterboot 0.12 -This is the first CRAN release since version `0.9`. It comes with a set of new features, but also potentially breaking changes. This section summarizes all developments since version `0.9`. +This is the first CRAN release since version `0.9`. It comes with a set of new features, but also potentially breaking changes. This section summarizes all developments since version `0.9`. -### Potentially breaking changes: +### Potentially breaking changes: * `boottest()'s` function argument `boot_algo` has been renamed to `engine` * the `setBoottest_boot_algo()` function was renamed to `setBoottest_engine()` ### Bug fixes and internal changes -* When a multi-parameter hypothesis of the form R beta = r was tested, the *heteroskedastic* wild bootstrap would nevertheless always test +* When a multi-parameter hypothesis of the form R beta = r was tested, the *heteroskedastic* wild bootstrap would nevertheless always test "beta_k = 0" vs "beta_k != 0", with "beta_k = param". I am sorry for that bug! * The `Matrix.utils` package is at danger of CRAN removal - it has been replaced by custom functions for internal use. @@ -116,17 +150,17 @@ This is the first CRAN release since version `0.9`. It comes with a set of new f + The heteroskedastic bootstrap is now significantly faster, and WCR21 and WCR31 versions are now supported (i.e. HC2 and HC3 'imposed' on the bootstrap dgp.) -# fwildclusterboot 0.11.3 +# fwildclusterboot 0.11.3 + significant speed improvements for the heteroskedastic bootstrap -# fwildclusterboot 0.11.2 +# fwildclusterboot 0.11.2 + significant speed improvements for the x1 bootstrap algorithms, `bootstrap_type %in% c("11", "31")`, both for WCR and WCU -# fwildclusterboot 0.11.1 +# fwildclusterboot 0.11.1 ### New bootstrap algorithms following MNW (2022) @@ -136,35 +170,35 @@ This is the first CRAN release since version `0.9`. It comes with a set of new f A `boot_aggregate()` method to supports the aggregation of coefficients in staggered difference-in-differences following the methods by [Sun & Abraham (2021, Journal of Econometrics)](https://arxiv.org/pdf/1804.05785.pdf) in combination with the `sunab()` function from [`fixest`](https://lrberge.github.io/fixest/reference/sunab.html)has been added. Essentially, `boot_aggregate()` is a copy of [`aggregate.fixest`](https://lrberge.github.io/fixest/reference/aggregate.fixest.html): the only difference is that inference is powered by a wild bootstrap. -### Other syntax changes, potentially breaking! +### Other syntax changes, potentially breaking! + The `boot_algo` function argument has been renamed to `engine`. + The `setBoottest_boot_algo()` function has been renamed to `setBoottest_engine()`. -In consequence, the syntax introduced in 0.11 changes to +In consequence, the syntax introduced in 0.11 changes to ``` boottest( - lm_fit, - param = ~treatment, + lm_fit, + param = ~treatment, clustid = ~group_id1, - B = 9999, + B = 9999, impose_null = TRUE, - engine = "R", + engine = "R", bootstrap_type = "11" ) ``` -To run everything through `WildBootTests.jl`, you would have to specify +To run everything through `WildBootTests.jl`, you would have to specify ``` boottest( - lm_fit, - param = ~treatment, + lm_fit, + param = ~treatment, clustid = ~group_id1, - B = 9999, + B = 9999, impose_null = TRUE, - engine = "WildBootTests.jl", + engine = "WildBootTests.jl", bootstrap_type = "11" ) ``` @@ -172,18 +206,18 @@ boottest( # fwildclusterboot 0.11 -+ This release introduces new wild cluster bootstrap variants as described in [MacKinnon, Nielsen & Webb (2022)](https://www.econ.queensu.ca/sites/econ.queensu.ca/files/wpaper/qed_wp_1485.pdf). The implementation is still quite bare-bone: it only allows to test hypotheses of the form $\beta_k = 0$ vs $\beta_k \neq 0$, does not allow for regression weights or fixed effects, and further does not compute confidence intervals. ++ This release introduces new wild cluster bootstrap variants as described in [MacKinnon, Nielsen & Webb (2022)](https://www.econ.queensu.ca/sites/econ.queensu.ca/files/wpaper/qed_wp_1485.pdf). The implementation is still quite bare-bone: it only allows to test hypotheses of the form $\beta_k = 0$ vs $\beta_k \neq 0$, does not allow for regression weights or fixed effects, and further does not compute confidence intervals. -You can run one of the 'new' variants - e.g. the "WCR13", by specifying the `bootstrap_type` function argument accordingly: +You can run one of the 'new' variants - e.g. the "WCR13", by specifying the `bootstrap_type` function argument accordingly: ``` boottest( - lm_fit, - param = ~treatment, + lm_fit, + param = ~treatment, clustid = ~group_id1, - B = 9999, + B = 9999, impose_null = TRUE, - engine = "R", + engine = "R", bootstrap_type = "31" ) ``` @@ -195,14 +229,14 @@ boottest( + multiple (internal) changes for ropensci standards alignment + drop the `t_boot` (`teststat_boot`) function arguments -> they are now TRUE by default -+ fix a bug in the lean algorithms - it always tested hypotheses of the form ++ fix a bug in the lean algorithms - it always tested hypotheses of the form beta = 0 instead of R'beta = r, even when R != 1 and r != 0 -+ enable full enumeration for R-lean tests ++ enable full enumeration for R-lean tests + enable deterministic 'full enumeration tests' - these are exact # fwildclusterboot 0.9 -+ v0.9 moves data pre-processing from `model.frame` methods to `model_matrix` methods. I had wanted to do so for a while, but issue #42, as raised by Michael Topper, has finally convinced me to start working on this project. ++ v0.9 moves data pre-processing from `model.frame` methods to `model_matrix` methods. I had wanted to do so for a while, but issue #42, as raised by Michael Topper, has finally convinced me to start working on this project. + Moving to `model_matrix` methods unlocks new functionality for how `boottest()` plays with `fixest` objects - it is now possible to run `boottest()` after `feols()` models that use syntactic sugar: @@ -221,15 +255,15 @@ boot1 <- boottest(feols_fit, ) feols_fits <- fixest::feols(proposition_vote ~ treatment | sw(Q1_immigration, Q2_defense), data = voters) -res <- lapply(feols_fits, \(x) boottest(x, B = 999, param = "treatment", clustid = "group_id1")) +res <- lapply(feols_fits, \(x) boottest(x, B = 999, param = "treatment", clustid = "group_id1")) voters$split <- sample(1:2, nrow(voters), TRUE) feols_fits <- fixest::feols(proposition_vote ~ treatment, split = ~split, data = voters) -res <- lapply(feols_fits, \(x) boottest(x, B = 999, param = "treatment", clustid = "group_id1")) +res <- lapply(feols_fits, \(x) boottest(x, B = 999, param = "treatment", clustid = "group_id1")) ``` -Some formula sugar still leads to errors, e.g. +Some formula sugar still leads to errors, e.g. ``` feols_fit2 <- feols(proposition_vote ~ treatment | Q1_immigration^Q2_defense, @@ -245,7 +279,7 @@ boot1 <- boottest(feols_fit2, + The release further fixes a multicollinearity bug that occured when `lm()` or `fixest()` silently deleted multicollinar variable(s). Thanks to Kurt Schmidheiny for reporting! -+ The `na_omit` function argument has been dropped. If the cluster variable is not included in the regression model, it is now not allowed to contain NA values. ++ The `na_omit` function argument has been dropped. If the cluster variable is not included in the regression model, it is now not allowed to contain NA values. + Several function arguments can now be fed to `boottest()` as formulas (`param`, `clustid`, `bootcluster`, `fe`). @@ -267,7 +301,7 @@ boot <- boottest(feols_fit, #### boot_algo = 'WildBootTests.jl' -+ `fwildclusterboot` now supports calling [WildBootTests.jl](https://github.com/droodman/WildBootTests.jl), which is a very fast Julia implementation of the wild cluster bootstrap algorithm. To do so, a new function argument is introduced, `boot_algo`, through which it is possible to control the executed bootstrap algorithm. ++ `fwildclusterboot` now supports calling [WildBootTests.jl](https://github.com/droodman/WildBootTests.jl), which is a very fast Julia implementation of the wild cluster bootstrap algorithm. To do so, a new function argument is introduced, `boot_algo`, through which it is possible to control the executed bootstrap algorithm. ```{r} # load data set voters included in fwildclusterboot @@ -275,16 +309,16 @@ data(voters) # estimate the regression model via lm lm_fit <- lm(proposition_vote ~ treatment + ideology1 + log_income + Q1_immigration , data = voters) boot_lm <- boottest( - lm_fit, - clustid = "group_id1", - param = "treatment", - B = 9999, + lm_fit, + clustid = "group_id1", + param = "treatment", + B = 9999, boot_algo = "WildBootTests.jl" ) ``` -+ WildBootTests.jl is (after compilation) orders of magnitudes faster than `fwildclusterboot's` native R implementation, and speed gains are particularly pronounced for large problems with a large number of clusters and many bootstrap iterations. ++ WildBootTests.jl is (after compilation) orders of magnitudes faster than `fwildclusterboot's` native R implementation, and speed gains are particularly pronounced for large problems with a large number of clusters and many bootstrap iterations. -+ Furthermore, `WildBootTests.jl` supports a range of models and tests that were previously not supported by `fwildclusterboot`: most importantly a) wild cluster bootstrap tests of multiple joint hypotheses and b) the WRE bootstrap by Davidson & MacKinnon for instrumental variables estimation. On top of the cake ... the WRE is really fast. ++ Furthermore, `WildBootTests.jl` supports a range of models and tests that were previously not supported by `fwildclusterboot`: most importantly a) wild cluster bootstrap tests of multiple joint hypotheses and b) the WRE bootstrap by Davidson & MacKinnon for instrumental variables estimation. On top of the cake ... the WRE is really fast. ```{r} library(ivreg) @@ -312,23 +346,23 @@ generics::tidy(boot_ivreg) + Also, note that running the wild cluster bootstrap through `WildBootTests.jl` is often very memory-efficient. -#### boot_algo = 'R-lean' +#### boot_algo = 'R-lean' A key limitation of the vectorized 'fast' cluster bootstrap algorithm as implemented in `fwildclusterboot` is that it is very memory-demanding. For 'larger' problems, running `boottest()` might lead to out-of-memory errors. To offer an alternative, `boottest()` now ships a 'new' rcpp- and loop-based implementation of the wild cluster bootstrap (the 'wild2' algorithm in Roodman et al). ```{r} boot_lm <- boottest( - lm_fit, - clustid = "group_id1", - param = "treatment", - B = 9999, + lm_fit, + clustid = "group_id1", + param = "treatment", + B = 9999, boot_algo = "R-lean" ) ``` -### Heteroskeadstic Wild Bootstrap +### Heteroskeadstic Wild Bootstrap -It is now possible to run `boottest()` without specifying a `clustid` function argument. In this case, `boottest()` runs a heteroskedasticity-robust wild bootstrap (HC1), which is implemented in c++. +It is now possible to run `boottest()` without specifying a `clustid` function argument. In this case, `boottest()` runs a heteroskedasticity-robust wild bootstrap (HC1), which is implemented in c++. ```{r} boot_hc1 <- boottest(lm_fit, param = "treatment", B = 9999) @@ -337,11 +371,11 @@ summary(boot_hc1) ### `boottest()` function argument `beta0` deprecated -For consistency with `WildBootTests.jl`, the `boottest()` function argument `beta0` is now replaced by a new function argument, `r`. +For consistency with `WildBootTests.jl`, the `boottest()` function argument `beta0` is now replaced by a new function argument, `r`. ### Frühjahrsputz -I have spent some time to clean up `fwildclusterboot's` internals, which should now hopefully be more readable and easier to maintain. +I have spent some time to clean up `fwildclusterboot's` internals, which should now hopefully be more readable and easier to maintain. ### Testing @@ -350,82 +384,82 @@ I have spent some time to clean up `fwildclusterboot's` internals, which should # fwildclusterboot 0.7 -+ Bug fixes, see issues [#26](https://github.com/s3alfisc/fwildclusterboot/issues/26) and [#27](https://github.com/s3alfisc/fwildclusterboot/issues/27) regarding preprocessing for fixest when weights are passed to feols() as a formula or when cluster is specified in fixest as a column vector. ++ Bug fixes, see issues [#26](https://github.com/s3alfisc/fwildclusterboot/issues/26) and [#27](https://github.com/s3alfisc/fwildclusterboot/issues/27) regarding preprocessing for fixest when weights are passed to feols() as a formula or when cluster is specified in fixest as a column vector. # fwildclusterboot 0.6 -+ Bug fix: for one-sided hypotheses for the WRU bootstrap (if impose_null = FALSE), the returned p-values were incorrect - they were reported as 'p', but should have been '1-p'. E.g. if the reported p-values was reported as 0.4, it should have been reported as 0.6. -+ A new function argument `ssc` gives more control over the small sample adjustments made within `boottest()`. It closely mirrors the `ssc` argument in `fixest`. The only difference is that `fwildclusterboot::boot_ssc()'s` `fixef.K` argument currently has only one option, `'none'`, which means that the fixed effect parameters are discarded when calculating the number of estimated parameters k. -The default argument of `boot_ssc()` are `adj = TRUE, fixef.K = "none", cluster.adj = TRUE` and `cluster.df = "conventional"`. In fixest, the `cluster.df` argument is `"min"` by default. ++ Bug fix: for one-sided hypotheses for the WRU bootstrap (if impose_null = FALSE), the returned p-values were incorrect - they were reported as 'p', but should have been '1-p'. E.g. if the reported p-values was reported as 0.4, it should have been reported as 0.6. ++ A new function argument `ssc` gives more control over the small sample adjustments made within `boottest()`. It closely mirrors the `ssc` argument in `fixest`. The only difference is that `fwildclusterboot::boot_ssc()'s` `fixef.K` argument currently has only one option, `'none'`, which means that the fixed effect parameters are discarded when calculating the number of estimated parameters k. +The default argument of `boot_ssc()` are `adj = TRUE, fixef.K = "none", cluster.adj = TRUE` and `cluster.df = "conventional"`. In fixest, the `cluster.df` argument is `"min"` by default. Prior to v 0.6, by default, no small sample adjustments regarding the sample size N and the number of estimated parameters k were applied. The changes in v0.6 may slightly affect the output of `boottest()`. For exact reproducibility of previous results, set `adj = FALSE`. Setting `adj = TRUE` will not affect p-values and confidence intervals for *oneway clustering*, but the internally calculated t-stat, which is divided by sqrt(N-k)/(N-1). For *twoway* clustering, it might affect the number and order of invalid bootstrapped t-statistics (due to non-positive definite covariance matrices) and, through this channel, affect bootstrapped inferential parameters. -+ Testing: unit tests are now run on [github actions](https://github.com/s3alfisc/fwildclusterboot/actions/workflows/R-CMD-check.yaml) against [wildboottestjlr](https://github.com/s3alfisc/wildboottestjlr), which is a [JuliaConnectoR](https://github.com/stefan-m-lenz/JuliaConnectoR) based wrapper around [WildBootTests.jl](https://github.com/droodman/WildBootTests.jl), a Julia implementation of the fast wild cluster bootstrap algorithm. -+ Additionally, minor speed tweaks. ++ Testing: unit tests are now run on [github actions](https://github.com/s3alfisc/fwildclusterboot/actions/workflows/R-CMD-check.yaml) against [wildboottestjlr](https://github.com/s3alfisc/wildboottestjlr), which is a [JuliaConnectoR](https://github.com/stefan-m-lenz/JuliaConnectoR) based wrapper around [WildBootTests.jl](https://github.com/droodman/WildBootTests.jl), a Julia implementation of the fast wild cluster bootstrap algorithm. ++ Additionally, minor speed tweaks. # fwildclusterboot 0.5.1 -+ Fixes a bug with Mammen weights introduced in version 0.5 -> switch back to `sample()` function. To guarantee reproducibilty with Mammen weights, either a seed ++ Fixes a bug with Mammen weights introduced in version 0.5 -> switch back to `sample()` function. To guarantee reproducibilty with Mammen weights, either a seed needs to be specified in `boottest()` or a global seed needs to be set via `set.seed()`. + Deletes some unnecessary computations from boot_algo2() -> speed improvements -+ For B = 2^(#number of clusters), Rademacher weights should have been enumerated - ++ For B = 2^(#number of clusters), Rademacher weights should have been enumerated - instead, they were drawn randomly and enumeration only occured for B > 2^(#number of clusters). Now, enumeration occurs if B >= 2^(#number of clusters). # fwildclusterboot 0.5 -+ Version 0.5 fixes an error for the bootstrap with weighted least squares introduced with version 0.4. All unit tests ++ Version 0.5 fixes an error for the bootstrap with weighted least squares introduced with version 0.4. All unit tests that compare fwildclusterboot with weighted least squares results from boottest.stata pass. In particular, enumerated cases pass with exact equality (in such cases, the bootstrap weights matrices are exactly identical in both R and Stata). + `boottest()` now stops if `fixest::feols()` deletes non-NA values (e.g. singleton fixed effects deletion) and asks the user to delete such rows prior to estimation via `feols()` & `boottest()`. Currently, `boottest()'s` pre-processing cannot handle such deletions - this remains future work. + To align `fwildclusterboot` with Stata's boottest command (Roodman et al, 2019), Mammen weights are no longer enumerated in `fwildclusterboot::boottest()`. -+ `boottest()` no longer sets an internal seed (previously set.seed(1)) if no seed is provided as a function argument. ++ `boottest()` no longer sets an internal seed (previously set.seed(1)) if no seed is provided as a function argument. + Sampling of the bootstrap weights is now powered by the `dqrng` package, which speeds up the creation of the bootstrap weights matrix. To set a "global" seed, one now has use the `dqset.seed()` function from the `dqrng package`, which is added as a dependency. # fwilclusterboot 0.4 -+ New feature I: `boottest()` now allows for univariate tests that involve multiple - variables. E.g. one can now test hypothesis as `var1 + var2 = c` where c is a scalar. More details on the syntax can be found in the vignette. All methods of for ++ New feature I: `boottest()` now allows for univariate tests that involve multiple + variables. E.g. one can now test hypothesis as `var1 + var2 = c` where c is a scalar. More details on the syntax can be found in the vignette. All methods of for objects of class `boottest` have been updated. -+ New feature II: `boottest()` now also supports "equal-tailed" p-values and one-sided hypotheses. For one-sided tests, confidence intervals are currently not supported. -+ Internal changes: To allow for multivariable tests, the `boot_algo2()` function has slightly been modified. `invert_p_val2()` is superseded by `invert_p_val()`. -+ Further, a CRAN error is fixed - some tests for exact equality failed with relative difference e-05 on openBLAS. In consequence, all exact tests are set to reltol = 1e-04. -+ Make better use of functionality of `dreamerr` package. ++ New feature II: `boottest()` now also supports "equal-tailed" p-values and one-sided hypotheses. For one-sided tests, confidence intervals are currently not supported. ++ Internal changes: To allow for multivariable tests, the `boot_algo2()` function has slightly been modified. `invert_p_val2()` is superseded by `invert_p_val()`. ++ Further, a CRAN error is fixed - some tests for exact equality failed with relative difference e-05 on openBLAS. In consequence, all exact tests are set to reltol = 1e-04. ++ Make better use of functionality of `dreamerr` package. # fwildclusterboot 0.3.7 -+ Bug fix: the output of `boottest()` varied depending on the class of the - input fixed effects for regressions both via `lfe::felm()` and `fixest::feols()`. - This bug occurred because `boottest()` does not work with a pre-processed model.frame object from either `felm()` or `feols()` but works with the original input data. While both `felm()` and `feols()` change non-factor fixed effects variables to factors internally, `boottest()` did not check but implicitely assumed that all fixed effects used in the regression models are indeed factors in the original data set. As a consequence, if one or more fixed effects were e.g. numeric, `boottest()` would produce incorrect results without throwing an error. - With version 0.3.7, `boottest()` checks internally if all variables in the original data set which are used as fixed effects are factor variables and if not, changes them to factors. - Thanks for timotheedotc for raising the issue on github, which can be found here: https://github.com/s3alfisc/fwildclusterboot/issues/14. ++ Bug fix: the output of `boottest()` varied depending on the class of the + input fixed effects for regressions both via `lfe::felm()` and `fixest::feols()`. + This bug occurred because `boottest()` does not work with a pre-processed model.frame object from either `felm()` or `feols()` but works with the original input data. While both `felm()` and `feols()` change non-factor fixed effects variables to factors internally, `boottest()` did not check but implicitely assumed that all fixed effects used in the regression models are indeed factors in the original data set. As a consequence, if one or more fixed effects were e.g. numeric, `boottest()` would produce incorrect results without throwing an error. + With version 0.3.7, `boottest()` checks internally if all variables in the original data set which are used as fixed effects are factor variables and if not, changes them to factors. + Thanks for timotheedotc for raising the issue on github, which can be found here: https://github.com/s3alfisc/fwildclusterboot/issues/14. + Some tests have been added that compare output from `boottest()` with the wild cluster bootstrap implemented via `clusterSEs`. # fwildclusterboot 0.3.6 - -+ Bug fix regarding suggested packages and CRAN: [see github issue #12](https://github.com/s3alfisc/fwildclusterboot/issues/12). + ++ Bug fix regarding suggested packages and CRAN: [see github issue #12](https://github.com/s3alfisc/fwildclusterboot/issues/12). Added `if(requireNamespace("pkgname"))` statements for suggested packages in the vignettes, -examples and tests. Note that unit tests will now only execute on CRAN if both `fixest` and `lfe` +examples and tests. Note that unit tests will now only execute on CRAN if both `fixest` and `lfe` can be installed on the OS. # fwildclusterboot 0.3.5 + Bug fix: For Rademacher and Mammen weights and cases where (2^ number of clusters) < # boostrap iterations, (deterministic ) full enumeration should have been employed for sampling the bootstrap weights. -Full enumeration means the following: for e.g. 6 numbers of clusters, only -2^6 = 64 unique draws from either the Rademacher or Mammen distributions exists. -Therefore, `boottest()` overwrites the user-provided number of bootstrap iterations -to $B = \text{(2^ number of clusters)}$ if a larger number is chosen. -The bug now occured because the bootstrap weights were drawn **randomly with +Full enumeration means the following: for e.g. 6 numbers of clusters, only +2^6 = 64 unique draws from either the Rademacher or Mammen distributions exists. +Therefore, `boottest()` overwrites the user-provided number of bootstrap iterations +to $B = \text{(2^ number of clusters)}$ if a larger number is chosen. +The bug now occured because the bootstrap weights were drawn **randomly with replacement** instead of using **full enumeration**. Note: full enumeration was introduced with version 0.3.3. Thanks to fschoner for finding the bug! [see github issue #11](https://github.com/s3alfisc/fwildclusterboot/issues/11) -+ Bug fix: A small bug has been fixed related to missing values in the cluster variables. ++ Bug fix: A small bug has been fixed related to missing values in the cluster variables. -+ By default, `boottest()` now sets an internal seed if no seed is provided by ++ By default, `boottest()` now sets an internal seed if no seed is provided by the user via the `seed` function argument. -+ Several improvements to the documentation. ++ Several improvements to the documentation. # fwildclusterboot 0.3.4 @@ -438,30 +472,30 @@ Note: full enumeration was introduced with version 0.3.3. Thanks to fschoner for # fwildclusterboot 0.3.2 -+ Fixes a CRAN test error message for Oracle Solaris. ++ Fixes a CRAN test error message for Oracle Solaris. -# fwildclusterboot 0.3.1 +# fwildclusterboot 0.3.1 -+ A `glance.boottest()` method was added, which enables the use of the `modelsummary` package with `fwildclusterboot`. ++ A `glance.boottest()` method was added, which enables the use of the `modelsummary` package with `fwildclusterboot`. + The `tidy.boottest()` method is no longer exported. You can still access it via `fwildclusterboot:::tidy.boottest()` or by loading the `generics` package via `library(generics)`. # fwildclusterboot 0.3.0 -+ Additional performance improvements through parallelization. By default, - `boottest()` uses half the available threads for parallel execution. The ++ Additional performance improvements through parallelization. By default, + `boottest()` uses half the available threads for parallel execution. The number of threads can be set via the `nthreads` function argument. -+ Additional function arguments for `boottest()` - the user can now set the ++ Additional function arguments for `boottest()` - the user can now set the tolerance and maximum number of iterations for the calculation of - confidence intervals. By default, `tol = 1e-6` and + confidence intervals. By default, `tol = 1e-6` and `maxiter = 10`. + The package no longer depends on `data.table` and `fabricatr` - both are now - only suggested. Further, the package now comes with an example data set + only suggested. Further, the package now comes with an example data set 'voters'. # fwildclusterboot 0.2.0 -Add support for +Add support for + tests of two-sided, univariate hypotheses + regression weights diff --git a/R/arg_checks.R b/R/arg_checks.R index 75e82ae6..2f8944bb 100644 --- a/R/arg_checks.R +++ b/R/arg_checks.R @@ -2,8 +2,6 @@ #' @param object an object of type lm, fixest or felm #' @param param character vector - boottest() function arg 'param' #' -#' @importFrom rlang abort warn inform -#' #' @noRd check_params_in_model <- function(object, param) { @@ -11,22 +9,14 @@ check_params_in_model <- function(object, param) { # for lm and fixest if (inherits(object, "lm")) { if (mean(param %in% c(names(coef(object)))) != 1) { - rlang::abort(paste("The parameter", param, "is not included in the estimated - model. Maybe you are trying to test for an interaction - parameter? To see all model parameter names, - run names(coef(object))."), - use_cli_format = TRUE) + param_not_in_model_error(param) } } # for lm and fixest if (inherits(object, "fixest")) { if (mean(param %in% c(names(coef(object)))) != 1) { - rlang::abort(paste("The parameter", param, "is not included in the estimated - model. Maybe you are trying to test for an interaction - parameter? To see all model parameter names, - run names(coef(model))."), - use_cli_format = TRUE) + param_not_in_model_error(param) } } @@ -34,21 +24,15 @@ check_params_in_model <- function(object, param) { if (inherits(object, "felm")) { # check if param(s) is (are) in model if (mean(param %in% c(rownames(object$coefficients))) != 1) { - rlang::abort(paste("The parameter", param, "is not included - in the estimated model. Maybe you are trying to - test for an interaction parameter? To see all model - parameter names, run names(coef(model))."), - use_cli_format = TRUE) + param_not_in_model_error(param) } } if (inherits(object, "ivreg")) { # which parametrs can be tested? if (mean(param %in% names(c(object$exogenous, object$endogenous)) != 1)) { - rlang::abort(paste("The parameter", param, "is not included in the estimated - model. Maybe you are trying to test for an interaction - parameter? To see all model parameter names, - run names(coef(model)).")) } + param_not_in_model_error(param) + } } } @@ -60,35 +44,23 @@ check_params_in_model <- function(object, param) { #' @param B the number of bootstrap iterations #' @param fe NULL or numeric scalar - the fixed effect to be projected #' out in the bootstrap -#' @noRd #' check_boottest_args_plus <- function( object, R, param, sign_level, B, fe = NULL) { if (inherits(object, "ivreg")) { if (object$method != "OLS") { - rlang::abort("Currently, only 2SLS is supported. Please set the `ivreg` - function argument `method` to `OLS`.") + only_2SLS_for_IV_error() } } - if (inherits(object, "felm")) { if (!is.null(fe)) { if (fe %in% param) { - rlang::abort(paste("The function argument fe =", fe, "is included in the - hypothesis (via the `param` argument). This is not allowed. - Please set fe to another factor variable or NULL."), - call. = FALSE - ) + fe_in_test_error(fe) } if (!(fe %in% names(object$fe))) { - rlang::abort(paste( - "The fixed effect to be projected out in the bootstrap,", - fe, "is not included as a dedicated fixed effect in the - estimated model." - )) - } + fixef_not_fixef_error(fe) } } } @@ -103,7 +75,8 @@ check_boottest_args_plus <- function( if ( # '^' illegal in fixef argument, but legal in main formula - # e.g. fml = y ~ x1 + I(x2^2) shold be possible - #' @srrstats {G2.4c} *explicit conversion to character via `as.character()` + #' @srrstats {G2.4c} *explicit conversion to character via + #' `as.character()` #' (and not `paste` or `paste0`)* Done ("fixef_vars" %in% names(object) && @@ -114,16 +87,16 @@ check_boottest_args_plus <- function( # note: whitespace ~ - for IV # grepl("~", deparse_fml, fixed = TRUE) ) { - rlang::abort("Advanced formula notation in fixest / fixest via ^ to interact - fixed effects is currently not supported in boottest().") + advanced_formula_error() } - + if ( # '^' illegal in fixef argument, but legal in main formula - # e.g. fml = y ~ x1 + I(x2^2) shold be possible - #' @srrstats {G2.4c} *explicit conversion to character via `as.character()` + #' @srrstats {G2.4c} *explicit conversion to character via + #' `as.character()` #' (and not `paste` or `paste0`)* Done - + ("fixef_vars" %in% names(object) && grepl("[", Reduce(paste, as.character(as.formula(object$fml_all$fixef))), @@ -132,36 +105,29 @@ check_boottest_args_plus <- function( # note: whitespace ~ - for IV # grepl("~", deparse_fml, fixed = TRUE) ) { - rlang::abort("Varying slopes in fixest / fixest via [] to interact - fixed effects is currently not supported in boottest().") + varying_slopes_error() } - + if (!is.null(fe)) { if (fe %in% param) { - rlang::abort(paste("The function argument fe =", fe, "is included in the - hypothesis (via the `param` argument). This is not allowed. - Please set fe to another factor variable or NULL."), - call. = FALSE - ) + fe_in_test_error(fe) } if (!(fe %in% object$fixef_vars)) { - rlang::abort(paste( - "The fixed effect to be projected out in the bootstrap,", - fe, "is not included as a dedicated fixed effect - in the estimated model." - )) - } + fixef_not_fixef_error(fe) } } } if (((1 - sign_level) * (B + 1)) %% 1 != 0) { - rlang::inform( - paste("Note: The bootstrap usually performs best when the confidence", - "level (here,", 1 - sign_level, "% times the number of replications", - "plus 1 (", B, "+ 1 = ", B + 1, ") is an integer."), - use_cli_format = TRUE + message( + paste( + "Note: The bootstrap usually performs best when the confidence", + "level (here,", + 1 - sign_level, + "%) times the number of replications", + "plus 1 (", B, "+ 1 = ", B + 1, ") is an integer." + ) ) } } @@ -181,12 +147,7 @@ check_mboottest_args_plus <- function(object, R, r, fe) { if (inherits(object, "felm")) { if (!is.null(fe)) { if (!(fe %in% names(object$fe))) { - rlang::abort(paste( - "The fixed effect to be projected out in the bootstrap,", - fe, "is not included as a dedicated fixed effect in the - estimated model." - )) - } + fixef_not_fixef_error(fe) } } } @@ -208,28 +169,19 @@ check_mboottest_args_plus <- function(object, R, r, fe) { )) ) { - rlang::abort("Advanced formula notation in fixest / fixest via ^ to interact - fixed effects is currently not supported in boottest().") + advanced_formula_error() } if (!is.null(fe)) { if (!(fe %in% object$fixef_vars)) { - rlang::abort(paste( - "The fixed effect to be projected out in the bootstrap,", - fe, "is not included as a dedicated fixed effect in the - estimated model." - )) + fixef_not_fixef_error(fe) } } } if (nrow(R) != length(r)) { - rlang::abort(paste( - "The dimensions of func args R and r do not match. The number - of rows of R is ", nrow(R), ", but the length of r is", - length(r), "." - )) + test_dimension_error() } } @@ -248,23 +200,19 @@ check_mboottest_args_plus <- function(object, R, r, fe) { check_r_lean <- function(weights, clustid, fe, impose_null){ if(length(clustid) > 1){ - rlang::abort("The R-lean algorithm currently only supports oneway clustering.") + no_clustering_for_rlean_error() } if (!is.null(fe)) { - rlang::abort("boottest() currently does not support - fixed effects with engine = 'R-lean'.") + no_fixef_for_rlean_error() } if(!is.null(weights)){ - rlang::abort("boottest() currently does not support regression - weights with engine = 'R-lean'.") + no_weights_for_rlean_error() } if(impose_null != TRUE){ - rlang::abort("boottest() currently does not support the 'WCU' bootstrap - - which does not impose the null on the - bootstrap dgp - for engine = 'R-lean'.") + no_wcu_for_rlean_error() } } @@ -273,7 +221,7 @@ check_r_lean <- function(weights, clustid, fe, impose_null){ check_boot_algo_fastnreliable <- function( weights, clustid, - bootcluster, + bootcluster, fe, bootstrap_type, R, @@ -293,49 +241,46 @@ check_boot_algo_fastnreliable <- function( if(R != 1L || r != 0){ - rlang::abort("Bootstraps of type '11', '13', '31', '33' currently - only support hypotheses of the form R * beta = 0 vs beta <> 0, - where R is a scalar equal to 1.") + non_scalar_test_error() + } if(length(clustid) > 1){ - rlang::abort("The '13', '31', and '33' bootstrap variants currently only support oneway clustering when 'boot_engine' == 'R'." - ) + only_onweway_clustering_for_fast_reliable } if (!is.null(fe)) { if(bootstrap_type %in% c("11", "31")){ if(!(fe == clustid)){ - rlang::abort("The '11' and '31'' - bootstrap variants currently only support cluster fixed effects in the bootstrap. Please set 'fe = NULL'.") + only_cluster_fixef_for_11_31_error() } - } - + } + if (bootstrap_type %in% c("13", "33")) { - rlang::abort("The '13' and '33'' - bootstrap variants currently do support cluster fixed effects in the bootstrap. Please set 'fe = NULL'.") - } + no_fixef_for_13_33_error() + } } if(!is.null(weights)){ - rlang::abort("The '11','13', '31', and '33' - bootstrap variants currently only support weights for engine = 'WildBootTests.jl'.") + no_weights_for_fast_reliable_error() } - + if(length(bootcluster) > 1){ - rlang::abort(paste("The subcluster bootstrap is currently not supported for the bootstrap types '11', '31', '13' and '33'.")) + no_subcluster_bootstrap_fast_reliable() } - + if(!(bootcluster %in% c("max", "min", clustid))){ - rlang::abort(paste("For the 'fast and reliable' implementations, the 'bootcluster' argument must be 'max', 'min' or", clustid, ".")) + bootcluster_not_max_min_clustid() } - + } -#' check if full enumeration should be employed, provide message when it is +#' check if full enumeration should be employed, provide message when it is. +#' in this case, overwrite the number of bootstrap iterations to the number of +#' enumerations. #' @param heteroskedastic Logical. Is the heteroskedastic or a wild cluster #' bootstrap being run? #' @param preprocess A list created via the preprocess2 function @@ -358,25 +303,7 @@ check_set_full_enumeration <- N_G_2 <- 2^N_G_bootcluster if (type == "rademacher") { if (N_G_2 <= B) { - rlang::warn( - paste( - "There are only", - N_G_2, - "unique draws from the rademacher distribution for", - N_G_bootcluster, - "bootstrap clusters. Therefore, B = ", - N_G_2, - " with full enumeration. Consider using webb weights instead. - Further, note that under full enumeration and with B =", - N_G_2, - "bootstrap draws, only 2^(#clusters - 1) = ", - 2^(N_G_bootcluster - 1), - " distinct t-statistics and p-values can be computed. For a - more thorough discussion, see Webb `Reworking wild bootstrap - based inference for clustered errors` (2013)." - ), - use_cli_format = TRUE - ) + full_enumeration_warning(N_G_2, N_G_bootcluster) full_enumeration <- TRUE if (engine != "WildBootTests.jl") { # this is handled internally by WildBootTests.jl, so don't update B @@ -404,36 +331,21 @@ check_set_full_enumeration <- r_algo_checks <- function(R, p_val_type, conf_int, B) { if (!is.null(R)) { if (length(nrow(R)) != 0) { - rlang::abort( - "Hypotheses with q>1 are currently only supported via WildBootTests.jl. - Please set the function argument 'engine = WildBootTests.jl'." - ) + multi_hypothesis_only_via_julia() } } if (p_val_type %in% c(">", "<")) { if (conf_int == TRUE) { conf_int <- FALSE - rlang::warn( - paste( - "Currently, boottest() calculates confidence intervals for one-sided - hypotheses only for `engine = 'WildBootTests.jl'`." - ), - use_cli_format = TRUE - ) + one_sided_cis_only_via_julia() } } if (conf_int == TRUE || is.null(conf_int)) { if (B < 100) { - rlang::abort( - "The function argument B is smaller than 100. The number of bootstrap - iterations needs to be 100 or higher in order to guarantee that the - root finding procudure used to find the confidence set - works properly.", - call. = FALSE - ) + B_too_small_error() } } } @@ -450,8 +362,7 @@ process_R <- function(R, param) { R <- rep(1, length(param)) } else { if (length(R) != length(param)) { - rlang::abort("The constraints vector must either be NULL or a numeric of - the same length as the `param` input vector.") + R_must_be_NULL_or_param() } } R @@ -464,32 +375,14 @@ check_engine_btype <- function( if(engine == "WildBootTests.jl"){ if(!(bootstrap_type %in% c("11", "fnw11", "31"))){ - rlang::abort( - paste( - "The bootstrap of type", - bootstrap_type, - "is not yet supported via 'WildBootTests. You can run it via - the 'R' engine instead.'") - ) + no_13_33_bootstap_via_julia(bootstrap_type) } } else if(engine == "R-lean"){ if(bootstrap_type != "fnw11"){ if(bootstrap_type == "31"){ - rlang::abort( - paste( - "The bootstrap of type", - bootstrap_type, - "is not yet supported via 'R-lean'. You can run it via - the 'R' or 'WildBootTests.jl' engines instead.'") - ) + no_bootstrap_type_for_rlean_error() } else { - rlang::abort( - paste( - "The bootstrap of type", - bootstrap_type, - "is not yet supported via 'R-lean'. You can run it via - the 'R' engine instead.'") - ) + no_bootstrap_type_for_rlean_error() } } @@ -502,11 +395,10 @@ check_bootstrap_types <- function(param, bootstrap_type){ if(length(param) > 1){ if(bootstrap_type != "fnw11"){ - rlang::abort("Only bootstrap_type = 'fnw11' is currently supported with - hypotheses that contain more than one parameter. This feature - will be added in the near future.") + non_scalar_test_error() } } } + diff --git a/R/boot_aggregate.R b/R/boot_aggregate.R index 7ee10778..38f64381 100644 --- a/R/boot_aggregate.R +++ b/R/boot_aggregate.R @@ -1,37 +1,37 @@ #' Simple tool that aggregates the value of CATT coefficients in -#' staggered difference-in-difference setups with inference based on +#' staggered difference-in-difference setups with inference based on #' a wild cluster bootstrap (see details) - similar to `fixest::aggregate()` -#' +#' #' This is a function helping to replicate the estimator from Sun and -#' Abraham (2021, Journal of Econometrics). You first need to perform -#' an estimation with cohort and relative periods dummies -#' (typically using the function i), this leads to estimators of the +#' Abraham (2021, Journal of Econometrics). You first need to perform +#' an estimation with cohort and relative periods dummies +#' (typically using the function i), this leads to estimators of the #' cohort average treatment effect on the treated (CATT). Then you can -#' use this function to retrieve the average treatment effect on each +#' use this function to retrieve the average treatment effect on each #' relative period,or for any other way you wish to aggregate the CATT. -#' -#' Note that contrary to the SA article, here the cohort share -#' in the sample is considered to be a perfect measure for the +#' +#' Note that contrary to the SA article, here the cohort share +#' in the sample is considered to be a perfect measure for the #' cohort share in the population. -#' -#' Most of this function is written by Laurent Bergé and used -#' in the fixest package published under GPL-3, +#' +#' Most of this function is written by Laurent Bergé and used +#' in the fixest package published under GPL-3, #' https://cran.r-project.org/web/packages/fixest/index.html #' minor changes by Alexander Fischer -#' +#' #' @param x An object of type fixest estimated using `sunab()` #' @param agg A character scalar describing the variable names to be -#' aggregated, it is pattern-based. All variables that match the pattern -#' will be aggregated. It must be of the form `"(root)"`, the parentheses -#' must be there and the resulting variable name will be `"root"`. You +#' aggregated, it is pattern-based. All variables that match the pattern +#' will be aggregated. It must be of the form `"(root)"`, the parentheses +#' must be there and the resulting variable name will be `"root"`. You #' can add another root with parentheses: `"(root1)regex(root2)"`, in -#' which case the resulting name is `"root1::root2"`. To name the resulting +#' which case the resulting name is `"root1::root2"`. To name the resulting #' variable differently you can pass a named vector: `c("name" = "pattern")` #' or `c("name" = "pattern(root2)")`. It's a bit intricate sorry, please #' see the examples. -#' @param full Logical scalar, defaults to `FALSE`. If `TRUE`, then all +#' @param full Logical scalar, defaults to `FALSE`. If `TRUE`, then all #' coefficients are returned, not only the aggregated coefficients. -#' @param use_weights Logical, default is `TRUE`. If the estimation was +#' @param use_weights Logical, default is `TRUE`. If the estimation was #' weighted, whether the aggregation should take into account the weights. #' Basically if the weights reflected frequency it should be `TRUE`. #' @param clustid A character vector or rhs formula containing the names of the @@ -63,7 +63,7 @@ #' of the inference procedure. E.g. sign_level = 0.05 #' returns 0.95% confidence intervals. By default, sign_level = 0.05. #' @param engine Character scalar. Either "R", "R-lean" or "WildBootTests.jl". -#' Controls if `boottest()` should run via its native R implementation +#' Controls if `boottest()` should run via its native R implementation #' or `WildBootTests.jl`. #' "R" is the default and implements the cluster bootstrap #' as in Roodman (2019). "WildBootTests.jl" executes the @@ -77,10 +77,10 @@ #' defaults to the "lean" algorithm. You can set the employed #' algorithm globally by using the #' `setBoottest_engine()` function. -#' @param bootstrap_type Determines which wild cluster bootstrap type should be -#' run. Options are "fnw11", which runs a "11" type -#' wild cluster bootstrap via the algorithm outlined in "fast and wild" -#' (Roodman et al (2019)). +#' @param bootstrap_type Determines which wild cluster bootstrap type should be +#' run. Options are "fnw11", which runs a "11" type +#' wild cluster bootstrap via the algorithm outlined in "fast and wild" +#' (Roodman et al (2019)). #' @param beta0 Deprecated function argument. Replaced by function argument 'r'. #' @param type character or function. The character string specifies the type #' of boostrap to use: One of "rademacher", "mammen", "norm" @@ -127,24 +127,24 @@ #' bootstrap-c instead of bootstrap-t. Only relevant when #' 'engine = "WildBootTests.jl"' #' @param sampling 'dqrng' or 'standard'. If 'dqrng', the 'dqrng' package is -#' used for random number generation (when available). If 'standard', -#' functions from the 'stats' package are used when available. -#' This argument is mostly a convenience to control random number generation in -#' a wrapper package around `fwildclusterboot`, `wildrwolf`. -#' I recommend to use the fast' option. -#' @param ... misc function arguments -#' +#' used for random number generation (when available). If 'standard', +#' functions from the 'stats' package are used when available. +#' This argument is mostly a convenience to control random number generation in +#' a wrapper package around `fwildclusterboot`, `wildrwolf`. +#' I recommend to use the fast' option. +#' @param ... misc function arguments +#' #' @export -#' +#' #' @importFrom utils setTxtProgressBar txtProgressBar -#' -#' @return A data frame with aggregated coefficients, p-values and -#' confidence intervals. -#' +#' +#' @return A data frame with aggregated coefficients, p-values and +#' confidence intervals. +#' #' @importFrom dreamerr check_value_plus -#' -#' @examples -#' +#' +#' @examples +#' #' \dontrun{ #' if(requireNamespace("fixest")){ #'library(fixest) @@ -153,30 +153,30 @@ #'res_sunab = feols(y ~ x1 + sunab(year_treated, year) | id + year, base_stagg) #'res_sunab_3ref = feols(y ~ x1 + sunab( #' year_treated, year, ref.p = c(.F + 0:2, -1)) | -#' id + year, +#' id + year, #' cluster = "id", -#' base_stagg, +#' base_stagg, #' ssc = ssc(adj = FALSE, cluster.adj = FALSE)) #' #' aggregate(res_sunab, agg = "ATT") #'# test ATT equivalence -#'boot_att <- +#'boot_att <- #' boot_aggregate( -#' res_sunab, -#' B = 9999, -#' agg = "ATT", -#' clustid = "id" +#' res_sunab, +#' B = 9999, +#' agg = "ATT", +#' clustid = "id" #' ) #' head(boot_att) -#' -#'#'boot_agg2 <- +#' +#'#'boot_agg2 <- #' boot_aggregate( -#' res_sunab, -#' B = 99999, +#' res_sunab, +#' B = 99999, #' agg = TRUE, #' ssc = boot_ssc(adj = FALSE, cluster.adj = FALSE) #' ) -#' +#' #' } #' } @@ -186,7 +186,7 @@ boot_aggregate <- function( agg, full = FALSE, use_weights = TRUE, - clustid = NULL, + clustid = NULL, B, bootstrap_type = "fnw11", bootcluster = "max", @@ -197,7 +197,7 @@ boot_aggregate <- function( impose_null = TRUE, p_val_type = "two-tailed", nthreads = getBoottest_nthreads(), - tol = 1e-06, + tol = 1e-06, maxiter = 10, ssc = boot_ssc( adj = TRUE, @@ -212,35 +212,29 @@ boot_aggregate <- function( getauxweights = FALSE, sampling = "dqrng", ...){ - - + + # note: all boottest function arguments are tested in boottest() # therefore, only check for supported subset of features - + check_arg(bootstrap_type, "charin(fnw11)") check_arg(full, "logical scalar") # => later => extend it to more than one set of vars to agg - - + + ssc <- quote( boot_ssc( - adj = ssc$adj, - fixef.K = ssc$fixef.K, - cluster.adj = ssc$cluster.adj, + adj = ssc$adj, + fixef.K = ssc$fixef.K, + cluster.adj = ssc$cluster.adj, cluster.df = ssc$cluster.df ) ) - - - # remind packages users to set a global seed - inform_seed( - frequency_id = "seed-reminder-bootagg", - engine = engine - ) - + + dots <- list(...) from_summary <- isTRUE(dots$from_summary) - + no_agg <- FALSE agg_rm <- NULL check_value_plus(agg, "match(att, period, cohort, TRUE) | scalar") @@ -253,56 +247,54 @@ boot_aggregate <- function( agg_rm <- gsub("E::", "E::-?", agg, fixed = TRUE) } else if(agg == "cohort"){ agg <- c("cohort" = "::[^-].*:cohort::(.+)") - agg_rm <- gsub("E::", "E::-?", + agg_rm <- gsub("E::", "E::-?", x$model_matrix_info$sunab$agg_att, fixed = TRUE) } else { agg <- x$model_matrix_info$sunab$agg_period } if(!is.null(agg_name)) names(agg) <- agg_name } - } else if(isFALSE(agg)){ + } else if(base::isFALSE(agg)){ agg <- c("nothing to remove" = "we want all the coefficients") } - + is_name <- !is.null(names(agg)) - + if(!is_name && !grepl("(", agg, fixed = TRUE)){ - rlang::abort( - c("Argument 'agg' must be a character in - which the pattern to match must be in between parentheses. - So far there are no parenthesis: please have a look at the examples."), - use_cli_format = TRUE + stop( + "Argument 'agg' must be a character in ", + "which the pattern to match must be in between parentheses.", + "So far there are no parenthesis: please have a look at the examples." ) } - + coef <- x$coefficients cname <- names(coef) V <- x$cov.scaled - - + + qui <- grepl(agg, cname, perl = TRUE) if(!any(qui)){ if(from_summary){ # We make it silent when aggregate is used in summary # => this way we can pool calls to agg even for models that don't have it # ==> useful in etable eg - return(list(coeftable = x$coeftable, + return(list(coeftable = x$coeftable, model_matrix_info = x$model_matrix_info)) } else if(no_agg){ x <- summary(x, agg = FALSE, ...) return(x$coeftable) } else { - rlang::abort( - "The argument 'agg' does not match any variable.", - use_cli_format = TRUE + stop( + "The argument 'agg' does not match any variable." ) } } - + if(!isTRUE(x$summary)){ x <- summary(x, ...) } - + cname_select <- cname[qui] if(is_name){ root <- rep(names(agg), length(cname_select)) @@ -311,60 +303,60 @@ boot_aggregate <- function( root <- gsub(paste0(".*", agg, ".*"), "\\1", cname_select, perl = TRUE) val <- gsub(paste0(".*", agg, ".*"), "\\2", cname_select, perl = TRUE) } - + mm <- model.matrix(x) - - cat("Run the wild bootstrap: this might take some time...(but + + cat("Run the wild bootstrap: this might take some time...(but hopefully not too much time =) ).", "\n") - + name_df <- unique(data.frame(root, val, stringsAsFactors = FALSE)) - + nk <- nrow(name_df) c_all <- vector(mode = "numeric", length = nk) se_all <- vector(mode = "numeric", length = nk) pvalues <- vector(mode = "numeric", length = nk) teststat <- vector(mode = "numeric", length = nk) conf_int <- matrix(NA, nk, 2) - - pb <- txtProgressBar(min = 0, max = nk, initial = 0, style = 3) - + + pb <- txtProgressBar(min = 0, max = nk, initial = 0, style = 3) + for(i in 1:nk){ - + r <- name_df[i, 1] v <- name_df[i, 2] v_names <- cname_select[root == r & val == v] - + if(use_weights && !is.null(x$weights)){ shares <- colSums(x$weights * sign(mm[, v_names, drop = FALSE])) } else { shares <- colSums(sign(mm[, v_names, drop = FALSE])) } - + shares <- shares / sum(shares) - + # The coef c_value <- sum(shares * coef[v_names]) - + # The variance n <- length(v_names) s1 <- matrix(shares, n, n) s2 <- matrix(shares, n, n, byrow = TRUE) - + s <- s1 * s2 - + var_value <- sum(s * V[v_names, v_names]) se_value <- sqrt(var_value) - + v_names_pos <- which(names(coef) %in% v_names) - + c_all[i] <- c_value se_all[i] <- se_value - + param <- names(shares) R <- shares - - boot_fit <- + + boot_fit <- boottest( object = x, param = param, @@ -388,13 +380,13 @@ boot_aggregate <- function( floattype = floattype, maxmatsize = maxmatsize, bootstrapc = bootstrapc, - getauxweights = getauxweights, + getauxweights = getauxweights, sampling = sampling, ssc = ssc ) - + setTxtProgressBar(pb,i) - + pvalues[i] <- pval(boot_fit) teststat[i] <- teststat(boot_fit) if(!is.null(clustid)){ @@ -406,23 +398,23 @@ boot_aggregate <- function( } # th z & p values zvalue <- c_all/se_all - + res <- cbind( c_all, - teststat, - pvalues, + teststat, + pvalues, conf_int ) colnames(res) <- c( "Estimate", "t value", - "Pr(>|t|)", + "Pr(>|t|)", paste0("[",sign_level / 2, "%"), paste0(1 - (sign_level / 2), "%","]") ) - + res - + } diff --git a/R/boot_algo_fastnreliable.R b/R/boot_algo_fastnreliable.R index c163de6c..d5e179bc 100644 --- a/R/boot_algo_fastnreliable.R +++ b/R/boot_algo_fastnreliable.R @@ -225,7 +225,14 @@ boot_algo_fastnreliable <- function( tXgXg <- lapply(tXgXg,function(g) as.matrix(g)) scores_list <- lapply(scores_list,function(g) as.matrix(g)) - H <- compute_H(G = G, R = matrix(R, 1, k), tXXinv = as.matrix(tXXinv), tXgXg = tXgXg, scores_list = scores_list, cores = nthreads) + H <- compute_H( + G = G, + R = matrix(R, 1, k), + tXXinv = as.matrix(tXXinv), + tXgXg = tXgXg, + scores_list = scores_list, + cores = nthreads + ) denom <- boot_algo3_crv1_denom( B = B, @@ -358,9 +365,16 @@ inv <- function(x, g){ Matrix::solve(x) }, error = function(e) { - rlang::warn(message = paste0( - "Matrix inversion error when computing beta(g) for cluster ", g, ". Using Pseudo-Inverse instead. Potentially, you can suppress this message by specifying a cluster fixed effect in the bootstrap via the `fe` argument of `boottest()`."), - use_cli_format = TRUE) + message( + message = + paste0( + "Matrix inversion error when computing beta(g) for cluster ", + g, ". Using Pseudo-Inverse instead. Potentially, you can suppress", + "this message by specifying a cluster fixed effect in + the bootstrap", + "via the `fe` argument of `boottest()`." + ) + ) eigen_pinv(as.matrix(x)) } ) diff --git a/R/boot_algo_fastnwild.R b/R/boot_algo_fastnwild.R index aeb8b699..f74ea4af 100644 --- a/R/boot_algo_fastnwild.R +++ b/R/boot_algo_fastnwild.R @@ -63,11 +63,11 @@ boot_algo_fastnwild <- #' finding procedure to find the confidence interval. #' 10 by default. #' @param sampling 'dqrng' or 'standard'. If 'dqrng', the 'dqrng' package is - #' used for random number generation (when available). If 'standard', - #' functions from the 'stats' package are used when available. - #' This argument is mostly a convenience to control random number generation in - #' a wrapper package around `fwildclusterboot`, `wildrwolf`. - #' I recommend to use the fast' option. + #' used for random number generation (when available). If 'standard', + #' functions from the 'stats' package are used when available. + #' This argument is mostly a convenience to control random + #' number generation in a wrapper package around `fwildclusterboot`, + #' `wildrwolf`. I recommend to use the fast' option. #' @return A list of ... #' @importFrom collapse fsum GRP #' @importFrom stats as.formula coef model.matrix model.response diff --git a/R/boot_algo_textbook_cpp.R b/R/boot_algo_textbook_cpp.R index 6972185d..69a668e5 100644 --- a/R/boot_algo_textbook_cpp.R +++ b/R/boot_algo_textbook_cpp.R @@ -96,28 +96,15 @@ boot_algo_textbook_cpp <- } else if (type == "webb"){ type <- 1 } else { - rlang::abort( - paste("For the 'lean' bootstrap algorithm, only webb and rademacher - weights are supported."), - use_cli_format = TRUE - ) + only_webb_rademacher_for_rlean_error() } if (impose_null == FALSE) { - rlang::abort( - c("The 'lean' bootstrap algorithm is currently not supported without - imposing the null on the bootstrap dgp."), - use_cli_format = TRUE - ) + no_wcu_for_rlean_error() } if ((length(R) - sum(R != 1)) > 1) { - rlang::abort( - c("The 'lean' bootstrap algorithm - which runs the - heteroskedastic wild bootstrap - is currently not supported for - hypotheses about more than one parameter."), - use_cli_format = TRUE - ) + only_scalar_test_rlean_error() } if(bootstrap_type == "11"){ diff --git a/R/boottest_felm.R b/R/boottest_felm.R index 9357cf2a..edd6721e 100644 --- a/R/boottest_felm.R +++ b/R/boottest_felm.R @@ -198,16 +198,9 @@ #' @section Standard Errors: #' `boottest` does not calculate standard errors. #' @section Multiple Fixed Effects: -#' If your felm() model contains fixed effects, boottest() will internally convert all fixed -#' effects but the one specified via the `fe` argument to dummy variables. -#' @section Run `boottest` quietly: -#' You can suppress all warning and error messages by setting the following global -#' options: -#' `options(rlib_warning_verbosity = "quiet")` -#' `options(rlib_message_verbosity = "quiet")` -#' Not that this will turn off all warnings (messages) produced via `rlang::warn()` and -#' `rlang::inform()`, which might not be desirable if you use other software build on -#' `rlang`, as e.g. the `tidyverse`. +#' If your felm() model contains fixed effects, boottest() will internally +#' convert all fixed effects but the one specified via the `fe` +#'argument to dummy variables. #' @section Stata, Julia and Python Implementations: #' The fast wild cluster bootstrap algorithms are further implemented in the #' following software packages: @@ -390,47 +383,21 @@ boottest.felm <- function(object, check_arg(sampling, "charin(dqrng, standard)") - # remind packages users to set a global seed - inform_seed( - frequency_id = "seed-reminder-boot-felm", - engine = engine - ) if(bootstrap_type %in% c("33", "13")){ - rlang::abort( - c( - "The bootstrap types '13' and '33' are not yet supported for objects - of type felm. Support will be added in the future. Until then, you - can run the '13' and '33' bootstrap types with objects of type 'lm' - and 'fixest'." - ), - use_cli_format = TRUE - ) + no_13_33_for_felm_error() } if(bootstrap_type != "fnw11"){ if(engine == "R"){ if(conf_int){ - rlang::inform( - "Confidence Intervals are currently only supported for - the R engine with 'bootstrap_type = 'fnw11' '.", - use_cli_format = TRUE, - .frequency = "regularly", - .frequency_id = "CI only for fnw algo." - ) + cis_only_for_fnw11() } } } if (!is.null(beta0)) { - rlang::abort( - c( - "The function argument 'beta0' is deprecated. - Please use the function argument 'r' instead, - by which it is replaced." - ), - use_cli_format = TRUE - ) + arg_beta0_is_deprecated_error() } if (inherits(clustid, "formula")) { diff --git a/R/boottest_fixest.R b/R/boottest_fixest.R index d3554b81..d6502235 100644 --- a/R/boottest_fixest.R +++ b/R/boottest_fixest.R @@ -181,16 +181,13 @@ #' @section Standard Errors: #' `boottest` does not calculate standard errors. #' @section Multiple Fixed Effects: -#' If your feols() model contains fixed effects, boottest() will internally convert all fixed -#' effects but the one specified via the `fe` argument to dummy variables. -#' @section Run `boottest` quietly: -#' You can suppress all warning and error messages by setting the following global -#' options: -#' `options(rlib_warning_verbosity = "quiet")` -#' `options(rlib_message_verbosity = "quiet")` -#' Note that this will turn off all warnings (messages) produced via `rlang::warn()` and -#' `rlang::inform()`, which might not be desirable if you use other software build on -#' `rlang`, as e.g. the `tidyverse`. +#' If your feols() model contains fixed effects, boottest() will internally +#' convert all fixed effects but the one specified via the `fe` argument to +#' dummy variables. +#' @srrstats {RE2.0} *Regression Software should document any transformations +#' applied to input data, for example conversion of label-values to `factor`, +#' and should provide ways to explicitly avoid any default +#' transformations (with error or warning conditions where appropriate).* #' @section Stata, Julia and Python Implementations: #' The fast wild cluster bootstrap algorithms are further implemented in the #' following software packages: @@ -344,7 +341,8 @@ #' #' @srrstats {RE1.0} *Regression Software should enable models to be specified #' via a formula interface, unless reasons for not doing so are explicitly -#' documented.* The following function arguments can be formulas: param, clustid, fe. +#' documented.* The following function arguments can be formulas: param, +#' clustid, fe. @@ -412,35 +410,16 @@ boottest.fixest <- function(object, check_arg(sampling, "charin(dqrng, standard)") - # remind packages users to set a global seed - inform_seed( - frequency_id = "seed-reminder-boot-fixest", - engine = engine - ) - if(bootstrap_type != "fnw11"){ if(engine == "R"){ if(conf_int){ - rlang::inform( - "Confidence Intervals are currently only supported for - the R engine with 'bootstrap_type = 'fnw11' '.", - use_cli_format = TRUE, - .frequency = "regularly", - .frequency_id = "CI only for fnw algo." - ) + cis_only_for_fnw11() } } } if (!is.null(beta0)) { - rlang::abort( - c( - "The function argument 'beta0' is deprecated. - Please use the function argument 'r' instead, - by which it is replaced." - ), - use_cli_format = TRUE - ) + arg_beta0_is_deprecated_error() } if (inherits(clustid, "formula")) { @@ -461,17 +440,7 @@ boottest.fixest <- function(object, if (!is.null(object$fixef_removed)) { - rlang::abort( - paste( - "feols() removes fixed effects with the following values: ", - object$fixef_removed, - ". Currently, boottest()'s internal pre-processing does not - account for this deletion. Therefore, please exclude such fixed - effects prior to estimation with feols(). You can find them listed - under '$fixef_removed' of your fixest object." - ), - use_cli_format = TRUE - ) + please_remove_fixest_na_error() } # -------------------------------------------- diff --git a/R/boottest_ivreg.R b/R/boottest_ivreg.R index fc60dd85..4abe9816 100644 --- a/R/boottest_ivreg.R +++ b/R/boottest_ivreg.R @@ -115,14 +115,6 @@ #' @section Setting Seeds: #' To guarantee reproducibility, you need to #' set a global random seed via `set.seed()` -#' @section Run `boottest` quietly: -#' You can suppress all warning and error messages by setting the following global -#' options: -#' `options(rlib_warning_verbosity = "quiet")` -#' `options(rlib_message_verbosity = "quiet")` -#' Not that this will turn off all warnings (messages) produced via `rlang::warn()` and -#' `rlang::inform()`, which might not be desirable if you use other software build on -#' `rlang`, as e.g. the `tidyverse`. #' @references Roodman et al., 2019, "Fast and wild: Bootstrap inference in #' STATA using boottest", The STATA Journal. #' () @@ -256,16 +248,6 @@ boottest.ivreg <- function(object, small_sample_adjustment <- FALSE } - if (ssc[["fixef.K"]] != "none" || - ssc[["cluster.df"]] != "conventional") { - rlang::warn( - "Currently, boottest() only supports fixef.K = 'none'.", - use_cli_format = TRUE - ) - } - - - check_params_in_model( object = object, param = param @@ -316,18 +298,8 @@ boottest.ivreg <- function(object, clusteradj <- julia_ssc$clusteradj clustermin <- julia_ssc$clustermin - # remind packages users to set a global seed - inform_seed( - frequency_id = "seed-reminder-boot-iv", - engine = "WildBootTests.jl" - ) - if (ssc[["fixef.K"]] != "none") { - rlang::inform( - paste( - "Currently, boottest() only supports fixef.K = 'none'."), - use_cli_format = TRUE - ) + no_fixef.K_warning() } diff --git a/R/boottest_lm.R b/R/boottest_lm.R index 4e310e74..14c1f17b 100644 --- a/R/boottest_lm.R +++ b/R/boottest_lm.R @@ -182,14 +182,6 @@ #' } #' @section Standard Errors: #' `boottest` does not calculate standard errors. -#' @section Run `boottest` quietly: -#' You can suppress all warning and error messages by setting the following global -#' options: -#' `options(rlib_warning_verbosity = "quiet")` -#' `options(rlib_message_verbosity = "quiet")` -#' Not that this will turn off all warnings (messages) produced via `rlang::warn()` and -#' `rlang::inform()`, which might not be desirable if you use other software build on -#' `rlang`, as e.g. the `tidyverse`. #' @section Stata, Julia and Python Implementations: #' The fast wild cluster bootstrap algorithms are further implemented in the #' following software packages: @@ -361,20 +353,10 @@ boottest.lm <- function(object, check_arg(sampling, "charin(dqrng, standard)") - # remind packages users to set a global seed - inform_seed( - frequency_id = "seed-reminder-boot-lm", - engine = engine - ) if(bootstrap_type != "fnw11"){ if(engine == "R"){ if(conf_int){ - rlang::inform( - c('*',"Confidence Intervals are currently only supported for", - "the R engine with 'bootstrap_type = 'fnw11'."), - use_cli_format = TRUE, - .frequency = "regularly", - .frequency_id = "CI only for fnw algo.") + cis_only_for_fnw11() } } } @@ -385,14 +367,7 @@ boottest.lm <- function(object, ) if (!is.null(beta0)) { - rlang::abort( - c( - "The function argument 'beta0' is deprecated. - Please use the function argument 'r' instead, - by which it is replaced." - ), - use_cli_format = TRUE - ) + arg_beta0_is_deprecated_error() } if (inherits(clustid, "formula")) { diff --git a/R/error_warning_messages.R b/R/error_warning_messages.R new file mode 100644 index 00000000..45a58a45 --- /dev/null +++ b/R/error_warning_messages.R @@ -0,0 +1,415 @@ +cis_only_for_fnw11 <- function(){ + message( + paste( + "Note: Confidence Intervals are currently only supported for", + "the R engine with 'bootstrap_type = 'fnw11' '." + ) + ) +} + +arg_beta0_is_deprecated_error <- function(){ + + stop( + "The function argument 'beta0' is deprecated.", + "Please use the function argument 'r' instead, ", + "by which it is replaced." + ) + +} + +please_remove_fixest_na_error <- function(object){ + + stop( + paste( + "feols() removes fixed effects with the following values: ", + object$fixef_removed, + ". Currently, boottest()'s internal pre-processing does not", + "account for this deletion. Therefore, please exclude such fixed", + "effects prior to estimation with feols(). You can find them listed", + "under '$fixef_removed' of your fixest object." + ) + ) + +} + +no_plot_when_no_ci_error <- function(){ + stop( + "No plot method if boottest()'s function argument 'conf_int = FALSE'." + ) +} + +only_ols_for_fixest_error <- function(){ + stop( + "boottest() only supports OLS estimation via fixest::feols() - it ", + "does not support non-linear models computed via e.g. fixest::fepois() ", + "or fixest::feglm." + ) +} + +no_weights_when_fe_error <- function(){ + stop( + "boottest() unfortunately currently does not support WLS and fixed", + "effects. Please set fe = NULL to run a bootstrap with WLS." + ) +} + +no_fixef_for_fast_reliable_error <- function(){ + + stop( + "No fixed effects are supported for bootstrap_types ", + "'11', '13', '31', '33'." + ) + +} + +bootcluster_neither_cluster_nor_params <- function(){ + stop( + "A bootcluster variable is neither contained in the cluster ", + "variables nor in the model coefficients." + ) +} + +cluster_nobs_error <- function(){ + stop( + "The number of observations in 'cluster' and 'nobs()' do not match." + ) +} + +bootcluster_nobs_error <- function(){ + stop( + "The number of observations in 'bootcluster' and 'nobs()' do not match." + ) +} + +nas_in_clustid_error <- function(){ + stop( + "`boottest()` cannot handle NAs in `clustid` variables that are not ", + "part of the estimated model object." + ) +} + +nas_in_bootclustid_error <- function(){ + stop( + "`boottest()` cannot handle NAs in `bootcluster` variables that are ", + "not part of the estimated model object." + ) +} + +boottest_engine_error <- function(){ + stop( + "The value of `getOption(\"boottest_engine\")` is currently not legal. ", + "Please use function `setBoottest_engine()` to set it to an appropriate ", + "value." + ) +} + +boottest_nthreads_error <- function(){ + stop( + "The value of `getOption(\"boottest_nthreads\")` is currently", + "not legal. ", + "Please use function `setBoottest_nthreads` to set it to an", + "appropriate value." + ) +} + +# error messages +advanced_formula_error <- function(){ + stop( + "Advanced formula notation in fixest / fixest via ^ to interact ", + "fixed effects is currently not supported in boottest().", + call = TRUE + ) +} + +varying_slopes_error <- function(){ + stop( + "Varying slopes in fixest / fixest via [] to interact ", + "fixed effects is currently not supported in boottest()." + ) +} + +fixef_not_fixef_error <- function(fe){ + + stop( + paste( + "The fixed effect to be projected out in the bootstrap,",fe, + "is not included as a dedicated fixed effect in the", + "estimated model." + ) + ) + +} + +fe_in_test_error <- function(fe){ + + stop( + paste( + "The function argument fe =", fe, "is included in the", + "hypothesis (via the `param` argument). This is not allowed.", + "Please set fe to another factor variable or NULL." + ) + ) + + +} + +B_too_small_error <- function(){ + stop( + "The function argument B is smaller than 100. The number of bootstrap", + "iterations needs to be 100 or higher in order to guarantee that the", + "root finding procudure used to find the confidence set", + "works properly." + ) +} + +test_dimension_error <- function(R, r){ + stop( + paste( + "The dimensions of func args R and r do not match. The number", + "of rows of R is ", nrow(R), ", but the length of r is", + length(r), "." + ) + ) +} + +non_scalar_test_error <- function(){ + + stop( + "Only bootstrap_type = 'fnw11' is currently supported with ", + "hypotheses that contain more than one parameter. This feature ", + "will be added in the near future. If you receive this error message ", + "please complain on github, it will motivate Alex to implement this ", + "feature." + ) + +} + +only_scalar_test_rlean_error <- function(){ + stop( + "The 'lean' bootstrap algorithm - which runs the ", + "heteroskedastic wild bootstrap - is currently not supported for ", + "hypotheses about more than one parameter." + ) +} + +no_clustering_for_rlean_error <- function(){ + + stop( + "The R-lean algorithm currently only supports oneway clustering." + ) + +} + +no_fixef_for_rlean_error <- function(){ + + stop( + "boottest() currently does not support ", + "fixed effects with engine = 'R-lean'." + ) + +} + +no_weights_for_rlean_error <- function(){ + stop( + "boottest() currently does not support regression ", + "weights with engine = 'R-lean'." + ) +} + +no_wcu_for_rlean_error <- function(){ + + stop( + "boottest() currently does not support the 'WCU' bootstrap ", + "- which does not impose the null on the ", + "bootstrap dgp - for engine = 'R-lean'." + ) + +} + +only_onweway_clustering_for_fast_reliable <- function(){ + + stop( + "The '13', '31', and '33' bootstrap variants currently ", + "only support oneway clustering when 'boot_engine' == 'R'." + ) + +} + +no_subcluster_bootstrap_fast_reliable <- function(){ + stop( + paste( + "The subcluster bootstrap is currently not supported", + "for the bootstrap types '11', '31', '13' and '33'." + ) + ) +} + +bootcluster_not_max_min_clustid <- function(clustid){ + stop( + paste0( + "For the 'fast and reliable' implementations, ", + "the 'bootcluster' argument must be 'max', 'min' or '", + clustid, "'." + ) + ) +} + +full_enumeration_warning <- function(N_G_2, N_G_bootcluster){ + message( + paste( + "Note: ", + "There are only", + N_G_2, + "unique draws from the rademacher distribution for", + N_G_bootcluster, + "bootstrap clusters. Therefore, B = ", + N_G_2, + " with full enumeration. Consider using webb weights instead.", + "Further, note that under full enumeration and with B =", + N_G_2, + "bootstrap draws, only 2^(#clusters - 1) = ", + 2^(N_G_bootcluster - 1), + " distinct t-statistics and p-values can be computed. For a", + "more thorough discussion, see Webb `Reworking wild bootstrap", + "based inference for clustered errors` (2013)." + ) + ) + +} + +multi_hypothesis_only_via_julia <- function(){ + stop( + "Hypotheses with q>1 are currently only supported via WildBootTests.jl. ", + "Please set the function argument 'engine = WildBootTests.jl'." + ) +} + +one_sided_cis_only_via_julia <- function(){ + message( + paste( + "Note:", + "Currently, boottest() calculates confidence intervals for one-sided", + "hypotheses only for `engine = 'WildBootTests.jl'`." + ) + ) + +} + +R_must_be_NULL_or_param <- function(){ + stop( + "The constraints vector must either be NULL or a numeric of", + "the same length as the `param` input vector." + ) +} + +no_13_33_bootstap_via_julia <- function(bootstrap_type){ + stop( + paste( + "The bootstrap of type", + bootstrap_type, + "is not yet supported via 'WildBootTests. You can run it via", + "the 'R' engine instead.'. But note that this algos are still", + "written in pure R, and therefore rather slow.") + ) +} + +param_not_in_model_error <- function(param){ + stop( + paste( + "The parameter", param, "is not included in the estimated", + "model. Maybe you are trying to test for an interaction", + " parameter? To see all model parameter names,", + "run names(coef(model))." + ) + ) +} + +only_2SLS_for_IV_error <- function(){ + stop( + "Currently, only 2SLS is supported. Please set the `ivreg`", + "function argument `method` to `OLS`." + ) +} + +only_cluster_fixef_for_11_31_error <- function(){ + stop( + "The '11' and '31' bootstrap variants currently only support ", + "cluster fixed effects in the bootstrap. Please set 'fe = NULL' or to ", + "the cluster variable." + ) + +} + +no_fixef_for_13_33_error <- function(){ + stop( + "The '13' and '33' bootstrap variants currently do support ", + "fixed effects in the bootstrap. Please set 'fe = NULL'." + ) +} + +no_weights_for_fast_reliable_error <- function(){ + stop( + "The '11' and '13' ", + "bootstrap variants currently only support weights ", + "for engine = 'WildBootTests.jl' The '13' and '33' do not support ", + "weights for both engines." + ) +} + +no_bootstrap_type_for_rlean_error <- function(bootstrap_type){ + stop( + paste( + "The bootstrap of type", + bootstrap_type, + "is not yet supported via 'R-lean'. You can potentially run it via ", + "the 'R' or 'WildBootTests.jl' engines instead.'" + ) + ) + +} + +only_webb_rademacher_for_rlean_error <- function(){ + stop( + "For the 'lean' bootstrap algorithm, only webb and rademacher ", + "weights are supported." + ) +} + +no_13_33_for_felm_error <- function(){ + stop( + "The bootstrap types '13' and '33' are not yet supported for objects ", + "of type felm. Support will be added in the future. Until then, you ", + "can run the '13' and '33' bootstrap types with objects of type 'lm' ", + "and 'fixest'." + ) +} + +ci_root_finding_initiation_failed <- function(){ + stop( + "The inflation factor for initial guesses for standard errors was not ", + "large enough. In consequence, the root-finding procedure to compute ", + "confidence intervals via p-value inversion could not be initiated. ", + "In a future release, it will be possible to specify a costum inflation ", + "factor as a function argument to boottest(). Until then, you can still ", + "use boottest() to calculate p-values by setting the boottest() function ", + "argument conf_int to FALSE." + ) +} + +no_iv_for_felm_error <- function(){ + + stop( + "IV regression is currently not supported by boottest() for ", + "objects of type 'felm'. Please use ", + "or 'ivreg::ivreg' for IV-regression." + ) + +} + +no_fixef.K_warning <- function(){ + message( + paste( + "Note: Currently, boottest() only supports fixef.K = 'none'." + ) + ) +} diff --git a/R/get_weights.R b/R/get_weights.R index 4f103021..5c3579e5 100644 --- a/R/get_weights.R +++ b/R/get_weights.R @@ -12,13 +12,12 @@ get_weights <- function(type, #' employed #' @param N_G_bootcluster Integer. The number of bootstrap clusters #' @param boot_iter The number of bootstrap iterations - #' @param sampling 'fast' or 'standard'. If 'fast', the 'dqrng' package is used - #' for random number generation. If 'standard', functions from the 'stats' - #' package are used when available. This argument is mostly a convenience for - #' a wrapper package around fwildclusterboot, wildrwolf. I recommend to use the - #' 'fast' option. + #' @param sampling 'fast' or 'standard'. If 'fast', the 'dqrng' package is + #' used for random number generation. If 'standard', functions from the + #' 'stats' package are used when available. This argument is mostly a + #' convenience for a wrapper package around fwildclusterboot, wildrwolf. + #' I recommend to use the fast' option. #' @return A matrix of dimension N_G_bootcluster x (boot_iter + 1) - #' @importFrom gtools permutations #' @noRd @@ -29,11 +28,7 @@ get_weights <- function(type, # (uses less memory # than numeric) rademacher = function(n) { - dqrng::dqsample( - x = c(-1L, 1L), - size = n, - replace = TRUE - ) + dqrng::dqrrademacher(n) }, mammen = function(n) { sample( @@ -99,8 +94,7 @@ get_weights <- function(type, # full_enumeration only for rademacher weights (set earlier) if (full_enumeration) { v0 <- - # gtools_permutations( - permutations( + gtools_permutations( n = 2, r = N_G_bootcluster, v = c(1, -1), diff --git a/R/gtools_permutations.R b/R/gtools_permutations.R index 0bbfa43c..ec549dac 100644 --- a/R/gtools_permutations.R +++ b/R/gtools_permutations.R @@ -4,7 +4,7 @@ gtools_permutations <- v = 1:n, set = TRUE, repeats.allowed = FALSE) { - #' copied from the permutations package as long as it is orphaned on CRAN + #' Copied from the permutations package. Published under GPL-2. #' @param n size of source vector #' @param r size of target vector #' @param v source vector. defaults to 1:n diff --git a/R/invert_p_val.R b/R/invert_p_val.R index 57ca169b..2ef1bb11 100644 --- a/R/invert_p_val.R +++ b/R/invert_p_val.R @@ -191,16 +191,7 @@ invert_p_val <- function(ABCD, #' I have plans to revisit this going forward. if (sum(p_start < sign_level) < 2) { - rlang::abort(c( - "The inflation factor for initial guesses for standard errors was not - large enough. In consequence, the root-finding procedure to compute - confidence intervals via p-value inversion could not be initiated. - In a future release, it will be possible to specify a costum inflation - factor as a function argument to boottest(). Until then, you can still - use boottest() to calculate p-values by setting the boottest() function - argument conf_int to FALSE."), - use_cli_format = TRUE - ) + ci_root_finding_initiation_failed() } # ---------------------------------------------------------------------- diff --git a/R/mboottest_felm.R b/R/mboottest_felm.R index 89d59446..05c356a1 100644 --- a/R/mboottest_felm.R +++ b/R/mboottest_felm.R @@ -107,16 +107,9 @@ #' To guarantee reproducibility, you need to #' set a global random seed via `set.seed()` when using #' @section Multiple Fixed Effects: -#' If your felm() model contains fixed effects, boottest() will internally convert -#' all fixed effects but the one specified via the `fe` argument to dummy variables. -#' @section Run `boottest` quietly: -#' You can suppress all warning and error messages by setting the following global -#' options: -#' `options(rlib_warning_verbosity = "quiet")` -#' `options(rlib_message_verbosity = "quiet")` -#' Not that this will turn off all warnings (messages) produced via `rlang::warn()` and -#' `rlang::inform()`, which might not be desirable if you use other software build on -#' `rlang`, as e.g. the `tidyverse`. +#' If your felm() model contains fixed effects, boottest() will internally +#' convert all fixed effects but the one specified via the `fe` +#' argument to dummy variables. #' @references Roodman et al., 2019, "Fast and wild: Bootstrap inference in #' STATA using boottest", The STATA Journal. #' () @@ -202,12 +195,6 @@ mboottest.felm <- function(object, check_arg(maxmatsize, "scalar integer | NULL") check_arg(bootstrapc, "scalar logical") - # remind packages users to set a global seed - inform_seed( - frequency_id = "seed-reminder-m-felm", - engine = "WildBootTests.jl" - ) - if (inherits(clustid, "formula")) { clustid <- attr(terms(clustid), "term.labels") } @@ -257,10 +244,7 @@ mboottest.felm <- function(object, clustermin <- julia_ssc$clustermin if (ssc[["fixef.K"]] != "none") { - rlang::inform( - "Currently, boottest() only supports fixef.K = 'none'.", - use_cli_format = TRUE - ) + no_fixef.K_warning() } diff --git a/R/mboottest_fixest.R b/R/mboottest_fixest.R index 6d2dae34..32ce9c27 100644 --- a/R/mboottest_fixest.R +++ b/R/mboottest_fixest.R @@ -107,16 +107,9 @@ #' To guarantee reproducibility, you need to #' set a global random seed via`set.seed()` #' @section Multiple Fixed Effects: -#' If your feols() model contains fixed effects, boottest() will internally convert all fixed -#' effects but the one specified via the `fe` argument to dummy variables. -#' @section Run `boottest` quietly: -#' You can suppress all warning and error messages by setting the following global -#' options: -#' `options(rlib_warning_verbosity = "quiet")` -#' `options(rlib_message_verbosity = "quiet")` -#' Not that this will turn off all warnings (messages) produced via `rlang::warn()` and -#' `rlang::inform()`, which might not be desirable if you use other software build on -#' `rlang`, as e.g. the `tidyverse`. +#' If your feols() model contains fixed effects, boottest() will internally +#' convert all fixed effects but the one specified via the `fe` argument to +#' dummy variables. #' @references Roodman et al., 2019, "Fast and wild: Bootstrap inference in #' STATA using boottest", The STATA Journal. #' () @@ -206,11 +199,6 @@ mboottest.fixest <- function(object, check_arg(maxmatsize, "scalar integer | NULL") check_arg(bootstrapc, "scalar logical") - inform_seed( - frequency_id = "seed-reminder-m-fixest", - engine = "WildBootTests.jl" - ) - if (inherits(clustid, "formula")) { clustid <- attr(terms(clustid), "term.labels") } @@ -225,25 +213,11 @@ mboottest.fixest <- function(object, # fixest specific checks if (object$method != "feols") { - rlang::abort( - c("mboottest() only supports OLS estimation via fixest::feols() - it - does not support non-linear models computed via e.g. - fixest::fepois() or fixest::feglm."), - use_cli_format = TRUE - ) + only_ols_for_fixest_error() } if (!is.null(object$fixef_removed)) { - rlang::abort( - paste( - "feols() removes fixed effects with the following values: ", - object$fixef_removed, ". Currently, boottest()'s - internal pre-processing does not account for this deletion. - Therefore, please exclude such fixed effects prior - to estimation with feols(). You can find them listed under - '$fixef_removed' of your fixest object."), - use_cli_format = TRUE - ) + please_remove_fixest_na_error() } fedfadj <- 0L @@ -292,10 +266,7 @@ mboottest.fixest <- function(object, clustermin <- julia_ssc$clustermin if (ssc[["fixef.K"]] != "none") { - rlang::inform( - "Currently, boottest() only supports fixef.K = 'none'.", - use_cli_format = TRUE - ) + no_fixef.K_warning() } diff --git a/R/mboottest_lm.R b/R/mboottest_lm.R index e3654f81..ec347c19 100644 --- a/R/mboottest_lm.R +++ b/R/mboottest_lm.R @@ -101,14 +101,6 @@ #' @section Setting Seeds: #' To guarantee reproducibility, you need to #' set a global random seed via `set.seed()` -#' @section Run `boottest` quietly: -#' You can suppress all warning and error messages by setting the following global -#' options: -#' `options(rlib_warning_verbosity = "quiet")` -#' `options(rlib_message_verbosity = "quiet")` -#' Note that this will turn off all warnings (messages) produced via `rlang::warn()` and -#' `rlang::inform()`, which might not be desirable if you use other software build on -#' `rlang`, as e.g. the `tidyverse`. #' @references Roodman et al., 2019, "Fast and wild: Bootstrap inference in #' STATA using boottest", The STATA Journal. #' () @@ -194,11 +186,6 @@ mboottest.lm <- function(object, check_arg(p_val_type, "charin(two-tailed, equal-tailed,>, <)") check_arg(tol, "numeric scalar GT{0}") - inform_seed( - frequency_id = "seed-reminder-m-lm", - engine = "WildBootTests.jl" - ) - if (inherits(clustid, "formula")) { clustid <- attr(terms(clustid), "term.labels") } @@ -207,16 +194,6 @@ mboottest.lm <- function(object, bootcluster <- attr(terms(bootcluster), "term.labels") } - - if (ssc[["fixef.K"]] != "none" || - ssc[["cluster.df"]] != "conventional") { - rlang::inform( - "Currently, boottest() only supports fixef.K = 'none'.", - use_cli_format = TRUE - ) - } - - check_mboottest_args_plus( object = object, R = R, @@ -252,9 +229,7 @@ mboottest.lm <- function(object, clustermin <- julia_ssc$clustermin if (ssc[["fixef.K"]] != "none") { - rlang::inform( - paste("Currently, boottest() only supports fixef.K = 'none'."), - use_cli_format = TRUE) + no_fixef.K_warning() } diff --git a/R/methods.R b/R/methods.R index 3cab1194..1e0e37f9 100644 --- a/R/methods.R +++ b/R/methods.R @@ -8,7 +8,7 @@ #' @param object An object of type lm, fixest, felm or ivreg #' @param ... other arguments #' -#' @seealso +#' @seealso #' [boottest.lm][fwildclusterboot::boottest.lm], #' [boottest.fixest][fwildclusterboot::boottest.fixest], #' [boottest.felm][fwildclusterboot::boottest.felm], @@ -49,11 +49,12 @@ #' or Normal weights #' #' @return An object of class `boottest`. -#' -#' @details Technical Details +#' +#' @details Technical Details #' For technical details, either take a look at the references below, or check -#' out the [wild (cluster) bootstrap vignette](wild_bootstrap.html). -#' +#' out the +#' [wild (cluster) bootstrap vignette]( +#' https://s3alfisc.github.io/fwildclusterboot/articles/wild_bootstrap.html). #' @section Stata, Julia and Python Implementations: #' The fast wild cluster bootstrap algorithms are further implemented in the #' following software packages: @@ -99,24 +100,6 @@ boottest <- function(object, ...) { - rlang::warn( - message = " - Please note that the seeding behavior for random number - generation for `boottest()` has changed with `fwildclusterboot` - version 0.13. - - It will no longer be possible to - exactly reproduce results produced by versions lower than 0.13. - - If your prior results were produced under sufficiently many bootstrap - iterations, none of your conclusions will change. - For more details about this change, please read the notes in - [news.md](https://cran.r-project.org/web/packages/fwildclusterboot/news/news.html).", - .frequency = "once", - .frequency_id = "random-seed-message", - use_cli_format = TRUE - ) - UseMethod("boottest") } @@ -548,9 +531,10 @@ summary.boottest <- function(object, digits = 3, ...) { plot.boottest <- function(x, ...) { - #' Plots bootstrapped p-values as a function of the hypothesized effect size r for - #' a hypothesis test of the form R beta = r.The points where the p-values are 0.05 - #' are the boundaries of the bootstrapped confidence interval. + #' Plots bootstrapped p-values as a function of the hypothesized effect + #' size r for a hypothesis test of the form R beta = r.The points + #' where the p-values are 0.05 are the boundaries of the bootstrapped + #' confidence interval. #' @param x An object of type boottest #' @param ... Further arguments passed to or from other methods. #' @method plot boottest @@ -579,10 +563,7 @@ plot.boottest <- function(x, ...) { dreamerr::validate_dots(stop = TRUE) if (is.null(x$conf_int)) { - rlang::abort( - c("No plot method if boottest()'s function argument 'conf_int = FALSE'."), - use_cli_format = TRUE - ) + no_plot_when_no_ci_error() } test_vals <- x$grid_vals p_test_vals <- x$p_grid_vals @@ -773,20 +754,24 @@ nobs.mboottest <- function(object, ...) { #' clustid = "group_id1" #' ) #' print(boot) -#' +#' @srrstats {RE4.17} *Model objects returned by Regression Software +#' should implement or appropriately extend a default `print` method +#' which provides an on-screen summary of model (input) parameters +#' and (output) coefficients.* + print.boottest <- function(x, ..., digits = 4) { stopifnot(inherits(x, "boottest")) - + print(x$call) cat("", "\n") - + vals <- lapply( c("p_val", "conf_int", "t_stat"), function(y) { round(x[[y]], digits = digits) } ) - + cat("p value:", vals[[1]], "\n") cat("confidence interval:", vals[[2]], "\n") cat("test statistic", vals[[3]], "\n") @@ -817,17 +802,17 @@ print.boottest <- function(x, ..., digits = 4) { #' print.mboottest <- function(x, ..., digits = 4) { stopifnot(inherits(x, "mboottest")) - + print(x$call) cat("", "\n") - + vals <- lapply( c("p_val", "teststat"), function(y) { round(x[[y]], digits = digits) } ) - + cat("p value:", vals[[1]], "\n") cat("test statistic", vals[[2]], "\n") } diff --git a/R/onLoad.R b/R/onLoad.R index 3e38ff13..a57ac4d9 100644 --- a/R/onLoad.R +++ b/R/onLoad.R @@ -16,8 +16,16 @@ .onAttach <- function(libname, pkgname) { - packageStartupMessage("\nPlease cite as: \n") - packageStartupMessage(" Fischer & Roodman. (2021). fwildclusterboot: Fast Wild Cluster.") - packageStartupMessage(" Bootstrap Inference for Linear Regression Models.") - packageStartupMessage(" Available from https://cran.r-project.org/package=fwildclusterboot/.") + packageStartupMessage( + "\nPlease cite as: \n", + "Fischer & Roodman. (2021). fwildclusterboot: Fast Wild Cluster.", + "Bootstrap Inference for Linear Regression Models. \n", + "Available from https://cran.r-project.org/package=fwildclusterboot/. \n" + ) + packageStartupMessage( + "Too guarantee reproducibility, please don't forget to set a ", + "global random seed **both** via `set.seed()` and `dqrng::dqset.seed()`. ", + "This is required as `boottest()` uses different random number generators", + "for different algorithms." + ) } diff --git a/R/pairs_bootstrap.R b/R/pairs_bootstrap.R index 9127adae..8067bc43 100644 --- a/R/pairs_bootstrap.R +++ b/R/pairs_bootstrap.R @@ -26,6 +26,11 @@ boot_algo_pairs <- function( small_sample_correction, object){ + stop(" + This function is in development. It is only included because + looking at the source code might be useful to some people =) + ") + X <- preprocessed_object$X y <- preprocessed_object$Y diff --git a/R/preprocess2.R b/R/preprocess2.R index 74574564..adfdf951 100644 --- a/R/preprocess2.R +++ b/R/preprocess2.R @@ -8,9 +8,11 @@ #' @importFrom Matrix Diagonal #' @return An object of class `preprocess2`. #' @srrstats {G2.8} *Software should provide appropriate conversion or dispatch -#' routines as part of initial pre-processing to ensure that all other -#' sub-functions of a package receive inputs of a single defined class or -#' type.* Similar preprocess method + run_bootstrap function. +#' routines as part of initial pre-processing to ensure that all other +#' sub-functions of a package receive inputs of a single defined class or +#' type.* The preprocess2 methods ensure that all information of individually +#' supported classes - fixest, lfe, ivreg, lm - are processed into the +#' required format. preprocess2 <- function(object, ...) { UseMethod("preprocess2") } @@ -54,21 +56,9 @@ preprocess2.fixest <- method <- object$family # fixest specific checks if (object$method != "feols") { - rlang::abort( - "boottest() only supports OLS estimation via fixest::feols() - it - does not support non-linear models computed via e.g. fixest::fepois() - or fixest::feglm.", - use_cli_format = TRUE - ) + only_ols_for_fixest_error() } - # if (!is.null(object$is_sunab)) { - # if(object$is_sunab == TRUE){ - # rlang::abort( - # "boottest() does not support the Sun-Abrams - # estimator via `sunab()`." - # ) - # } - # } + is_iv <- ifelse(!is.null(object$fml_all$iv), TRUE, FALSE) has_fe <- ifelse(!is.null(object$fml_all$fixef), TRUE, FALSE) if (!is_iv) { @@ -108,11 +98,7 @@ preprocess2.fixest <- } if (has_weights) { if (!is.null(fe)) { - rlang::abort( - "boottest() unfortunately currently does not support WLS and fixed - effects. Please set fe = NULL to run a bootstrap with WLS.", - use_cli_format = TRUE - ) + no_weights_when_fe_error() } } fixed_effect <- NULL @@ -263,12 +249,7 @@ preprocess2.felm <- rhs = 3 ) ) != "~0") { - rlang::abort( - "IV regression is currently not supported by boottest() for - objects of type 'felm'. You can either use 'fixest::feols()' - or 'ivreg::ivreg' for IV-regression.", - use_cli_format = TRUE - ) + no_iv_for_felm_error() is_iv <- TRUE } X <- model_matrix(object, type = "rhs", collin.rm = TRUE) @@ -283,11 +264,7 @@ preprocess2.felm <- } if (has_weights) { if (!is.null(fe)) { - rlang::abort( - "boottest() unfortunately currently does not support WLS and - fixed effects. Please set fe = NULL to run a bootstrap with WLS.", - use_cli_format = TRUE - ) + no_weights_when_fe_error() } } if (has_fe) { @@ -640,12 +617,7 @@ demean_fe <- function(X, Y, fe, has_weights, N) { levels(fixed_effect_W) <- (1 / table(fe)) # because duplicate levels are forbidden } else { - rlang::abort( - "Currently, boottest() does not jointly support regression weights / - WLS and fixed effects. If you want to use - boottest() for inference based on WLS, please set fe = NULL.", - use_cli_format = TRUE - ) + no_weights_when_fe_error() # levels(fixed_effect_W) <- 1 / table(fixed_effect) } #' @srrstats {G2.4c} *explicit conversion to character via `as.character()` @@ -720,13 +692,10 @@ transform_fe <- } # project out fe if (engine == "R") { - + if(bootstrap_type != "fnw11"){ if(clustid_char != fe){ - rlang::abort("No fixed effects are supported for bootstrap_types - '11', '13', '31', '33'.", - use_cli_format = TRUE - ) + no_fixef_for_fast_reliable_error() } } # WildBootTests.jl does demeaning internally @@ -740,25 +709,41 @@ transform_fe <- W <- prep_fe$W n_fe <- prep_fe$n_fe } - + } else { add_fe <- all_fe add_fe_names <- names(add_fe) fml_fe <- reformulate(add_fe_names, response = NULL) if(engine == "R" && bootstrap_type %in% c("11", "31", "13","33")){ add_fe_dummies <- - Matrix::sparse.model.matrix(fml_fe, model.frame(fml_fe, data = as.data.frame(add_fe))) - + Matrix::sparse.model.matrix( + fml_fe, + model.frame( + fml_fe, + data = as.data.frame( + add_fe + ) + ) + ) + } else { - + add_fe_dummies <- - model.matrix(fml_fe, model.frame(fml_fe, data = as.data.frame(add_fe))) + model.matrix( + fml_fe, + model.frame( + fml_fe, + data = as.data.frame( + add_fe + ) + ) + ) } - + X <- cbind(X, add_fe_dummies) - + } - + res <- list( X = X, Y = Y, @@ -788,9 +773,8 @@ get_cluster <- #' @noRd #' #' @return a list, containing, among other things, a data.frame of the - #' cluster variables, - #' a data.frame of the bootcluster variable(s), and a helper - #' matrix, all_c, used in `engine_julia()` + #' cluster variables, a data.frame of the bootcluster variable(s), + #' and a helper matrix, all_c, used in `engine_julia()` #' @srrstats {G2.4} *Provide appropriate mechanisms to convert between #' different data types, potentially including:* All cluster variables #' are set to factor internally. @@ -803,9 +787,10 @@ get_cluster <- #' In this case, `boottest()` throws an error.* #' @srrstats {G2.15} *Functions should never assume non-missingness, #' and should never pass data with potential missing values to any base - #' routines with default `na.rm = FALSE`-type parameters*. NAs are dropped via - #' the modeling functions (lm, felm, feols, ivreg). NAs in the cluster variables - #' are either dropped here (if dropped in the original model) or throw an error. + #' routines with default `na.rm = FALSE`-type parameters*. NAs are dropped + #' via the modeling functions (lm, felm, feols, ivreg). NAs in the cluster + #' variables are either dropped here (if dropped in the original model) + #' or throw an error. #' @srrstats {G2.4} *Provide appropriate mechanisms to convert between #' different #' data types, potentially including:* All cluster variables are set to @@ -821,11 +806,11 @@ get_cluster <- # no essential changes, but slight reorganization of pieces of code dreamerr::check_arg(clustid_char, "character scalar|charakter vector") dreamerr::check_arg(bootcluster, "character scalar | character vector") - + clustid_fml <- reformulate(clustid_char) # Step 1: create cluster df - - + + manipulate_object <- function(object){ if(inherits(object, "fixest")){ if(!is.null(object$fixef_vars)){ @@ -837,7 +822,7 @@ get_cluster <- object } } - + cluster_tmp <- if ("Formula" %in% loadedNamespaces()) { ## FIXME to suppress potential warnings due to | in Formula @@ -859,7 +844,7 @@ get_cluster <- envir = call_env ) } - + cluster_df <- model.frame(clustid_fml, cluster_tmp, na.action = na.pass) # without cluster intersection @@ -882,7 +867,7 @@ get_cluster <- } else { bootcluster_char <- bootcluster } - + # add bootcluster variable to formula of clusters cluster_bootcluster_fml <- update( @@ -893,8 +878,8 @@ get_cluster <- ) ) ) - - + + cluster_bootcluster_tmp <- if ("Formula" %in% loadedNamespaces()) { ## FIXME to suppress potential warnings due to | in Formula @@ -916,29 +901,25 @@ get_cluster <- envir = call_env ) } - + # data.frame as needed for WildBootTests.jl cluster_bootcluster_df <- model.frame( cluster_bootcluster_fml, cluster_bootcluster_tmp, na.action = na.pass ) - + # data.frames with clusters, bootcluster cluster <- cluster_bootcluster_df[, clustid_char, drop = FALSE] bootcluster <- cluster_bootcluster_df[, bootcluster_char, drop = FALSE] - + if (!any(bootcluster_char %in% clustid_char)) { is_subcluster <- TRUE if (!(any(names(bootcluster) %in% c(clustid_char, names(coef( object )))))) { - rlang::abort( - "A bootcluster variable is neither contained in the cluster - variables nor in the model coefficients.", - use_cli_format = TRUE - ) + bootcluster_neither_cluster_nor_params() } } else { is_subcluster <- FALSE @@ -960,30 +941,16 @@ get_cluster <- bootcluster[unlist(object$obs_selection), , drop = FALSE] } if (NROW(cluster) != N) { - rlang::abort( - "The number of observations in 'cluster' and 'nobs()' do not match", - use_cli_format = TRUE - ) + cluster_nobs_error() } if (NROW(bootcluster) != N) { - rlang::abort( - "The number of observations in 'bootcluster' and 'nobs()' do not match", - use_cli_format = TRUE - ) + bootcluster_nobs_error() } if (any(is.na(cluster))) { - rlang::abort( - "`boottest()` cannot handle NAs in `clustid` variables that are not - part of the estimated model object.", - use_cli_format = TRUE - ) + nas_in_clustid_error() } if (any(is.na(bootcluster))) { - rlang::abort( - "`boottest()` cannot handle NAs in `bootcluster` variables that are - not part of the estimated model object.", - use_cli_format = TRUE - ) + nas_in_bootclustid_error() } clustid_dims <- length(clustid_char) #' @srrstats {G2.4} @@ -1014,7 +981,9 @@ get_cluster <- #' @srrstats {G5.8c} *Data with all-`NA` fields or columns or all identical #' fields or columns* If all cluster vars are NA, this leads to an error. if(any(N_G == 1)){ - stop("A clustering variable contains only one group. This is not allowed.") + stop( + "A clustering variable contains only one group. This is not allowed." + ) } # now do all the other bootcluster things c1 <- diff --git a/R/run_bootstrap.R b/R/run_bootstrap.R index 857fecca..b4794e0d 100644 --- a/R/run_bootstrap.R +++ b/R/run_bootstrap.R @@ -55,8 +55,8 @@ run_bootstrap <- function( #' algorithm globally by using the #' `setBoottest_engine()` function. #' @param preprocess A list: output of the preprocess2 function. - #' @param bootstrap_type Determines which wild cluster bootstrap type should be - #' run. Options are "fnw11","11", "13", "31" and "33" for the wild cluster + #' @param bootstrap_type Determines which wild cluster bootstrap type should + #' be run. Options are "fnw11","11", "13", "31" and "33" for the wild cluster #' bootstrap and "11" and "31" for the heteroskedastic bootstrap. #' For more information, see the details section. "fnw11" is the default for #' the cluster bootstrap, which runs a "11" type @@ -64,7 +64,8 @@ run_bootstrap <- function( #' (Roodman et al (2019)). "11" is the default for the heteroskedastic #' bootstrap. #' @param B number of bootstrap iterations - #' @param point_estimate The constraints vector R times the estimated coefficients, R x beta + #' @param point_estimate The constraints vector R times the + #' estimated coefficients, R x beta #' @param impose_null logical scalar. Should the null be imposed on the #' bootstrap dgp or not? #' @param r Shifts the null hypothesis. @@ -99,8 +100,8 @@ run_bootstrap <- function( #' @param clustid The name of the cluster variables, as a character vector #' @param fe The name of the fixed effect, as a character scalar #' @param R_long The (internally) transformed constraints vector - #' @param heteroskedastic If TRUE, runs the heteroskedastic wild bootstrap. FALSE for - #' all wild cluster bootstrap variants + #' @param heteroskedastic If TRUE, runs the heteroskedastic wild bootstrap. + #' FALSE for all wild cluster bootstrap variants #' @param ssc An object of class `boot_ssc.type` obtained with the function #' [fwildclusterboot::boot_ssc()]. Represents how the small sample #' adjustments are computed. The defaults are `adj = TRUE, fixef.K = "none", @@ -121,8 +122,8 @@ run_bootstrap <- function( #' @param sampling 'dqrng' or 'standard'. If 'dqrng', the 'dqrng' package is #' used for random number generation (when available). If 'standard', #' functions from the 'stats' package are used when available. - #' This argument is mostly a convenience to control random number generation in - #' a wrapper package around `fwildclusterboot`, `wildrwolf`. + #' This argument is mostly a convenience to control random number generation + #' in a wrapper package around `fwildclusterboot`, `wildrwolf`. #' I recommend to use the fast' option. #' @param bootcluster A character vector or rhs formula of length 1. Specifies #' the bootstrap clustering variable or variables. If more @@ -148,14 +149,15 @@ run_bootstrap <- function( #' #' @section Different Bootstrap Implementations / Algorithms: #' \itemize{ - #' \item `boot_algo_textbook_cpp.R`: Implements the heteroskedastic wild bootstrap + #' \item `boot_algo_textbook_cpp.R`: Implements the heteroskedastic wild + #' bootstrap #' (`WildboottestHC`) as well as the textbook wild cluster bootstrap #' (`WildboottestHC`) in (R)cpp. - #' \item `boot_algo_fastnwild.R`: Implements the 'classical' wild cluster bootstrap - #' via the "Fast and Wild" algorithm (Roodman et al, 2019). - #' \item `boot_algo_fastnreliable.R`: Implements the ('classical') as well as 'S', 'C' and - #' 'V' wild cluster bootstrap types following algorithms in - #' "Fast and Reliable" (MacKinnon et al, 2023). + #' \item `boot_algo_fastnwild.R`: Implements the 'classical' wild cluster + #' bootstrap via the "Fast and Wild" algorithm (Roodman et al, 2019). + #' \item `boot_algo_fastnreliable.R`: Implements the ('classical') as + #' well as 'S', 'C' and V' wild cluster bootstrap types following algorithms + #' in "Fast and Reliable" (MacKinnon et al, 2023). #' \item `boot_algo_julia.R`: Wrapper around 'WildBootTests.jl". #' } #' @@ -273,10 +275,7 @@ run_bootstrap <- function( fedfadj <- 0L if (ssc[["fixef.K"]] != "none") { - rlang::inform( - "Currently, boottest() only supports fixef.K = 'none'.", - use_cli_format = TRUE - ) + no_fixef.K_warning() } res <- boot_algo_julia( diff --git a/R/srr-stats-standards.R b/R/srr-stats-standards.R index bd5538de..9fabdd07 100644 --- a/R/srr-stats-standards.R +++ b/R/srr-stats-standards.R @@ -14,152 +14,104 @@ #' time. #' #' @srrstats {G1.3} *All statistical terminology should be clarified and -#' unambiguously defined.* I have prepared a small paper that attempts to -#' clarify all statistical terms, which I am happy to send to reviewers. +#' unambiguously defined.* I have added extensive sets of vignettes, +#' e.g. in this +#' [vignette]( +#' https://s3alfisc.github.io/fwildclusterboot/articles/wild_bootstrap.html). #' #' @srrstats {G1.4} *`roxygen2` is used throughout the package to document -#' all functions.* +#' all functions.* You can open any package or check the docs to verify this. #' #' @srrstats {G1.4a} *All internal (non-exported) functions should also be #' documented in standard [`roxygen2`](https://roxygen2.r-lib.org/) format, #' along with a final `@noRd` tag to suppress automatic generation of `.Rd` -#' files.* Done. +#' files.* Everything is documented or has a noRd tag. #' #' @srrstats {G1.6} *Software should include code necessary to compare #' performance claims with alternative implementations in other R packages.* -#' Is a link to this blog post sufficient? +#' Here is a link to a blog post that demonstrates that the package is fast. +#' But I don't think the package makes any performance claims against other +#' packages, just mentions that it is fast =). #' ://s3alfisc.github.io/blog/post/1000x-faster-wild-cluster-bootstrap #' -inference-in-r-with-fwildclusterboot/ #' #' @srrstats {G2.0a} Provide explicit secondary documentation of any -#' expectations on lengths of inputs. I think this is sufficiently documented. -#' - -#' -#' @srrstats {G2.6} *Software which accepts one-dimensional input should ensure -#' values are appropriately pre-processed regardless of class structures.* (?) -#' different methods are checked via 'check_methods_equivalence.R' -#' -#' @srrstats {G2.7} *Software should accept as input as many of the above -#' standard tabular forms as possible, including extension to domain-specific -#' forms.* No tabular data is passed to boottest(). -#' -#' @srrstats {G2.10} *Software should ensure that extraction or filtering of -#' single columns from tabular inputs should not presume any particular -#' default behaviour, and should ensure all column-extraction operations -#' behave consistently regardless of the class of tabular data used as input.* -#' NA, as no tabular data inputs. -#' -#' @srrstats {G2.11} *Software should ensure that `data.frame`-like tabular -#' objects which have columns which do not themselves have standard class -#' attributes (typically, `vector`) are appropriately processed, and do -#' not error without reason. This behaviour should be tested. Again, columns -#' created by the [`units` package](https://github.com/r-quantities/units/) -#' provide a good test case.* NA, as no tabular input data. -#' -#' @srrstats {G2.12} *Software should ensure that `data.frame`-like tabular -#' objects which have list columns should ensure that those columns are -#' appropriately pre-processed either through being removed, converted to -#' equivalent vector columns where appropriate, or some other appropriate -#' treatment such as an informative error. This behaviour should be tested.* -#' Does not happen. NA, as no tabular input data. +#' expectations on lengths of inputs. - I am not sure where to best place +#' this tag in my codebase. #' -#' -#' @srrstats {G2.14} *see G2.13* -#' -#' @srrstats {G2.14a} *see G2.13* +#' @srrstatsTODO {G2.14a} *error on missing data*. *see G2.13* #' #' @srrstats {G2.14b} *see G2.13* #' #' @srrstats {G2.14c} *see G2.13* #' #' @srrstats {G3.0} *No floating point numbers (rational numbers) are -#' compared with equality.* Yes, checked. +#' compared with equality.* I have verified this. #' #' @srrstats {G3.1} *Statistical software which relies on covariance #' calculations should enable users to choose between different algorithms #' for calculating covariances, and should not rely solely on covariances -#' from the `stats::cov` function.* The package deals with "clustered" -#' standard errors, but does not produce covariance matrices. -#' Non-clustered tests based on the 'regular' wild bootstrap -#' are also supported. +#' from the `stats::cov` function.* +#' The `bootstrap_type` function argument provides this +#' functionality allows users to apply different covariance matrices for the +#' wild cluster bootstrap. #' #' @srrstats {G3.1a} *The ability to use arbitrarily specified covariance #' methods should be documented (typically in examples or vignettes).* -#' See above. -#' -#' @srrstats {G4.0} *Statistical Software which enables outputs to be -#' written to local files should parse parameters specifying file names -#' to ensure appropriate file suffices are automatically generated where -#' not provided.* No output can be written to local files. -#' +#' Documented in [this vignette]( +#' https://s3alfisc.github.io/fwildclusterboot/articles/ +#' Different-Variants-of-the-Wild-Cluster-Bootstrap.html). #' #' @srrstats {G5.2a} *Every message produced within R code by `stop()`, -#' `warning()`, `message()`, or equivalent should be unique* Done. -#' +#' `warning()`, `message()`, or equivalent should be unique*. Errors +#' warnings and messages are unique. #' #' @srrstats {G5.8} **Edge condition tests** *to test that these conditions #' produce expected behaviour such as clear warnings or errors when confronted #' with data with extreme properties including but not limited to:* Tests for #' non-randomness under full enumeration: #' -#' @srrstats {RE1.1} *Regression Software should document how formula interfaces -#' are converted to matrix representations of input data.* Not applicable. Formulas -#' only create scalars. -#' -#' @srrstats {RE1.4} *Regression Software should document any assumptions made -#' with regard to input data; for example distributional assumptions, or -#' assumptions that predictor data have mean values of zero. Implications of -#' violations of these assumptions should be both documented and tested.* -#' The bootstrap weight options are described in a separate vignette article. -#' In general, the wild bootstrap does not make any distributional assumptions -#' for estimation. -#' -#' @srrstats {RE3.1} *Enable such messages to be optionally suppressed, yet -#' should ensure that the resultant model object nevertheless includes -#' sufficient data to identify lack of convergence.* In my experience, -#' convergence is never a problem, but confidence interval inversion -#' sometimes fails. I suspect that I can improve the error messageas :) -#' - #' @srrstats {RE4.4} *The specification of the model, generally as a -#' formula (via `formula()`)* Done as not applicable. -#' -#' -#' @srrstats {RE4.17} *Model objects returned by Regression Software should -#' implement or appropriately extend a default `print` method which provides -#' an on-screen summary of model (input) parameters and (output) coefficients.* -#' Done. -#' +#' formula (via `formula()`)* Not applicable. #' #' @srrstats {RE5.0} *Scaling relationships between sizes of input data -#' (numbers of observations, with potential extension to numbers of -#' variables/columns) and speed of algorithm.* -#' +#' (numbers of observations, with potential extension to numbers of +#' variables/columns) and speed of algorithm.* +#' The asymptotic computational is O(G**2 x B). #' -#' @srrstats {RE6.1} *Where the default `plot` method is **NOT** a generic -#' `plot` method dispatched on the class of return objects (that is, -#' through an S3-type `plot.` function or equivalent), that -#' method dispatch (or equivalent) should nevertheless exist in order -#' to explicitly direct users to the appropriate function.* -#' -#' @srrstats {RE6.2} *The default `plot` method should produce a plot of -#' the `fitted` values of the model, with optional visualisation of confidence -#' intervals or equivalent.* -#' -#' @srrstats {RE6.3} *Where a model object is used to generate a forecast -#' (for example, through a `predict()` method), the default `plot` method -#' should provide clear visual distinction between modelled (interpolated) -#' and forecast (extrapolated) values.* -#' -#' @srrstats {RE7.0} *Tests with noiseless, exact relationships between -#' predictor (independent) data.* +#' @srrstatsTODO {G5.9a} *Adding trivial noise (for example, at the +#' scale of `.Machine$double.eps`) to data does not meaningfully change +#' results* #' #' @noRd + + NULL + #' NA_standards #' +#' +#' @srrstatsNA {RE3.1} *Enable such messages to be optionally suppressed, yet +#' should ensure that the resultant model object nevertheless includes +#' sufficient data to identify lack of convergence.* In my experience, +#' convergence is never a problem, but confidence interval inversion +#' sometimes fails. I suspect that I can improve the error messageas :) +#' +#' @srrstatsNA {RE4.1} *Regression Software may enable an ability to +#' generate a model object without actually fitting values. This may +#' be useful for controlling batch processing of computationally intensive +#' fitting algorithms.* I don't see how this +#' would work and be useful for `boottest()`. +#' +#' @srrstatsNA {RE4.2} *Model coefficients (via `coef()` / `coefficients()`)* +#' Not relevant, provided by the regression package. +#' +#' @srrstatsNA {G4.0} *Statistical Software which enables outputs to be +#' written to local files should parse parameters specifying file names +#' to ensure appropriate file suffices are automatically generated where +#' not provided.* No output can be written to local files. +#' #' Any non-applicable standards can have their tags changed from `@srrstatsTODO` #' to `@srrstatsNA`, and placed together in this block, along with explanations #' for why each of these standards have been deemed not applicable. @@ -182,6 +134,13 @@ NULL #' other routines to ensure inputs follow these expectations.* No input #' assumed to be of type factor. #' +#' @srrstatsNA {G2.6} *Software which accepts one-dimensional input should ensure +#' values are appropriately pre-processed regardless of class structures.* +#' +#' @srrstatsNA {G2.7} *Software should accept as input as many of the above +#' standard tabular forms as possible, including extension to domain-specific +#' forms.* No tabular data is passed to boottest(). +#' #' @srrstatsNA {G2.9} *Software should issue diagnostic messages for type #' conversion in which information is lost (such as conversion of variables #' from factor to character; standardisation of variable names; or removal @@ -190,6 +149,31 @@ NULL #' (such as insertion of variable or column names where none were provided).* #' Type conversion with information loss never happens. #' +#' @srrstatsNA {G2.10} *Software should ensure that extraction or filtering of +#' single columns from tabular inputs should not presume any particular +#' default behaviour, and should ensure all column-extraction operations +#' behave consistently regardless of the class of tabular data used as input.* +#' NA, as no tabular data inputs. +#' +#' @srrstatsNA {G2.11} *Software should ensure that `data.frame`-like tabular +#' objects which have columns which do not themselves have standard class +#' attributes (typically, `vector`) are appropriately processed, and do +#' not error without reason. This behaviour should be tested. Again, columns +#' created by the [`units` package](https://github.com/r-quantities/units/) +#' provide a good test case.* NA, as no tabular input data. +#' +#' @srrstatsNA {G2.12} *Software should ensure that `data.frame`-like tabular +#' objects which have list columns should ensure that those columns are +#' appropriately pre-processed either through being removed, converted to +#' equivalent vector columns where appropriate, or some other appropriate +#' treatment such as an informative error. This behaviour should be tested.* +#' Does not happen. NA, as no tabular input data. +#' +#' @srrstatsNA {G2.14} *Where possible, all functions should provide options +#' for users to specify how to handle missing (`NA`) data, with options +#' minimally including:* This is not applicable. The data is pre-processed +#' in the regression class, there is no need to do so again for `boottest()`. +#' #' @srrstatsNA {G5.3} *For functions which are expected to return objects #' containing no missing (`NA`) or undefined (`NaN`, `Inf`) values, the #' absence of any such values in return objects should be explicitly tested.* @@ -233,10 +217,12 @@ NULL #' implementations is not available* Code is tested against #' WildBootTests.jl / fixest::feols(). #' -#' @srrstatsNA {G5.8a} *Zero-length data* +#' @srrstatsNA {G5.8a} *Zero-length data* - No data is passed to any of the +#' `boottest()` functions. #' #' @srrstatsNA {G5.8b} *Data of unsupported types (e.g., character or complex -#' numbers in for functions designed only for numeric data)* +#' numbers in for functions designed only for numeric data)* No data is passed +#' to `boottest()`. #' #' @srrstatsNA {G5.8d} *Data outside the scope of the algorithm (for example, #' data with more fields (columns) than observations (rows) for some @@ -246,8 +232,10 @@ NULL #' for expected stochastic behaviour, such as through the following #' conditions:* This might not make sense for a bootstrap package? #' -#' @srrstatsNA {G5.9a} *Adding trivial noise (for example, at the scale of -#' `.Machine$double.eps`) to data does not meaningfully change results* +#' @srrstatsNA {RE1.1} *Regression Software should document how formula interfaces +#' are converted to matrix representations of input data.* Not applicable. +#' Formulas only create one dimensional columns (for cluster variables or +#' fixed effect). #' #' @srrstatsNA {RE1.2} *Regression Software should document expected format #' (types or classes) for inputting predictor variables, including descriptions @@ -255,14 +243,9 @@ NULL #' regression package, unless fixed effects are projected out - then #' boottest() might transform them into factors, if required. #' -#' @srrstatsNA {RE1.3} *Regression Software which passes or otherwise -#' transforms aspects of input data onto output structures should ensure -#' that those output structures retain all relevant aspects of input data, -#' notably including row and column names, and potentially information from -#' other `attributes()`.* -#' #' @srrstatsNA {RE1.3a} *Where otherwise relevant information is not #' transferred, this should be explicitly documented.* +#' No relevant info is not retained. #' #' @srrstatsNA {RE2.1} *Regression Software should implement explicit #' parameters controlling the processing of missing values, ideally @@ -310,53 +293,83 @@ NULL #' sense to me :) #' #' @srrstatsNA {RE4.6} *The variance-covariance matrix of the model parameters -#' (via `vcov()`)* Inference based on p-values and t-stats, not vcov's. +#' (via `vcov()`)* Inference based on p-values and t-stats, not vcov's. In +#' fact, `boottest()` is fast because it never explicitly computes a vcov +#' matrix. #' #' @srrstatsNA {RE4.7} *Where appropriate, convergence statistics* I think #' supplying convergence stats for the test inversion procedure would be too -#' much. +#' much. #' #' @srrstatsNA {RE4.8} *Response variables, and associated "metadata" where -#' applicable.* +#' applicable.* Done by the regression package. #' -#' @srrstatsNA {RE4.9} *Modelled values of response variables.* +#' @srrstatsNA {RE4.9} *Modelled values of response variables.* Done by the +#' regression package. #' #' @srrstatsNA {RE4.10} *Model Residuals, including sufficient documentation #' to enable interpretation of residuals, and to enable users to submit -#' residuals to their own tests.* +#' residuals to their own tests.* Unfortunately not possible. I could +#' output all bootstrapped residuals, but I don't really see the use for it? #' #' @srrstatsNA {RE4.11} *Goodness-of-fit and other statistics associated such -#' as effect sizes with model coefficients.* +#' as effect sizes with model coefficients.* Done by the regression model. #' #' @srrstatsNA {RE4.12} *Where appropriate, functions used to transform input -#' data, and associated inverse transform functions.* +#' data, and associated inverse transform functions.*Done by the regression +#' model. #' #' @srrstatsNA {RE4.13} *Predictor variables, and associated "metadata" where -#' applicable.* +#' applicable.*Done by the regression model. #' #' @srrstatsNA {RE4.14} *Where possible, values should also be provided for -#' extrapolation or forecast *errors*.* +#' extrapolation or forecast *errors*. Done by the regression model. #' #' @srrstatsNA {RE4.15} *Sufficient documentation and/or testing should be #' provided to demonstrate that forecast errors, confidence intervals, or -#' equivalent values increase with forecast horizons.* +#' equivalent values increase with forecast horizons.* Done by the regression +#' model. #' #' @srrstatsNA {RE4.16} *Regression Software which models distinct responses #' for different categorical groups should include the ability to submit new -#' groups to `predict()` methods.* +#' groups to `predict()` methods.* Done by the regression model. +#' +#' @srrstatsNA {RE6.1} *Where the default `plot` method is **NOT** a generic +#' `plot` method dispatched on the class of return objects (that is, +#' through an S3-type `plot.` function or equivalent), that +#' method dispatch (or equivalent) should nevertheless exist in order +#' to explicitly direct users to the appropriate function.* +#' Not applicable, as plot method is a generic, and that standard only +#' applies to packages without generic plot methods +#' +#' @srrstatsNA {RE6.2} *The default `plot` method should produce a plot of +#' the `fitted` values of the model, with optional visualisation of confidence +#' intervals or equivalent.* This is not really relevant here, instead +#' plot() visualises the bootstrap results. +#' +#' @srrstatsNA {RE6.3} *Where a model object is used to generate a forecast +#' (for example, through a `predict()` method), the default `plot` method +#' should provide clear visual distinction between modelled (interpolated) +#' and forecast (extrapolated) values.* No forecasting possible with +#' `boottest()` +#' +#' @srrstatsNA {RE7.0} *Tests with noiseless, exact relationships between +#' predictor (independent) data.* #' #' @srrstatsNA {RE7.0a} In particular, these tests should confirm ability to -#' reject perfectly noiseless input data. +#' reject perfectly noiseless input data. Done by the regression model. #' #' @srrstatsNA {RE7.1} *Tests with noiseless, exact relationships between -#' predictor (independent) and response (dependent) data.* +#' predictor (independent) and response (dependent) data.* Done by the +#' regression model. #' #' @srrstatsNA {RE7.1a} *In particular, these tests should confirm that model #' fitting is at least as fast or (preferably) faster than testing with -#' equivalent noisy data (see RE2.4b).* +#' equivalent noisy data (see RE2.4b).* Done by the regression model. #' #' @srrstatsNA {RE7.2} Demonstrate that output objects retain aspects of input -#' data such as row or case names (see **RE1.3**). +#' data such as row or case names (see **RE1.3**). Done by the regression +#' model. #' #' @srrstatsNA {RE7.3} Demonstrate and test expected behaviour when objects #' returned from regression software are submitted to the accessor methods @@ -366,4 +379,4 @@ NULL #' tests should demonstrate and confirm that forecast errors, confidence #' intervals, or equivalent values increase with forecast horizons. -NULL +NULL \ No newline at end of file diff --git a/R/utils.R b/R/utils.R index 18589d6c..91ef7de6 100644 --- a/R/utils.R +++ b/R/utils.R @@ -226,12 +226,7 @@ getBoottest_engine <- function() { x <- getOption("boottest_engine") if (!(x %in% c("R", "WildBootTests.jl"))) { - rlang::abort( - "The value of getOption(\"boottest_engine\") is currently not legal. - Please use function setBoottest_engine to set it to an appropriate - value. ", - use_cli_format = TRUE - ) + boottest_engine_error() } x } @@ -292,14 +287,8 @@ getBoottest_nthreads <- function() { x <- getOption("boottest_nthreads") if (length(x) != 1 || !is.numeric(x) || is.na(x) || x %% 1 != 0 || x < 0) { - rlang::abort("The value of getOption(\"boottest_nthreads\") is currently not legal. - Please use function setBoottest_nthreads to set it to an appropriate - value. ", - use_cli_format = TRUE - ) + boottest_nthreads_error() } - # cat("getBoottest nr threads \n") - # print(x) x } @@ -452,7 +441,8 @@ find_proglang <- function(lang){ #' can be found on the PATH. #' #' Based on Mauro Lepore's great suggestion - #' https://github.com/ropensci/software-review/issues/546#issuecomment-1416728843 + #' https://github.com/ropensci/ + #' software-review/issues/546#issuecomment-1416728843 #' #' @param lang which language to check. Either 'julia' or 'python' #' @@ -470,61 +460,4 @@ find_proglang <- function(lang){ language_found -} - - -inform_seed <- function(frequency_id, engine){ - - if(engine != "WildBootTests.jl"){ - - rlang::inform( - "Too guarantee reproducibility, don't forget to set a - global random seed **both** via `set.seed()` and `dqrng::dqset.seed()`.", - use_cli_format = TRUE, - .frequency = "regularly", - .frequency_id = frequency_id - ) - - } else { - - rlang::inform( - "Too guarantee reproducibility, don't forget to set a - global random seed via `set.seed()`.", - use_cli_format = TRUE, - .frequency = "regularly", - .frequency_id = frequency_id - ) - - } - - - -} - -is_congruent <- function(x, y){ - - #' check if two vectors x and y are congruent - #' @param x a vector - #' @param y another vector of length(y) == length(x) - #' @return A logical. TRUE if the two vectors are congruent - #' @examples - #' x <- c(1, 1, 1, 2, 2, 3, 4, 4, 4) - #' y <- c("a", "a", "a", "b", "b", "b", "b", "b", "b") - #' is_congruent(x, y) - #' @noRd - - dreamerr::check_arg(x,"vector") - dreamerr::check_arg(y, "vector") - - u_x <- unique(x) - u_y <- unique(y) - - if(length(u_x) < length(u_y)){ - rlang::abort("") - } - - tab <- table(x,y) - !any(colSums(tab != 0) != 1) - -} - +} \ No newline at end of file diff --git a/README.Rmd b/README.Rmd index 5737900d..bcdcb883 100644 --- a/README.Rmd +++ b/README.Rmd @@ -1,6 +1,6 @@ --- output: github_document -editor_options: +editor_options: chunk_output_type: console --- @@ -15,14 +15,14 @@ knitr::opts_chunk$set( ) ``` -# fwildclusterboot +# fwildclusterboot [![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/546_status.svg)](https://github.com/ropensci/software-review/issues/546) -[![Lifecycle: maturing](https://img.shields.io/badge/lifecycle-maturing-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html) +[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html) [![CRAN status](https://www.r-pkg.org/badges/version/fwildclusterboot)](https://CRAN.R-project.org/package=fwildclusterboot) ![runiverse-package](https://s3alfisc.r-universe.dev/badges/fwildclusterboot) [![R-CMD-check](https://github.com/s3alfisc/fwildclusterboot/workflows/R-CMD-check/badge.svg)](https://github.com/s3alfisc/fwildclusterboot/actions) @@ -48,14 +48,14 @@ df$`boot algo` <- df$type ``` ```{r, fig.width=10, fig.height=3, echo = FALSE, warning = FALSE, message = FALSE} -ggplot(data = df, aes(x = B, y = time, color = `boot algo`)) + - facet_wrap(~N_G, nrow = 1) + - geom_point() + -scale_y_continuous(trans='log10') + +ggplot(data = df, aes(x = B, y = time, color = `boot algo`)) + + facet_wrap(~N_G, nrow = 1) + + geom_point() + +scale_y_continuous(trans='log10') + labs(title = "Benchmarks", caption = "N = 10_000, k = 20 covariates and one cluster of dimension N_G (3 iterations each, median runtime is plotted).")+ - #theme_bw() + - xlab("Bootstrap iterations") + - ylab("time in seconds, log scale") + + #theme_bw() + + xlab("Bootstrap iterations") + + ylab("time in seconds, log scale") + theme_bw() ``` @@ -76,17 +76,17 @@ Additional features are provided through `WildBootTests.jl`: + A highly optimized version of the Wild Restricted Efficient bootstrap (WRE) for IV/2SLS/LIML [(Davidson & MacKinnon, 2010)](https://www.tandfonline.com/doi/abs/10.1198/jbes.2009.07221). + Arbitrary and multiple linear hypotheses in the parameters. -`{fwildclusterboot}` supports the following models: +`{fwildclusterboot}` supports the following models: + OLS: `lm` (from stats), `fixest` (from fixest), `felm` from (lfe) -+ IV: `ivreg` (from ivreg). ++ IV: `ivreg` (from ivreg). ### Installation You can install compiled versions of`{fwildclusterboot}` from CRAN (compiled), R-universe (compiled) or github by following one of the steps below: ```{r, eval = FALSE} -# from CRAN +# from CRAN install.packages("fwildclusterboot") # from r-universe (windows & mac, compiled R > 4.0 required) @@ -106,7 +106,7 @@ library(fwildclusterboot) # set seed via dqset.seed for engine = "R" & Rademacher, Webb & Normal weights dqrng::dqset.seed(2352342) -# set 'familiar' seed for all other algorithms and weight types +# set 'familiar' seed for all other algorithms and weight types set.seed(23325) data(voters) @@ -118,7 +118,7 @@ lm_boot <- boottest(lm_fit, clustid = c("group_id1"), B = 9999, param = "treatme summary(lm_boot) ``` -## Citation +## Citation If you are in `R`, you can simply run the following command to get the BibTeX citation for `{fwildclusterboot}`: @@ -126,4 +126,4 @@ If you are in `R`, you can simply run the following command to get the BibTeX ci citation("fwildclusterboot") ``` -Alternatively, if you prefer to cite the "Fast & Wild" paper by Roodman et al, it would be great if you mentioned `{fwildclusterboot}` in a footnote =) ! +Alternatively, if you prefer to cite the "Fast & Wild" paper by Roodman et al, it would be great if you mentioned `{fwildclusterboot}` in a footnote =) ! diff --git a/man/boot_aggregate.Rd b/man/boot_aggregate.Rd index 00e0fb25..ea1fa24b 100644 --- a/man/boot_aggregate.Rd +++ b/man/boot_aggregate.Rd @@ -206,26 +206,26 @@ data(base_stagg) res_sunab = feols(y ~ x1 + sunab(year_treated, year) | id + year, base_stagg) res_sunab_3ref = feols(y ~ x1 + sunab( year_treated, year, ref.p = c(.F + 0:2, -1)) | - id + year, + id + year, cluster = "id", - base_stagg, + base_stagg, ssc = ssc(adj = FALSE, cluster.adj = FALSE)) aggregate(res_sunab, agg = "ATT") # test ATT equivalence -boot_att <- +boot_att <- boot_aggregate( - res_sunab, - B = 9999, - agg = "ATT", - clustid = "id" + res_sunab, + B = 9999, + agg = "ATT", + clustid = "id" ) head(boot_att) -#'boot_agg2 <- +#'boot_agg2 <- boot_aggregate( - res_sunab, - B = 99999, + res_sunab, + B = 99999, agg = TRUE, ssc = boot_ssc(adj = FALSE, cluster.adj = FALSE) ) diff --git a/man/boottest.Rd b/man/boottest.Rd index 42f50ace..036701d1 100644 --- a/man/boottest.Rd +++ b/man/boottest.Rd @@ -22,7 +22,8 @@ the fast wild bootstrap algorithm developed in Roodman et al., 2019. \details{ Technical Details For technical details, either take a look at the references below, or check -out the \href{wild_bootstrap.html}{wild (cluster) bootstrap vignette}. +out the +\href{https://s3alfisc.github.io/fwildclusterboot/articles/wild_bootstrap.html}{wild (cluster) bootstrap vignette}. } \section{Setting Seeds}{ diff --git a/man/boottest.felm.Rd b/man/boottest.felm.Rd index 848dc964..bdad2405 100644 --- a/man/boottest.felm.Rd +++ b/man/boottest.felm.Rd @@ -252,19 +252,9 @@ stops the procedure after 10 iterations. \section{Multiple Fixed Effects}{ -If your felm() model contains fixed effects, boottest() will internally convert all fixed -effects but the one specified via the \code{fe} argument to dummy variables. -} - -\section{Run \code{boottest} quietly}{ - -You can suppress all warning and error messages by setting the following global -options: -\code{options(rlib_warning_verbosity = "quiet")} -\code{options(rlib_message_verbosity = "quiet")} -Not that this will turn off all warnings (messages) produced via \code{rlang::warn()} and -\code{rlang::inform()}, which might not be desirable if you use other software build on -\code{rlang}, as e.g. the \code{tidyverse}. +If your felm() model contains fixed effects, boottest() will internally +convert all fixed effects but the one specified via the \code{fe} +argument to dummy variables. } \section{Stata, Julia and Python Implementations}{ diff --git a/man/boottest.fixest.Rd b/man/boottest.fixest.Rd index 994bd124..d803b421 100644 --- a/man/boottest.fixest.Rd +++ b/man/boottest.fixest.Rd @@ -249,19 +249,9 @@ stops the procedure after 10 iterations. \section{Multiple Fixed Effects}{ -If your feols() model contains fixed effects, boottest() will internally convert all fixed -effects but the one specified via the \code{fe} argument to dummy variables. -} - -\section{Run \code{boottest} quietly}{ - -You can suppress all warning and error messages by setting the following global -options: -\code{options(rlib_warning_verbosity = "quiet")} -\code{options(rlib_message_verbosity = "quiet")} -Note that this will turn off all warnings (messages) produced via \code{rlang::warn()} and -\code{rlang::inform()}, which might not be desirable if you use other software build on -\code{rlang}, as e.g. the \code{tidyverse}. +If your feols() model contains fixed effects, boottest() will internally +convert all fixed effects but the one specified via the \code{fe} argument to +dummy variables. } \section{Stata, Julia and Python Implementations}{ diff --git a/man/boottest.ivreg.Rd b/man/boottest.ivreg.Rd index 1590cace..a6f0f6f9 100644 --- a/man/boottest.ivreg.Rd +++ b/man/boottest.ivreg.Rd @@ -167,17 +167,6 @@ To guarantee reproducibility, you need to set a global random seed via \code{set.seed()} } -\section{Run \code{boottest} quietly}{ - -You can suppress all warning and error messages by setting the following global -options: -\code{options(rlib_warning_verbosity = "quiet")} -\code{options(rlib_message_verbosity = "quiet")} -Not that this will turn off all warnings (messages) produced via \code{rlang::warn()} and -\code{rlang::inform()}, which might not be desirable if you use other software build on -\code{rlang}, as e.g. the \code{tidyverse}. -} - \examples{ \dontrun{ requireNamespace("ivreg") diff --git a/man/boottest.lm.Rd b/man/boottest.lm.Rd index d5cef508..9d0082ab 100644 --- a/man/boottest.lm.Rd +++ b/man/boottest.lm.Rd @@ -244,17 +244,6 @@ stops the procedure after 10 iterations. \code{boottest} does not calculate standard errors. } -\section{Run \code{boottest} quietly}{ - -You can suppress all warning and error messages by setting the following global -options: -\code{options(rlib_warning_verbosity = "quiet")} -\code{options(rlib_message_verbosity = "quiet")} -Not that this will turn off all warnings (messages) produced via \code{rlang::warn()} and -\code{rlang::inform()}, which might not be desirable if you use other software build on -\code{rlang}, as e.g. the \code{tidyverse}. -} - \section{Stata, Julia and Python Implementations}{ The fast wild cluster bootstrap algorithms are further implemented in the diff --git a/man/check_boottest_args_plus.Rd b/man/check_boottest_args_plus.Rd new file mode 100644 index 00000000..c0811316 --- /dev/null +++ b/man/check_boottest_args_plus.Rd @@ -0,0 +1,25 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/arg_checks.R +\name{check_boottest_args_plus} +\alias{check_boottest_args_plus} +\title{several checks on the input args for 'boottest()'} +\usage{ +check_boottest_args_plus(object, R, param, sign_level, B, fe = NULL) +} +\arguments{ +\item{object}{an object of type lm, fixest or felm} + +\item{R}{Numeric vector defining the test} + +\item{param}{character vector - the parameters to be tested} + +\item{sign_level}{The significance level} + +\item{B}{the number of bootstrap iterations} + +\item{fe}{NULL or numeric scalar - the fixed effect to be projected +out in the bootstrap} +} +\description{ +several checks on the input args for 'boottest()' +} diff --git a/man/find_proglang.Rd b/man/find_proglang.Rd index 218be7e1..ed20b820 100644 --- a/man/find_proglang.Rd +++ b/man/find_proglang.Rd @@ -15,7 +15,8 @@ logical. TRUE if lang is found on path, FALSE if not } \description{ Based on Mauro Lepore's great suggestion -https://github.com/ropensci/software-review/issues/546#issuecomment-1416728843 +https://github.com/ropensci/ +software-review/issues/546#issuecomment-1416728843 } \examples{ diff --git a/man/mboottest.felm.Rd b/man/mboottest.felm.Rd index d799da7a..ccb30843 100644 --- a/man/mboottest.felm.Rd +++ b/man/mboottest.felm.Rd @@ -145,19 +145,9 @@ set a global random seed via \code{set.seed()} when using \section{Multiple Fixed Effects}{ -If your felm() model contains fixed effects, boottest() will internally convert -all fixed effects but the one specified via the \code{fe} argument to dummy variables. -} - -\section{Run \code{boottest} quietly}{ - -You can suppress all warning and error messages by setting the following global -options: -\code{options(rlib_warning_verbosity = "quiet")} -\code{options(rlib_message_verbosity = "quiet")} -Not that this will turn off all warnings (messages) produced via \code{rlang::warn()} and -\code{rlang::inform()}, which might not be desirable if you use other software build on -\code{rlang}, as e.g. the \code{tidyverse}. +If your felm() model contains fixed effects, boottest() will internally +convert all fixed effects but the one specified via the \code{fe} +argument to dummy variables. } \examples{ diff --git a/man/mboottest.fixest.Rd b/man/mboottest.fixest.Rd index 48f49c50..a63cc007 100644 --- a/man/mboottest.fixest.Rd +++ b/man/mboottest.fixest.Rd @@ -145,19 +145,9 @@ set a global random seed via\code{set.seed()} \section{Multiple Fixed Effects}{ -If your feols() model contains fixed effects, boottest() will internally convert all fixed -effects but the one specified via the \code{fe} argument to dummy variables. -} - -\section{Run \code{boottest} quietly}{ - -You can suppress all warning and error messages by setting the following global -options: -\code{options(rlib_warning_verbosity = "quiet")} -\code{options(rlib_message_verbosity = "quiet")} -Not that this will turn off all warnings (messages) produced via \code{rlang::warn()} and -\code{rlang::inform()}, which might not be desirable if you use other software build on -\code{rlang}, as e.g. the \code{tidyverse}. +If your feols() model contains fixed effects, boottest() will internally +convert all fixed effects but the one specified via the \code{fe} argument to +dummy variables. } \examples{ diff --git a/man/mboottest.lm.Rd b/man/mboottest.lm.Rd index 0988903e..231b47b9 100644 --- a/man/mboottest.lm.Rd +++ b/man/mboottest.lm.Rd @@ -137,17 +137,6 @@ To guarantee reproducibility, you need to set a global random seed via \code{set.seed()} } -\section{Run \code{boottest} quietly}{ - -You can suppress all warning and error messages by setting the following global -options: -\code{options(rlib_warning_verbosity = "quiet")} -\code{options(rlib_message_verbosity = "quiet")} -Note that this will turn off all warnings (messages) produced via \code{rlang::warn()} and -\code{rlang::inform()}, which might not be desirable if you use other software build on -\code{rlang}, as e.g. the \code{tidyverse}. -} - \examples{ \dontrun{ requireNamespace("clubSandwich") diff --git a/man/plot.boottest.Rd b/man/plot.boottest.Rd index e3ee133c..73143f18 100644 --- a/man/plot.boottest.Rd +++ b/man/plot.boottest.Rd @@ -2,9 +2,10 @@ % Please edit documentation in R/methods.R \name{plot.boottest} \alias{plot.boottest} -\title{Plots bootstrapped p-values as a function of the hypothesized effect size r for -a hypothesis test of the form R beta = r.The points where the p-values are 0.05 -are the boundaries of the bootstrapped confidence interval.} +\title{Plots bootstrapped p-values as a function of the hypothesized effect +size r for a hypothesis test of the form R beta = r.The points +where the p-values are 0.05 are the boundaries of the bootstrapped +confidence interval.} \usage{ \method{plot}{boottest}(x, ...) } @@ -17,9 +18,10 @@ are the boundaries of the bootstrapped confidence interval.} A plot of bootstrap t-statistics under different null hypotheses } \description{ -Plots bootstrapped p-values as a function of the hypothesized effect size r for -a hypothesis test of the form R beta = r.The points where the p-values are 0.05 -are the boundaries of the bootstrapped confidence interval. +Plots bootstrapped p-values as a function of the hypothesized effect +size r for a hypothesis test of the form R beta = r.The points +where the p-values are 0.05 are the boundaries of the bootstrapped +confidence interval. } \examples{ requireNamespace("fwildclusterboot") diff --git a/man/print.boottest.Rd b/man/print.boottest.Rd index f725cab1..4b28ebb7 100644 --- a/man/print.boottest.Rd +++ b/man/print.boottest.Rd @@ -33,5 +33,4 @@ boot <- boottest(lm_fit, clustid = "group_id1" ) print(boot) - } diff --git a/tests/testthat/test-convergence.R b/tests/testthat/test-convergence.R index a5917b1d..13841909 100644 --- a/tests/testthat/test-convergence.R +++ b/tests/testthat/test-convergence.R @@ -1,19 +1,21 @@ test_that("algorithm performance test", { - #' @srrstats {G5.7} **Algorithm performance tests** *to test that implementation + #' @srrstats {G5.7} **Algorithm performance tests** *to test that + #' implementation #' performs as expected as properties of data change. For instance, a #' test may show that parameters approach correct estimates within tolerance #' as data size increases, or that convergence times decrease for higher #' convergence thresholds.* See test-convergence.R, which tests if the #' bootstrap converges in probability for B -> infinity. - #' @srrstats {G5.9b} *Running under different random seeds or initial conditions - #' does not meaningfully change results* See the convergence tests. + #' @srrstats {G5.9b} *Running under different random seeds or initial + #' conditions + #' does not meaningfully change results* See the convergence tests. skip_on_cran() - skip_on_ci() + #skip_on_ci() skip_if_not( find_proglang("julia"), diff --git a/tests/testthat/test-error_warning.R b/tests/testthat/test-error_warning.R index e53bcaf6..803c887c 100644 --- a/tests/testthat/test-error_warning.R +++ b/tests/testthat/test-error_warning.R @@ -241,7 +241,7 @@ test_that("errors and warnings q = 1", { # rademacher enumeration case if (engine != "R-lean") { - suppressWarnings(expect_warning( + suppressWarnings(expect_message( boottest( object = lm_fit, clustid = "group_id1", @@ -251,7 +251,7 @@ test_that("errors and warnings q = 1", { engine = engine ) )) - suppressWarnings(expect_warning( + suppressWarnings(expect_message( boottest( object = feols_fit, clustid = "group_id1", @@ -261,7 +261,7 @@ test_that("errors and warnings q = 1", { engine = engine ) )) - suppressWarnings(expect_warning( + suppressWarnings(expect_message( boottest( object = felm_fit, clustid = "group_id1", @@ -273,7 +273,7 @@ test_that("errors and warnings q = 1", { )) } - suppressWarnings(expect_warning( + suppressWarnings(expect_message( boottest( object = lm_fit, clustid = "group_id1", @@ -722,7 +722,7 @@ test_that("errors and warnings q = 1", { ) # no confidence intervals calculated: expect warning - expect_warning( + expect_message( boottest( object = lm_fit, clustid = "group_id1", @@ -1468,7 +1468,7 @@ test_that("error warning IV/WRE and q > 1", { )) # enumeration warning - expect_warning(suppressMessages( + expect_message( boottest( object = ivreg_fit, clustid = "ethnicity", @@ -1477,7 +1477,7 @@ test_that("error warning IV/WRE and q > 1", { type = "rademacher", conf_int = FALSE ) - )) + ) # drop all NA values from SchoolingReturns # SchoolingReturns <- diff --git a/tests/testthat/test-full-enumeration.R b/tests/testthat/test-full-enumeration.R index 6f87f8ee..22f46155 100644 --- a/tests/testthat/test-full-enumeration.R +++ b/tests/testthat/test-full-enumeration.R @@ -107,7 +107,8 @@ test_that("test full enumeration cases: r and r-lean", { # note: difference in p-values due to discrete jumps: # t-statistics are not directly identical (discrepancies at order e-14) - # therefore, the actual t-statistic t can be smaller, larger, or lie within + # therefore, the actual t-statistic t can be smaller, larger, or lie + # within # the bootstrapped test statisics for which all weights are 1 or -1 # as there are only 2^N_G_bootcluster t-stats and only # 2^(N_G_bootcluster -1 ) @@ -117,7 +118,8 @@ test_that("test full enumeration cases: r and r-lean", { # and depending where t lies (smaller, larger, or within), the p-value can # be x / (2^N_G_bootcluster), (x-1) / (2^N_G_bootcluster), # (x+1) / (2^N_G_bootcluster) - # I therefore set the tolerance to 1 / (2^N_G_bootcluster) for all p-values + # I therefore set the tolerance to 1 / (2^N_G_bootcluster) for all + # p-values # see sum(sort(boot_r$t_boot) - sort(boot_jl$t_boot)) != 0L diff --git a/tests/testthat/test-method_equivalence.R b/tests/testthat/test-method_equivalence.R index 072a772b..97e09624 100644 --- a/tests/testthat/test-method_equivalence.R +++ b/tests/testthat/test-method_equivalence.R @@ -6,22 +6,22 @@ test_that("Do different, but equivalent ways to specify message = "skip test as julia installation not found." ) - set.seed(2351) - dqrng::dqset.seed(2351) + set.seed(23351) + dqrng::dqset.seed(23513) print_results <- FALSE data1 <<- fwildclusterboot:::create_data( - N = 10000, + N = 14000, N_G1 = 20, icc1 = 0.01, N_G2 = 10, icc2 = 0.01, numb_fe1 = 10, numb_fe2 = 10, - seed = 71986045 + seed = 7198045 ) # sapplydata1, class) @@ -188,7 +188,7 @@ test_that("Do different, but equivalent ways to specify # boottest() cat("boottest()", "\n") set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_lm_", engine), suppressWarnings( @@ -207,7 +207,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest1_", engine), suppressWarnings( @@ -226,7 +226,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest2_", engine), suppressWarnings( @@ -245,7 +245,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest3_", engine), suppressWarnings( @@ -264,7 +264,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest4_", engine), suppressWarnings( @@ -283,7 +283,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest5_", engine), suppressWarnings( @@ -302,7 +302,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest6_", engine), suppressWarnings( @@ -321,7 +321,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest7_", engine), suppressWarnings( @@ -340,7 +340,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest8_", engine), suppressWarnings( @@ -359,7 +359,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest9_", engine), suppressWarnings( @@ -378,7 +378,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest10_", engine), suppressWarnings( @@ -397,7 +397,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest11_", engine), suppressWarnings( @@ -416,7 +416,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest12_", engine), suppressWarnings( @@ -435,7 +435,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest13_", engine), suppressWarnings( @@ -454,7 +454,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest6fe_", engine), suppressWarnings( @@ -474,7 +474,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest7fe_", engine), suppressWarnings( @@ -494,7 +494,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest8fe_", engine), suppressWarnings( @@ -514,7 +514,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest9fe_", engine), suppressWarnings( @@ -534,7 +534,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest10fe_", engine), suppressWarnings( @@ -554,7 +554,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) # why suddenly fe = Q2_defense? Should give the same models assign(paste0("boot_fixest11fe_", engine), @@ -575,7 +575,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest12fe_", engine), suppressWarnings( @@ -595,7 +595,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_fixest13fe_", engine), suppressWarnings( @@ -615,7 +615,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_felm1_", engine), suppressWarnings( @@ -634,7 +634,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_felm2_", engine), suppressWarnings( @@ -653,7 +653,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_felm3_", engine), suppressWarnings( @@ -672,7 +672,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_felm4_", engine), suppressWarnings( @@ -691,7 +691,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_felm5_", engine), suppressWarnings( @@ -710,7 +710,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_felm6_", engine), suppressWarnings( @@ -729,7 +729,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_felm4fe_", engine), suppressWarnings( @@ -749,7 +749,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_felm5fe_", engine), suppressWarnings( @@ -769,7 +769,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_felm6fe_", engine), suppressWarnings( @@ -789,7 +789,7 @@ test_that("Do different, but equivalent ways to specify ) set.seed(8965) - dqrng::dqset.seed(765) + dqrng::dqset.seed(12111221) assign(paste0("boot_felm7fe_", engine), suppressWarnings( @@ -813,7 +813,7 @@ test_that("Do different, but equivalent ways to specify if (engine == "WildBootTests.jl") { cat("mboottest()", "\n") - set.seed(86908) + set.seed(1231) assign(paste0("wboot_lm_", engine), suppressWarnings( mboottest( @@ -828,7 +828,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_fixest1_", engine), suppressWarnings( @@ -843,7 +843,7 @@ test_that("Do different, but equivalent ways to specify ), envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_fixest2_", engine), suppressWarnings( @@ -859,7 +859,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_fixest3_", engine), suppressWarnings( @@ -875,7 +875,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_fixest4_", engine), suppressWarnings( @@ -891,7 +891,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_fixest5_", engine), suppressWarnings( @@ -910,7 +910,7 @@ test_that("Do different, but equivalent ways to specify # effects, so R needs to be of dimension of # length(names(coef(object))) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_fixest6_", engine), suppressWarnings( @@ -926,7 +926,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_fixest7_", engine), suppressWarnings( @@ -942,7 +942,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_fixest8_", engine), suppressWarnings( @@ -958,7 +958,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_fixest9_", engine), suppressWarnings( @@ -974,7 +974,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign( paste0("wboot_fixest10_", engine), @@ -991,7 +991,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign( paste0("wboot_fixest6fe_", engine), @@ -1009,7 +1009,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign( paste0("wboot_fixest7fe_", engine), @@ -1027,7 +1027,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign( paste0("wboot_fixest8fe_", engine), @@ -1045,7 +1045,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign( paste0("wboot_fixest9fe_", engine), @@ -1063,7 +1063,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign( paste0("wboot_fixest10fe_", engine), @@ -1081,7 +1081,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) # why suddenly fe = Q2_defense? Should give the same models assign( @@ -1100,7 +1100,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_felm1_", engine), suppressWarnings( @@ -1116,7 +1116,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_felm2_", engine), suppressWarnings( @@ -1132,7 +1132,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_felm3_", engine), suppressWarnings( @@ -1148,7 +1148,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_felm4_", engine), suppressWarnings( @@ -1164,7 +1164,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_felm5_", engine), suppressWarnings( @@ -1180,7 +1180,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_felm6_", engine), suppressWarnings( @@ -1196,7 +1196,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_felm4fe_", engine), suppressWarnings( @@ -1213,7 +1213,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_felm5fe_", engine), suppressWarnings( @@ -1230,7 +1230,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_felm6fe_", engine), suppressWarnings( @@ -1247,7 +1247,7 @@ test_that("Do different, but equivalent ways to specify envir = .GlobalEnv ) - set.seed(86908) + set.seed(1231) assign(paste0("wboot_felm7fe_", engine), suppressWarnings( diff --git a/tests/testthat/test-new-bootstrap-variants.R b/tests/testthat/test-new-bootstrap-variants.R index 8f4bfec5..3428512a 100644 --- a/tests/testthat/test-new-bootstrap-variants.R +++ b/tests/testthat/test-new-bootstrap-variants.R @@ -22,7 +22,8 @@ test_that("test r-fnw vs r-, stochastic", { #' [RStata](https://github.com/lbraglia/RStata)).* Several correctness #' tests are implemented. First, it is tested if the non-bootstrapped #' t-statistics - #' produced via boottest() *exactly* match those computed by the fixest package + #' produced via boottest() *exactly* match those computed by the fixest + #' package #' (see test_tstat_equivalence). Second, `fwildclusterboot` is heavily tested #' against `WildBootTests.jl` - see "test-r-vs-julia". Last, multiple R #' implementations of the WCB are tested against each other. @@ -520,7 +521,8 @@ test_that("test cluster fixed effects", { weights = 1:N / N ) - feols_fit <- fixest::feols(proposition_vote ~ treatment + log_income | group_id1 + Q2_defense, + feols_fit <- fixest::feols( + proposition_vote ~ treatment + log_income | group_id1 + Q2_defense, data = data1 ) diff --git a/tests/testthat/test-r-vs-julia.R b/tests/testthat/test-r-vs-julia.R index d7ba1f4b..e09e4481 100644 --- a/tests/testthat/test-r-vs-julia.R +++ b/tests/testthat/test-r-vs-julia.R @@ -29,7 +29,7 @@ test_that("test r against Julia I: stochastic tests", { skip_on_cran() - skip_on_ci() + #skip_on_ci() skip_if_not( find_proglang("julia"), message = "skip test as julia installation not found." diff --git a/tests/testthat/test-seed.R b/tests/testthat/test-seed.R index 5991b760..5587177c 100644 --- a/tests/testthat/test-seed.R +++ b/tests/testthat/test-seed.R @@ -11,8 +11,12 @@ test_that("seed works for OLS", { #' @srrstats {G5.0} *Where applicable or practicable, tests should use #' standard data sets with known properties (for example, the #' [NIST Standard Reference Datasets](https://www.itl.nist.gov/div898/strd/), - #' or data sets provided by other widely-used R packages).* All tests are based - #' on random data sets. + #' or data sets provided by other widely-used R packages).* All tests are + #' based on random data sets. + #' + #' @srrstats {G5.1} *Data sets created within, and used to test, a package + #' should be exported (or otherwise made generally available) so + #' that users can confirm tests and run examples.* data1 <<- fwildclusterboot:::create_data( diff --git a/tests/testthat/test_sunab.R b/tests/testthat/test_sunab.R index ba29ecbf..88e42152 100644 --- a/tests/testthat/test_sunab.R +++ b/tests/testthat/test_sunab.R @@ -1,5 +1,5 @@ test_that("test sunab", { - + skip_if_not( find_proglang("julia"), @@ -16,7 +16,7 @@ test_that("test sunab", { cluster = "id", base_stagg, ssc = fixest::ssc(adj = FALSE, fixef.K = "none", cluster.adj = FALSE) - + ) res_sunab_3ref <- fixest::feols( @@ -55,7 +55,7 @@ test_that("test sunab", { ignore_attr = TRUE, tolerance = 0.01 ) - + expect_equal( fixest_teststat[2], boot_att[, 2], @@ -67,7 +67,7 @@ test_that("test sunab", { fixest_ci <- as.matrix(confint(summary(res_sunab, agg = TRUE))) fixest_pval <- as.vector(pvalue(summary(res_sunab, agg = TRUE))) fixest_teststat <- as.vector(tstat(summary(res_sunab, agg = TRUE))) - + boot_year <- boot_aggregate( res_sunab, @@ -90,35 +90,35 @@ test_that("test sunab", { ignore_attr = TRUE, tolerance = 0.065 ) - + expect_equal( fixest_teststat[-1], boot_year[, "t value"], ignore_attr = TRUE ) - - + + }) test_that("test sunab hetero", { - + # as it takes too long ... skip_on_cran() - skip_on_ci() - + #skip_on_ci() + library(fixest) # Simple DiD example base_stagg <- fixest::base_stagg - + # The DiD estimation res_sunab <- fixest::feols( - y ~ x1 + fixest:::sunab(year_treated, year) | id + year, base_stagg, + y ~ x1 + fixest:::sunab(year_treated, year) | id + year, base_stagg, se = "hetero", ssc = fixest::ssc(adj = FALSE, cluster.adj = FALSE) ) - + res_sunab_3ref <- fixest::feols( y ~ x1 + fixest:::sunab(year_treated, year, ref.p = c(.F + 0:2, -1)) | id + year, @@ -126,12 +126,12 @@ test_that("test sunab hetero", { base_stagg, ssc = fixest::ssc(adj = FALSE, cluster.adj = FALSE) ) - + fixest_att <- aggregate(res_sunab, agg = "ATT") fixest_ci <- as.matrix(confint(summary(res_sunab, agg = "ATT"))) fixest_pval <- as.vector(pvalue(summary(res_sunab, agg = "ATT"))) fixest_teststat <- as.vector(tstat(summary(res_sunab, agg = "ATT"))) - + # test ATT equivalence boot_att <- boot_aggregate( @@ -141,55 +141,55 @@ test_that("test sunab hetero", { ssc = boot_ssc(adj = FALSE, cluster.adj = FALSE), nthreads = 1 ) - + # expect_equal( # fixest_ci["ATT", ], # boot_att[, 4:5], # ignore_attr = TRUE, # tolerance = 0.03 # ) - + expect_equal( fixest_pval[2], boot_att[, 3], ignore_attr = TRUE, tolerance = 0.01 ) - + expect_equal( fixest_teststat[2], boot_att[, 2], ignore_attr = TRUE ) - - + + # effects aggregated by year fixest_ci <- as.matrix(confint(summary(res_sunab, agg = TRUE))) fixest_pval <- as.vector(pvalue(summary(res_sunab, agg = TRUE))) fixest_teststat <- as.vector(tstat(summary(res_sunab, agg = TRUE))) - + boot_year <- boot_aggregate( res_sunab, B = 9999, agg = TRUE, - ssc = boot_ssc(adj = FALSE, cluster.adj = FALSE), + ssc = boot_ssc(adj = FALSE, cluster.adj = FALSE), nthreads = 1 ) - - + + expect_equal( fixest_pval[-1], boot_year[, "Pr(>|t|)"], ignore_attr = TRUE, tolerance = 0.1 ) - + expect_equal( fixest_teststat[-1], boot_year[, "t value"], ignore_attr = TRUE ) - - + + }) diff --git a/tests/testthat/test_tidy.R b/tests/testthat/test_tidy.R index 31538219..788338f3 100644 --- a/tests/testthat/test_tidy.R +++ b/tests/testthat/test_tidy.R @@ -58,7 +58,7 @@ test_that("test tidiers with q = 1", { test_that("test tidiers with q > 1", { skip_on_cran() - skip_on_ci() + #skip_on_ci() skip_if_not( find_proglang("julia"), message = "skip test as julia installation not found." diff --git a/vignettes/Literature.Rmd b/vignettes/Literature.Rmd index 5191b887..023d6af5 100644 --- a/vignettes/Literature.Rmd +++ b/vignettes/Literature.Rmd @@ -16,16 +16,16 @@ knitr::opts_chunk$set( ## Academic Papers -+ [Roodman, MacKinnon, Nielsen & Webb - Fast and Wild Bootstrap Inference (Stata Journal), 2019](https://journals.sagepub.com/doi/pdf/10.1177/1536867X19830877). The paper to read if you think that the performance of `fwildclusterboot` resembles black magic. Introduces `boottest` - the Stata software package `fwildclusterboot` is modeled after - and contains a great introduction to (almost) all features of the wild cluster bootstrap implemented in `fwildclusterboot`. ++ [Roodman, MacKinnon, Nielsen & Webb - Fast and Wild Bootstrap Inference (Stata Journal), 2019](https://journals.sagepub.com/doi/pdf/10.1177/1536867X19830877). The paper to read if you think that the performance of `fwildclusterboot` resembles black magic. Introduces `boottest` - the Stata software package `fwildclusterboot` is modeled after - and contains a great introduction to (almost) all features of the wild cluster bootstrap implemented in `fwildclusterboot`. + [MacKinnon, Nielsen & Webb - Fast and Reliable Jackknife and Bootstrap -Methods for Cluster-Robust Inference (Journal of Applied Econometrics), 2023](https://arxiv.org/pdf/2301.04527.pdf) Introduces the "31", "33", "13" bootstrap types, explains how to compute them quickly, and contrasts their empirical performance in Monte Carlo studies. Argues in favour of the "31" method. +Methods for Cluster-Robust Inference (Journal of Applied Econometrics), 2023](https://arxiv.org/pdf/2301.04527.pdf) Introduces the "31", "33", "13" bootstrap types, explains how to compute them quickly, and contrasts their empirical performance in Monte Carlo studies. Argues in favour of the "31" method. -+ [MacKinnon - Fast cluster bootstrap methods for linear regression models (Econometrics and Statistics, 2021)](https://www.econstor.eu/bitstream/10419/247206/1/qed-wp-1465.pdf) Discusses computational tricks for speeding up wild cluster bootstrap inference. Further provides a nice discussion of (bootstrap) test inversion to compute confidence intervals. ++ [MacKinnon - Fast cluster bootstrap methods for linear regression models (Econometrics and Statistics, 2021)](https://www.econstor.eu/bitstream/10419/247206/1/qed-wp-1465.pdf) Discusses computational tricks for speeding up wild cluster bootstrap inference. Further provides a nice discussion of (bootstrap) test inversion to compute confidence intervals. + [MacKinnon, Nielsen & Webb - Cluster-robust inference: A guide to empirical practice (Journal of Econometrics), 2023](https://www.sciencedirect.com/science/article/pii/S0304407622000781) Broad introduction and state-of-the-art literature survey of concepts around clustered errors. -+ [Webb - Reworking wild bootstrap based inference for clustered errors (forthcoming at Canadian Journal of Economics)](https://ideas.repec.org/p/qed/wpaper/1315.html) Introduces "Webb" weights, which are the recommended wild bootstrap weight type when the number of clusters are very small. ++ [Webb - Reworking wild bootstrap based inference for clustered errors (forthcoming at Canadian Journal of Economics)](https://ideas.repec.org/p/qed/wpaper/1315.html) Introduces "Webb" weights, which are the recommended wild bootstrap weight type when the number of clusters are very small. + [MacKinnon & Webb - The wild bootstrap for few (treated) clusters (Econometrics Journal), 2018](https://academic.oup.com/ectj/article-abstract/21/2/114/5078969) Introduces the subcluster bootstrap for regressions with few treated clusters (e.g. difference-in-differences regressions with one treated cluster). @@ -33,18 +33,16 @@ Methods for Cluster-Robust Inference (Journal of Applied Econometrics), 2023](ht + [Davidson & MacKinnon - Wild bootstrap tests for IV regression (Journal of Economic and Business Statistics), 2010](https://www.tandfonline.com/doi/abs/10.1198/jbes.2009.07221) Introduces the WRE bootstrap / wild cluster bootstrap for instrumental variables regression. -+ [Cameron, Gelbach & Miller - Bootstrap-based improvements for inference with clustered errors (Review of Economics and Statistics)](https://www.nber.org/system/files/working_papers/t0344/t0344.pdf) The paper that started the literature on wild cluster bootstrap inference. Simulation evidence that the wild cluster bootstrap works remarkably well when there are only few clusters. ++ [Cameron, Gelbach & Miller - Bootstrap-based improvements for inference with clustered errors (Review of Economics and Statistics)](https://www.econstor.eu/bitstream/10419/31404/1/514886757.pdf) The paper that started the literature on wild cluster bootstrap inference. Simulation evidence that the wild cluster bootstrap works remarkably well when there are only few clusters. -+ [Flachaire - Bootstrapping heteroskedastic regression models: wild bootstrap vs. pairs bootstrap (Computational Statistics & Data Analysis), 2005](https://shs.hal.science/file/index/docid/175910/filename/Flachaire_03a.pdf) Provides simulation evidence of the performance of *heteroskedastic* wild bootstrap procedures. ++ [Flachaire - Bootstrapping heteroskedastic regression models: wild bootstrap vs. pairs bootstrap (Computational Statistics & Data Analysis), 2005](https://shs.hal.science/file/index/docid/175910/filename/Flachaire_03a.pdf) Provides simulation evidence of the performance of *heteroskedastic* wild bootstrap procedures. + [MacKinnon - Thirty Years of Heteroskedasticity-Robust Inference, 2012](https://www.econ.queensu.ca/sites/econ.queensu.ca/files/qed_wp_1268.pdf) More simulation evidence on the performance of the HC1-HC3 vcov estimators vs the wild bootstrap. -+ [Joshi, Megha, James E. Pustejovsky, and S. Natasha Beretvas - "Cluster wild bootstrapping to handle dependent effect sizes in meta‐analysis with a small number of studies." (Research Synthesis Methods), 2022](https://onlinelibrary.wiley.com/doi/abs/10.1002/jrsm.1554) Nice simulations on the empirical performance of the wild cluster bootstrap for tests of multiple joint hypotheses. - -+ [Kline & Santos - A Score Based Approach to Wild Bootstrap Inference (Journal of Econometric Methods), 2012](https://www.nber.org/system/files/working_papers/w16127/w16127.pdf) Introduces a score based wild bootstrap for non-linear regression models. ++ [Joshi, Megha, James E. Pustejovsky, and S. Natasha Beretvas - "Cluster wild bootstrapping to handle dependent effect sizes in meta‐analysis with a small number of studies." (Research Synthesis Methods), 2022](https://onlinelibrary.wiley.com/doi/abs/10.1002/jrsm.1554) Nice simulations on the empirical performance of the wild cluster bootstrap for tests of multiple joint hypotheses. -## Links, blogposts, etc - -[Stata blog - Heteroskedasticity robust standard errors: Some practical considerations](https://blog.stata.com/2022/10/06/heteroskedasticity-robust-standard-errors-some-practical-considerations/) Extensive simulations on small sample properties of HC estimators, inlcuding the wild bootstrap ++ [Kline & Santos - A Score Based Approach to Wild Bootstrap Inference (Journal of Econometric Methods), 2012](https://eml.berkeley.edu/~pkline/papers/ScoreFinal_web.pdf) Introduces a score based wild bootstrap for non-linear regression models. +## Links, blogposts, etc +[Stata blog - Heteroskedasticity robust standard errors: Some practical considerations](https://blog.stata.com/2022/10/06/heteroskedasticity-robust-standard-errors-some-practical-considerations/) Extensive simulations on small sample properties of HC estimators, inlcuding the wild bootstrap \ No newline at end of file diff --git a/vignettes/articles/Multiple-Estimations-with-fixest.Rmd b/vignettes/articles/Multiple-Estimations-with-fixest.Rmd index 42fe73fd..11dea383 100644 --- a/vignettes/articles/Multiple-Estimations-with-fixest.Rmd +++ b/vignettes/articles/Multiple-Estimations-with-fixest.Rmd @@ -123,7 +123,8 @@ boot1 <- boottest(feols_fit2, param = "treatment", clustid = "group_id1" ) -# Error in `check_boottest_args_plus()` at fwildclusterboot/R/boottest_fixest.R:514:2: +# Error in `check_boottest_args_plus()` at +# fwildclusterboot/R/boottest_fixest.R:514:2: # ! Advanced formula notation in fixest / fixest via ^ to interact # fixed effects is currently not supported in boottest(). @@ -142,7 +143,8 @@ boot <- boottest(feols_fit3, param = "treatment", clustid = "group_id1" ) -# Error in `check_boottest_args_plus()` at fwildclusterboot/R/boottest_fixest.R:514:2: +# Error in `check_boottest_args_plus()` at +# fwildclusterboot/R/boottest_fixest.R:514:2: # ! Varying slopes in fixest / fixest via [] to interact # fixed effects is currently not supported in boottest(). diff --git a/vignettes/articles/wild_bootstrap.Rmd b/vignettes/articles/wild_bootstrap.Rmd index e8f46e77..38000418 100644 --- a/vignettes/articles/wild_bootstrap.Rmd +++ b/vignettes/articles/wild_bootstrap.Rmd @@ -13,6 +13,13 @@ knitr::opts_chunk$set( ### Wild Cluster Bootstrap 101 +#' @srrstatsNA {RE1.4} *Regression Software should document any assumptions made +#' with regard to input data; for example distributional assumptions, or +#' assumptions that predictor data have mean values of zero. Implications of +#' violations of these assumptions should be both documented and tested.* +#' The wild bootstrap does not make any distributional assumptions +#' for estimation beyond the assumption of a linear regression model. + We are interested in testing a linear hypothesis $H_0: \beta_{j} = 0$ against $H_1: \beta_{j} \neq 0$ for a linear model of interest @@ -43,7 +50,7 @@ u_{G} with $E(u|X) = 0$, where group $g$ contains $N_{g}$ observations so that $N = \sum_{g = 1}^{G} N_{g}$. The regression residuals $u$ are allowed to be correlated within clusters, but are assumed to be uncorrelated across clusters. In consequence, the models' covariance matrix is block diagonal. For each cluster, we denote $E(u_{g} u_{g}') =\Sigma_{g}$ and $E(u u') =\Sigma$. -A generic wild bootstrap test then proceeds in the following steps: +A generic wild bootstrap test then proceeds in the following steps: ::: callout * Step 1: Either @@ -92,9 +99,9 @@ The subscript $x$ for the variance-covariance estimate $\hat{\Sigma}_{jj,x}$ den ### Multiple Variants of the Wild Cluster Bootstrap -The above algorithm leads to several variants of the wild cluster bootstrap: first, the bootstrap variants may differ in the choice of the variance-covariance estimator $\Sigma_{x}$. Second, they may differ in the functional transformation of the estimated residuals $f(\hat{u})$. Finally, the choice of imposing or not imposing the null on the bootstrap data generating process, as well as the choice of bootstrap weights lead to different bootstrap variants. +The above algorithm leads to several variants of the wild cluster bootstrap: first, the bootstrap variants may differ in the choice of the variance-covariance estimator $\Sigma_{x}$. Second, they may differ in the functional transformation of the estimated residuals $f(\hat{u})$. Finally, the choice of imposing or not imposing the null on the bootstrap data generating process, as well as the choice of bootstrap weights lead to different bootstrap variants. -Based on recent work by @mackinnon2022fast, we will put emphasis on the choice of the variance-covariance estimator $\Sigma_{x}$ and how the residuals $f(\hat{u})$ are transformed. +Based on recent work by @mackinnon2022fast, we will put emphasis on the choice of the variance-covariance estimator $\Sigma_{x}$ and how the residuals $f(\hat{u})$ are transformed. For reasons of computational feasibility, @mackinnon2022fast focus on CRV1 and CRV3 covariance matrix estimators. @@ -182,7 +189,7 @@ For Rademacher and Mammen weights, only $2^G$ unique combinations of draws exist ### Wild Bootstrap Confidence Intervals -In theory, multiple ways to calculate wild (cluster) bootstrapped confidence intervals exists [@mackinnon2022cluster]. +In theory, multiple ways to calculate wild (cluster) bootstrapped confidence intervals exists [@mackinnon2022cluster]. @@ -202,7 +209,7 @@ In theory, multiple ways to calculate wild (cluster) bootstrapped confidence int -Based on simulation results in @mackinnon2015wild and higher order asymptotic theory in @djogbenou2019asymptotic, [fwildclusterboot]{.pkg} computes confidence intervals by test inversion. +Based on simulation results in @mackinnon2015wild and higher order asymptotic theory in @djogbenou2019asymptotic, [fwildclusterboot]{.pkg} computes confidence intervals by test inversion. While inverting bootstrap tests is computationally more demanding, in the case of the wild cluster bootstrap, the procedure can be massively accelerated by pre-computing multiple objects that are constant across all iterations of the inversion algorithm. Details on how this acceleration is achieved in [fwildclusterboot]{.pkg} are presented in the appendix (which is to be added. If you are curious, you can email Alex, and he'll share his notes). In the following section, for illustrative purposes, we will demonstrate how to invert a simple t-test for a regression model (vs a bootstrap test inversion). @@ -217,24 +224,24 @@ While inverting bootstrap tests is computationally more demanding, in the case o -Based on the definition of the p-value, we can define a confidence interval at significance level $\alpha$ as +Based on the definition of the p-value, we can define a confidence interval at significance level $\alpha$ as $$ C = \{\theta \in \Theta: p(\theta) \geq \alpha \}. $$ -In other words: the confidence interval is the set of all values $\theta \in \Theta$ with p-value larger than the chosen significance level $\alpha$. +In other words: the confidence interval is the set of all values $\theta \in \Theta$ with p-value larger than the chosen significance level $\alpha$. -All of this implies that if we have a function that calculates p-values for different values of $\theta$, $p(\theta)$, to obtain a confidence interval, we simply have to collect all values $\theta$ for which $p(\theta) > \alpha$. Or, in other words, we need to *invert* $p(\theta)$. +All of this implies that if we have a function that calculates p-values for different values of $\theta$, $p(\theta)$, to obtain a confidence interval, we simply have to collect all values $\theta$ for which $p(\theta) > \alpha$. Or, in other words, we need to *invert* $p(\theta)$. -We will illustrate all of this based on a simple linear regression model. +We will illustrate all of this based on a simple linear regression model. -The data generating process is +The data generating process is -$$ +$$ Y = \beta_0 + \beta_1 X + u -$$ +$$ -with $E[u|X] = 0$, and we are interested in testing the null hypothesis +with $E[u|X] = 0$, and we are interested in testing the null hypothesis $$ H_0: \beta_1 = 0 \textit{ vs } H_1: \beta_1 \neq 0. @@ -253,7 +260,7 @@ df <- data.frame(X = X, y = y) fit <- (lm(y ~ 1 + X, data = df)) ``` -The estimated confidence interval of the regression model is +The estimated confidence interval of the regression model is ```{r} confint(fit) @@ -261,9 +268,9 @@ confint(fit) Note that this confidence interval is build on *estimated standard errors*. -This means that in order to calculate standard errors, `confint()` computes a standard error and multiplies it with a critical value. +This means that in order to calculate standard errors, `confint()` computes a standard error and multiplies it with a critical value. -To compute a confidence without estimating standard errors, we first need to define a function that calculates p-values for different values of $\beta$ given the model and data. To do so, we will simply create a function that will allow us to test hypotheses of the form +To compute a confidence without estimating standard errors, we first need to define a function that calculates p-values for different values of $\beta$ given the model and data. To do so, we will simply create a function that will allow us to test hypotheses of the form $$ H_0: \beta_1 - r = 0 \textit{ vs } H_1: \beta_1 -r \neq 0. @@ -275,50 +282,69 @@ $$ H_0: \beta_1 = r \textit{ vs } H_1: \beta_1 \neq r. $$ -Tests of such a form are implemented in the `car` package, via the `linearHypothesis` function, and we create a small wrapper function, `p_val(y, X, r)` around `car::linearHypothesis`: +Tests of such a form are implemented in the `car` package, via the `linearHypothesis` function, and we create a small wrapper function, `p_val(y, X, r)` around `car::linearHypothesis`: ```{r} p_val <- function(y, X, r){ res <- lm(y ~ 1 + X) - p_val <- car::linearHypothesis(model = res, hypothesis.matrix = c(0,1), rhs = r)$`Pr(>F)`[2] + p_val <- car::linearHypothesis( + model = res, + hypothesis.matrix = c(0,1), + rhs = r)$`Pr(>F)`[2] p_val } ``` -As can be seen in the plot below, for different values of $r$, `p_val()` returns a range of p-values: +As can be seen in the plot below, for different values of $r$, `p_val()` returns +a range of p-values: ```{r} -p_val_r <- unlist(lapply(seq(-0.05, 0.05, 0.005), function(i) p_val(y = y, X = X, r = i))) +p_val_r <- unlist( + lapply( + seq(-0.05, 0.05, 0.005), + function(i) p_val(y = y, X = X, r = i) + ) + ) p_val_df <- data.frame(r = seq(-0.05, 0.05, 0.005), p_val_r = p_val_r) -plot(x = p_val_df$r, y = p_val_df$p_val_r,type = "b", pch = 20, lty = 2, xlab = "r", ylab = "p-value") +plot( + x = p_val_df$r, + y = p_val_df$p_val_r, + type = "b", + pch = 20, + lty = 2, + xlab = "r", + ylab = "p-value" +) lines(p_val_df$r, p_val_df$p_val_r, type = "l", lty = 1) -abline(h = 0.05, col = "red") +abline(h = 0.05, col = "red") abline(v = confint(fit)["X", ][1], col = "blue") -text(x = confint(fit)["X", ][1] - 0.01, y = 0.8, "lower", srt=0.2, col = "blue") +text( + x = confint(fit)["X", ][1] - 0.01, y = 0.8, "lower", srt=0.2, col = "blue") abline(v = confint(fit)["X", ][2], col = "blue") -text(x = confint(fit)["X", ][2] + 0.01, y = 0.8, "upper", srt=0.2, col = "blue") +text( + x = confint(fit)["X", ][2] + 0.01, y = 0.8, "upper", srt=0.2, col = "blue") abline(v = confint(fit)["X", ][1] - 0.01, col = "green", lty = 2) abline(v = confint(fit)["X", ][1] + 0.01, col = "green", lty = 2) abline(v = confint(fit)["X", ][2] - 0.01, col = "green", lty = 2) abline(v = confint(fit)["X", ][2] + 0.01, col = "green", lty = 2) ``` -The p-value peaks for the "true" null hypothesis $\beta_1 = r = 0.01$ and decreases when moving further away from the true value. +The p-value peaks for the "true" null hypothesis $\beta_1 = r = 0.01$ and decreases when moving further away from the true value. The two points where the red line crosses with the black line - marked by a blue line - are the confidence interval for our hypothesis test of interest! (Note that this plot is very similar to the output of `plot.boottest()`). -In consequence, our goal is to find the intersection of the blue, red, and black lines. +In consequence, our goal is to find the intersection of the blue, red, and black lines. -To do so, we need to find two starting values for the line search. Those are marked as green. In practice, `boottest()` needs to do some work to find them, but here we will skip this step. +To do so, we need to find two starting values for the line search. Those are marked as green. In practice, `boottest()` needs to do some work to find them, but here we will skip this step. -We will start from the empirical confidence interval calculated by `confint()`: +We will start from the empirical confidence interval calculated by `confint()`: ```{r} confint(fit) ``` -We create two sets of starting values by adding a value $\epsilon \neq 0$ to the confidence boundaries of the confidence set obtained by `confint()`: +We create two sets of starting values by adding a value $\epsilon \neq 0$ to the confidence boundaries of the confidence set obtained by `confint()`: ```{r} epsilon <- 0.01 @@ -327,20 +353,20 @@ startset1 <- confint(fit)["X",][1] + c(-epsilon, epsilon) startset2 <- confint(fit)["X",][2] + c(-epsilon, epsilon) ``` -With these starting values at hand, we can invert $p(.)$ numerically by a root finding procedure - we will run a simple bisection. +With these starting values at hand, we can invert $p(.)$ numerically by a root finding procedure - we will run a simple bisection. ```{r} invert_p_val <- function(X, y, startset1, startset2, alpha){ - - # p-val - sign_level + + # p-val - sign_level p_val_x_sign_level <- function(r) { p_val(X = X, y = y, r) - alpha } - + # bisection for both startset1, startset2 - res <- lapply(list(startset1, startset2), function(x){ - + res <- lapply(list(startset1, startset2), function(x){ + tmp <- suppressWarnings(stats::uniroot(f = p_val_x_sign_level, lower = min(x), upper = max(x), @@ -350,17 +376,23 @@ invert_p_val <- function(X, y, startset1, startset2, alpha){ }) unlist(res) - + } ``` Now, which confidence interval do we get from the numerical p-value inversion? ```{r} -invert_p_val(X = X, y = y, startset1 = startset1, startset2 = startset2, alpha = 0.05) +invert_p_val( + X = X, + y = y, + startset1 = startset1, + startset2 = startset2, + alpha = 0.05 +) ``` -As it turns out, this confidence interval is practically identical with the confidence interval based on estimated standard errors: +As it turns out, this confidence interval is practically identical with the confidence interval based on estimated standard errors: ```{r} confint(fit)