From f683b18bc7d7ea2fcef6e30d474bf40a033e7f5e Mon Sep 17 00:00:00 2001 From: Pat Bills Date: Fri, 26 Sep 2025 12:47:54 -0400 Subject: [PATCH 01/13] WIP improving documentation --- R/install_dev_packages.r | 3 ++ README.Rmd | 40 ++++++++++------- README.md | 78 +++++++++++++++++++++++++-------- vignettes/google_sheets_api.Rmd | 30 +++++++------ 4 files changed, 104 insertions(+), 47 deletions(-) create mode 100644 R/install_dev_packages.r diff --git a/R/install_dev_packages.r b/R/install_dev_packages.r new file mode 100644 index 0000000..9694f94 --- /dev/null +++ b/R/install_dev_packages.r @@ -0,0 +1,3 @@ + +dev_packages <- c('devtools', 'pkgdown', 'renv') +install.packages(dev_packages, repos = "https://cran.rstudio.com") \ No newline at end of file diff --git a/README.Rmd b/README.Rmd index 59afba2..aa06463 100644 --- a/README.Rmd +++ b/README.Rmd @@ -15,37 +15,47 @@ knitr::opts_chunk$set( # collaboratR -### [MSU IBEEM](https://ibeem.msu.edu) commRULES project +### A package to support collaborative meta-analysis for [MSU IBEEM](https://ibeem.msu.edu) [![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental) -[![CRAN status](https://www.r-pkg.org/badges/version/collaboratR)](https://CRAN.R-project.org/package=collaboratR) -*This is very early version under heavy development.* +### Motivation + +Performaing a Meta-analysis requires collating and harmonizing data extracted +from many different sources but most frequently scientific publications. +Collaborative meta-analysis requires a group of scientists to collectively develop +and agree on their goals, type of data extracted, format of the those data, and to +do so extremely consistently across papers. This package helps to support that +efficiently by This R package is part of 3 repositories that support the data entry, validation and accumulation of a meta-analysis for the commRULES project. -1. commRULES data: version controlled data collection for tracking provenance using git, this is the L0 and L1 layers in the EDI framework -2. collaboratR: commRULES data management code for L0 and L0->L1 layer in EDI framework -3. commRULES-analysis: R code for reproducible data analysis , L1->L2 layers in EDI framework -## Installation - Package +1. collaboratR: commRULES data management code for L0 and L0->L1 layer in EDI framework +1. data: version controlled data collection for tracking provenance using git, this is the L0 and L1 layers in the EDI framework +1. analysis: R code for reproducible data analysis , L1->L2 layers in EDI framework -This package uses [renv](https://rstudio.github.io/renv/) to manage the packages you need to install, which creates an `renv.lock` file for you. +## Installation - Package -- install RENV: this can go into your R environment used for all packages, so fire up R with now project select and `install.packages('renv')` - clone this repository into a [new Rstudio project](https://docs.posit.co/ide/user/ide/guide/code/projects.html) and open it -- inside the Rstudio project in the R console, `renv::restore()` -## Google Drive Project Setup +- install required packages: + This package uses [renv](https://rstudio.github.io/renv/) to manage the packages you need to install, which creates an `renv.lock` file for you. 1. install the renv package: this can go into your R environment used for all packages. + 2. in R run `renv::restore()` or if that complains about R versions + +*additional packages are required to build the package and this website, source the script* +`R/install_dev_packages.R` -Using google drive via MSU seems to require creating a Google Cloud project, enabling the proper -APIs and and assigning permissions +## Data Google Drive Project Setup -Note that for safety, this package only reads from google drive and it never writes to google drive. Therefore it only requests 'read-only' access. +See the Vignette ["Google Sheets API setup using Google Cloud"](vignettes/google_sheets_api.Rmd) +for details about setting up google sheets connection with R, which requires +a google cloud project in your institution -Full documentation for how to set this up is forthcoming +Note that for safety, this package only reads from google drive and it never +writes to google drive. Therefore it only requests 'read-only' access. ## Usage diff --git a/README.md b/README.md index 6d7a892..30a8414 100644 --- a/README.md +++ b/README.md @@ -3,39 +3,81 @@ # collaboratR -### [MSU IBEEM](https://ibeem.msu.edu) commRULES project +### A package to support collaborative meta-analysis for [MSU IBEEM](https://ibeem.msu.edu) [![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental) -[![CRAN -status](https://www.r-pkg.org/badges/version/collaboratR)](https://CRAN.R-project.org/package=collaboratR) -*This is very early version under heavy development.* +### Motivation + +Performaing a Meta-analysis requires collating and harmonizing data +extracted from many different sources but most frequently scientific +publications. +Collaborative meta-analysis requires a group of scientists to +collectively develop and agree on their goals, type of data extracted, +format of the those data, and to do so extremely consistently across +papers. This package helps to support that efficiently by This R package is part of 3 repositories that support the data entry, validation and accumulation of a meta-analysis for the commRULES project. -1. commRULES data: version controlled data collection for tracking - provenance using git, this is the L0 and L1 layers in the EDI - framework -2. collaboratR: commRULES data management code for L0 and L0-\>L1 layer +1. collaboratR: commRULES data management code for L0 and L0-\>L1 layer in EDI framework -3. commRULES-analysis: R code for reproducible data analysis , L1-\>L2 - layers in EDI framework - -## Installation +2. data: version controlled data collection for tracking provenance + using git, this is the L0 and L1 layers in the EDI framework +3. analysis: R code for reproducible data analysis , L1-\>L2 layers in + EDI framework -This package uses [renv](https://rstudio.github.io/renv/) to manage the -packages you need to install, which creates an `renv.lock` file for you. +## Installation - Package -- install RENV: this can go into your R environment used for all - packages, so fire up R with now project select and - `install.packages('renv')` - clone this repository into a [new Rstudio project](https://docs.posit.co/ide/user/ide/guide/code/projects.html) and open it -- inside the Rstudio project in the R console, `renv::restore()` + +- install required packages: This package uses + [renv](https://rstudio.github.io/renv/) to manage the packages you + need to install, which creates an `renv.lock` file for you. 1. install + the renv package: this can go into your R environment used for all + packages. + + 2. in R run `renv::restore()` or if that complains about R versions + +*additional packages are required to build the package and this website, +source the script* `R/install_dev_packages.R` + +## Data Google Drive Project Setup + +See the Vignette [“Google Sheets API setup using Google +Cloud”](vignettes/google_sheets_api.Rmd) for details about setting up +google sheets connection with R, which requires a google cloud project +in your institution + +Note that for safety, this package only reads from google drive and it +never writes to google drive. Therefore it only requests ‘read-only’ +access. + +## Usage + +When reading in data sheets, you provide a URL for a datasheet that +exists in any folder that you have access to. The system will attempt to +log you into to google drive and requests your permission for this code +to access files on your behalf. + +``` r +gurl<- 'https://docs.google.com/spreadsheets/d/1w6sYozjybyd53eeiTdigrRTonteQW2KXUNZNmEhQyM8/edit?gid=0#gid=0' +study_data<- read_commrules_sheet(gurl) +``` + +### + +## References + +@article{van2021data, title={Data validation infrastructure for R}, +author={van der Loo, Mark PJ and de Jonge, Edwin}, journal={Journal of +Statistical Software}, year={2021}, volume ={97}, issue = {10}, pages = +{1-33}, doi={10.18637/jss.v097.i10}, url = +{} } diff --git a/vignettes/google_sheets_api.Rmd b/vignettes/google_sheets_api.Rmd index 026e1cd..a1e48da 100644 --- a/vignettes/google_sheets_api.Rmd +++ b/vignettes/google_sheets_api.Rmd @@ -8,32 +8,34 @@ vignette: > --- A goal of this package is to enable the use of Google Sheets for collaborative data entry with -familiar spreadsheet features. The package could also have used MS Office on-line excel editing, but excel files +familiar spreadsheet features. The package could also have used MS Office on-line excel editing, but excel files on desktop have historically had issues with datetime conversion (Mac vs Windows), line endings, and character encoding. A main feature then is to read in google 'sheets' files as data for validation and processing. However, accessing google drive files requires significant setup for an 'app' to access files. This vignette describes how that works. -This package heavily depends on the great work from Posit in the `googlesheets4` package, which in turn relies on the Posit-authored [gargle](Bryan J, Citro C, Wickham H 2023). gargle: Utilities for Working with Google APIs. https://gargle.r-lib.org.) package +This package relies upon the great work from Posit in the `googlesheets4` package, which in turn relies on the Posit-authored [gargle](Bryan J, Citro C, Wickham H 2023). gargle: Utilities for Working with Google APIs. https://gargle.r-lib.org.) -However before even starting R you need a Google Cloud project. The gargle package has some instructions for this - -`vignette("get-api-credentials", package = "gargle")` +However before even starting R you need a Google Cloud project. The gargle package has some instructions for this: `vignette("get-api-credentials", package = "gargle")` This is essentially a desktop app to read private google sheets, so in that guide we wan at to create "OAuth 2.0 client ID and secret" -The 'gargle' vignette says ->Note that most users of gargle-using packages do not need to read this and can just enjoy the automatic token flow. This article is for people who have a specific reason to be more proactive about auth. - -But I've found that one get several "insufficient permissions" at my institution to access files on a shared drive without going through this process. - - +The 'gargle' vignette states: +>Note that most users of gargle-using packages do not need to read this and can just enjoy the automatic token flow. This article is for people who have a specific reason to be more proactive about auth. +We've found that one gets several "insufficient permissions" at out institution +since Google has added several layers of permissions Google Workspace, and to +access files on a shared drives we have to go through this process. YMMV -*DRAFT* summary: +Workflow summary: -- get a google workspace account -- get a google cloud account. If your team is all using the same google workspace (e.g. at the same university or company) it's much easier and you should create a google cloud account in that domain. Most institutions restrict this so you may need help from your institutions cloud gatekeepers. Often it's the IT department. You can create a free account which requires a credit card, but this process will not have any charges and you will have to use +- get a google workspace account, or use your intitutions account +- get a google cloud account. Each institutione regulated google cloud accounts + differently. At ours you must request it and request access for each person. + If your team is all using the same google workspace (e.g. at the same university or company) it's much easier and you should create a google cloud account in that domain. Most institutions restrict this so you may need help from your institutions cloud gatekeepers. + The process described here does not require a "billing account" associated + with your google cloud account as it's only need to enable the APIs and there + are no charges for doing so. - create a new cloud project in the console - in the console go to the APIs page https://console.cloud.google.com/apis/ From 8537c82dd15449906dfca5f3f54897943ac963dc Mon Sep 17 00:00:00 2001 From: Pat Bills Date: Fri, 13 Mar 2026 14:09:57 -0400 Subject: [PATCH 02/13] typo in gitignore --- .gitignore | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/.gitignore b/.gitignore index 5fd4bbc..9e53786 100644 --- a/.gitignore +++ b/.gitignore @@ -6,8 +6,9 @@ renv/ ~$*.xlsx *.json inst/doc -int/credentials/*.json +inst/credentials/ *.csv +L0 From de2021f5409424cd7bc357f54d37d7702f875308 Mon Sep 17 00:00:00 2001 From: Pat Bills Date: Fri, 13 Mar 2026 14:32:02 -0400 Subject: [PATCH 03/13] dependency version management for package --- DESCRIPTION | 18 ++++++--- renv.lock | 68 ++++++++------------------------ renv/activate.R | 101 ++++++++++++++++++++++++++++++++++++++++++++---- 3 files changed, 123 insertions(+), 64 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index 7721604..e617a8e 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -2,11 +2,18 @@ Package: collaboratR Title: validation and version control for level 0 data using Google Drive Version: 0.0.0.9100 Authors@R: - person("Patrick", "Bills", , "billspat@msu.edu", role = c("aut", "cre"), - comment = c(ORCID = "0000-0003-4235-255X")) + c(person("Patrick", "Bills", "S", "billspat@msu.edu", role = c("aut", "cre"), + comment = c(ORCID = "0000-0003-4235-255X")), + person("Ashwini", "Ramesh", , "ashwini.ramesh@yale.edu", role = c("aut","dtc"), + comment = c(ORCID = "0000-0003-1629-7024")), + person("Laís", "Petri",, "petrila1@msu.edu", role = c("aut", "dtc"), + comment = c(ORCID = "0000-0001-9727-1939")), + person("Phoebe", "Zarnetske", "Lehman", "plz@msu.edu", role = c("aut","fnd"), + comment = c(ORCID = "0000-0001-6257-6951")) + ) Description: Data management for Community Ecology RULES project of MSU IBEEM, - using the EDI data package framework. Reads data entered using from google drive - and validates using standard set of rules. + using the EDI data package framework. Reads data and schema from google drive, + validates using standard set of rules, and combines into CSV. License: MIT + file LICENSE Suggests: knitr, @@ -20,10 +27,11 @@ Imports: gargle, googledrive, googlesheets4, + httpuv, readr, validate, dplyr VignetteBuilder: knitr Depends: - R (>= 2.10) + R (>= 4.1.0) LazyData: true diff --git a/renv.lock b/renv.lock index cb4a058..f164bfc 100644 --- a/renv.lock +++ b/renv.lock @@ -35,6 +35,11 @@ "Maintainer": "Winston Chang ", "Repository": "CRAN" }, + "Rcpp": { + "Package": "Rcpp", + "Version": "1.1.1", + "Source": "Repository" + }, "askpass": { "Package": "askpass", "Version": "1.2.1", @@ -347,6 +352,11 @@ "Maintainer": "Jeroen Ooms ", "Repository": "CRAN" }, + "fastmap": { + "Package": "fastmap", + "Version": "1.2.0", + "Source": "Repository" + }, "fs": { "Package": "fs", "Version": "1.6.6", @@ -613,6 +623,11 @@ "Maintainer": "Kirill Müller ", "Repository": "CRAN" }, + "httpuv": { + "Package": "httpuv", + "Version": "1.6.16", + "Source": "Repository" + }, "httr": { "Package": "httr", "Version": "1.4.7", @@ -1156,57 +1171,8 @@ }, "renv": { "Package": "renv", - "Version": "1.1.5", - "Source": "Repository", - "Type": "Package", - "Title": "Project Environments", - "Authors@R": "c( person(\"Kevin\", \"Ushey\", role = c(\"aut\", \"cre\"), email = \"kevin@rstudio.com\", comment = c(ORCID = \"0000-0003-2880-7407\")), person(\"Hadley\", \"Wickham\", role = c(\"aut\"), email = \"hadley@rstudio.com\", comment = c(ORCID = \"0000-0003-4757-117X\")), person(\"Posit Software, PBC\", role = c(\"cph\", \"fnd\")) )", - "Description": "A dependency management toolkit for R. Using 'renv', you can create and manage project-local R libraries, save the state of these libraries to a 'lockfile', and later restore your library as required. Together, these tools can help make your projects more isolated, portable, and reproducible.", - "License": "MIT + file LICENSE", - "URL": "https://rstudio.github.io/renv/, https://github.com/rstudio/renv", - "BugReports": "https://github.com/rstudio/renv/issues", - "Imports": [ - "utils" - ], - "Suggests": [ - "BiocManager", - "cli", - "compiler", - "covr", - "cpp11", - "devtools", - "generics", - "gitcreds", - "jsonlite", - "jsonvalidate", - "knitr", - "miniUI", - "modules", - "packrat", - "pak", - "R6", - "remotes", - "reticulate", - "rmarkdown", - "rstudioapi", - "shiny", - "testthat", - "uuid", - "waldo", - "yaml", - "webfakes" - ], - "Encoding": "UTF-8", - "RoxygenNote": "7.3.2", - "VignetteBuilder": "knitr", - "Config/Needs/website": "tidyverse/tidytemplate", - "Config/testthat/edition": "3", - "Config/testthat/parallel": "true", - "Config/testthat/start-first": "bioconductor,python,install,restore,snapshot,retrieve,remotes", - "NeedsCompilation": "no", - "Author": "Kevin Ushey [aut, cre] (ORCID: ), Hadley Wickham [aut] (ORCID: ), Posit Software, PBC [cph, fnd]", - "Maintainer": "Kevin Ushey ", - "Repository": "CRAN" + "Version": "1.1.8", + "Source": "Repository" }, "rlang": { "Package": "rlang", diff --git a/renv/activate.R b/renv/activate.R index 2753ae5..31a6969 100644 --- a/renv/activate.R +++ b/renv/activate.R @@ -2,7 +2,8 @@ local({ # the requested version of renv - version <- "1.1.5" + version <- "1.1.8" + attr(version, "md5") <- NULL attr(version, "sha") <- NULL # the project directory @@ -168,6 +169,16 @@ local({ if (quiet) return(invisible()) + # also check for config environment variables that should suppress messages + # https://github.com/rstudio/renv/issues/2214 + enabled <- Sys.getenv("RENV_CONFIG_STARTUP_QUIET", unset = NA) + if (!is.na(enabled) && tolower(enabled) %in% c("true", "1")) + return(invisible()) + + enabled <- Sys.getenv("RENV_CONFIG_SYNCHRONIZED_CHECK", unset = NA) + if (!is.na(enabled) && tolower(enabled) %in% c("false", "0")) + return(invisible()) + msg <- sprintf(fmt, ...) cat(msg, file = stdout(), sep = if (appendLF) "\n" else "") @@ -215,6 +226,16 @@ local({ section <- header(sprintf("Bootstrapping renv %s", friendly)) catf(section) + # try to install renv from cache + md5 <- attr(version, "md5", exact = TRUE) + if (length(md5)) { + pkgpath <- renv_bootstrap_find(version) + if (length(pkgpath) && file.exists(pkgpath)) { + file.copy(pkgpath, library, recursive = TRUE) + return(invisible()) + } + } + # attempt to download renv catf("- Downloading renv ... ", appendLF = FALSE) withCallingHandlers( @@ -240,7 +261,6 @@ local({ # add empty line to break up bootstrapping from normal output catf("") - return(invisible()) } @@ -257,12 +277,20 @@ local({ repos <- Sys.getenv("RENV_CONFIG_REPOS_OVERRIDE", unset = NA) if (!is.na(repos)) { - # check for RSPM; if set, use a fallback repository for renv - rspm <- Sys.getenv("RSPM", unset = NA) - if (identical(rspm, repos)) - repos <- c(RSPM = rspm, CRAN = cran) + # split on ';' if present + parts <- strsplit(repos, ";", fixed = TRUE)[[1L]] - return(repos) + # split into named repositories if present + idx <- regexpr("=", parts, fixed = TRUE) + keys <- substring(parts, 1L, idx - 1L) + vals <- substring(parts, idx + 1L) + names(vals) <- keys + + # if we have a single unnamed repository, call it CRAN + if (length(vals) == 1L && identical(keys, "")) + names(vals) <- "CRAN" + + return(vals) } @@ -511,6 +539,51 @@ local({ } + renv_bootstrap_find <- function(version) { + + path <- renv_bootstrap_find_cache(version) + if (length(path) && file.exists(path)) { + catf("- Using renv %s from global package cache", version) + return(path) + } + + } + + renv_bootstrap_find_cache <- function(version) { + + md5 <- attr(version, "md5", exact = TRUE) + if (is.null(md5)) + return() + + # infer path to renv cache + cache <- Sys.getenv("RENV_PATHS_CACHE", unset = "") + if (!nzchar(cache)) { + root <- Sys.getenv("RENV_PATHS_ROOT", unset = NA) + if (!is.na(root)) + cache <- file.path(root, "cache") + } + + if (!nzchar(cache)) { + tools <- asNamespace("tools") + if (is.function(tools$R_user_dir)) { + root <- tools$R_user_dir("renv", "cache") + cache <- file.path(root, "cache") + } + } + + # start completing path to cache + file.path( + cache, + renv_bootstrap_cache_version(), + renv_bootstrap_platform_prefix(), + "renv", + version, + md5, + "renv" + ) + + } + renv_bootstrap_download_tarball <- function(version) { # if the user has provided the path to a tarball via @@ -979,7 +1052,7 @@ local({ renv_bootstrap_validate_version_release <- function(version, description) { expected <- description[["Version"]] - is.character(expected) && identical(expected, version) + is.character(expected) && identical(c(expected), c(version)) } renv_bootstrap_hash_text <- function(text) { @@ -1181,6 +1254,18 @@ local({ } + renv_bootstrap_cache_version <- function() { + # NOTE: users should normally not override the cache version; + # this is provided just to make testing easier + Sys.getenv("RENV_CACHE_VERSION", unset = "v5") + } + + renv_bootstrap_cache_version_previous <- function() { + version <- renv_bootstrap_cache_version() + number <- as.integer(substring(version, 2L)) + paste("v", number - 1L, sep = "") + } + renv_json_read <- function(file = NULL, text = NULL) { jlerr <- NULL From 9726b3bb704e946cb0210a613d9cbc6b18979c75 Mon Sep 17 00:00:00 2001 From: Pat Bills Date: Fri, 13 Mar 2026 14:32:27 -0400 Subject: [PATCH 04/13] edits for clarity in overview doc --- vignettes/process_overview.Rmd | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/vignettes/process_overview.Rmd b/vignettes/process_overview.Rmd index b3144df..6944e1f 100644 --- a/vignettes/process_overview.Rmd +++ b/vignettes/process_overview.Rmd @@ -19,23 +19,36 @@ library(collaboratR) ``` This package enables the use of Google Sheets as a collabrative data and meta- -data entry tool for researchers of different skill sets, +data entry tool for researchers of different skill sets, automated validation of the data using standarized rules, and to tracking all changes to data using git ### Components -Automated: +Workflow Overview: + +- read metadata + - schema (column definitions) from google sheets + - list of data sheet URLs + - validation rules (from Rdata file) +- read data files (from Google Sheets) +- validate + - data format against schema + - data values against rules +- report errors +- save all sheets as CSV and for commit to git +- combine to master list(s) -- pull metadata from google sheets to allow collaboration -- pull data from Google Sheets -- validation using coded rules against standardized rule sets and reporting of errors -- automate pulling -- commiting data files in text CSV format for change management using git ### Setup/Requirements +This code is written specifically to accommodate a multi-tab sheet setup +that is unique to a meta-analysis of plant competition experiments and expects +the sheets to have multiple tabs. 1. Metadata in a google sheet -- list of all fields and +- list of all fields and field types + + + From 2ba93dee12f3b85774cc927ad2cb36ff55a5ff38 Mon Sep 17 00:00:00 2001 From: Pat Bills Date: Fri, 13 Mar 2026 15:02:41 -0400 Subject: [PATCH 05/13] update package down website --- docs/404.html | 111 +++ docs/LICENSE-text.html | 105 +++ docs/LICENSE.html | 90 +++ docs/articles/google_sheets_api.html | 242 ++++++ docs/articles/index.html | 87 +++ docs/articles/process_overview.html | 156 ++++ .../validating_commassemblyrules.html | 730 ++++++++++++++++++ docs/authors.html | 116 +++ docs/bootstrap-toc.css | 60 ++ docs/bootstrap-toc.js | 159 ++++ docs/docsearch.css | 148 ++++ docs/docsearch.js | 85 ++ docs/index.html | 194 +++++ docs/link.svg | 12 + docs/pkgdown.css | 384 +++++++++ docs/pkgdown.js | 108 +++ docs/pkgdown.yml | 8 + docs/reference/aggregate_csvs.html | 100 +++ docs/reference/as.Date.flexible.html | 107 +++ docs/reference/errorSaver.html | 138 ++++ docs/reference/gdrive_client_setup.html | 96 +++ docs/reference/gdrive_setup.html | 113 +++ docs/reference/get_api_key.html | 94 +++ docs/reference/get_col_type_from_spec.html | 110 +++ docs/reference/get_drive_email.html | 90 +++ docs/reference/get_gsfile.html | 125 +++ docs/reference/gfile_modified_time.html | 102 +++ docs/reference/gsheet_auth_setup.html | 108 +++ docs/reference/index.html | 184 +++++ docs/reference/read_data_csv.html | 108 +++ docs/reference/read_data_sheet.html | 142 ++++ docs/reference/read_gcsv.html | 131 ++++ docs/reference/read_gsheet_by_url.html | 133 ++++ docs/reference/read_url_list.html | 96 +++ docs/reference/read_validate_and_save.html | 90 +++ docs/reference/remove_comment_line.html | 112 +++ docs/reference/spec_to_readr_col_types.html | 102 +++ docs/reference/type_code_to_readr_code.html | 106 +++ docs/reference/type_converter_fun.html | 110 +++ docs/reference/validate_all.html | 96 +++ docs/reference/validate_data.html | 106 +++ docs/reference/validate_data_columns.html | 102 +++ docs/reference/validate_from_file.html | 106 +++ docs/sitemap.xml | 38 + 44 files changed, 5740 insertions(+) create mode 100644 docs/404.html create mode 100644 docs/LICENSE-text.html create mode 100644 docs/LICENSE.html create mode 100644 docs/articles/google_sheets_api.html create mode 100644 docs/articles/index.html create mode 100644 docs/articles/process_overview.html create mode 100644 docs/articles/validating_commassemblyrules.html create mode 100644 docs/authors.html create mode 100644 docs/bootstrap-toc.css create mode 100644 docs/bootstrap-toc.js create mode 100644 docs/docsearch.css create mode 100644 docs/docsearch.js create mode 100644 docs/index.html create mode 100644 docs/link.svg create mode 100644 docs/pkgdown.css create mode 100644 docs/pkgdown.js create mode 100644 docs/pkgdown.yml create mode 100644 docs/reference/aggregate_csvs.html create mode 100644 docs/reference/as.Date.flexible.html create mode 100644 docs/reference/errorSaver.html create mode 100644 docs/reference/gdrive_client_setup.html create mode 100644 docs/reference/gdrive_setup.html create mode 100644 docs/reference/get_api_key.html create mode 100644 docs/reference/get_col_type_from_spec.html create mode 100644 docs/reference/get_drive_email.html create mode 100644 docs/reference/get_gsfile.html create mode 100644 docs/reference/gfile_modified_time.html create mode 100644 docs/reference/gsheet_auth_setup.html create mode 100644 docs/reference/index.html create mode 100644 docs/reference/read_data_csv.html create mode 100644 docs/reference/read_data_sheet.html create mode 100644 docs/reference/read_gcsv.html create mode 100644 docs/reference/read_gsheet_by_url.html create mode 100644 docs/reference/read_url_list.html create mode 100644 docs/reference/read_validate_and_save.html create mode 100644 docs/reference/remove_comment_line.html create mode 100644 docs/reference/spec_to_readr_col_types.html create mode 100644 docs/reference/type_code_to_readr_code.html create mode 100644 docs/reference/type_converter_fun.html create mode 100644 docs/reference/validate_all.html create mode 100644 docs/reference/validate_data.html create mode 100644 docs/reference/validate_data_columns.html create mode 100644 docs/reference/validate_from_file.html create mode 100644 docs/sitemap.xml diff --git a/docs/404.html b/docs/404.html new file mode 100644 index 0000000..95d393a --- /dev/null +++ b/docs/404.html @@ -0,0 +1,111 @@ + + + + + + + +Page not found (404) • collaboratR + + + + + + + + + + + +
+
+ + + + +
+
+ + +Content not found. Please use links in the navbar. + +
+ + + +
+ + + +
+ +
+

+

Site built with pkgdown 2.2.0.

+
+ +
+
+ + + + + + + + diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html new file mode 100644 index 0000000..8e237b5 --- /dev/null +++ b/docs/LICENSE-text.html @@ -0,0 +1,105 @@ + +License • collaboratR + + +
+
+ + + +
+
+ + +
MIT License
+
+Copyright (c) 2025 Patrick Bills, Ashwini Ramesh, Laís Petri, Phoebe Zarnetske
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+
+ +
+ + + +
+ + + +
+ +
+

Site built with pkgdown 2.2.0.

+
+ +
+ + + + + + + + diff --git a/docs/LICENSE.html b/docs/LICENSE.html new file mode 100644 index 0000000..dba3eab --- /dev/null +++ b/docs/LICENSE.html @@ -0,0 +1,90 @@ + +MIT License • collaboratR + + +
+
+ + + +
+
+ + +
+ +

Copyright (c) 2024 collaboratR authors

+

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

+

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

+

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

+
+ +
+ + + +
+ + + +
+ +
+

Site built with pkgdown 2.2.0.

+
+ +
+ + + + + + + + diff --git a/docs/articles/google_sheets_api.html b/docs/articles/google_sheets_api.html new file mode 100644 index 0000000..fa8033b --- /dev/null +++ b/docs/articles/google_sheets_api.html @@ -0,0 +1,242 @@ + + + + + + + +Google Sheets API setup using Google Cloud • collaboratR + + + + + + + + + + + +
+
+ + + + +
+
+ + + + +

A goal of this package is to enable the use of Google Sheets for +collaborative data entry with familiar spreadsheet features. The package +could also have used MS Office on-line excel editing, but excel files on +desktop have historically had issues with datetime conversion (Mac vs +Windows), line endings, and character encoding.

+

A main feature then is to read in google ‘sheets’ files as data for +validation and processing. However, accessing google drive files +requires significant setup for an ‘app’ to access files. This vignette +describes how that works.

+

This package relies upon the great work from Posit in the +googlesheets4 package, which in turn relies on the +Posit-authored gargle. gargle: +Utilities for Working with Google APIs. https://gargle.r-lib.org.)

+

However before even starting R you need a Google Cloud project. The +gargle package has some instructions for this: +vignette("get-api-credentials", package = "gargle")

+

This is essentially a desktop app to read private google sheets, so +in that guide we wan at to create “OAuth 2.0 client ID and secret”

+

The ‘gargle’ vignette states:

+
+

Note that most users of gargle-using packages do not need to read +this and can just enjoy the automatic token flow. This article is for +people who have a specific reason to be more proactive about auth.

+
+

We’ve found that one gets several “insufficient permissions” at out +institution since Google has added several layers of permissions Google +Workspace, and to access files on a shared drives we have to go through +this process. YMMV

+

Workflow summary:

+
    +
  • get a google workspace account, or use your intitutions +account

  • +
  • get a google cloud account. Each institutione regulated google +cloud accounts differently. At ours you must request it and request +access for each person.
    +If your team is all using the same google workspace (e.g. at the same +university or company) it’s much easier and you should create a google +cloud account in that domain. Most institutions restrict this so you may +need help from your institutions cloud gatekeepers.
    +The process described here does not require a “billing account” +associated with your google cloud account as it’s only need to enable +the APIs and there are no charges for doing so.

  • +
  • create a new cloud project in the console

  • +
  • in the console go to the APIs page https://console.cloud.google.com/apis/

  • +
  • +

    create ‘credentials’ which is aka “service account”

    + +
  • +
  • +

    setup the OAuth consent screen. If you are all close +collaborators you may wonder why you have to setup a consent screen but +it’s google and you do

    +
      +
    • give it name related to your project that is unique but that all +collaborators would instantly recognize, perhaps “ProjectX-R”

    • +
    • it’s much easier to make this consent ‘internal’ which limits +users in the same ‘workspace’ as you (my institution has essential one +workspace for all users ). If using external collaborators with +email/google drive accounts that are not your institution pick +‘external.’ this will change what you need to do but these notes don’t +cover that.

    • +
    • google also calls these ‘consent screens’ “apps” so you +have to give it an app name - use the same as the credentials name +above.

    • +
    • user support email is your email

    • +
    • app logo - you can skip all of that

    • +
    • App domain: if you are working in a lab or interdisciplinary +project, use that domain, or the domain of some department in your +institution that you are a member of or is helpful for this kind of +thing, like “biology.myuni.edu” or +“x-project.biology.myuni.edu”

    • +
    • +

      “Authorized domains” - if you are using internal app, then put +the lowest domain which is typically the domain on your institutional +email address (e.g. “myuni.edu” or “myuni.ac.uk” or +“my-non-profit.org”)

      +
        +
      • since this is just for internal collaborators, use the same url for +all of home page, privacy policy link and terms of service link.
        +
      • +
      +
    • +
    • If using external users, you may have to add a domain for each +user that is their instution, or perhaps use gmail for everyone. (not +tested)

    • +
    +
  • +
  • +

    enable APIs

    +
      +
    • to use the google sheets package, enable the sheets api. it may +be necessary to enable the google drive API (which I did)

    • +
    • +

      set for read-only

      +
        +
      • there is a ‘scope’ setting and may want to change to ti ‘read only’ +since this package only reads google drive files. when setting up the +apis, use this scope if possible.
      • +
      +
    • +
    +
  • +
  • download credentials

  • +
+

In the OAuth 2.0 Client IDs, download the JSON file needs to be +accessible by your R session. It should not be inside your R code folder +which is a git repository and this identity file should never be checked +into git (for security). It does not go into the package itself, and +must be given to each person using the package. To make it flexible, +update the .Renviron file to point to that file. It may +need to be the full path to the file. For example +`PROJECT_AUTH_FILE=‘/Users/myuserid/downloads/client_secret_77etc_blahblah.apps.googleusercontent.com.json’

+

commands to try:

+
drive_auth_json_file <- 'path/to/json/file.json'
+# may want to check that file exists 
+googlesheets4::googlesheets4::gs4_auth_configure(path=drive_auth_json_file)
+googlesheets4::gs4_auth(email='myname@institution.whatever', scopes="drive.readonly")
+
+library(collaboratR)
+
+ + + +
+ + + +
+ +
+

+

Site built with pkgdown 2.2.0.

+
+ +
+
+ + + + + + + + diff --git a/docs/articles/index.html b/docs/articles/index.html new file mode 100644 index 0000000..10b74c8 --- /dev/null +++ b/docs/articles/index.html @@ -0,0 +1,87 @@ + +Articles • collaboratR + + +
+
+ + + +
+ + +
+ +
+

Site built with pkgdown 2.2.0.

+
+ +
+ + + + + + + + diff --git a/docs/articles/process_overview.html b/docs/articles/process_overview.html new file mode 100644 index 0000000..94d5e25 --- /dev/null +++ b/docs/articles/process_overview.html @@ -0,0 +1,156 @@ + + + + + + + +Google Sheet Validation Process Outline • collaboratR + + + + + + + + + + + +
+
+ + + + +
+
+ + + + +
+library(collaboratR)
+

This package enables the use of Google Sheets as a collabrative data +and meta- data entry tool for researchers of different skill sets, +automated validation of the data using standarized rules, and to +tracking all changes to data using git

+
+

Components +

+

Workflow Overview:

+
    +
  • read metadata +
      +
    • schema (column definitions) from google sheets
    • +
    • list of data sheet URLs
    • +
    • validation rules (from Rdata file)
    • +
    +
  • +
  • read data files (from Google Sheets)
  • +
  • validate +
      +
    • data format against schema
    • +
    • data values against rules
    • +
    +
  • +
  • report errors
  • +
  • save all sheets as CSV and for commit to git
  • +
  • combine to master list(s)
  • +
+
+
+

Setup/Requirements +

+

This code is written specifically to accommodate a multi-tab sheet +setup that is unique to a meta-analysis of plant competition experiments +and expects the sheets to have multiple tabs.
+1. Metadata in a google sheet

+
    +
  • list of all fields and field types
  • +
+
+
+ + + +
+ + + +
+ +
+

+

Site built with pkgdown 2.2.0.

+
+ +
+
+ + + + + + + + diff --git a/docs/articles/validating_commassemblyrules.html b/docs/articles/validating_commassemblyrules.html new file mode 100644 index 0000000..da28c97 --- /dev/null +++ b/docs/articles/validating_commassemblyrules.html @@ -0,0 +1,730 @@ + + + + + + + +Validating Community Assembly Rules Project Data • collaboratR + + + + + + + + + + + +
+
+ + + + +
+
+ + + + +
+require(dplyr)
+#> Loading required package: dplyr
+#> Warning: package 'dplyr' was built under R version 4.5.2
+#> 
+#> Attaching package: 'dplyr'
+#> The following objects are masked from 'package:stats':
+#> 
+#>     filter, lag
+#> The following objects are masked from 'package:base':
+#> 
+#>     intersect, setdiff, setequal, union
+library(collaboratR)
+
+

Authenticate to gdrive/sheets. Requires an email account +

+
+
+## EDIT AND ADD YOUR EMAIL IF YOU DON'T HAVE AN EMAIL IN .Renviron FILE
+# if you have not setup the .Renvfile, or would like to test a different email 
+# address, set it here.   If none is sent, the auth setup function will look in 
+# the environment (e.g. .Renviron file)
+drive_email <- NULL
+collaboratR::gsheet_auth_setup(drive_email = drive_email)
+#> [1] TRUE
+
+
+

Load data specific to the Comm Assembly Rules project stored in this +package +

+
+# Validate package rules
+data(commassembly_rules_biomass_str)
+data(commassembly_rules_env_str)
+
+# list of URLS from google drive to a data sheet
+doc_with_list_url <- Sys.getenv('TEST_ID_LIST_URL')
+id_column = 'ID_new'
+urls.df <- read_url_list(gurl = doc_with_list_url, id_column = id_column, url_column='url')
+#>  Reading from ID_new-Urls.
+#>  Range ''Sheet1''.
+print(paste(nrow(urls.df), "urls to read"))
+#> [1] "67 urls to read"
+

Read in the first URL as a test. There will be warnings if the +columns are not the correct type or name

+
+# get the example from one of the URLS
+
+# Randome sheet
+test_gsheet_url <- sample(urls.df$url, 1)
+# OR set the example sheet manually
+# test_gsheet_url <- 'https://docs.google.com/spreadsheets/d/1Npcre4y4LnzIU_4v_vqJiVyzLtXGZ1lpYGRnXcfi0X8/edit?gid=0#gid=0'
+
+print(paste("We'll use this test sheet:", test_gsheet_url))
+#> [1] "We'll use this test sheet: https://docs.google.com/spreadsheets/d/1NiTlO-tt8hMT4mvgZXPWYiNyGox3cmWnWihjGoYW5k8/edit?gid=0#gid=0"
+
+# test of basic reading, and validate the column names
+# this will throw warnings if there are invalid column names
+
+biomass.df <- read_data_sheet(gurl = test_gsheet_url, 
+                              tab_name = 'biomass_data', 
+                              spec.df = commassembly_rules_biomass_str)
+
+biomass.df
+#> # A tibble: 16 × 33
+#>    spp_who spp_date   ID_new source external_trt trt_type    reponse_var site 
+#>    <chr>   <chr>       <int> <chr>  <fct>        <fct>       <fct>       <chr>
+#>  1 AR      03/09/2024    983 Fig. 2 NA           monoculture biomass     main 
+#>  2 AR      03/09/2024    983 Fig. 2 NA           monoculture biomass     main 
+#>  3 AR      03/09/2024    983 Fig. 2 NA           mixture     biomass     main 
+#>  4 AR      03/09/2024    983 Fig. 2 NA           mixture     biomass     main 
+#>  5 AR      03/09/2024    983 Fig. 2 NA           monoculture biomass     main 
+#>  6 AR      03/09/2024    983 Fig. 2 NA           monoculture biomass     main 
+#>  7 AR      03/09/2024    983 Fig. 2 NA           mixture     biomass     main 
+#>  8 AR      03/09/2024    983 Fig. 2 NA           mixture     biomass     main 
+#>  9 AR      03/09/2024    983 Fig. 2 NA           monoculture biomass     main 
+#> 10 AR      03/09/2024    983 Fig. 2 NA           monoculture biomass     main 
+#> 11 AR      03/09/2024    983 Fig. 2 NA           mixture     biomass     main 
+#> 12 AR      03/09/2024    983 Fig. 2 NA           mixture     biomass     main 
+#> 13 AR      03/09/2024    983 Fig. 2 NA           monoculture biomass     main 
+#> 14 AR      03/09/2024    983 Fig. 2 NA           monoculture biomass     main 
+#> 15 AR      03/09/2024    983 Fig. 2 NA           mixture     biomass     main 
+#> 16 AR      03/09/2024    983 Fig. 2 NA           mixture     biomass     main 
+#> # ℹ 25 more variables: spp_biomass <fct>, f_spp_name <fct>, f_nat_inv <fct>,
+#> #   f_num_indiv <dbl>, c_spp_name <fct>, c_nat_inv <fct>, c_num_indiv <dbl>,
+#> #   biomass_type <fct>, response_mean <dbl>, response_transformation <chr>,
+#> #   response_mean_unit <fct>, response_var <dbl>, response_var_unit <fct>,
+#> #   sample_size <dbl>, nutrient_general <fct>, nutrient_compound <fct>,
+#> #   nutrient1_mean <chr>, nutrient2_mean <chr>, nutrient3_mean <chr>,
+#> #   nutrient_unit <fct>, nutrient_interval_days <dbl>, …
+

env sheet

+
+env.df <- read_data_sheet(gurl = test_gsheet_url, 
+                          tab_name = 'env_data', 
+                          spec.df = commassembly_rules_env_str)
+head(env.df)
+#> # A tibble: 1 × 22
+#>   env_who env_date   ID_new exp_type   biome_type country state_city_province   
+#>   <fct>   <chr>      <fct>  <fct>      <fct>      <chr>   <chr>                 
+#> 1 AR      03/09/2024 983    field_expt marsh      USA     Point Aux Pins, Bayou…
+#> # ℹ 15 more variables: latitude <chr>, longitude <chr>, coord_units <chr>,
+#> #   site <fct>, rainshelter <fct>, avg_rep_temp_C <dbl>, avg_rep_ppt_mm <dbl>,
+#> #   avg_rep_elev_m <dbl>, year_start <dbl>, year_end <dbl>,
+#> #   exp_duration_days <dbl>, notes <chr>, study_design <fct>, ntrt_val <fct>,
+#> #   comparison_level <fct>
+
+
+

Validation +

+

the above does validate the column headers and types. (TDB problems +output from readr)

+

Let’s try using the validation rules

+
+
+biomass_validation_file <- '../inst/rules/biomass_validation_rules.yaml'
+file.exists(biomass_validation_file)
+#> [1] TRUE
+validation_summary <- validate::summary(validate_from_file(biomass.df, biomass_validation_file))
+validation_summary
+#>                         name items passes fails nNA error warning
+#> 1            ID_new_required    16     16     0   0 FALSE   FALSE
+#> 2          spp_who_character     1      1     0   0 FALSE   FALSE
+#> 3  spp_who_two_or_more_chars    16     16     0   0 FALSE   FALSE
+#> 4                 id_new_int     1      1     0   0 FALSE   FALSE
+#> 5         trt_type_is_factor     1      1     0   0 FALSE   FALSE
+#> 6          trt_type_required    16     16     0   0 FALSE   FALSE
+#> 7        trt_type_required.1     1      1     0   0 FALSE   FALSE
+#> 8        trt_type_valid_code    16     16     0   0 FALSE   FALSE
+#> 9       f_nat_inv_valid_code    16      0     0  16 FALSE   FALSE
+#> 10      c_nat_inv_valid_code    16      0     0  16 FALSE   FALSE
+#> 11       response_mean_range    16     16     0   0 FALSE   FALSE
+#>                                                       expression
+#> 1                                                 !is.na(ID_new)
+#> 2                                          is.character(spp_who)
+#> 3                              nchar(as.character(spp_who)) >= 2
+#> 4                                             is.integer(ID_new)
+#> 5                                            is.factor(trt_type)
+#> 6                                               !is.na(trt_type)
+#> 7                                             !is.null(trt_type)
+#> 8            trt_type %vin% c("monoculture", "mixture", "alone")
+#> 9  (f_nat_inv %vin% c("native", "NA", "invasive", "non-native"))
+#> 10 (c_nat_inv %vin% c("native", "NA", "invasive", "non-native"))
+#> 11                  in_range(response_mean, min = 0, max = 5000)
+

Validation: Just show the fails

+
+if(sum(validation_summary$fails) == 0) {
+  print("Validation Passed, return true")
+} else {
+  fails <- validation_summary[validation_summary$fails > 0,]
+  fails
+}
+#> [1] "Validation Passed, return true"
+

Validate the ‘env’ tab

+
+
+read_data_sheet_save_warnings <- errorSaver(read_data_sheet)
+env.df <- read_data_sheet_save_warnings(gurl = test_gsheet_url, 
+                              tab_name = 'env_data', 
+                              spec.df = commassembly_rules_env_str)
+
+if("warnings" %in% names(env.df)) { 
+  print(env.df$warnings)
+  # get just the data
+  env.df <- env.df[[1]]
+}
+
+env_validation_file <- '../inst/rules/env_validation_rules.yaml'
+# file.exists(env_validation_file)
+validation_summary <- validate::summary(validate_from_file(env.df, env_validation_file))
+if(sum(validation_summary$fails) == 0) {
+  print("Validation Passed, return true")
+} else {
+  print("Validation fails")
+  fails <- validation_summary[validation_summary$fails > 0,]
+  fails
+}
+#> [1] "Validation Passed, return true"
+
+

find missing column error +

+

this is example code to only show validation errors and not to save +anything

+

disabled for now, see loop to validate below

+
+
+
+

Function to read, check columns, validate and save the CSVS of these +two tabs +

+
+

Save one to a csv and output the filename +

+
+
+test_url_num <- sample(1:nrow(urls.df), 1)
+test_gsheet <- as.list(urls.df[test_url_num,])
+print(test_gsheet)
+#> $who
+#> [1] "AMB"
+#> 
+#> $ID_new
+#> [1] 747
+#> 
+#> $url
+#> [1] "https://docs.google.com/spreadsheets/d/1yyp5st2Z_khZox4JqY8SLMUiiqcx0JFUugQWOg021YE/edit?usp=sharing"
+#> 
+#> $notes
+#> $notes[[1]]
+#> NULL
+#> 
+#> 
+#> $id
+#> [1] 747
+test_file_names<- save_csvs(test_gsheet)
+
+print(test_file_names)
+#> [1] "../L0/biomass_747.csv" "../L0/env_747.csv"
+
+
+

Try to read the CSV back in +

+
+test_df <- read_data_csv(test_file_names[1], spec.df = commassembly_rules_biomass_str)
+test_df
+#> # A tibble: 48 × 33
+#>    spp_who spp_date   ID_new source  external_trt trt_type    reponse_var site 
+#>    <chr>   <chr>       <int> <chr>   <fct>        <fct>       <fct>       <chr>
+#>  1 AMB     25/10/2024    747 Table 1 CO2          alone       biomass     main 
+#>  2 AMB     25/10/2024    747 Table 1 CO2          alone       biomass     main 
+#>  3 AMB     25/10/2024    747 Table 1 CO2          alone       biomass     main 
+#>  4 AMB     25/10/2024    747 Table 1 CO2          alone       biomass     main 
+#>  5 AMB     25/10/2024    747 Table 1 CO2          mixture     biomass     main 
+#>  6 AMB     25/10/2024    747 Table 1 CO2          mixture     biomass     main 
+#>  7 AMB     25/10/2024    747 Table 1 CO2          mixture     biomass     main 
+#>  8 AMB     25/10/2024    747 Table 1 CO2          mixture     biomass     main 
+#>  9 AMB     25/10/2024    747 Table 1 CO2          monoculture biomass     main 
+#> 10 AMB     25/10/2024    747 Table 1 CO2          monoculture biomass     main 
+#> # ℹ 38 more rows
+#> # ℹ 25 more variables: spp_biomass <fct>, f_spp_name <fct>, f_nat_inv <fct>,
+#> #   f_num_indiv <dbl>, c_spp_name <fct>, c_nat_inv <fct>, c_num_indiv <dbl>,
+#> #   biomass_type <fct>, response_mean <dbl>, response_transformation <chr>,
+#> #   response_mean_unit <fct>, response_var <dbl>, response_var_unit <fct>,
+#> #   sample_size <dbl>, nutrient_general <fct>, nutrient_compound <fct>,
+#> #   nutrient1_mean <chr>, nutrient2_mean <chr>, nutrient3_mean <chr>, …
+
+
+
+

Loop that reads all URLS in the url list. +

+

This is complete code, with duplicates from above, to show how to +validate all CSVs

+
+
+data(commassembly_rules_biomass_str)
+data(commassembly_rules_env_str)
+
+# before running this, authenticate. 
+# if you don't have ths PROJECT_EMAIL set in .Renviron, edit the next line
+drive_email = Sys.getenv('PROJECT_EMAIL')
+stopifnot(gsheet_auth_setup(drive_email = drive_email))
+
+
+# list of URLS from google drive to a data sheet
+doc_with_list_url <- Sys.getenv('TEST_ID_LIST_URL')
+id_column = 'ID_new'
+urls.df <- read_url_list(gurl = doc_with_list_url, id_column = id_column, url_column='url')
+#>  Reading from ID_new-Urls.
+#>  Range ''Sheet1''.
+
+for(i in seq(1:nrow(urls.df))) {
+  print(paste(urls.df$ID_new[i], urls.df$id[i], urls.df$who[i]))
+  tryCatch({
+    study_info <- as.list(urls.df[i,])
+    file_names<- save_csvs(study_info)
+    print(file_names)
+    }, error=function(e) print(e))
+}
+#> [1] "741 741 ADT"
+#> [1] "../L0/biomass_741.csv" "../L0/env_741.csv"    
+#> [1] "1047 1047 ADT"
+#> [1] "../L0/biomass_1047.csv" "../L0/env_1047.csv"    
+#> [1] "1163 1163 ADT"
+#> [1] "../L0/biomass_1163.csv" "../L0/env_1163.csv"    
+#> [1] "1203 1203 ADT"
+#> [1] "../L0/biomass_1203.csv" "../L0/env_1203.csv"    
+#> [1] "1225 1225 ADT"
+#> [1] "../L0/biomass_1225.csv" "../L0/env_1225.csv"    
+#> [1] "1246 1246 ADT"
+#> [1] "../L0/biomass_1246.csv" "../L0/env_1246.csv"    
+#> [1] "1255 1255 ADT"
+#> [1] "../L0/biomass_1255.csv" "../L0/env_1255.csv"    
+#> [1] "1546 1546 ADT"
+#>  Request 1 failed [429: RESOURCE_EXHAUSTED, per user quota].
+#>  Will retry in 61.2s.
+#> ⠙ Retry happens in  1m
+#> ⠹ Retry happens in  1m
+#> ⠸ Retry happens in  1m
+#> ⠼ Retry happens in  1m
+#> ⠴ Retry happens in  1m
+#> ⠦ Retry happens in  1m
+#> ⠧ Retry happens in 47s
+#> ⠇ Retry happens in 44s
+#> ⠏ Retry happens in 41s
+#> ⠋ Retry happens in 38s
+#> ⠙ Retry happens in 35s
+#> ⠹ Retry happens in 32s
+#> ⠸ Retry happens in 29s
+#> ⠼ Retry happens in 26s
+#> ⠴ Retry happens in 23s
+#> ⠦ Retry happens in 20s
+#> ⠧ Retry happens in 17s
+#> ⠇ Retry happens in 14s
+#> ⠏ Retry happens in 11s
+#> ⠋ Retry happens in  8s
+#> ⠙ Retry happens in  5s
+#> ⠹ Retry happens in  2s
+#>  Request 2 successful!
+#> ⠹ Retry happens in  2s⠹ Retry happens in  0s
+#> [1] "../L0/biomass_1546.csv" "../L0/env_1546.csv"    
+#> [1] "1575 1575 ADT"
+#> [1] "../L0/biomass_1575.csv" "../L0/env_1575.csv"    
+#> [1] "1702 1702 ADT"
+#> [1] "../L0/biomass_1702.csv" "../L0/env_1702.csv"    
+#> [1] "1847 1847 ADT"
+#> [1] "../L0/biomass_1847.csv" "../L0/env_1847.csv"    
+#> [1] "4 4 AMB"
+#> [1] "../L0/biomass_4.csv" "../L0/env_4.csv"    
+#> [1] "133 133 AMB"
+#> [1] "../L0/biomass_133.csv" "../L0/env_133.csv"    
+#> [1] "351 351 AMB"
+#> [1] "../L0/biomass_351.csv" "../L0/env_351.csv"    
+#> [1] "373 373 AMB"
+#> [1] "../L0/biomass_373.csv" "../L0/env_373.csv"    
+#> [1] "374 374 AMB"
+#> [1] "../L0/biomass_374.csv" "../L0/env_374.csv"    
+#> [1] "456 456 AMB"
+#>  Request 1 failed [429: RESOURCE_EXHAUSTED, per user quota].
+#>  Will retry in 61.5s.
+#> ⠙ Retry happens in  1m
+#> ⠹ Retry happens in  1m
+#> ⠸ Retry happens in  1m
+#> ⠼ Retry happens in  1m
+#> ⠴ Retry happens in  1m
+#> ⠦ Retry happens in  1m
+#> ⠧ Retry happens in 49s
+#> ⠇ Retry happens in 46s
+#> ⠏ Retry happens in 43s
+#> ⠋ Retry happens in 40s
+#> ⠙ Retry happens in 37s
+#> ⠹ Retry happens in 34s
+#> ⠸ Retry happens in 31s
+#> ⠼ Retry happens in 28s
+#> ⠴ Retry happens in 25s
+#> ⠦ Retry happens in 22s
+#> ⠧ Retry happens in 19s
+#> ⠇ Retry happens in 16s
+#> ⠏ Retry happens in 13s
+#> ⠋ Retry happens in 10s
+#> ⠙ Retry happens in  7s
+#> ⠹ Retry happens in  4s
+#> ⠸ Retry happens in  1s
+#>  Request 2 successful!
+#> ⠸ Retry happens in  1s⠸ Retry happens in  0s
+#> [1] "../L0/biomass_456.csv" "../L0/env_456.csv"    
+#> [1] "589 589 AMB"
+#> [1] "../L0/biomass_589.csv" "../L0/env_589.csv"    
+#> [1] "614 614 AMB"
+#> [1] "../L0/biomass_614.csv" "../L0/env_614.csv"    
+#> [1] "749 749 AMB"
+#> [1] "../L0/biomass_749.csv" "../L0/env_749.csv"    
+#> [1] "780 780 AMB"
+#> [1] "../L0/biomass_780.csv" "../L0/env_780.csv"    
+#> [1] "825 825 AMB"
+#> [1] "../L0/biomass_825.csv" "../L0/env_825.csv"    
+#> [1] "747 747 AMB"
+#> [1] "../L0/biomass_747.csv" "../L0/env_747.csv"    
+#> [1] "560 560 AMB"
+#> [1] "../L0/biomass_560.csv" "../L0/env_560.csv"    
+#> [1] "10279 10279 LP"
+#>                         name items passes fails nNA error warning
+#> 1            ID_new_required    47     42     5   0 FALSE   FALSE
+#> 2          spp_who_character     1      1     0   0 FALSE   FALSE
+#> 3  spp_who_two_or_more_chars    47     42     0   5 FALSE   FALSE
+#> 4                 id_new_int     1      1     0   0 FALSE   FALSE
+#> 5         trt_type_is_factor     1      1     0   0 FALSE   FALSE
+#> 6          trt_type_required    47     42     5   0 FALSE   FALSE
+#> 7        trt_type_required.1     1      1     0   0 FALSE   FALSE
+#> 8        trt_type_valid_code    47     42     0   5 FALSE   FALSE
+#> 9       f_nat_inv_valid_code    47     42     0   5 FALSE   FALSE
+#> 10      c_nat_inv_valid_code    47     24     0  23 FALSE   FALSE
+#> 11       response_mean_range    47     42     0   5 FALSE   FALSE
+#>                                                       expression
+#> 1                                                 !is.na(ID_new)
+#> 2                                          is.character(spp_who)
+#> 3                              nchar(as.character(spp_who)) >= 2
+#> 4                                             is.integer(ID_new)
+#> 5                                            is.factor(trt_type)
+#> 6                                               !is.na(trt_type)
+#> 7                                             !is.null(trt_type)
+#> 8            trt_type %vin% c("monoculture", "mixture", "alone")
+#> 9  (f_nat_inv %vin% c("native", "NA", "invasive", "non-native"))
+#> 10 (c_nat_inv %vin% c("native", "NA", "invasive", "non-native"))
+#> 11                  in_range(response_mean, min = 0, max = 5000)
+#> [1] "did not validate biomass FALSE  env  TRUE"
+#> [1] NA NA
+#> [1] "20279 20279 LP"
+#> [1] "../L0/biomass_20279.csv" "../L0/env_20279.csv"    
+#> [1] "938 938 LP"
+#>  Request 1 failed [429: RESOURCE_EXHAUSTED, per user quota].
+#>  Will retry in 61.7s.
+#> ⠙ Retry happens in  1m
+#> ⠹ Retry happens in  1m
+#> ⠸ Retry happens in  1m
+#> ⠼ Retry happens in  1m
+#> ⠴ Retry happens in  1m
+#> ⠦ Retry happens in  1m
+#> ⠧ Retry happens in 49s
+#> ⠇ Retry happens in 46s
+#> ⠏ Retry happens in 43s
+#> ⠋ Retry happens in 40s
+#> ⠙ Retry happens in 37s
+#> ⠹ Retry happens in 34s
+#> ⠸ Retry happens in 31s
+#> ⠼ Retry happens in 28s
+#> ⠴ Retry happens in 25s
+#> ⠦ Retry happens in 22s
+#> ⠧ Retry happens in 19s
+#> ⠇ Retry happens in 16s
+#> ⠏ Retry happens in 13s
+#> ⠋ Retry happens in 10s
+#> ⠙ Retry happens in  7s
+#> ⠹ Retry happens in  4s
+#> ⠸ Retry happens in  1s
+#>  Request 2 successful!
+#> ⠸ Retry happens in  1s⠸ Retry happens in  0s
+#> [1] "../L0/biomass_938.csv" "../L0/env_938.csv"    
+#> [1] "1090 1090 LP"
+#> [1] "../L0/biomass_1090.csv" "../L0/env_1090.csv"    
+#> [1] "1114 1114 LP"
+#> [1] "../L0/biomass_1114.csv" "../L0/env_1114.csv"    
+#> [1] "2253 2253 LP"
+#> [1] "../L0/biomass_2253.csv" "../L0/env_2253.csv"    
+#> [1] "2571 2571 LP"
+#> [1] "../L0/biomass_2571.csv" "../L0/env_2571.csv"    
+#> [1] "2757 2757 LP"
+#> [1] "../L0/biomass_2757.csv" "../L0/env_2757.csv"    
+#> [1] "2827 2827 LP"
+#> [1] "../L0/biomass_2827.csv" "../L0/env_2827.csv"    
+#> [1] "2830 2830 LP"
+#> [1] "../L0/biomass_2830.csv" "../L0/env_2830.csv"    
+#> [1] "2872 2872 LP"
+#> [1] "../L0/biomass_2872.csv" "../L0/env_2872.csv"    
+#> [1] "2876 2876 LP"
+#>  Request 1 failed [429: RESOURCE_EXHAUSTED, per user quota].
+#>  Will retry in 61.9s.
+#> ⠙ Retry happens in  1m
+#> ⠹ Retry happens in  1m
+#> ⠸ Retry happens in  1m
+#> ⠼ Retry happens in  1m
+#> ⠴ Retry happens in  1m
+#> ⠦ Retry happens in  1m
+#> ⠧ Retry happens in 48s
+#> ⠇ Retry happens in 45s
+#> ⠏ Retry happens in 42s
+#> ⠋ Retry happens in 39s
+#> ⠙ Retry happens in 36s
+#> ⠹ Retry happens in 33s
+#> ⠸ Retry happens in 30s
+#> ⠼ Retry happens in 27s
+#> ⠴ Retry happens in 24s
+#> ⠦ Retry happens in 21s
+#> ⠧ Retry happens in 18s
+#> ⠇ Retry happens in 15s
+#> ⠏ Retry happens in 12s
+#> ⠋ Retry happens in  9s
+#> ⠙ Retry happens in  6s
+#> ⠹ Retry happens in  3s
+#>  Request 2 successful!
+#> ⠹ Retry happens in  3s⠹ Retry happens in  0s
+#> [1] "../L0/biomass_2876.csv" "../L0/env_2876.csv"    
+#> [1] "2948 2948 LP"
+#> [1] "../L0/biomass_2948.csv" "../L0/env_2948.csv"    
+#> [1] "2990 2990 LP"
+#> [1] "../L0/biomass_2990.csv" "../L0/env_2990.csv"    
+#> [1] "3184 3184 LP"
+#> [1] "../L0/biomass_3184.csv" "../L0/env_3184.csv"    
+#> [1] "3185 3185 LP"
+#> [1] "../L0/biomass_3185.csv" "../L0/env_3185.csv"    
+#> [1] "3497 3497 LP"
+#> [1] "../L0/biomass_3497.csv" "../L0/env_3497.csv"    
+#> [1] "3506 3506 LP"
+#> [1] "../L0/biomass_3506.csv" "../L0/env_3506.csv"    
+#> [1] "3508 3508 LP"
+#> [1] "../L0/biomass_3508.csv" "../L0/env_3508.csv"    
+#> [1] "368 368 AR"
+#> [1] "../L0/biomass_368.csv" "../L0/env_368.csv"    
+#> [1] "983 983 AR"
+#> [1] "../L0/biomass_983.csv" "../L0/env_983.csv"    
+#> [1] "1116 1116 AR"
+#>  Request 1 failed [429: RESOURCE_EXHAUSTED, per user quota].
+#>  Will retry in 61s.
+#> ⠙ Retry happens in  1m
+#> ⠹ Retry happens in  1m
+#> ⠸ Retry happens in  1m
+#> ⠼ Retry happens in  1m
+#> ⠴ Retry happens in  1m
+#> ⠦ Retry happens in  1m
+#> ⠧ Retry happens in 47s
+#> ⠇ Retry happens in 44s
+#> ⠏ Retry happens in 41s
+#> ⠋ Retry happens in 38s
+#> ⠙ Retry happens in 35s
+#> ⠹ Retry happens in 32s
+#> ⠸ Retry happens in 29s
+#> ⠼ Retry happens in 26s
+#> ⠴ Retry happens in 23s
+#> ⠦ Retry happens in 20s
+#> ⠧ Retry happens in 17s
+#> ⠇ Retry happens in 14s
+#> ⠏ Retry happens in 11s
+#> ⠋ Retry happens in  8s
+#> ⠙ Retry happens in  5s
+#> ⠹ Retry happens in  2s
+#>  Request 2 successful!
+#> ⠹ Retry happens in  2s⠹ Retry happens in  0s
+#> [1] "../L0/biomass_1116.csv" "../L0/env_1116.csv"    
+#> [1] "1943 1943 AR"
+#> [1] "../L0/biomass_1943.csv" "../L0/env_1943.csv"    
+#> [1] "2090 2090 AR"
+#> [1] "../L0/biomass_2090.csv" "../L0/env_2090.csv"    
+#> [1] "102125 102125 AR"
+#> [1] "../L0/biomass_102125.csv" "../L0/env_102125.csv"    
+#> [1] "202125 202125 AR"
+#> [1] "../L0/biomass_202125.csv" "../L0/env_202125.csv"    
+#> [1] "2157 2157 AR"
+#> [1] "../L0/biomass_2157.csv" "../L0/env_2157.csv"    
+#> [1] "2164 2164 AR"
+#> [1] "../L0/biomass_2164.csv" "../L0/env_2164.csv"    
+#> [1] "2362 2362 AR"
+#> [1] "../L0/biomass_2362.csv" "../L0/env_2362.csv"    
+#> [1] "2383 2383 AR"
+#> [1] "../L0/biomass_2383.csv" "../L0/env_2383.csv"    
+#> [1] "2441 2441 AR"
+#> [1] "../L0/biomass_2441.csv" "../L0/env_2441.csv"    
+#> [1] "2464 2464 AR"
+#> [1] "../L0/biomass_2464.csv" "../L0/env_2464.csv"    
+#> [1] "2469 2469 AR"
+#> [1] "../L0/biomass_2469.csv" "../L0/env_2469.csv"    
+#> [1] "2559 2559 AR"
+#> [1] "../L0/biomass_2559.csv" "../L0/env_2559.csv"    
+#> [1] "2581 2581 AR"
+#> [1] "../L0/biomass_2581.csv" "../L0/env_2581.csv"    
+#> [1] "2614 2614 AR"
+#> [1] "../L0/biomass_2614.csv" "../L0/env_2614.csv"    
+#> [1] "2635 2635 AR"
+#> [1] "../L0/biomass_2635.csv" "../L0/env_2635.csv"    
+#> [1] "2636 2636 AR"
+#> [1] "../L0/biomass_2636.csv" "../L0/env_2636.csv"    
+#> [1] "2667 2667 AR"
+#> [1] "../L0/biomass_2667.csv" "../L0/env_2667.csv"    
+#> [1] "2672 2672 AR"
+#>  Request 1 failed [429: RESOURCE_EXHAUSTED, per user quota].
+#>  Will retry in 61.4s.
+#> ⠙ Retry happens in  1m
+#> ⠹ Retry happens in  1m
+#> ⠸ Retry happens in  1m
+#> ⠼ Retry happens in  1m
+#> ⠴ Retry happens in  1m
+#> ⠦ Retry happens in  1m
+#> ⠧ Retry happens in 49s
+#> ⠇ Retry happens in 46s
+#> ⠏ Retry happens in 43s
+#> ⠋ Retry happens in 40s
+#> ⠙ Retry happens in 37s
+#> ⠹ Retry happens in 34s
+#> ⠸ Retry happens in 31s
+#> ⠼ Retry happens in 28s
+#> ⠴ Retry happens in 25s
+#> ⠦ Retry happens in 22s
+#> ⠧ Retry happens in 19s
+#> ⠇ Retry happens in 16s
+#> ⠏ Retry happens in 13s
+#> ⠋ Retry happens in 10s
+#> ⠙ Retry happens in  7s
+#> ⠹ Retry happens in  4s
+#> ⠸ Retry happens in  1s
+#>  Request 2 successful!
+#> ⠸ Retry happens in  1s⠸ Retry happens in  0s
+#> [1] "../L0/biomass_2672.csv" "../L0/env_2672.csv"    
+#> [1] "2680 2680 AR"
+#> [1] "../L0/biomass_2680.csv" "../L0/env_2680.csv"    
+#> [1] "2284 2284 AR"
+#> [1] "../L0/biomass_2284.csv" "../L0/env_2284.csv"    
+#> [1] "2072 2072 AR"
+#> [1] "../L0/biomass_2072.csv" "../L0/env_2072.csv"
+
+

Build a database +

+
+# getting the list of CSV file paths is flexible so that some could be 
+# excluded or multiple folders combined
+
+biomass_files <- dir('../L0', pattern = "biomass.*\\.csv", 
+                          full.names = TRUE, include.dirs = TRUE)
+biomass.df <- aggregate_csvs(csv_list=biomass_files, spec.df=commassembly_rules_biomass_str)
+
+# some of these are not saving or reading correctly
+#env_files <- dir('../L0', pattern = "env.*\\.csv", full.names = TRUE, include.dirs = TRUE)
+# env.df <- aggregate_csvs(csv_list=env_files, spec.df=commassembly_rules_env_str)
+
+print(paste('biomass rows', nrow(biomass.df))) # , " env rows", nrow(env.df)))
+#> [1] "biomass rows 3446"
+
+
+
+ + + +
+ + + +
+ +
+

+

Site built with pkgdown 2.2.0.

+
+ +
+
+ + + + + + + + diff --git a/docs/authors.html b/docs/authors.html new file mode 100644 index 0000000..6bb0906 --- /dev/null +++ b/docs/authors.html @@ -0,0 +1,116 @@ + +Authors and Citation • collaboratR + + +
+
+ + + +
+
+
+ + + +
  • +

    Patrick S Bills. Author, maintainer. +

    +
  • +
  • +

    Ashwini Ramesh. Author, data contributor. +

    +
  • +
  • +

    Laís Petri. Author, data contributor. +

    +
  • +
  • +

    Phoebe Lehman Zarnetske. Author, funder. +

    +
  • +
+
+
+

Citation

+ +
+
+ + +

Bills PS, Ramesh A, Petri L, Zarnetske PL (2026). +collaboratR: validation and version control for level 0 data using Google Drive. +R package version 0.0.0.9100. +

+
@Manual{,
+  title = {collaboratR: validation and version control for level 0 data using Google Drive},
+  author = {Patrick S Bills and Ashwini Ramesh and Laís Petri and Phoebe Lehman Zarnetske},
+  year = {2026},
+  note = {R package version 0.0.0.9100},
+}
+ +
+ +
+ + + +
+ +
+

Site built with pkgdown 2.2.0.

+
+ +
+ + + + + + + + diff --git a/docs/bootstrap-toc.css b/docs/bootstrap-toc.css new file mode 100644 index 0000000..5a85941 --- /dev/null +++ b/docs/bootstrap-toc.css @@ -0,0 +1,60 @@ +/*! + * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) + * Copyright 2015 Aidan Feldman + * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ + +/* modified from https://github.com/twbs/bootstrap/blob/94b4076dd2efba9af71f0b18d4ee4b163aa9e0dd/docs/assets/css/src/docs.css#L548-L601 */ + +/* All levels of nav */ +nav[data-toggle='toc'] .nav > li > a { + display: block; + padding: 4px 20px; + font-size: 13px; + font-weight: 500; + color: #767676; +} +nav[data-toggle='toc'] .nav > li > a:hover, +nav[data-toggle='toc'] .nav > li > a:focus { + padding-left: 19px; + color: #563d7c; + text-decoration: none; + background-color: transparent; + border-left: 1px solid #563d7c; +} +nav[data-toggle='toc'] .nav > .active > a, +nav[data-toggle='toc'] .nav > .active:hover > a, +nav[data-toggle='toc'] .nav > .active:focus > a { + padding-left: 18px; + font-weight: bold; + color: #563d7c; + background-color: transparent; + border-left: 2px solid #563d7c; +} + +/* Nav: second level (shown on .active) */ +nav[data-toggle='toc'] .nav .nav { + display: none; /* Hide by default, but at >768px, show it */ + padding-bottom: 10px; +} +nav[data-toggle='toc'] .nav .nav > li > a { + padding-top: 1px; + padding-bottom: 1px; + padding-left: 30px; + font-size: 12px; + font-weight: normal; +} +nav[data-toggle='toc'] .nav .nav > li > a:hover, +nav[data-toggle='toc'] .nav .nav > li > a:focus { + padding-left: 29px; +} +nav[data-toggle='toc'] .nav .nav > .active > a, +nav[data-toggle='toc'] .nav .nav > .active:hover > a, +nav[data-toggle='toc'] .nav .nav > .active:focus > a { + padding-left: 28px; + font-weight: 500; +} + +/* from https://github.com/twbs/bootstrap/blob/e38f066d8c203c3e032da0ff23cd2d6098ee2dd6/docs/assets/css/src/docs.css#L631-L634 */ +nav[data-toggle='toc'] .nav > .active > ul { + display: block; +} diff --git a/docs/bootstrap-toc.js b/docs/bootstrap-toc.js new file mode 100644 index 0000000..1cdd573 --- /dev/null +++ b/docs/bootstrap-toc.js @@ -0,0 +1,159 @@ +/*! + * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) + * Copyright 2015 Aidan Feldman + * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ +(function() { + 'use strict'; + + window.Toc = { + helpers: { + // return all matching elements in the set, or their descendants + findOrFilter: function($el, selector) { + // http://danielnouri.org/notes/2011/03/14/a-jquery-find-that-also-finds-the-root-element/ + // http://stackoverflow.com/a/12731439/358804 + var $descendants = $el.find(selector); + return $el.filter(selector).add($descendants).filter(':not([data-toc-skip])'); + }, + + generateUniqueIdBase: function(el) { + var text = $(el).text(); + var anchor = text.trim().toLowerCase().replace(/[^A-Za-z0-9]+/g, '-'); + return anchor || el.tagName.toLowerCase(); + }, + + generateUniqueId: function(el) { + var anchorBase = this.generateUniqueIdBase(el); + for (var i = 0; ; i++) { + var anchor = anchorBase; + if (i > 0) { + // add suffix + anchor += '-' + i; + } + // check if ID already exists + if (!document.getElementById(anchor)) { + return anchor; + } + } + }, + + generateAnchor: function(el) { + if (el.id) { + return el.id; + } else { + var anchor = this.generateUniqueId(el); + el.id = anchor; + return anchor; + } + }, + + createNavList: function() { + return $(''); + }, + + createChildNavList: function($parent) { + var $childList = this.createNavList(); + $parent.append($childList); + return $childList; + }, + + generateNavEl: function(anchor, text) { + var $a = $(''); + $a.attr('href', '#' + anchor); + $a.text(text); + var $li = $('
  • '); + $li.append($a); + return $li; + }, + + generateNavItem: function(headingEl) { + var anchor = this.generateAnchor(headingEl); + var $heading = $(headingEl); + var text = $heading.data('toc-text') || $heading.text(); + return this.generateNavEl(anchor, text); + }, + + // Find the first heading level (`

    `, then `

    `, etc.) that has more than one element. Defaults to 1 (for `

    `). + getTopLevel: function($scope) { + for (var i = 1; i <= 6; i++) { + var $headings = this.findOrFilter($scope, 'h' + i); + if ($headings.length > 1) { + return i; + } + } + + return 1; + }, + + // returns the elements for the top level, and the next below it + getHeadings: function($scope, topLevel) { + var topSelector = 'h' + topLevel; + + var secondaryLevel = topLevel + 1; + var secondarySelector = 'h' + secondaryLevel; + + return this.findOrFilter($scope, topSelector + ',' + secondarySelector); + }, + + getNavLevel: function(el) { + return parseInt(el.tagName.charAt(1), 10); + }, + + populateNav: function($topContext, topLevel, $headings) { + var $context = $topContext; + var $prevNav; + + var helpers = this; + $headings.each(function(i, el) { + var $newNav = helpers.generateNavItem(el); + var navLevel = helpers.getNavLevel(el); + + // determine the proper $context + if (navLevel === topLevel) { + // use top level + $context = $topContext; + } else if ($prevNav && $context === $topContext) { + // create a new level of the tree and switch to it + $context = helpers.createChildNavList($prevNav); + } // else use the current $context + + $context.append($newNav); + + $prevNav = $newNav; + }); + }, + + parseOps: function(arg) { + var opts; + if (arg.jquery) { + opts = { + $nav: arg + }; + } else { + opts = arg; + } + opts.$scope = opts.$scope || $(document.body); + return opts; + } + }, + + // accepts a jQuery object, or an options object + init: function(opts) { + opts = this.helpers.parseOps(opts); + + // ensure that the data attribute is in place for styling + opts.$nav.attr('data-toggle', 'toc'); + + var $topContext = this.helpers.createChildNavList(opts.$nav); + var topLevel = this.helpers.getTopLevel(opts.$scope); + var $headings = this.helpers.getHeadings(opts.$scope, topLevel); + this.helpers.populateNav($topContext, topLevel, $headings); + } + }; + + $(function() { + $('nav[data-toggle="toc"]').each(function(i, el) { + var $nav = $(el); + Toc.init($nav); + }); + }); +})(); diff --git a/docs/docsearch.css b/docs/docsearch.css new file mode 100644 index 0000000..e5f1fe1 --- /dev/null +++ b/docs/docsearch.css @@ -0,0 +1,148 @@ +/* Docsearch -------------------------------------------------------------- */ +/* + Source: https://github.com/algolia/docsearch/ + License: MIT +*/ + +.algolia-autocomplete { + display: block; + -webkit-box-flex: 1; + -ms-flex: 1; + flex: 1 +} + +.algolia-autocomplete .ds-dropdown-menu { + width: 100%; + min-width: none; + max-width: none; + padding: .75rem 0; + background-color: #fff; + background-clip: padding-box; + border: 1px solid rgba(0, 0, 0, .1); + box-shadow: 0 .5rem 1rem rgba(0, 0, 0, .175); +} + +@media (min-width:768px) { + .algolia-autocomplete .ds-dropdown-menu { + width: 175% + } +} + +.algolia-autocomplete .ds-dropdown-menu::before { + display: none +} + +.algolia-autocomplete .ds-dropdown-menu [class^=ds-dataset-] { + padding: 0; + background-color: rgb(255,255,255); + border: 0; + max-height: 80vh; +} + +.algolia-autocomplete .ds-dropdown-menu .ds-suggestions { + margin-top: 0 +} + +.algolia-autocomplete .algolia-docsearch-suggestion { + padding: 0; + overflow: visible +} + +.algolia-autocomplete .algolia-docsearch-suggestion--category-header { + padding: .125rem 1rem; + margin-top: 0; + font-size: 1.3em; + font-weight: 500; + color: #00008B; + border-bottom: 0 +} + +.algolia-autocomplete .algolia-docsearch-suggestion--wrapper { + float: none; + padding-top: 0 +} + +.algolia-autocomplete .algolia-docsearch-suggestion--subcategory-column { + float: none; + width: auto; + padding: 0; + text-align: left +} + +.algolia-autocomplete .algolia-docsearch-suggestion--content { + float: none; + width: auto; + padding: 0 +} + +.algolia-autocomplete .algolia-docsearch-suggestion--content::before { + display: none +} + +.algolia-autocomplete .ds-suggestion:not(:first-child) .algolia-docsearch-suggestion--category-header { + padding-top: .75rem; + margin-top: .75rem; + border-top: 1px solid rgba(0, 0, 0, .1) +} + +.algolia-autocomplete .ds-suggestion .algolia-docsearch-suggestion--subcategory-column { + display: block; + padding: .1rem 1rem; + margin-bottom: 0.1; + font-size: 1.0em; + font-weight: 400 + /* display: none */ +} + +.algolia-autocomplete .algolia-docsearch-suggestion--title { + display: block; + padding: .25rem 1rem; + margin-bottom: 0; + font-size: 0.9em; + font-weight: 400 +} + +.algolia-autocomplete .algolia-docsearch-suggestion--text { + padding: 0 1rem .5rem; + margin-top: -.25rem; + font-size: 0.8em; + font-weight: 400; + line-height: 1.25 +} + +.algolia-autocomplete .algolia-docsearch-footer { + width: 110px; + height: 20px; + z-index: 3; + margin-top: 10.66667px; + float: right; + font-size: 0; + line-height: 0; +} + +.algolia-autocomplete .algolia-docsearch-footer--logo { + background-image: url("data:image/svg+xml;utf8,"); + background-repeat: no-repeat; + background-position: 50%; + background-size: 100%; + overflow: hidden; + text-indent: -9000px; + width: 100%; + height: 100%; + display: block; + transform: translate(-8px); +} + +.algolia-autocomplete .algolia-docsearch-suggestion--highlight { + color: #FF8C00; + background: rgba(232, 189, 54, 0.1) +} + + +.algolia-autocomplete .algolia-docsearch-suggestion--text .algolia-docsearch-suggestion--highlight { + box-shadow: inset 0 -2px 0 0 rgba(105, 105, 105, .5) +} + +.algolia-autocomplete .ds-suggestion.ds-cursor .algolia-docsearch-suggestion--content { + background-color: rgba(192, 192, 192, .15) +} diff --git a/docs/docsearch.js b/docs/docsearch.js new file mode 100644 index 0000000..b35504c --- /dev/null +++ b/docs/docsearch.js @@ -0,0 +1,85 @@ +$(function() { + + // register a handler to move the focus to the search bar + // upon pressing shift + "/" (i.e. "?") + $(document).on('keydown', function(e) { + if (e.shiftKey && e.keyCode == 191) { + e.preventDefault(); + $("#search-input").focus(); + } + }); + + $(document).ready(function() { + // do keyword highlighting + /* modified from https://jsfiddle.net/julmot/bL6bb5oo/ */ + var mark = function() { + + var referrer = document.URL ; + var paramKey = "q" ; + + if (referrer.indexOf("?") !== -1) { + var qs = referrer.substr(referrer.indexOf('?') + 1); + var qs_noanchor = qs.split('#')[0]; + var qsa = qs_noanchor.split('&'); + var keyword = ""; + + for (var i = 0; i < qsa.length; i++) { + var currentParam = qsa[i].split('='); + + if (currentParam.length !== 2) { + continue; + } + + if (currentParam[0] == paramKey) { + keyword = decodeURIComponent(currentParam[1].replace(/\+/g, "%20")); + } + } + + if (keyword !== "") { + $(".contents").unmark({ + done: function() { + $(".contents").mark(keyword); + } + }); + } + } + }; + + mark(); + }); +}); + +/* Search term highlighting ------------------------------*/ + +function matchedWords(hit) { + var words = []; + + var hierarchy = hit._highlightResult.hierarchy; + // loop to fetch from lvl0, lvl1, etc. + for (var idx in hierarchy) { + words = words.concat(hierarchy[idx].matchedWords); + } + + var content = hit._highlightResult.content; + if (content) { + words = words.concat(content.matchedWords); + } + + // return unique words + var words_uniq = [...new Set(words)]; + return words_uniq; +} + +function updateHitURL(hit) { + + var words = matchedWords(hit); + var url = ""; + + if (hit.anchor) { + url = hit.url_without_anchor + '?q=' + escape(words.join(" ")) + '#' + hit.anchor; + } else { + url = hit.url + '?q=' + escape(words.join(" ")); + } + + return url; +} diff --git a/docs/index.html b/docs/index.html new file mode 100644 index 0000000..bf0a6f6 --- /dev/null +++ b/docs/index.html @@ -0,0 +1,194 @@ + + + + + + + +validation and version control for level 0 data using Google Drive • collaboratR + + + + + + + + + + + + +
    +
    + + + + +
    +
    +
    + +
    +

    A package to support collaborative meta-analysis for MSU IBEEM + +

    + + +
    +
    +

    Motivation +

    +

    Performaing a Meta-analysis requires collating and harmonizing data extracted from many different sources but most frequently scientific publications.
    +Collaborative meta-analysis requires a group of scientists to collectively develop and agree on their goals, type of data extracted, format of the those data, and to do so extremely consistently across papers. This package helps to support that efficiently by

    +

    This R package is part of 3 repositories that support the data entry, validation and accumulation of a meta-analysis for the commRULES project.

    +
      +
    1. collaboratR: commRULES data management code for L0 and L0->L1 layer in EDI framework
    2. +
    3. data: version controlled data collection for tracking provenance using git, this is the L0 and L1 layers in the EDI framework
    4. +
    5. analysis: R code for reproducible data analysis , L1->L2 layers in EDI framework
    6. +
    +
    +
    +

    Installation - Package +

    +
      +
    • clone this repository into a new Rstudio project and open it

    • +
    • +

      install required packages: This package uses renv to manage the packages you need to install, which creates an renv.lock file for you. 1. install the renv package: this can go into your R environment used for all packages.

      +
        +
      1. in R run renv::restore() or if that complains about R versions
      2. +
      +
    • +
    +

    additional packages are required to build the package and this website, source the script R/install_dev_packages.R

    +
    +
    +

    Data Google Drive Project Setup +

    +

    See the Vignette “Google Sheets API setup using Google Cloud” for details about setting up google sheets connection with R, which requires a google cloud project in your institution

    +

    Note that for safety, this package only reads from google drive and it never writes to google drive. Therefore it only requests ‘read-only’ access.

    +
    +
    +

    Usage +

    +

    When reading in data sheets, you provide a URL for a datasheet that exists in any folder that you have access to. The system will attempt to log you into to google drive and requests your permission for this code to access files on your behalf.

    +
    +gurl<- 'https://docs.google.com/spreadsheets/d/1w6sYozjybyd53eeiTdigrRTonteQW2KXUNZNmEhQyM8/edit?gid=0#gid=0'
    +study_data<- read_commrules_sheet(gurl)
    +
    +

    +
    +
    +
    +

    References +

    +

    @article{van2021data, title={Data validation infrastructure for R}, author={van der Loo, Mark PJ and de Jonge, Edwin}, journal={Journal of Statistical Software}, year={2021}, volume ={97}, issue = {10}, pages = {1-33}, doi={10.18637/jss.v097.i10}, url = {https://www.jstatsoft.org/article/view/v097i10} }

    +
    +
    +
    + + +
    + + +
    + +
    +

    +

    Site built with pkgdown 2.2.0.

    +
    + +
    +
    + + + + + + + + diff --git a/docs/link.svg b/docs/link.svg new file mode 100644 index 0000000..88ad827 --- /dev/null +++ b/docs/link.svg @@ -0,0 +1,12 @@ + + + + + + diff --git a/docs/pkgdown.css b/docs/pkgdown.css new file mode 100644 index 0000000..80ea5b8 --- /dev/null +++ b/docs/pkgdown.css @@ -0,0 +1,384 @@ +/* Sticky footer */ + +/** + * Basic idea: https://philipwalton.github.io/solved-by-flexbox/demos/sticky-footer/ + * Details: https://github.com/philipwalton/solved-by-flexbox/blob/master/assets/css/components/site.css + * + * .Site -> body > .container + * .Site-content -> body > .container .row + * .footer -> footer + * + * Key idea seems to be to ensure that .container and __all its parents__ + * have height set to 100% + * + */ + +html, body { + height: 100%; +} + +body { + position: relative; +} + +body > .container { + display: flex; + height: 100%; + flex-direction: column; +} + +body > .container .row { + flex: 1 0 auto; +} + +footer { + margin-top: 45px; + padding: 35px 0 36px; + border-top: 1px solid #e5e5e5; + color: #666; + display: flex; + flex-shrink: 0; +} +footer p { + margin-bottom: 0; +} +footer div { + flex: 1; +} +footer .pkgdown { + text-align: right; +} +footer p { + margin-bottom: 0; +} + +img.icon { + float: right; +} + +/* Ensure in-page images don't run outside their container */ +.contents img { + max-width: 100%; + height: auto; +} + +/* Fix bug in bootstrap (only seen in firefox) */ +summary { + display: list-item; +} + +/* Typographic tweaking ---------------------------------*/ + +.contents .page-header { + margin-top: calc(-60px + 1em); +} + +dd { + margin-left: 3em; +} + +/* Section anchors ---------------------------------*/ + +a.anchor { + display: none; + margin-left: 5px; + width: 20px; + height: 20px; + + background-image: url(./link.svg); + background-repeat: no-repeat; + background-size: 20px 20px; + background-position: center center; +} + +h1:hover .anchor, +h2:hover .anchor, +h3:hover .anchor, +h4:hover .anchor, +h5:hover .anchor, +h6:hover .anchor { + display: inline-block; +} + +/* Fixes for fixed navbar --------------------------*/ + +.contents h1, .contents h2, .contents h3, .contents h4 { + padding-top: 60px; + margin-top: -40px; +} + +/* Navbar submenu --------------------------*/ + +.dropdown-submenu { + position: relative; +} + +.dropdown-submenu>.dropdown-menu { + top: 0; + left: 100%; + margin-top: -6px; + margin-left: -1px; + border-radius: 0 6px 6px 6px; +} + +.dropdown-submenu:hover>.dropdown-menu { + display: block; +} + +.dropdown-submenu>a:after { + display: block; + content: " "; + float: right; + width: 0; + height: 0; + border-color: transparent; + border-style: solid; + border-width: 5px 0 5px 5px; + border-left-color: #cccccc; + margin-top: 5px; + margin-right: -10px; +} + +.dropdown-submenu:hover>a:after { + border-left-color: #ffffff; +} + +.dropdown-submenu.pull-left { + float: none; +} + +.dropdown-submenu.pull-left>.dropdown-menu { + left: -100%; + margin-left: 10px; + border-radius: 6px 0 6px 6px; +} + +/* Sidebar --------------------------*/ + +#pkgdown-sidebar { + margin-top: 30px; + position: -webkit-sticky; + position: sticky; + top: 70px; +} + +#pkgdown-sidebar h2 { + font-size: 1.5em; + margin-top: 1em; +} + +#pkgdown-sidebar h2:first-child { + margin-top: 0; +} + +#pkgdown-sidebar .list-unstyled li { + margin-bottom: 0.5em; +} + +/* bootstrap-toc tweaks ------------------------------------------------------*/ + +/* All levels of nav */ + +nav[data-toggle='toc'] .nav > li > a { + padding: 4px 20px 4px 6px; + font-size: 1.5rem; + font-weight: 400; + color: inherit; +} + +nav[data-toggle='toc'] .nav > li > a:hover, +nav[data-toggle='toc'] .nav > li > a:focus { + padding-left: 5px; + color: inherit; + border-left: 1px solid #878787; +} + +nav[data-toggle='toc'] .nav > .active > a, +nav[data-toggle='toc'] .nav > .active:hover > a, +nav[data-toggle='toc'] .nav > .active:focus > a { + padding-left: 5px; + font-size: 1.5rem; + font-weight: 400; + color: inherit; + border-left: 2px solid #878787; +} + +/* Nav: second level (shown on .active) */ + +nav[data-toggle='toc'] .nav .nav { + display: none; /* Hide by default, but at >768px, show it */ + padding-bottom: 10px; +} + +nav[data-toggle='toc'] .nav .nav > li > a { + padding-left: 16px; + font-size: 1.35rem; +} + +nav[data-toggle='toc'] .nav .nav > li > a:hover, +nav[data-toggle='toc'] .nav .nav > li > a:focus { + padding-left: 15px; +} + +nav[data-toggle='toc'] .nav .nav > .active > a, +nav[data-toggle='toc'] .nav .nav > .active:hover > a, +nav[data-toggle='toc'] .nav .nav > .active:focus > a { + padding-left: 15px; + font-weight: 500; + font-size: 1.35rem; +} + +/* orcid ------------------------------------------------------------------- */ + +.orcid { + font-size: 16px; + color: #A6CE39; + /* margins are required by official ORCID trademark and display guidelines */ + margin-left:4px; + margin-right:4px; + vertical-align: middle; +} + +/* Reference index & topics ----------------------------------------------- */ + +.ref-index th {font-weight: normal;} + +.ref-index td {vertical-align: top; min-width: 100px} +.ref-index .icon {width: 40px;} +.ref-index .alias {width: 40%;} +.ref-index-icons .alias {width: calc(40% - 40px);} +.ref-index .title {width: 60%;} + +.ref-arguments th {text-align: right; padding-right: 10px;} +.ref-arguments th, .ref-arguments td {vertical-align: top; min-width: 100px} +.ref-arguments .name {width: 20%;} +.ref-arguments .desc {width: 80%;} + +/* Nice scrolling for wide elements --------------------------------------- */ + +table { + display: block; + overflow: auto; +} + +/* Syntax highlighting ---------------------------------------------------- */ + +pre, code, pre code { + background-color: #f8f8f8; + color: #333; +} +pre, pre code { + white-space: pre-wrap; + word-break: break-all; + overflow-wrap: break-word; +} + +pre { + border: 1px solid #eee; +} + +pre .img, pre .r-plt { + margin: 5px 0; +} + +pre .img img, pre .r-plt img { + background-color: #fff; +} + +code a, pre a { + color: #375f84; +} + +a.sourceLine:hover { + text-decoration: none; +} + +.fl {color: #1514b5;} +.fu {color: #000000;} /* function */ +.ch,.st {color: #036a07;} /* string */ +.kw {color: #264D66;} /* keyword */ +.co {color: #888888;} /* comment */ + +.error {font-weight: bolder;} +.warning {font-weight: bolder;} + +/* Clipboard --------------------------*/ + +.hasCopyButton { + position: relative; +} + +.btn-copy-ex { + position: absolute; + right: 0; + top: 0; + visibility: hidden; +} + +.hasCopyButton:hover button.btn-copy-ex { + visibility: visible; +} + +/* headroom.js ------------------------ */ + +.headroom { + will-change: transform; + transition: transform 200ms linear; +} +.headroom--pinned { + transform: translateY(0%); +} +.headroom--unpinned { + transform: translateY(-100%); +} + +/* mark.js ----------------------------*/ + +mark { + background-color: rgba(255, 255, 51, 0.5); + border-bottom: 2px solid rgba(255, 153, 51, 0.3); + padding: 1px; +} + +/* vertical spacing after htmlwidgets */ +.html-widget { + margin-bottom: 10px; +} + +/* fontawesome ------------------------ */ + +.fab { + font-family: "Font Awesome 5 Brands" !important; +} + +/* don't display links in code chunks when printing */ +/* source: https://stackoverflow.com/a/10781533 */ +@media print { + code a:link:after, code a:visited:after { + content: ""; + } +} + +/* Section anchors --------------------------------- + Added in pandoc 2.11: https://github.com/jgm/pandoc-templates/commit/9904bf71 +*/ + +div.csl-bib-body { } +div.csl-entry { + clear: both; +} +.hanging-indent div.csl-entry { + margin-left:2em; + text-indent:-2em; +} +div.csl-left-margin { + min-width:2em; + float:left; +} +div.csl-right-inline { + margin-left:2em; + padding-left:1em; +} +div.csl-indent { + margin-left: 2em; +} diff --git a/docs/pkgdown.js b/docs/pkgdown.js new file mode 100644 index 0000000..6f0eee4 --- /dev/null +++ b/docs/pkgdown.js @@ -0,0 +1,108 @@ +/* http://gregfranko.com/blog/jquery-best-practices/ */ +(function($) { + $(function() { + + $('.navbar-fixed-top').headroom(); + + $('body').css('padding-top', $('.navbar').height() + 10); + $(window).resize(function(){ + $('body').css('padding-top', $('.navbar').height() + 10); + }); + + $('[data-toggle="tooltip"]').tooltip(); + + var cur_path = paths(location.pathname); + var links = $("#navbar ul li a"); + var max_length = -1; + var pos = -1; + for (var i = 0; i < links.length; i++) { + if (links[i].getAttribute("href") === "#") + continue; + // Ignore external links + if (links[i].host !== location.host) + continue; + + var nav_path = paths(links[i].pathname); + + var length = prefix_length(nav_path, cur_path); + if (length > max_length) { + max_length = length; + pos = i; + } + } + + // Add class to parent
  • , and enclosing
  • if in dropdown + if (pos >= 0) { + var menu_anchor = $(links[pos]); + menu_anchor.parent().addClass("active"); + menu_anchor.closest("li.dropdown").addClass("active"); + } + }); + + function paths(pathname) { + var pieces = pathname.split("/"); + pieces.shift(); // always starts with / + + var end = pieces[pieces.length - 1]; + if (end === "index.html" || end === "") + pieces.pop(); + return(pieces); + } + + // Returns -1 if not found + function prefix_length(needle, haystack) { + if (needle.length > haystack.length) + return(-1); + + // Special case for length-0 haystack, since for loop won't run + if (haystack.length === 0) { + return(needle.length === 0 ? 0 : -1); + } + + for (var i = 0; i < haystack.length; i++) { + if (needle[i] != haystack[i]) + return(i); + } + + return(haystack.length); + } + + /* Clipboard --------------------------*/ + + function changeTooltipMessage(element, msg) { + var tooltipOriginalTitle=element.getAttribute('data-original-title'); + element.setAttribute('data-original-title', msg); + $(element).tooltip('show'); + element.setAttribute('data-original-title', tooltipOriginalTitle); + } + + if(ClipboardJS.isSupported()) { + $(document).ready(function() { + var copyButton = ""; + + $("div.sourceCode").addClass("hasCopyButton"); + + // Insert copy buttons: + $(copyButton).prependTo(".hasCopyButton"); + + // Initialize tooltips: + $('.btn-copy-ex').tooltip({container: 'body'}); + + // Initialize clipboard: + var clipboardBtnCopies = new ClipboardJS('[data-clipboard-copy]', { + text: function(trigger) { + return trigger.parentNode.textContent.replace(/\n#>[^\n]*/g, ""); + } + }); + + clipboardBtnCopies.on('success', function(e) { + changeTooltipMessage(e.trigger, 'Copied!'); + e.clearSelection(); + }); + + clipboardBtnCopies.on('error', function() { + changeTooltipMessage(e.trigger,'Press Ctrl+C or Command+C to copy'); + }); + }); + } +})(window.jQuery || window.$) diff --git a/docs/pkgdown.yml b/docs/pkgdown.yml new file mode 100644 index 0000000..559c0cf --- /dev/null +++ b/docs/pkgdown.yml @@ -0,0 +1,8 @@ +pandoc: 3.8.3 +pkgdown: 2.2.0 +pkgdown_sha: ~ +articles: + google_sheets_api: google_sheets_api.html + process_overview: process_overview.html + validating_commassemblyrules: validating_commassemblyrules.html +last_built: 2026-03-13T18:52Z diff --git a/docs/reference/aggregate_csvs.html b/docs/reference/aggregate_csvs.html new file mode 100644 index 0000000..c214f06 --- /dev/null +++ b/docs/reference/aggregate_csvs.html @@ -0,0 +1,100 @@ + +combine data csvs — aggregate_csvs • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    given a list of CSV files, read and check the specs, +then aggregate into a single DF.

    +
    + +
    +
    aggregate_csvs(csv_list, spec.df)
    +
    + +
    +

    Arguments

    + + +
    csv_list
    +

    list of files,e.g.

    + +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/as.Date.flexible.html b/docs/reference/as.Date.flexible.html new file mode 100644 index 0000000..eeb9fa5 --- /dev/null +++ b/docs/reference/as.Date.flexible.html @@ -0,0 +1,107 @@ + +flexible character to date converter — as.Date.flexible • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    try to convert a character to date using one of two delimiters ('/' or '-') and +start with international data format first (d/m/Y), but if that fails try sci date format (Y-m-d) +and finally US format. Value can be 'optional' in that if it's blank or not a date, NA is return

    +
    + +
    +
    # S3 method for class 'flexible'
    +as.Date(x, ...)
    +
    + +
    +

    Arguments

    + + +
    x
    +

    character to be converted to date, or empty string

    + +
    +
    +

    Value

    +

    Date or NA if x is blank, using as.Date function with 'tryFormats' option

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/errorSaver.html b/docs/reference/errorSaver.html new file mode 100644 index 0000000..a5e7f91 --- /dev/null +++ b/docs/reference/errorSaver.html @@ -0,0 +1,138 @@ + +Function wrapper to capture errors and warnings for storing — errorSaver • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    errorSaver wraps functions to capture error and warning outputs that would +noramlly be emitted to the console can can't be saved. useful for using +in apply or loop, here used to collect the warnings issued by readr +functions which issue warnings to the console but we want to collect those +warnings +If there are no errors and no warnings, the regular function result is returned +If there are warnings or errors, returns a list with $warn and $err elements +this method breaks down if the result

    +
    + +
    +
    errorSaver(fun)
    +
    + +
    +

    Arguments

    + + +
    fun
    +

    The function from which we'll capture errors and warnings

    + +
    +
    +

    Value

    +

    a wrapped

    +
    + + +
    +

    Examples

    +
    log.errors <- errorSaver(log)
    +log.errors("a")
    +#> [[1]]
    +#> NULL
    +#> 
    +#> $warnings
    +#> NULL
    +#> 
    +#> $errors
    +#> [1] "non-numeric argument to mathematical function"
    +#> 
    +log.errors(1)
    +#> [1] 0
    +read_csv_with_warnings <- errorSaver(readr::read_csv)
    +
    +
    +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/gdrive_client_setup.html b/docs/reference/gdrive_client_setup.html new file mode 100644 index 0000000..db469ad --- /dev/null +++ b/docs/reference/gdrive_client_setup.html @@ -0,0 +1,96 @@ + +get a google drive 'client' for authentication from env file — gdrive_client_setup • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    reads values from the enviroment for configurating a google drive client +for accesing gdrive file or gsheets data

    +
    + +
    +
    gdrive_client_setup()
    +
    + +
    +

    Value

    +

    'client' for use in drive_auth_configure or

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/gdrive_setup.html b/docs/reference/gdrive_setup.html new file mode 100644 index 0000000..6ad4b0a --- /dev/null +++ b/docs/reference/gdrive_setup.html @@ -0,0 +1,113 @@ + +connect to your google drive account, required set-up for using the google drive packages — gdrive_setup • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    this is wrapped in a function because it has the side effect of logging in, and useful for +the two functions to read data from sheets or CSVs.

    +
    + +
    +
    gdrive_setup(drive_email = NULL, reset = FALSE)
    +
    + +
    +

    Arguments

    + + +
    drive_email
    +

    email to be used for google drive. Reads from env var, see get_drive_email

    + + +
    reset
    +

    boolean whether to start over and re-authorize

    + +
    +
    +

    Value

    +

    TRUE if the functions requiring the google R packages complete without error

    +
    +
    +

    Details

    +

    This setup is not needed for working with folders/datafiles connected to your computer +via Google Drive Desktop (Mac/Windows), only for reading files directly

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/get_api_key.html b/docs/reference/get_api_key.html new file mode 100644 index 0000000..89ad881 --- /dev/null +++ b/docs/reference/get_api_key.html @@ -0,0 +1,94 @@ + +check google cloud api key configuration — get_api_key • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    google drive requires an 'api key' set in cloud project, which you can copy from the cloud console. +how that works is beyond the scope of this, but is a 39-character alphanumber code. This function +checks that the key is in the environment, which can be set in Renviron. see help for more details

    +
    + +
    +
    get_api_key()
    +
    + + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/get_col_type_from_spec.html b/docs/reference/get_col_type_from_spec.html new file mode 100644 index 0000000..bdb4347 --- /dev/null +++ b/docs/reference/get_col_type_from_spec.html @@ -0,0 +1,110 @@ + +vector of types from column names, in order — get_col_type_from_spec • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    csvs/sheets created to specs may not be in order of specification, but we want to get the col types in order that the sheet is actually in +This get the colum types in the order the columns appear in the spreadsheet that is read (in case columns are re-ordered) by checking one by one +requires the a spec data frame that must have columns named col_name and col_type

    +
    + +
    +
    get_col_type_from_spec(col_name, spec)
    +
    + +
    +

    Arguments

    + + +
    col_name
    +

    character, vector of column names from the data to look up the formats

    + + +
    spec
    +

    data.frame of specification with columns 'col_name' to match, and 'col_type

    + +
    +
    +

    Value

    +

    character vector of column types

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/get_drive_email.html b/docs/reference/get_drive_email.html new file mode 100644 index 0000000..9b5747a --- /dev/null +++ b/docs/reference/get_drive_email.html @@ -0,0 +1,90 @@ + +pull drive email from environment — get_drive_email • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    simple convenience method for replacing an empty drive, called by drive_setup

    +
    + +
    +
    get_drive_email(drive_email = NULL)
    +
    + + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/get_gsfile.html b/docs/reference/get_gsfile.html new file mode 100644 index 0000000..7897cf1 --- /dev/null +++ b/docs/reference/get_gsfile.html @@ -0,0 +1,125 @@ + +get a google drive file object given path and share drive — get_gsfile • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    given a google filepath , find it in our shared drive and read it in. If there are multiple files +found with the same name on the share drive, throws a warning and reads only the first one +it finds (which may not be the most recent one! )

    +
    + +
    +
    get_gsfile(
    +  file_name_or_url,
    +  shared_drive = NULL,
    +  drive_path = NULL,
    +  drive_email = NULL
    +)
    +
    + +
    +

    Arguments

    + + +
    shared_drive
    +

    optional name of the shared drive to look in, will read from the environment 'PROJECT_SHARE_DRIVE' ignored if URL is sent

    + + +
    drive_path
    +

    optional standard path for project files, will read from environment 'PROJECT_SHARE_DRIVE_PATH'; ignored if URL is sent

    + + +
    filepath
    +

    full name of the file (e.g. myfile.csv ), which could include sub-folder (myfiles/myfile.csv) OR google drive URL

    + +
    +
    +

    Value

    +

    a gsfile object from google drive library, useable to read from other gdrive/gsheet functions

    +
    +
    +

    Details

    +

    This is not needed for working with folders/datafiles connected to your computer +via Google Drive Desktop (Mac/Windows), only for reading files directly from google drive. +If you don't have permission to access the share drive it will not work

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/gfile_modified_time.html b/docs/reference/gfile_modified_time.html new file mode 100644 index 0000000..ab7309a --- /dev/null +++ b/docs/reference/gfile_modified_time.html @@ -0,0 +1,102 @@ + +WIP get time stamp for a particular gfile — gfile_modified_time • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    WIP get time stamp for a particular gfile

    +
    + +
    +
    gfile_modified_time(gfile)
    +
    + +
    +

    Arguments

    + + +
    gfile
    +

    a file object from google drive

    + +
    +
    +

    Value

    +

    timestamp value

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/gsheet_auth_setup.html b/docs/reference/gsheet_auth_setup.html new file mode 100644 index 0000000..4000507 --- /dev/null +++ b/docs/reference/gsheet_auth_setup.html @@ -0,0 +1,108 @@ + +setup authentication for reading google sheet — gsheet_auth_setup • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    this reads from the environment (or Renviron file) to get configuration details +for authenticating to a google sheets service +note that this is nearly identical to gdrive_setup but only for google sheets +google sheets has a different API and different permissions in the cloud console to read

    +
    + +
    +
    gsheet_auth_setup(drive_email = NULL)
    +
    + +
    +

    Arguments

    + + +
    drive_email
    +

    your preferred email, can be read from environment, set get_drive_email

    + +
    +
    +

    Value

    +

    True/False if the authentication/setup was successful

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/index.html b/docs/reference/index.html new file mode 100644 index 0000000..fd18003 --- /dev/null +++ b/docs/reference/index.html @@ -0,0 +1,184 @@ + +Package index • collaboratR + + +
    +
    + + + +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +

    All functions

    +

    +
    +

    aggregate_csvs()

    +

    combine data csvs

    +

    as.Date(<flexible>)

    +

    flexible character to date converter

    +

    errorSaver()

    +

    Function wrapper to capture errors and warnings for storing

    +

    gdrive_client_setup()

    +

    get a google drive 'client' for authentication from env file

    +

    gdrive_setup()

    +

    connect to your google drive account, required set-up for using the google drive packages

    +

    get_api_key()

    +

    check google cloud api key configuration

    +

    get_col_type_from_spec()

    +

    vector of types from column names, in order

    +

    get_drive_email()

    +

    pull drive email from environment

    +

    get_gsfile()

    +

    get a google drive file object given path and share drive

    +

    gfile_modified_time()

    +

    WIP get time stamp for a particular gfile

    +

    gsheet_auth_setup()

    +

    setup authentication for reading google sheet

    +

    read_data_csv()

    +

    read in a CSV from L0 folder, using specification spec

    +

    read_data_sheet()

    +

    read in google sheet for formatted for specific project, similar to how the readr package works for read_csv with a spec sheet data sheet features

    • allows you to skip top row(s) of data - allows for sheets that have non-data rows that are descriptive

    • numeric columns can have numeric strings and those will be converted to NAs, e.g. can indicate "NA" in numeric cells

    • requires a spec sheet that uses names per type_converter_fun()

    +

    read_gcsv()

    +

    download a CSV file from the project google shared drive and read into memory

    +

    read_gsheet_by_url()

    +

    read data in a google sheet from a URL and tab number

    +

    read_url_list()

    +

    read in the table of google sheet URLs with id

    +

    read_validate_and_save()

    +

    given a URL and params, read, validate and save a CSV

    +

    remove_comment_line()

    +

    remove line 2 from a csv file, used by data-entry for column directions/description.

    +

    spec_to_readr_col_types()

    +

    convert our specification format to something useable by readr::read_csv

    +

    type_code_to_readr_code()

    +

    convert a type name to a readr convert code

    +

    type_converter_fun()

    +

    code to conversion function mapping

    +

    validate_all()

    +

    read the list of urls, read, confirm column format, validate and when using in development mode, make sure do run devtools::load_all() to get all the collaboratR functions loaded save to CSV

    +

    validate_data()

    +

    validate data df

    +

    validate_data_columns()

    +

    validate columns against data definition

    +

    validate_from_file()

    +

    convenience to create validation results from a yaml file

    + + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/read_data_csv.html b/docs/reference/read_data_csv.html new file mode 100644 index 0000000..da08c2c --- /dev/null +++ b/docs/reference/read_data_csv.html @@ -0,0 +1,108 @@ + +read in a CSV from L0 folder, using specification spec — read_data_csv • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    reading in a CSV using readr package, but ensure the data still matches +a specification

    +
    + +
    +
    read_data_csv(csv_file_path, spec.df = NULL)
    +
    + +
    +

    Arguments

    + + +
    csv_file_path
    +

    character path to csv file

    + + +
    spec.df
    +

    optional data frame list of data specifications

    + +
    +
    +

    Value

    +

    data.frame, or NA if the file is not found or if there are validation issues

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/read_data_sheet.html b/docs/reference/read_data_sheet.html new file mode 100644 index 0000000..a3a3511 --- /dev/null +++ b/docs/reference/read_data_sheet.html @@ -0,0 +1,142 @@ + +read in google sheet for formatted for specific project, similar to how the readr package works for read_csv with a spec sheet data sheet features allows you to skip top row(s) of data - allows for sheets that have non-data rows that are descriptive numeric columns can have numeric strings and those will be converted to NAs, e.g. can indicate "NA" in numeric cells requires a spec sheet that uses names per type_converter_fun() — read_data_sheet • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    read in google sheet for formatted for specific project, similar to how the +readr package works for read_csv with a spec sheet +data sheet features

    • allows you to skip top row(s) of data - allows for sheets that have +non-data rows that are descriptive

    • +
    • numeric columns can have numeric strings and those will be converted to +NAs, e.g. can indicate "NA" in numeric cells

    • +
    • requires a spec sheet that uses names per type_converter_fun()

    • +
    + +
    +
    read_data_sheet(
    +  gurl,
    +  tab_name,
    +  spec.df,
    +  rows_to_skip = 1,
    +  use_readr = TRUE,
    +  quiet = TRUE
    +)
    +
    + +
    +

    Arguments

    + + +
    gurl
    +

    google sheet url or ID per googlesheets package

    + + +
    tab_name
    +

    sheet tab name or number, forward to sheet param in googlesheets4::read_sheet

    + + +
    spec.df
    +

    data.frame that is the specification, must have columns col_name and data_str

    + + +
    rows_to_skip
    +

    integer default 1, number of rows to skip, not including +col names. some sheets have non-data or documentation in first rows +set to 0 to not skip any rows

    + + +
    use_readr
    +

    logical default TRUE, use readr::type_convert() to validate

    + +
    +
    +

    Value

    +

    data.frame or NA if there is a problem

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/read_gcsv.html b/docs/reference/read_gcsv.html new file mode 100644 index 0000000..8b1ca73 --- /dev/null +++ b/docs/reference/read_gcsv.html @@ -0,0 +1,131 @@ + +download a CSV file from the project google shared drive and read into memory — read_gcsv • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    NOTE this can import gsheet as CSV, but only the first tab. Use read_gsheet_by_url() for multi-tab sheets +Reads either CSV file or gsheet doc from a shared drive and reads it in as data frame. If there are multiple files +found with the same name on the share drive, throws a warning and reads only the first one +it finds (which may not be the most recent one! ) +This is not needed for working with folders/datafiles connected to your computer +via Google Drive Desktop (Mac/Windows), only for reading files directly the Internet via URL. Requires access +to a share drive

    +
    + +
    +
    read_gcsv(
    +  file_name_or_url,
    +  shared_drive = NULL,
    +  drive_path = NULL,
    +  has_comment_line = TRUE
    +)
    +
    + +
    +

    Arguments

    + + +
    shared_drive
    +

    name of the shared drive to look in, default NULL passed to get_gsfile which which reads path from environment (see get_gsfile)

    + + +
    drive_path
    +

    common project path to use, optional, passed to get_gsfile which reads from environment (see get_gsfile)

    + + +
    has_comment_line
    +

    =TRUE, does the google sheet have comments/directions on line 2 that needs to be stripped

    + + +
    filepath
    +

    full name of the CSV file (e.g. myfile.csv ) with optional partial path.

    + +
    +
    +

    Value

    +

    a data.frame as returned by read.csv, no row names.

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/read_gsheet_by_url.html b/docs/reference/read_gsheet_by_url.html new file mode 100644 index 0000000..b3d2a1d --- /dev/null +++ b/docs/reference/read_gsheet_by_url.html @@ -0,0 +1,133 @@ + +read data in a google sheet from a URL and tab number — read_gsheet_by_url • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    Note This is only for google sheets, not CSVs or other data files. +this can read either type of data sheet (e.g. either tab) and returns th +To remove the 2nd "description" row, it downloads as CSV, removes the line, +and reads back in +This is a generic function, and does not use a specification file, see +read_data_sheet() below for that.

    +
    + +
    +
    read_gsheet_by_url(
    +  gurl,
    +  sheet_id = 1,
    +  has_description_line = TRUE,
    +  drive_email = NULL
    +)
    +
    + +
    +

    Arguments

    + + +
    gurl
    +

    url of a google sheet (and only a google sheet, not doc)

    + + +
    sheet_id
    +

    optional the name or number of the tab (1, 2), defaults to 1, see read_sheet() fn

    + + +
    has_description_line
    +

    does the sheet have row 2 as description of data, if TRUE (default), remove it

    + + +
    drive_email
    +

    optional drive email, required if you have not already logged in or don't have it set in Env. See gdrive_setup()

    + +
    +
    +

    Value

    +

    data frame with contents of the tab

    +
    +
    +

    Details

    +

    requires Oauth and google cloud console setup

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/read_url_list.html b/docs/reference/read_url_list.html new file mode 100644 index 0000000..de5f24b --- /dev/null +++ b/docs/reference/read_url_list.html @@ -0,0 +1,96 @@ + +read in the table of google sheet URLs with id — read_url_list • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    google sheet requires two column on for id one for the url for that id +the names of which can be anything but

    +
    + +
    +
    read_url_list(gurl, id_column = "id", url_column = "url", drive_email = NULL)
    +
    + +
    +

    Value

    +

    dataframe with columns from google sheet with at least two columns 'id' and 'url' plus other columns in original sheet

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/read_validate_and_save.html b/docs/reference/read_validate_and_save.html new file mode 100644 index 0000000..68cdd2a --- /dev/null +++ b/docs/reference/read_validate_and_save.html @@ -0,0 +1,90 @@ + +given a URL and params, read, validate and save a CSV — read_validate_and_save • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    filename <- read_and_save(url, sheet_id = 'biomass_data', spec.df = commassembly_rules_biomass_str))

    +
    + +
    +
    read_validate_and_save(url, tab_name, spec.df, csv_folder = "../L0")
    +
    + + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/remove_comment_line.html b/docs/reference/remove_comment_line.html new file mode 100644 index 0000000..3760eff --- /dev/null +++ b/docs/reference/remove_comment_line.html @@ -0,0 +1,112 @@ + +remove line 2 from a csv file, used by data-entry for column directions/description. — remove_comment_line • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    this will read all lines of a test file (which can take a long time/memory for a long file), +remove some of the lines by number and write those to disk. If no new_file_path param +is sent, will overwrite the original file which will be lost +it will write a file with standard POSIX (linux/mac) line endings for now

    +
    + +
    +
    remove_comment_line(local_file_path, line_numbers = 2, new_file_path = NULL)
    +
    + +
    +

    Arguments

    + + +
    local_file_path
    +

    path to text file on your disk, relative or absolute

    + + +
    line_numbers
    +

    default 2, 1 number or a vector, range of numbers to exclude (2:5)

    + + +
    new_file_path
    +

    optional new name to write to, by default will use the local_file_path and overwrite

    + +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/spec_to_readr_col_types.html b/docs/reference/spec_to_readr_col_types.html new file mode 100644 index 0000000..614b183 --- /dev/null +++ b/docs/reference/spec_to_readr_col_types.html @@ -0,0 +1,102 @@ + +convert our specification format to something useable by readr::read_csv — spec_to_readr_col_types • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    convert our specification format to something useable by readr::read_csv

    +
    + +
    +
    spec_to_readr_col_types(spec.df)
    +
    + +
    +

    Arguments

    + + +
    spec.df
    +

    dataframe with columns 'col_name' and 'data_str'

    + +
    +
    +

    Value

    +

    list of col names and type abbreviations, per read_csv() docs

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/type_code_to_readr_code.html b/docs/reference/type_code_to_readr_code.html new file mode 100644 index 0000000..b96dbc9 --- /dev/null +++ b/docs/reference/type_code_to_readr_code.html @@ -0,0 +1,106 @@ + +convert a type name to a readr convert code — type_code_to_readr_code • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    readr uses convert codes - see vignette("readr") +this function allows for named types and will convert those to the 1-letter +codes used by reader

    +
    + +
    +
    type_code_to_readr_code(type_code)
    +
    + +
    +

    Arguments

    + + +
    type_code
    +

    single letter or string of type

    + +
    +
    +

    Value

    +

    character single letter code

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/type_converter_fun.html b/docs/reference/type_converter_fun.html new file mode 100644 index 0000000..43a9373 --- /dev/null +++ b/docs/reference/type_converter_fun.html @@ -0,0 +1,110 @@ + +code to conversion function mapping — type_converter_fun • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    given a character code, return the conversion function to use on a value. +This is useful for spreadsheets/csvs that don't adapt well to conversion (even with formats specific) +and hence manual converting from character is needed +Yes this is probably a re-make of what's in read.csv or read_csv but those weren't amenable to a specific +use case

    +
    + +
    +
    type_converter_fun(type_code)
    +
    + +
    +

    Arguments

    + + +
    type_code
    +

    character value indicating type, one of character, integer, factor, double, numeric, Date(capital D) or first letter

    + +
    +
    +

    Value

    +

    one of the converter function as.integer... etc. For Dates, return custom as.Date.flexible function defined above

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/validate_all.html b/docs/reference/validate_all.html new file mode 100644 index 0000000..bf90fe5 --- /dev/null +++ b/docs/reference/validate_all.html @@ -0,0 +1,96 @@ + +read the list of urls, read, confirm column format, validate and when using in development mode, make sure do run devtools::load_all() to get all the collaboratR functions loaded save to CSV — validate_all • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    read the list of urls, read, confirm column format, validate and +when using in development mode, make sure do run devtools::load_all() +to get all the collaboratR functions loaded +save to CSV

    +
    + +
    +
    validate_all(urls.df = NULL, drive_email = NULL)
    +
    + + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/validate_data.html b/docs/reference/validate_data.html new file mode 100644 index 0000000..ecf5ad8 --- /dev/null +++ b/docs/reference/validate_data.html @@ -0,0 +1,106 @@ + +validate data df — validate_data • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    use the validate package to check a file against a set of rules

    +
    + +
    +
    validate_data(data_df, spec_df, validation_rules)
    +
    + +
    +

    Arguments

    + + +
    data_df
    +

    dataframe of biomass data

    + + +
    spec_df
    +

    data frame of table specification

    + + +
    validation_rules
    +

    file with validation rules in it.

    + +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/validate_data_columns.html b/docs/reference/validate_data_columns.html new file mode 100644 index 0000000..a8f0699 --- /dev/null +++ b/docs/reference/validate_data_columns.html @@ -0,0 +1,102 @@ + +validate columns against data definition — validate_data_columns • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    validate columns against data definition

    +
    + +
    +
    validate_data_columns(data_df, spec_df)
    +
    + +
    +

    Arguments

    + + +
    data_df
    +

    data frame following data definition

    + + +
    spec_df
    +

    data frame of table specification

    + +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/reference/validate_from_file.html b/docs/reference/validate_from_file.html new file mode 100644 index 0000000..06d3447 --- /dev/null +++ b/docs/reference/validate_from_file.html @@ -0,0 +1,106 @@ + +convenience to create validation results from a yaml file — validate_from_file • collaboratR + + +
    +
    + + + +
    +
    + + +
    +

    convenience to create validation results from a yaml file

    +
    + +
    +
    validate_from_file(data_df, file)
    +
    + +
    +

    Arguments

    + + +
    file
    +

    yaml formatted file with rules for the validate package

    + + +
    data_dfe
    +

    data frame of data

    + +
    +
    +

    Value

    +

    the outupt from confront function

    +
    + +
    + +
    + + +
    + +
    +

    Site built with pkgdown 2.2.0.

    +
    + +
    + + + + + + + + diff --git a/docs/sitemap.xml b/docs/sitemap.xml new file mode 100644 index 0000000..f5da890 --- /dev/null +++ b/docs/sitemap.xml @@ -0,0 +1,38 @@ + +/404.html +/LICENSE-text.html +/LICENSE.html +/articles/google_sheets_api.html +/articles/index.html +/articles/process_overview.html +/articles/validating_commassemblyrules.html +/authors.html +/index.html +/reference/aggregate_csvs.html +/reference/as.Date.flexible.html +/reference/errorSaver.html +/reference/gdrive_client_setup.html +/reference/gdrive_setup.html +/reference/get_api_key.html +/reference/get_col_type_from_spec.html +/reference/get_drive_email.html +/reference/get_gsfile.html +/reference/gfile_modified_time.html +/reference/gsheet_auth_setup.html +/reference/index.html +/reference/read_data_csv.html +/reference/read_data_sheet.html +/reference/read_gcsv.html +/reference/read_gsheet_by_url.html +/reference/read_url_list.html +/reference/read_validate_and_save.html +/reference/remove_comment_line.html +/reference/spec_to_readr_col_types.html +/reference/type_code_to_readr_code.html +/reference/type_converter_fun.html +/reference/validate_all.html +/reference/validate_data.html +/reference/validate_data_columns.html +/reference/validate_from_file.html + + From 4a9e03d3deed605b15d731fc4c4cac2f19076703 Mon Sep 17 00:00:00 2001 From: Pat Bills Date: Fri, 13 Mar 2026 15:03:09 -0400 Subject: [PATCH 06/13] details about Gdrive in readme, minor editing --- README.Rmd | 35 +++++++++++++++++++++++++++++------ 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/README.Rmd b/README.Rmd index aa06463..3b1cca6 100644 --- a/README.Rmd +++ b/README.Rmd @@ -23,19 +23,28 @@ knitr::opts_chunk$set( ### Motivation -Performaing a Meta-analysis requires collating and harmonizing data extracted +Performing a Meta-analysis requires collating and harmonizing data extracted from many different sources but most frequently scientific publications. Collaborative meta-analysis requires a group of scientists to collectively develop and agree on their goals, type of data extracted, format of the those data, and to do so extremely consistently across papers. This package helps to support that -efficiently by +efficiently by the easy-to-use Google Sheets for data definition and data entry. +The workflow in this project can read directly from Google sheets into CSVs and +validate the structure of a google sheet as well as the data using the Validate package. -This R package is part of 3 repositories that support the data entry, validation and accumulation of a meta-analysis for the commRULES project. + +Originally, this This R package was part of 3 repositories that support the +data entry, validation and accumulation of a meta-analysis for a research project +sponsored by MSU IBEEM -1. collaboratR: commRULES data management code for L0 and L0->L1 layer in EDI framework -1. data: version controlled data collection for tracking provenance using git, this is the L0 and L1 layers in the EDI framework -1. analysis: R code for reproducible data analysis , L1->L2 layers in EDI framework + +1. collaboratR: data management code for L0 and L0->L1 layer in EDI framework +1. data: version controlled data collection for tracking provenance using git, + this is the L0 and L1 layers in the EDI framework. the collaboratR package + assists with data transfer and validation from Google drive into the data repository. +1. analysis: R code for reproducible data analysis , L1->L2 layers in EDI framework, + using data in the data repository. ## Installation - Package @@ -48,6 +57,20 @@ This R package is part of 3 repositories that support the data entry, validation *additional packages are required to build the package and this website, source the script* `R/install_dev_packages.R` +### Installation/Testing + +Google drive in this package is set to interaction. To run any building, installing or checking in R, +you must first manually connect to google drive, which must be set-up properly first. + +See the vignette in this package "Google Sheets API setup using Google Cloud", +or in this source code see [Google Sheets Vignette RMD](vignettes/google_sheets_api.Rmd) + +Once set-up you may have to log-in manually prior to running tests or checks, +use + +```r +source("R/gdrive.R") + ## Data Google Drive Project Setup See the Vignette ["Google Sheets API setup using Google Cloud"](vignettes/google_sheets_api.Rmd) From 8749f80c7bc9a457f3f3bcfdc680a06a0de99893 Mon Sep 17 00:00:00 2001 From: Pat Bills Date: Fri, 13 Mar 2026 15:04:02 -0400 Subject: [PATCH 07/13] more dependency management (package updates) --- renv.lock | 1 - renv/activate.R | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/renv.lock b/renv.lock index f164bfc..b8383ba 100644 --- a/renv.lock +++ b/renv.lock @@ -261,7 +261,6 @@ "cli", "covr", "decor", - "desc", "ggplot2", "glue", "knitr", diff --git a/renv/activate.R b/renv/activate.R index 31a6969..8432006 100644 --- a/renv/activate.R +++ b/renv/activate.R @@ -3,7 +3,7 @@ local({ # the requested version of renv version <- "1.1.8" - attr(version, "md5") <- NULL + attr(version, "md5") <- "be71f6cdf4c947eebbffbf9fd2489319" attr(version, "sha") <- NULL # the project directory From 13a61ea45f8e28c01143e6cb91002f84405a6b44 Mon Sep 17 00:00:00 2001 From: Pat Bills Date: Fri, 13 Mar 2026 15:17:52 -0400 Subject: [PATCH 08/13] add authors/collaborators and link to website --- README.Rmd | 17 +++++++++++- README.md | 79 +++++++++++++++++++++++++++++++++++++++--------------- 2 files changed, 73 insertions(+), 23 deletions(-) diff --git a/README.Rmd b/README.Rmd index 3b1cca6..53a36ca 100644 --- a/README.Rmd +++ b/README.Rmd @@ -15,7 +15,22 @@ knitr::opts_chunk$set( # collaboratR -### A package to support collaborative meta-analysis for [MSU IBEEM](https://ibeem.msu.edu) +**A package to support collaborative meta-analysis for [MSU IBEEM](https://ibeem.msu.edu)** + +https://ibeem-msu.github.io/collaboratR/ + +### Authors: + +- Patrick S Bills +- Ashwini Ramesh +- Laís Petri +- Phoebe Lehman Zarnetske, PI and Director, IBEEM + +### Contributors: + +- Kelly Kapsar, Data Scientist, IBEEM +- Alejandra Martinez Blancas +- Amar Deep Tiwari [![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental) diff --git a/README.md b/README.md index 30a8414..4584c70 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,23 @@ # collaboratR -### A package to support collaborative meta-analysis for [MSU IBEEM](https://ibeem.msu.edu) +**A package to support collaborative meta-analysis for [MSU +IBEEM](https://ibeem.msu.edu)** + + + +### Authors: + +- Patrick S Bills +- Ashwini Ramesh +- Laís Petri +- Phoebe Lehman Zarnetske, PI and Director, IBEEM + +### Contributors: + +- Kelly Kapsar, Data Scientist, IBEEM +- Alejandra Martinez Blancas +- Amar Deep Tiwari @@ -13,24 +29,32 @@ experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](h ### Motivation -Performaing a Meta-analysis requires collating and harmonizing data +Performing a Meta-analysis requires collating and harmonizing data extracted from many different sources but most frequently scientific publications. Collaborative meta-analysis requires a group of scientists to collectively develop and agree on their goals, type of data extracted, format of the those data, and to do so extremely consistently across -papers. This package helps to support that efficiently by +papers. This package helps to support that efficiently by the +easy-to-use Google Sheets for data definition and data entry. The +workflow in this project can read directly from Google sheets into CSVs +and validate the structure of a google sheet as well as the data using +the Validate package. + + -This R package is part of 3 repositories that support the data entry, -validation and accumulation of a meta-analysis for the commRULES -project. +Originally, this This R package was part of 3 repositories that support +the data entry, validation and accumulation of a meta-analysis for a +research project sponsored by MSU IBEEM -1. collaboratR: commRULES data management code for L0 and L0-\>L1 layer - in EDI framework +1. collaboratR: data management code for L0 and L0-\>L1 layer in EDI + framework 2. data: version controlled data collection for tracking provenance - using git, this is the L0 and L1 layers in the EDI framework + using git, this is the L0 and L1 layers in the EDI framework. the + collaboratR package assists with data transfer and validation from + Google drive into the data repository. 3. analysis: R code for reproducible data analysis , L1-\>L2 layers in - EDI framework + EDI framework, using data in the data repository. ## Installation - Package @@ -49,25 +73,36 @@ project. *additional packages are required to build the package and this website, source the script* `R/install_dev_packages.R` +### Installation/Testing + +Google drive in this package is set to interaction. To run any building, +installing or checking in R, you must first manually connect to google +drive, which must be set-up properly first. + +See the vignette in this package “Google Sheets API setup using Google +Cloud”, or in this source code see [Google Sheets Vignette +RMD](vignettes/google_sheets_api.Rmd) + +Once set-up you may have to log-in manually prior to running tests or +checks, use + +``` r +source("R/gdrive.R") + ## Data Google Drive Project Setup -See the Vignette [“Google Sheets API setup using Google -Cloud”](vignettes/google_sheets_api.Rmd) for details about setting up -google sheets connection with R, which requires a google cloud project -in your institution +See the Vignette ["Google Sheets API setup using Google Cloud"](vignettes/google_sheets_api.Rmd) +for details about setting up google sheets connection with R, which requires +a google cloud project in your institution -Note that for safety, this package only reads from google drive and it -never writes to google drive. Therefore it only requests ‘read-only’ -access. +Note that for safety, this package only reads from google drive and it never +writes to google drive. Therefore it only requests 'read-only' access. ## Usage -When reading in data sheets, you provide a URL for a datasheet that -exists in any folder that you have access to. The system will attempt to -log you into to google drive and requests your permission for this code -to access files on your behalf. +When reading in data sheets, you provide a URL for a datasheet that exists in any folder that you have access to. The system will attempt to log you into to google drive and requests your permission for this code to access files on your behalf. -``` r +```R gurl<- 'https://docs.google.com/spreadsheets/d/1w6sYozjybyd53eeiTdigrRTonteQW2KXUNZNmEhQyM8/edit?gid=0#gid=0' study_data<- read_commrules_sheet(gurl) ``` From e9895795abbc0259093b2550c3de51780e0bc43d Mon Sep 17 00:00:00 2001 From: Pat Bills Date: Fri, 13 Mar 2026 15:18:16 -0400 Subject: [PATCH 09/13] update website structure for github pages --- .../validating_commassemblyrules.html | 1147 ++++++++--------- docs/pkgdown.yml | 2 +- docs/sitemap.xml | 38 - 3 files changed, 516 insertions(+), 671 deletions(-) delete mode 100644 docs/sitemap.xml diff --git a/docs/articles/validating_commassemblyrules.html b/docs/articles/validating_commassemblyrules.html index da28c97..d679b77 100644 --- a/docs/articles/validating_commassemblyrules.html +++ b/docs/articles/validating_commassemblyrules.html @@ -1,23 +1,8 @@ - - - - - - -Validating Community Assembly Rules Project Data • collaboratR - - - - - - -Validating Community Assembly Rules Project Data • collaboratR - - +
    @@ -37,8 +22,7 @@
    - - - - - +
  • + + + + + -
    + + + + +