From 13de627528f6711d0ebead453296dc61cec6c992 Mon Sep 17 00:00:00 2001 From: Alex Reinhart Date: Mon, 3 Mar 2025 11:45:11 -0500 Subject: [PATCH] Update CTIS documentation to point to ICPSR Now that the CTIS data is archived with ICPSR, we can point there and deprecate pages describing access to microdata via CMU. ICPSR also archives the contingency tables, including a user guide, so we can remove the redundant contingency table documentation and point to ICPSR instead. --- docs/symptom-survey/collaboration-revision.md | 5 + docs/symptom-survey/contingency-tables.md | 178 ++---------------- docs/symptom-survey/data-access.md | 44 ++--- docs/symptom-survey/end-of-survey.md | 6 +- docs/symptom-survey/index.md | 66 ++++--- docs/symptom-survey/server-access.md | 7 + docs/symptom-survey/survey-files.md | 8 + 7 files changed, 92 insertions(+), 222 deletions(-) diff --git a/docs/symptom-survey/collaboration-revision.md b/docs/symptom-survey/collaboration-revision.md index 6cc65e670..03efee819 100644 --- a/docs/symptom-survey/collaboration-revision.md +++ b/docs/symptom-survey/collaboration-revision.md @@ -2,10 +2,15 @@ title: Collaboration and Survey Revision parent: inactive COVID-19 Trends and Impact Survey nav_order: 1 +nav_exclude: true --- # Collaboration and Survey Revision +
Update: CTIS data collection has ended. We are no longer +revising the survey or hosting collaboration meetings.
+ Delphi continues to revise the COVID-19 Trends and Impact Survey (CTIS) instruments in order to prioritize items that have the greatest utility for the response to the COVID-19 pandemic. We conduct revisions in collaboration with diff --git a/docs/symptom-survey/contingency-tables.md b/docs/symptom-survey/contingency-tables.md index a842a02cd..f92ee1ef4 100644 --- a/docs/symptom-survey/contingency-tables.md +++ b/docs/symptom-survey/contingency-tables.md @@ -8,23 +8,32 @@ nav_order: 4 {: .no_toc} This documentation describes the fine-resolution contingency tables produced by -grouping [US COVID-19 Trends and Impact Survey (CTIS)](./index.md) individual responses by various -self-reported demographic features. +grouping [US COVID-19 Trends and Impact Survey (CTIS)](./index.md) individual +responses by various self-reported demographic features. The contingency tables +are publicly available for download as a complete set from the Inter-university +Consortium for Political Science Research (ICPSR): -* [Weekly files](https://www.cmu.edu/delphi-web/surveys/weekly-rollup/) -* [Monthly files](https://www.cmu.edu/delphi-web/surveys/monthly-rollup/) +* Reinhart, Alex, Mejia, Robin, and Tibshirani, Ryan J. COVID-19 Trends and + Impact Survey (CTIS), United States, 2020-2022. Inter-university Consortium + for Political and Social Research [distributor], 2025-02-28. + + +Select the dataset "DS0 Study-Level Files" to download the complete set of +contingency tables and all survey documentation files, including the codebooks +and an Aggregate Contingency Table User Guide that describes the data +processing and file formats, and includes example R code. These contingency tables provide granular breakdowns of COVID-related topics such as vaccine uptake and acceptance. Compatible tables are also available for the [UMD Global CTIS](https://covidmap.umd.edu/) for more than 100 countries and -territories worldwide, through [UMD's -website](https://covidmap.umd.edu/umdcsvs/Contingency_Tables/). +territories worldwide, also [through +ICPSR](https://www.icpsr.umich.edu/web/ICPSR/studies/39206). -These tables are more detailed than the [coarse aggregates reported in the COVIDcast Epidata API](../api/covidcast-signals/fb-survey.md), which are grouped +These tables are more detailed than the [coarse aggregates reported in the +COVIDcast Epidata API](../api/covidcast-signals/fb-survey.md), which are grouped only by geographic region. [Individual response data](survey-files.md) for the -survey is available, but only to academic or nonprofit researchers who sign a -Data Use Agreement, whereas these contingency tables are available to the -general public. +survey is available, but only to researchers who request restricted data access +via ICPSR, whereas these contingency tables are available to the general public. Please see our survey [credits](index.md#credits) and [citation information](index.md#citing-the-survey) for information on how to cite this data if you use it in a publication. @@ -32,152 +41,3 @@ for information on how to cite this data if you use it in a publication. Our [Data and Sampling Errors](problems.md) documentation lists important updates for data users, including corrections to data or updates on data processing delays. - -## Table of contents -{: .no_toc .text-delta} - -1. TOC -{:toc} - -## Available Data - -We currently provide data files at several levels of geographic and temporal -aggregation. The reason for this is that each file is filtered to only include -estimates for a particular group if that group includes 100 or more responses. -Providing several levels of granularity allows us to provide coverage for a -variety of use cases. For example, users who need the most up-to-date data or -are interested in time series analysis should use the weekly files, while -those who want to study groups with smaller sample sizes should use the -monthly files. Because monthly aggregates include more responses, they have -lower missingness when grouping by several features at a time. - -* [Weekly files](https://www.cmu.edu/delphi-web/surveys/weekly-rollup/) -* [Monthly files](https://www.cmu.edu/delphi-web/surveys/monthly-rollup/) - -Files contain all time periods for a given period type-aggregation -type combination. - -Individual CSVs containing a single [week](https://www.cmu.edu/delphi-web/surveys/weekly/) or [month](https://www.cmu.edu/delphi-web/surveys/monthly/) for a given aggregation type are also available. - -### Dates - -The included files provide estimates for various metrics of interest over a -period of either a full epiweek (or [MMWR -week](https://wwwn.cdc.gov/nndss/document/MMWR_week_overview.pdf), a -standardized numbering of weeks throughout the year) or a full calendar month. - -Note: If a survey item was introduced in the middle of an aggregation period, -derived indicators will be included in aggregations for that period but will -only use a partial week or month of data. - -### Regions - -At the moment, only nation-wide and state groupings are available. - -Facebook only invites users to take the survey if they appear, based on -attributes in their Facebook profiles, to reside in the 50 states or -Washington, DC. Puerto Rico is sampled separately as part of the -[international version of the survey](https://covidmap.umd.edu/). If Facebook -believes a user qualifies for the survey, but the user then replies that they -live in Puerto Rico or another US territory, we do not include their response -in the aggregations. - -### Privacy - -The aggregates are filtered to only include estimates for a particular group -if that group includes 100 or more responses. Especially in the weekly -aggregates, many of the state-level groups have been filtered out due to low -sample size. In such cases, files that group by a single demographic of -interest will likely provide more coverage. - -## File Format - -### Naming - -"Rollup" files containing all time periods for a given period type-aggregation -type combination have names of the form: - - {period_type}_{geo_type}_{aggregation_type}.csv.gz - -Unless noted otherwise, the time period is always a complete month -(`period_type` = `monthly`) or epiweek (`period_type` = `weekly`). `geo_type` is -the geographic level responses were aggregated over. `aggregation_type` is a -concatenated list of other grouping variables used, ordered alphabetically. -Values for variables used in file naming align with those within files as -specified in the column section below. - -### Columns - -Within a CSV, the first few columns store metadata of the aggregation: - -| Column | Description | -| --- | --- | -| `survey_geo` | Survey geography ("US") | -| `period_start` | Date (yyyyMMdd) of first day of time period used in aggregation, in the Pacific time zone (UTC - 7) | -| `period_end` | Date of last day of time period used in aggregation | -| `period_val` | Month or week number | -| `geo_type` | Geography type ("state", "nation") | -| `aggregation_type` | Concatenated list of grouping variables, ordered alphabetically | -| `country` | Country name ("United States") | -| `ISO_3` | Three-letter ISO country code ("USA") | -| `GID_0` | GADM level 0 ID | -| `state` | State name; "Overall" if aggregation not grouped at the state level | -| `GID_1` | GADM level 1 ID | -| `state_fips` | State FIPS code; `NA` if aggregation not grouped at the state level | -| `county` | County name; "Overall" if aggregation not grouped at the county level | -| `county_fips` | County FIPS code; `NA` if aggregation not grouped at the county level | -| `issue_date` | Date on which estimates were generated | - -These are followed by the grouping variables used in the aggregation, ordered -alphabetically, and the indicators. Each indicator reports four columns -(unrounded): - -* `val_`: the main value of interest, e.g., percent, average, or - count, estimated using the [survey weights](weights.md) to better match state - demographics -* `se_`: the standard error of `val_` -* `sample_size_`: the number of survey responses used to - calculate `val_` -* `represented_`: the number of people in the population that - `val_` represents over all days in the given time period. This - is the sum of [survey weights](./weights.md) for all survey responses - used. - -All aggregates using the same set of grouping variables appear in a single CSV. - -### Missing Values - -Grouping variables (including region) will be missing (`NA`) to represent -respondents who provided one or more responses to survey items used for -indicators (e.g., vaccine uptake) but who did not provide a response to the -survey item used for the particular grouping variable. For example, if -grouping by gender, we would report the groups: male, female, other, and `NA`, -respondents who did not provide a response to the gender question. - -For a given respondent group (25-34 year old healthcare workers in Nebraska, -e.g.) sample size can vary by indicator because of the survey display logic. -For example, all respondents are asked if they have received a COVID-19 -vaccination (item V1), but only those who say they *have* are asked how many -doses they have received (item V2). This means that the sample size for V2 is -smaller than that for V1. Because indicators are [censored](#privacy) -individually, it is possible that V1-derived indicators will be reported for a -given group while V2-derived indicators are not. In this case, the V2-derived -indicator columns will be marked as missing (`NA`) for that group. - -## Indicators - -
Indicator -codebook: Our contingency table -codebook (CSV) lists all indicators available in the US contingency tables -for download, and specifies the survey questions on which they are based. See -the survey instrument codebook for the full text of -all questions.
- -The files contain [weighted estimates](../api/covidcast-signals/fb-survey.md#survey-weighting-and-estimation) -of the percent of respondents who fulfill one or several criteria. Estimates are -broken out by state, age, gender, race, ethnicity, occupation, and health -conditions. - -We plan to expand the list of indicators based on research needs; if you have a -public health or research need for a particular variable not included in our -current tables please contact us at . diff --git a/docs/symptom-survey/data-access.md b/docs/symptom-survey/data-access.md index 23f76afcc..435f07ee2 100644 --- a/docs/symptom-survey/data-access.md +++ b/docs/symptom-survey/data-access.md @@ -20,13 +20,20 @@ characteristics are available for download. ## Getting Microdata Access De-identified individual survey responses can be made available to researchers -associated with universities or non-profit organizations who sign a Data Use -Agreement (DUA). To request access to the data please submit the information -requested in [Facebook's page on obtaining data access](https://dataforgood.facebook.com/dfg/docs/covid-19-trends-and-impact-survey-request-for-data-access), -which sets out the basic conditions and provides a form to request access. An -[international version of CTIS](https://covidmap.umd.edu/) is conducted by the -University of Maryland (UMD) and access can be requested through the same -form. +associated with universities or non-profit organizations who agree to a Data Use +Agreement (DUA). The microdata is archived by the Inter-university Consortium +for Political and Social Research (ICPSR) at the University of Michigan: + +* Reinhart, Alex, Mejia, Robin, and Tibshirani, Ryan J. COVID-19 Trends and + Impact Survey (CTIS), United States, 2020-2022. Inter-university Consortium + for Political and Social Research [distributor], 2025-02-28. + + +Follow the link to view the data description and documentation, and to request +access to the restricted microdata. The survey documentation, including full +codebooks and user guides, is available for public download. Microdata access is +no longer available through direct agreements with Carnegie Mellon University, +so all access must be requested through ICPSR. The United States survey protocol has been reviewed by the Carnegie Mellon University Institutional Review Board with IRB ID STUDY2020_00000162. @@ -44,26 +51,9 @@ Some important notes about obtaining access to the individual survey responses: * Part- or full-time employees of Facebook are **not** eligible to receive data access, since Delphi's agreement with Facebook to protect the privacy of respondents prohibits Facebook employees from receiving any microdata. -* Because this survey is large and many groups have access, the Data Use - Agreements are not negotiable. - -After you complete the request form, staff from Facebook and CMU will be in -contact to guide you through the rest of the process. They will provide data use -agreements for your institution to sign, and will also request a copy of your -Institutional Review Board approval to verify you have ethical approval to -conduct the research. - -After the DUAs are executed, we will ask you to fill out [this -form](http://cmu.ca1.qualtrics.com/jfe/form/SV_89aVsYl29Oay4qq) to set up your -microdata access. This form can be used for new research projects or adding new -researchers to existing projects. - -After completing these forms, credentials for SFTP will be emailed to each -individual on the team. Please **do not share your credentials** with other -users. Only one person per research team needs to fill out this survey. You can -list all relevant team members in one submission. For teams with more than 5 -members, please fill out an additional form(s) to cover your whole team. If you have questions about the process, or your IRB needs information about the survey for their review, contact us at -. +. For all questions about ICPSR's +restricted data access process, contact ICPSR through the forms or email +addresses on their website. diff --git a/docs/symptom-survey/end-of-survey.md b/docs/symptom-survey/end-of-survey.md index f04cc55c6..2ec99ca81 100644 --- a/docs/symptom-survey/end-of-survey.md +++ b/docs/symptom-survey/end-of-survey.md @@ -57,11 +57,7 @@ continue to [request access](./data-access.md) to non-public, non-aggregated survey data for their research, and current approved data users will be able to continue accessing the non-aggregated data until their current data use agreements (DUA) expire. Researchers currently holding a fully executed DUA will -have the option to extend their DUA after it expires. Though no new data will be -collected after June 25, 2022, [Meta’s CTIS -visualizations](https://dataforgood.facebook.com/covid-survey/) will continue to -be available, and until the end of 2022, [JH CCP’s COVID Behaviors -dashboard](https://covidbehaviors.org/) will as well. +have the option to extend their DUA after it expires. ## CTIS Impact diff --git a/docs/symptom-survey/index.md b/docs/symptom-survey/index.md index 1c2f46354..15757abb3 100644 --- a/docs/symptom-survey/index.md +++ b/docs/symptom-survey/index.md @@ -5,28 +5,33 @@ nav_order: 4 # COVID-19 Trends and Impact Survey (CTIS) -Since April 2020, Delphi has conducted a voluntary survey about COVID-19, -distributed daily to users in the United States via a partnership with Facebook. -This survey asks respondents about COVID-like symptoms, their behavior (such as -social distancing), mental health, and economic and health impacts they have -experienced as a result of the pandemic. A high-level overview of the survey is -posted [on the Delphi website](https://delphi.cmu.edu/covid19/ctis/), -and an international version is -[conducted by the University of Maryland](https://covidmap.umd.edu/). +From April 2020 to June 2022, Delphi conducted a voluntary survey about +COVID-19, distributed daily to users in the United States via a partnership with +Facebook. This survey asked respondents about COVID-like symptoms, their +behavior (such as social distancing), mental health, and economic and health +impacts they have experienced as a result of the pandemic. A high-level overview +of the survey is posted [on the Delphi +website](https://delphi.cmu.edu/epidemic-signals/ctis/), and an international +version is [conducted by the University of Maryland](https://covidmap.umd.edu/). Data collection [ceased on June 25, 2022](end-of-survey.md). This survey was also known unofficially as the Facebook Survey. More survey details are also available [on the COVID-19 Trends and Impact Survey 2020-2022 (inactive) page](https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/fb-survey.html) under the COVIDcast Main Endpoint's Data Source and Signals section of this API documentation site. -The [CTIS Methodology -Report](https://dataforgood.facebook.com/dfg/resources/CTIS-methodology-report) -describes the survey design, data collection process, weighting, and aggregation -processes, and is the primary reference for researchers working with the survey -data. This website describes details specific to the US version of the survey -and documents the individual response data, which is available to researchers -with a signed Data Use Agreement. If you are a researcher and would like to get -access to the data, see our page on getting [data access](data-access.md). +The survey dataset is now archived via ICPSR: + +* Reinhart, Alex, Mejia, Robin, and Tibshirani, Ryan J. COVID-19 Trends and + Impact Survey (CTIS), United States, 2020-2022. Inter-university Consortium + for Political and Social Research [distributor], 2025-02-28. + + +The archive includes complete documentation, including a Methodology Report on +survey design, a survey weighting guide, guides to the aggregate and microdata, +and example code. Aggregate contingency tables are available for public download +and restricted microdata can be obtained upon request. If you are a researcher +and would like to get access to the data, see our page on getting [data +access](data-access.md). If you have questions about the survey or getting access to data, contact us at . @@ -59,19 +64,15 @@ include: The survey protocol is reviewed by the Carnegie Mellon University Institutional Review Board. -The support of several institutions makes the survey possible. Facebook supports -the survey through recruitment (participants are invited via their News Feed), -survey sampling and weighting procedures, technical assistance in survey design -and implementation, and coordination with researchers and public health -officials. The University of Maryland's Social Data Science Center conducts a -[global version of the survey](https://covidmap.umd.edu/), and we coordinate -closely on survey design and implementation. Delphi collects, aggregates, and -distributes the US survey data, and retains ultimate responsibility for the US -survey instrument and data. - -We develop the survey collaboratively with data users, public health officials, -and others. If you are interested in getting involved, see our -[collaboration and survey revision information](collaboration-revision.md). +The support of several institutions makes the survey possible. Facebook +supported the survey through recruitment (participants are invited via their +News Feed), survey sampling and weighting procedures, technical assistance in +survey design and implementation, and coordination with researchers and public +health officials. The University of Maryland's Social Data Science Center +conducted a [global version of the survey](https://covidmap.umd.edu/), and we +coordinated closely on survey design and implementation. Delphi collected, +aggregated, and distributes the US survey data, and retains ultimate +responsibility for the US survey instrument and data. ## Citing the Survey @@ -96,8 +97,11 @@ the survey in publications based on the data. Specifically, we ask that you: individual survey results. If you are unsure whether a particular aggregation will prevent disclosure of individual survey results, please email us at . -4. Finally, send a copy of your publication, once it appears publicly as a - preprint or journal article, to . +4. Finally, please notify us when your research is published. If you obtained + the data via ICPSR, please [visit their + site](https://www.icpsr.umich.edu/web/ICPSR/studies/39207) to find the form + to submit data-related publications; otherwise, send a copy of your + publication to . When referring to the survey in text, we prefer the following formats: diff --git a/docs/symptom-survey/server-access.md b/docs/symptom-survey/server-access.md index c27b3e15b..8342bfdb6 100644 --- a/docs/symptom-survey/server-access.md +++ b/docs/symptom-survey/server-access.md @@ -2,10 +2,17 @@ title: SFTP Server Access parent: inactive COVID-19 Trends and Impact Survey nav_order: 2 +nav_exclude: true --- # SFTP Server Access +
Note: This page describes access for data users who have +an agreement directly with CMU. Those who request access via ICPSR should +follow their instructions and procedures.
+ Researchers with data use agreements to access the raw data from the COVID-19 Trends and Impact Survey (CTIS) can access the data over SFTP. (If you do not have a data use agreement, see the [main survey page](index.md) for diff --git a/docs/symptom-survey/survey-files.md b/docs/symptom-survey/survey-files.md index e2bf0ed44..705233ed3 100644 --- a/docs/symptom-survey/survey-files.md +++ b/docs/symptom-survey/survey-files.md @@ -2,11 +2,19 @@ title: Response Files parent: inactive COVID-19 Trends and Impact Survey nav_order: 3 +nav_exclude: true --- # Response Files {: .no_toc} +
Note: This page is obsolete. Users who obtain the survey +data via ICPSR +should consult the Microdata User Guide provided in the study-level data +package. This guide documents the format of the data and includes example code +for common use cases.
+ Users with access to the [COVID-19 Trends and Impact Survey (CTIS)](./index.md) individual response data should have received SFTP credentials for a private server where the data are stored. To connect to the server, see the