Skip to content

Commit 29bc8be

Browse files
authored
Merge pull request #786 from cmu-delphi/krivard/gs-no-aggregations
Update google-symptoms geo aggregation
2 parents 51df544 + b797a91 commit 29bc8be

File tree

1 file changed

+17
-8
lines changed

1 file changed

+17
-8
lines changed

docs/api/covidcast-signals/google-symptoms.md

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ grand_parent: COVIDcast Epidata API
1111
* **Earliest issue available:** November 30, 2020
1212
* **Number of data revisions since May 19, 2020:** 0
1313
* **Date of last change:** Never
14-
* **Available for:** county, MSA, HRR, state (see [geography coding docs](../covidcast_geography.md))
14+
* **Available for:** county, MSA, HRR, state, HHS, nation (see [geography coding docs](../covidcast_geography.md))
1515
* **Time type:** day (see [date format docs](../covidcast_times.md))
1616
* **License:** To download or use the data, you must agree to the Google [Terms of Service](https://policies.google.com/terms)
1717

@@ -57,13 +57,22 @@ The state-level and county-level `raw_search` signals for specific symptoms such
5757
as _anosmia_ and _ageusia_ are taken directly from the [COVID-19 Search Trends
5858
symptoms
5959
dataset](https://github.com/google-research/open-covid-19-data/tree/master/data/exports/search_trends_symptoms_dataset)
60-
without changes. We aggregate the county-level data to the MSA and HRR levels
61-
using the population-weighted average. For MSAs/HRRs that include counties that
62-
have no data provided due to quality or privacy issues for a certain day, we
63-
simply assume the values to be 0 during aggregation. The values for MSAs/HRRs
64-
with no counties having non-NaN values will not be reported. Thus, the resulting
65-
MSA/HRR level data does not fully match the _actual_ MSA/HRR level data (which
66-
we are not provided).
60+
without changes.
61+
62+
We aggregate county and state data to other geographic levels using
63+
population-weighted averaging.
64+
65+
| Source level | Aggregated level |
66+
| ------------ | ---------------- |
67+
| county | MSA, HRR |
68+
| state | HHS, nation |
69+
70+
For aggregation purposes only, we assign a value of 0 to source regions that
71+
have no data provided due to quality or privacy issues for a certain day (see
72+
Limitations for details). We do not report aggregated regions if none of their
73+
source regions have data. Because of this censoring behavior, the resulting data
74+
for aggregated regions does not fully match the _actual_ search volume for these
75+
regions (which is not provided to us).
6776

6877
## Lag and Backfill
6978
Google does not currently update the search data daily, but usually twice a week.

0 commit comments

Comments
 (0)