From 2d49aa15de7341f5dcd672b7418ad4342faabdb6 Mon Sep 17 00:00:00 2001 From: Camden Blatchly Date: Thu, 14 Nov 2024 14:39:13 -0500 Subject: [PATCH] language edits for grammar and clarity --- .gitignore | 1 + README.Rmd | 35 ++-- README.md | 204 ++++++++++++---------- vignettes/Check_and_download_NBM_data.Rmd | 12 +- vignettes/articles/NBM.Rmd | 70 ++++---- vignettes/articles/f477.Rmd | 48 ++--- 6 files changed, 194 insertions(+), 176 deletions(-) diff --git a/.gitignore b/.gitignore index 789224b..1d73999 100644 --- a/.gitignore +++ b/.gitignore @@ -9,3 +9,4 @@ data-raw/fcc_staff.R data_swamp/ *.parquet inst/doc +.Rproj.user diff --git a/README.Rmd b/README.Rmd index 6422958..296312f 100644 --- a/README.Rmd +++ b/README.Rmd @@ -20,18 +20,18 @@ knitr::opts_chunk$set( [![R-CMD-check](https://github.com/ruralinnovation/cori.data.fcc/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ruralinnovation/cori.data.fcc/actions/workflows/R-CMD-check.yaml) -The goal of cori.data.fcc is to facilitate the discovery, download and use of FCC public data releases. +The goal of `cori.data.fcc` is to facilitate the discovery, analysis, and use of FCC public data releases. -It covers: +The package provides access to data from the following sources: - National Broadband Map [(NBM)](https://broadbandmap.fcc.gov/home) data[^bdc] - [Form 477](https://www.fcc.gov/general/broadband-deployment-data-fcc-form-477) data -[^bdc]: This is data about internet services available to individual locations across the country, along with new maps of mobile coverage, as reported by Internet Service Providers (ISPs) as part of the FCC’s ongoing [Broadband Data Collection](https://broadbandmap.fcc.gov/data-download/nationwide-data)). +[^bdc]: This data describes what internet services are available to individual locations across the country, along with new maps of mobile coverage, as reported by Internet Service Providers (ISPs). It is part of the FCC’s ongoing [Broadband Data Collection](https://broadbandmap.fcc.gov/data-download/nationwide-data)). ## Installation -You can install the development version of cori.data.fcc from [GitHub](https://github.com/) with: +You can install the development version of `cori.data.fcc` from [GitHub](https://github.com/) with: ``` r # install.packages("devtools") @@ -46,50 +46,47 @@ library(cori.data.fcc) ### National Broadband Map -- The package is providing you a way to download zipped `csv` see the vignette "Check and download NBM data" +Key uses: -- Access a parquet files stored in CORI s3 bucket per county: +- Access parquet files stored in a CORI s3 bucket, by county: ```{r nbm_raw} guilford_cty <- get_county_nbm_raw(geoid_co = "37081") -head(guilford_cty) +dplyr::glimpse(guilford_cty) ``` -- Use the CORI opinionated version at the Census block level for the **last NBM's release**: +- Access a CORI-opinionated, Census-block level version of the **latest NBM release**: ```{r nbm_block} # get a county nbm_bl <- get_nbm_bl(geoid_co = "47051") -dim(nbm_bl) +dplyr::glimpse(nbm_bl) # get census block covered by an ISP identified by their FRN skymesh <- get_frn_nbm_bl("0027136753") -dim(skymesh) +dplyr::glimpse(skymesh) ``` ### Form 477 -Sadly automating the download of some of the source data is harder for Form 477. -We are not providing that functionality. - -You can get all data (multiple years) covering a State from Form 477: +Access state data for multiple years: ```{r get_f477_example} f477_vt <- get_f477("VT") -head(f477_vt) +dplyr::glimpse(f477_vt) ``` ### Utilities -Getting the dictionnary for each dataset: +Access the dictionary for each dataset: ```{r get_fcc_dictionary_ex} -head(get_fcc_dictionary()) +dplyr::glimpse(get_fcc_dictionary()) ``` -The package also provide the list of Provider ID and FRN +The package also provides a list of Provider IDs and FRNs. ```{r fcc_provider} str(fcc_provider) @@ -97,4 +94,4 @@ str(fcc_provider) ## Inspiration -This package was imspired by https://github.com/bbcommons/bfm-explorer +This package was inspired by https://github.com/bbcommons/bfm-explorer diff --git a/README.md b/README.md index ce7f3a0..4ede10d 100644 --- a/README.md +++ b/README.md @@ -10,20 +10,20 @@ coverage](https://codecov.io/gh/ruralinnovation/cori.data.fcc/branch/main/graph/ [![R-CMD-check](https://github.com/ruralinnovation/cori.data.fcc/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ruralinnovation/cori.data.fcc/actions/workflows/R-CMD-check.yaml) -The goal of cori.data.fcc is to facilate the discovery, the download and -uses of FCC’s data. +The goal of `cori.data.fcc` is to facilitate the discovery, analysis, +and use of FCC public data releases. -It covers: +The package provides access to data from the following sources: - National Broadband Map [(NBM)](https://broadbandmap.fcc.gov/home) - data + data[^1] - [Form 477](https://www.fcc.gov/general/broadband-deployment-data-fcc-form-477) data ## Installation -You can install the development version of cori.data.fcc from +You can install the development version of `cori.data.fcc` from [GitHub](https://github.com/) with: ``` r @@ -39,122 +39,132 @@ library(cori.data.fcc) ### National Broadband Map -- The package is providing you a way to download zipped `csv` see the - vignette “Check and download NBM data” +Key uses: -- Access a parquet files stored in CORI s3 bucket per county: +- Access parquet files stored in a CORI s3 bucket, by county: ``` r guilford_cty <- get_county_nbm_raw(geoid_co = "37081") -head(guilford_cty) -#> frn provider_id brand_name location_id technology -#> 1 0001857952 130077 AT&T 1344960789 10 -#> 2 0001857952 130077 AT&T 1344965855 10 -#> 3 0001857952 130077 AT&T 1344971572 10 -#> 4 0001857952 130077 AT&T 1344982708 10 -#> 5 0001857952 130077 AT&T 1344991329 10 -#> 6 0001857952 130077 AT&T 1344996969 10 -#> max_advertised_download_speed max_advertised_upload_speed low_latency -#> 1 10 1 TRUE -#> 2 0 0 TRUE -#> 3 10 1 TRUE -#> 4 50 10 TRUE -#> 5 50 10 TRUE -#> 6 75 20 TRUE -#> business_residential_code state_usps geoid_bl geoid_co file_time_stamp -#> 1 X NC 370810161022008 37081 2024-09-03 -#> 2 X NC 370810168003003 37081 2024-09-03 -#> 3 X NC 370810125051020 37081 2024-09-03 -#> 4 X NC 370810171011021 37081 2024-09-03 -#> 5 X NC 370810157042006 37081 2024-09-03 -#> 6 X NC 370810127052022 37081 2024-09-03 -#> release -#> 1 2023-12-01 -#> 2 2023-12-01 -#> 3 2023-12-01 -#> 4 2023-12-01 -#> 5 2023-12-01 -#> 6 2023-12-01 +dplyr::glimpse(guilford_cty) +#> Rows: 1,337,541 +#> Columns: 14 +#> $ frn "0001857952", "0001857952", "0001857952"… +#> $ provider_id "130077", "130077", "130077", "130077", … +#> $ brand_name "AT&T", "AT&T", "AT&T", "AT&T", "AT&T", … +#> $ location_id "1344960789", "1344965855", "1344971572"… +#> $ technology 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, … +#> $ max_advertised_download_speed 10, 0, 10, 50, 50, 75, 50, 10, 50, 0, 10… +#> $ max_advertised_upload_speed 1, 0, 1, 10, 10, 20, 10, 1, 10, 0, 1, 5,… +#> $ low_latency TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE… +#> $ business_residential_code "X", "X", "X", "X", "X", "X", "X", "X", … +#> $ state_usps "NC", "NC", "NC", "NC", "NC", "NC", "NC"… +#> $ geoid_bl "370810161022008", "370810168003003", "3… +#> $ geoid_co "37081", "37081", "37081", "37081", "370… +#> $ file_time_stamp 2024-09-03, 2024-09-03, 2024-09-03, 202… +#> $ release 2023-12-01, 2023-12-01, 2023-12-01, 202… ``` -- Use the CORI opinionated version at the Census block level for the - **last NBM’s release**: +- Access a CORI-opinionated, Census-block level version of the **latest + NBM release**: ``` r # get a county nbm_bl <- get_nbm_bl(geoid_co = "47051") -dim(nbm_bl) -#> [1] 2146 21 +dplyr::glimpse(nbm_bl) +#> Rows: 2,146 +#> Columns: 21 +#> $ geoid_bl "470519601001000", "4705196010… +#> $ geoid_st "47", "47", "47", "47", "47", … +#> $ geoid_co "47051", "47051", "47051", "47… +#> $ state_abbr "TN", "TN", "TN", "TN", "TN", … +#> $ cnt_total_locations NA, NA, NA, NA, 8, NA, 8, 3, 1… +#> $ cnt_bead_locations NA, NA, NA, NA, 0, NA, 0, 0, 0… +#> $ cnt_copper_locations NA, NA, NA, NA, 0, NA, 0, 0, 0… +#> $ cnt_cable_locations NA, NA, NA, NA, 0, NA, 0, 0, 0… +#> $ cnt_fiber_locations NA, NA, NA, NA, 0, NA, 0, 0, 0… +#> $ cnt_other_locations NA, NA, NA, NA, 0, NA, 0, 0, 0… +#> $ cnt_unlicensed_fixed_wireless_locations NA, NA, NA, NA, 7, NA, 8, 3, 1… +#> $ cnt_licensed_fixed_wireless_locations NA, NA, NA, NA, 0, NA, 0, 0, 0… +#> $ cnt_LBR_fixed_wireless_locations NA, NA, NA, NA, 0, NA, 0, 0, 0… +#> $ cnt_terrestrial_locations NA, NA, NA, NA, 0, NA, 0, 0, 0… +#> $ cnt_25_3 NA, NA, NA, NA, 0, NA, 0, 0, 0… +#> $ cnt_100_20 NA, NA, NA, NA, 0, NA, 0, 0, 0… +#> $ cnt_100_100 NA, NA, NA, NA, 0, NA, 0, 0, 0… +#> $ cnt_distcint_frn NA, NA, NA, NA, NA, NA, NA, NA… +#> $ array_frn , , , $ combo_frn NA, NA, NA, NA, NA, NA, NA, NA… +#> $ release 2023-12-01, 2023-12-01, 2023-… # get census block covered by an ISP identified by their FRN skymesh <- get_frn_nbm_bl("0027136753") -dim(skymesh) -#> [1] 3 21 +dplyr::glimpse(skymesh) +#> Rows: 3 +#> Columns: 21 +#> $ geoid_bl "390375301004009", "3903755510… +#> $ geoid_st "39", "39", "39" +#> $ geoid_co "39037", "39037", "39109" +#> $ state_abbr "OH", "OH", "OH" +#> $ cnt_total_locations 13, 7, 15 +#> $ cnt_bead_locations 13, 6, 15 +#> $ cnt_copper_locations 9, 2, 10 +#> $ cnt_cable_locations 10, 0, 0 +#> $ cnt_fiber_locations 13, 5, 2 +#> $ cnt_other_locations 0, 0, 0 +#> $ cnt_unlicensed_fixed_wireless_locations 13, 7, 15 +#> $ cnt_licensed_fixed_wireless_locations 13, 6, 14 +#> $ cnt_LBR_fixed_wireless_locations 11, 0, 0 +#> $ cnt_terrestrial_locations 13, 6, 15 +#> $ cnt_25_3 13, 6, 14 +#> $ cnt_100_20 13, 5, 14 +#> $ cnt_100_100 13, 5, 5 +#> $ cnt_distcint_frn 9, 6, 8 +#> $ array_frn <"0002930980", "0004328688", "… +#> $ combo_frn 1.241130e+19, 7.392885e+18, 6.… +#> $ release 2023-12-01, 2023-12-01, 2023-1… ``` ### Form 477 -Sadly automating the download of some of the source data is harder for -Form 477. We are not providing that functionality. - -You can get all data (multiple years) covering a State from Form 477: +Access state data for multiple years: ``` r f477_vt <- get_f477("VT") -head(f477_vt) -#> Provider_Id FRN ProviderName DBAName -#> 1 9395 0021002092 Stowe Cablevision, Inc. Stowe Access, LLC -#> 2 9395 0021002092 Stowe Cablevision, Inc. Stowe Access, LLC -#> 3 9395 0021002092 Stowe Cablevision, Inc. Stowe Access, LLC -#> 4 9395 0021002092 Stowe Cablevision, Inc. Stowe Access, LLC -#> 5 9395 0021002092 Stowe Cablevision, Inc. Stowe Access, LLC -#> 6 9395 0021002092 Stowe Cablevision, Inc. Stowe Access, LLC -#> HoldingCompanyName HocoNum HocoFinal StateAbbr -#> 1 Stowe Cablevision, Inc. 240090 Stowe Cablevision, Inc. VT -#> 2 Stowe Cablevision, Inc. 240090 Stowe Cablevision, Inc. VT -#> 3 Stowe Cablevision, Inc. 240090 Stowe Cablevision, Inc. VT -#> 4 Stowe Cablevision, Inc. 240090 Stowe Cablevision, Inc. VT -#> 5 Stowe Cablevision, Inc. 240090 Stowe Cablevision, Inc. VT -#> 6 Stowe Cablevision, Inc. 240090 Stowe Cablevision, Inc. VT -#> BlockCode TechCode Consumer MaxAdDown MaxAdUp Business Date -#> 1 500159531001026 42 TRUE 25 5 TRUE 2014-12-01 -#> 2 500159531001026 41 TRUE 25 5 TRUE 2014-12-01 -#> 3 500159531001026 50 FALSE 0 0 TRUE 2014-12-01 -#> 4 500159531001027 42 TRUE 25 5 TRUE 2014-12-01 -#> 5 500159531001027 41 TRUE 25 5 TRUE 2014-12-01 -#> 6 500159531001027 50 FALSE 0 0 TRUE 2014-12-01 +dplyr::glimpse(f477_vt) +#> Rows: 1,147,267 +#> Columns: 15 +#> $ Provider_Id "9395", "9395", "9395", "9395", "9395", "9395", "93… +#> $ FRN "0021002092", "0021002092", "0021002092", "00210020… +#> $ ProviderName "Stowe Cablevision, Inc.", "Stowe Cablevision, Inc.… +#> $ DBAName "Stowe Access, LLC", "Stowe Access, LLC", "Stowe Ac… +#> $ HoldingCompanyName "Stowe Cablevision, Inc.", "Stowe Cablevision, Inc.… +#> $ HocoNum "240090", "240090", "240090", "240090", "240090", "… +#> $ HocoFinal "Stowe Cablevision, Inc.", "Stowe Cablevision, Inc.… +#> $ StateAbbr "VT", "VT", "VT", "VT", "VT", "VT", "VT", "VT", "VT… +#> $ BlockCode "500159531001026", "500159531001026", "500159531001… +#> $ TechCode "42", "41", "50", "42", "41", "50", "42", "41", "50… +#> $ Consumer TRUE, TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, F… +#> $ MaxAdDown 25, 25, 0, 25, 25, 0, 25, 25, 0, 25, 25, 0, 25, 25,… +#> $ MaxAdUp 5, 5, 0, 5, 5, 0, 5, 5, 0, 5, 5, 0, 5, 5, 0, 5, 5, … +#> $ Business TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRU… +#> $ Date 2014-12-01, 2014-12-01, 2014-12-01, 2014-12-01, 20… ``` ### Utilities -Getting the dictionnary for each dataset: +Access the dictionary for each dataset: ``` r -head(get_fcc_dictionary()) -#> dataset var_name var_type -#> 1 f477 Provider_Id TEXT -#> 2 f477 FRN TEXT -#> 3 f477 ProviderName VARCHAR -#> 4 f477 DBAName VARCHAR -#> 5 f477 HoldingCompanyName VARCHAR -#> 6 f477 HocoNum TEXT -#> var_description -#> 1 filing number (assigned by FCC) -#> 2 FCC registration number -#> 3 Provider name -#> 4 'Doing business as' name -#> 5 Holding company name (as filed on Form 477) -#> 6 Holding company number (assigned by FCC) -#> var_example -#> 1 8026 -#> 2 0001570936 -#> 3 Arctic Slope Telephone Association Cooperative, Inc. -#> 4 ASTAC -#> 5 Arctic Slope Telephone Association Cooperative, Inc. -#> 6 130067 +dplyr::glimpse(get_fcc_dictionary()) +#> Rows: 50 +#> Columns: 5 +#> $ dataset "f477", "f477", "f477", "f477", "f477", "f477", "f477"… +#> $ var_name "Provider_Id", "FRN", "ProviderName", "DBAName", "Hold… +#> $ var_type "TEXT", "TEXT", "VARCHAR", "VARCHAR", "VARCHAR", "TEXT… +#> $ var_description "filing number (assigned by FCC)", "FCC registration n… +#> $ var_example "8026", "0001570936", "Arctic Slope Telephone Associat… ``` -The package also provide the list of Provider ID and FRN +The package also provides a list of Provider IDs and FRNs. ``` r str(fcc_provider) @@ -168,4 +178,10 @@ str(fcc_provider) ## Inspiration -This package was imspired by +This package was inspired by + +[^1]: This data describes what internet services are available to + individual locations across the country, along with new maps of + mobile coverage, as reported by Internet Service Providers (ISPs). + It is part of the FCC’s ongoing [Broadband Data + Collection](https://broadbandmap.fcc.gov/data-download/nationwide-data)). diff --git a/vignettes/Check_and_download_NBM_data.Rmd b/vignettes/Check_and_download_NBM_data.Rmd index 3fe1dd8..76bca65 100644 --- a/vignettes/Check_and_download_NBM_data.Rmd +++ b/vignettes/Check_and_download_NBM_data.Rmd @@ -12,20 +12,21 @@ library(cori.data.fcc) ``` -This example shows some basic workflow: +This example shows a basic workflow: -1. First, you inspect what release are available: +1. First, you can inspect what releases are available: ```{r get_nbm_release} release <- get_nbm_release() # get the available releases release ``` -2. Second, you check what files are available: +2. Second, you can check what files are available: ```{r get_nbm_available} nbm <- get_nbm_available() # get what data is available -# if we are intrested in "Fixed Broadband" / "Nationwide" / released "June 30, 2023" + +# if we are interested in "Fixed Broadband" / "Nationwide" / released "June 30, 2023" nbm_filter <- nbm[which(nbm$release == "June 30, 2023" & nbm$data_type == "Fixed Broadband" & nbm$data_category == "Nationwide"), ] @@ -36,7 +37,8 @@ rownames(nbm_filter) <- NULL nbm_dplyr_filter <- nbm |> dplyr::filter(release == "June 30, 2023" & data_type == "Fixed Broadband" & data_category == "Nationwide") + all.equal(nbm_filter, nbm_dplyr_filter) #> [1] TRUE head(nbm_filter) -``` \ No newline at end of file +``` diff --git a/vignettes/articles/NBM.Rmd b/vignettes/articles/NBM.Rmd index 1d35a33..286fc99 100644 --- a/vignettes/articles/NBM.Rmd +++ b/vignettes/articles/NBM.Rmd @@ -49,36 +49,41 @@ The license can be found [here](https://broadbandmap.fcc.gov/about): ## A Quick Introduction -NBM was started by the FCC in November 2022[^fcc_nbm_start] and is following [Form 477](https://ruralinnovation.github.io/cori.data.fcc/articles/f477.html). +NBM was launched by the FCC in November 2022[^fcc_nbm_start] and follows [Form 477](https://ruralinnovation.github.io/cori.data.fcc/articles/f477.html). [^fcc_nbm_start]: [https://www.fcc.gov/news-events/notes/2022/11/18/new-broadband-maps-are-finally-here](https://www.fcc.gov/news-events/notes/2022/11/18/new-broadband-maps-are-finally-here) -Behind the National Broadband Map they are **two** datasets (see @fig-broadbanddata, below). -We are using the "Broadband Availability" dataset that is derived from the "Fabric" locations dataset (developed by CostQuest). +Behind the National Broadband Map, there are **two** datasets (see @fig-broadbanddata, below). +We use the "Broadband Availability" dataset that is derived from the "Fabric" locations dataset (developed by CostQuest). The locations are determined within the Fabric locations data. This dataset can be derived in multiple ways: by States or by Providers. -At the States the Data can be split between summaries and "raw data". -The summaries available are by geographies or by technologies: Fixed Broadband and Mobile Broadband. +At the state level, the data can be split between summaries and "raw data". +The summaries available are by geographies or by technologies (Fixed Broadband and Mobile Broadband). -For every states you need to access the raw data by technologies. -In our works we focused in the Fixed Broadband Availability Data. +For every state, you need to access the raw data by technology. +In our work, we focused on Fixed Broadband Availability data. -Here, the NBM provides information at the scale of a "service" - a location covered by a provider by a technology with specifics maximum speeds. +The NBM provides information about the scale of a "service" - a location covered by a provider and by a technology with a specific maximum speed. -This is one of the big change versus Form 477: we are moving from the scale of a Census block to the scale of a location (See @sec-BSL for a definition). +This formatting approach is one of the big changes compared to Form 477. +We moved from a Cesus-block scale to location-based scale (See @sec-BSL for a definition). Every location is characterized by: - Who is providing those services (`frn`, `provider_id`, and `brand_name`) -- A description of each services (`technology`, `max_advertised_download_speed`, `max_advertised_upload_speed`, `low_latency`) -- Whether the location associated with residential, business or both -- ways to localize it (`state_abbr`, `block_geoid`, `h3_res8_id`) +- A description of each service (`technology`, `max_advertised_download_speed`, `max_advertised_upload_speed`, `low_latency`) +- Whether the location is residential, business or both +- Ways to localize the location (`state_abbr`, `block_geoid`, `h3_res8_id`) -In our ingestion we did not kept `h3_res8_id` but we added the date of the release and the timestamp provided in the filename (see `data-raw/NBM.R` to get every details). +In our ingestion, we did not keep the `h3_res8_id` property, but we added the +date of the release and the timestamp provided in the +filename (see `data-raw/NBM.R` to get every details). -The exact coordinates of every locations is only part of the Fabric dataset and within the Broadband Availability we can only link a record for a location to a Census Block (2020 vintage) or H3 hexagon. +The exact coordinates of every location is only part of the Fabric dataset. +Within the Broadband Availability data, we can only link a record for a location +to a Census Block (2020 vintage) or H3 hexagon. !["What is on the national broadband map" Source: [https://www.fcc.gov/BroadbandData](https://www.fcc.gov/BroadbandData)](whats-on-the-national-broadband-map-113023-1.png){#fig-broadbanddata} @@ -93,8 +98,8 @@ A business BSL includes “all non-residential (business, government, non-profit ### When is this data updated? NBM has two big releases per year (June and December) and have versions every two weeks to take into account challenges[^challenges]. -Experience has told us that sometimes their release can be faster (more than one per week) or slower. -The FCC did not (April 2024) provides a changelog between releases or versions (but the documentation has some of the major changes[^nbm_chnagelog]). +Sometimes their release can be faster (more than one per week) or slower. +The FCC did not (April 2024) provide a changelog between releases or versions (but the documentation has some of the major changes[^nbm_chnagelog]). [^challenges]: [https://www.fcc.gov/sites/default/files/bdc-challenge-overview.pdf](https://www.fcc.gov/sites/default/files/bdc-challenge-overview.pdf) @@ -103,41 +108,32 @@ The FCC did not (April 2024) provides a changelog between releases or versions ( ### What is the geographic coverage? -The Broadband Availability data is covering all US States, Puerto Rico and the US territories. +The Broadband Availability data covers all US States, Puerto Rico, and the US territories. -# How does cori.data.fcc help me getting NBM data? +# How does cori.data.fcc help me access NBM data? -`cori.data.fcc` help you access this data set in 3 diffferent ways: +`cori.data.fcc` helps you access this dataset in 3 different ways: -1. The package is providing functions to list available data and download it +1. The package provides functions to list available data and download it -2. We ingested all the Fixed Broadband Availability Data and are providing it in a s3 bucket +2. We ingested all the Fixed Broadband Availability Data and are providing it +in a s3 bucket (for more performant data loading!) -3. We ingested all Fixed Broadband Availability data and transformed it to provides information at the Census block level. +3. We ingested all Fixed Broadband Availability data and transformed all information +to be available at the Census block level. -## Which ways should I go? +### NBM Raw Data dictionary -It will depends on your specific needs. -We were more interested at the information at the census block level but for that we needed to download the data. -So we might as well share how we do it. -We then first transformed the data in our database. -The process was taking some time and frequently we had questions that could not be answered with how the schema was made, hence we reingested it. -Ingesting CSV is a coastly operation it was better to have an intermediary data set, close to the raw data but in a better format to either access it and or reingest it. -Parquet was perfect for that need and we might as well provide other to access it. - - -### NBM raws's Data dictionary - -This dataset is called "nbm_raw" and dictionary can be accessed with the function `get_fcc_dictionary`: +This dataset is called "nbm_raw" and its dictionary can be accessed with the function `get_fcc_dictionary`: ```{r nbm-data-dic} table_with_options(get_fcc_dictionary("nbm_raw")) ``` -### NBM Block's Data Dictionary +### NBM Block Data Dictionary This dataset is called "nbm_block" @@ -149,7 +145,7 @@ table_with_options(get_fcc_dictionary("nbm_block"))