forked from wfmackey/absmapsdata
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
262 lines (198 loc) · 8.7 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# absmapsdata
<!-- badges: start -->
[](https://www.tidyverse.org/lifecycle/#stable)
[](https://github.com/wfmackey/absmapsdata/actions)
<!-- badges: end -->
The `absmapsdata` package exists to make it easier to produce maps from
ABS data in R. The package contains compressed, tidied, and
lazily-loadable `sf` objects that hold geometric information about ABS
data structures.
It also contains a vast number of 2016 population-weighted ABS correspondences (the most recent) that you can access with the `get_correspondence_absmaps` function. The correspondences available can be found at the [data.gov.au website](https://data.gov.au/data/dataset/asgs-geographic-correspondences-2016/resource/951e18c7-f187-4c86-a73f-fcabcd19af16).
Before we get into the ‘what problem is this package solving’ details,
let’s look at some examples so that you can copy-paste into your own
script and replicate out-of-the-box (and impress your friends).
## Installation
You can install `absmapsdata` from github with:
```{r, eval=FALSE}
# install.packages("remotes")
remotes::install_github("wfmackey/absmapsdata")
```
`absmapsdata` contains a lot of data, so installing using `remotes::install_github` may fail if the download times out. If this happens, set the timeout option to a large value and try again, i.e. run
```{r set_timeout, eval=FALSE}
options(timeout=1000)
remotes::install_github("wfmackey/absmapsdata")
```
The `sf` package is required to handle the `sf` objects:
```{r, eval=FALSE}
# install.packages("sf")
library(sf)
```
## Maps loaded with this package
Available maps are listed below. These will be added to over time.
If you would like to request a map to be added, let me know via an issue on this Github repo.
**ASGS Main Structures**
* Statistical Area 1 2011: `sa12011`
* Statistical Area 1 2016: `sa12016`
* Statistical Area 2 2011: `sa22011`
* Statistical Area 2 2016: `sa22016`
* Statistical Area 3 2011: `sa32011`
* Statistical Area 3 2016: `sa32016`
* Statistical Area 4 2011: `sa42011`
* Statistical Area 4 2016: `sa42016`
* Greater Capital Cities 2011: `gcc2011`
* Greater Capital Cities 2016: `gcc2016`
* Remoteness Areas 2011: `ra2011`
* Remoteness Areas 2016: `ra2016`
* State 2011: `state2011`
* State 2016: `state2016`
**ASGS Non-ABS Structures**
* Commonwealth Electoral Divisions 2018: `ced2018`
* State Electoral Divisions 2018:`sed2018`
* Local Government Areas 2016: `lga2016`
* Local Government Areas 2018: `lga2018`
* Regions for the Internet Vacancy Index 2008: `regional_ivi2008`
* Postcodes 2016: `postcodes2016`
* Census of Population and Housing Destination Zones 2011: `dz2011`
* Census of Population and Housing Destination Zones 2016: `dz2016`
**Non-ABS Australian Government Structures**
* Employment Regions 2015-2020: `employment_regions2015`
## Just show me how to make a map with this package
### Using the package’s pre-loaded data
The `absmapsdata` package comes with pre-downloaded and pre-processed
data. To load a particular geospatial object: load the **package**, then
call the object (see list above for object names).
```{r}
library(tidyverse)
library(sf)
library(absmapsdata)
mapdata1 <- sa32011
glimpse(mapdata1)
```
Or
```{r}
mapdata2 <- sa22016
glimpse(mapdata2)
```
The resulting `sf` object contains one observation per area (in the
following examples, one observation per `sa3`). It stores the geometry
information in the `geometry` variable, which is a nested list
describing the area’s polygon. The object can be joined to a standard
`data.frame` or `tibble` and can be used with `dplyr` functions.
### Creating maps with your `sf` object
We do all this so we can create gorgeous maps. And with the `sf` object
in hand, plotting a map via `ggplot` and `geom_sf` is simple.
```{r}
map <-
sa32016 %>%
filter(gcc_name_2016 == "Greater Melbourne") %>% # let's just look Melbourne
ggplot() +
geom_sf(aes(geometry = geometry)) # use the geometry variable
map
```
The data also include centroids of each area, and we can add these
points to the map with the `cent_lat` and `cent_long` variables using
`geom_point`.
```{r}
map <- sa32016 %>%
filter(gcc_name_2016 == "Greater Melbourne") %>% # let's just look Melbourne
ggplot() +
geom_sf(aes(geometry = geometry)) + # use the geometry variable
geom_point(aes(cent_long, cent_lat)) # use the centroid long (x) and lats (y)
map
```
Cool. But this all looks a bit ugly. We can pretty it up
using `ggplot` tweaks. See the comments on each line for its objective.
Also note that we’re filling the areas by their `areasqkm` size, another
variable included in the `sf` object (we’ll replace this with more
interesting data in the next section).
```{r}
map <- sa32016 %>%
filter(gcc_name_2016 == "Greater Melbourne") %>% # let's just look Melbourne
ggplot() +
geom_sf(aes(geometry = geometry, # use the geometry variable
fill = areasqkm_2016), # fill by area size
lwd = 0, # remove borders
show.legend = FALSE) + # remove legend
geom_point(aes(cent_long,
cent_lat), # use the centroid long (x) and lats (y)
colour = "white") + # make the points white
theme_void() + # clears other plot elements
coord_sf()
map
```
## Joining with other datasets
At some point, we’ll want to join our spatial data with
data-of-interest. The variables in our mapping data—stating the numeric
code and name of each area and parent area—will make this *relatively*
easy.
For example: suppose we had a simple dataset of median income by SA3
over time.
```{r}
# Read data in some data
income <- read_csv("https://raw.githubusercontent.com/wfmackey/absmapsdata/master/img/data/median_income_sa3.csv")
head(income)
```
This income data contains a variable `sa3_name_2016`, and we can use
`dplyr::left_join()` to combine with our mapping data.
```{r}
combined_data <- left_join(income,
sa32016,
by = "sa3_name_2016")
```
Now that we have a tidy dataset with 1) the income data we want to plot,
and 2) the geometry of the areas, we can plot income by area:
```{r}
map <- combined_data %>%
filter(gcc_name_2016 == "Greater Melbourne") %>% # let's just look Melbourne
ggplot() +
geom_sf(aes(geometry = geometry, # use the geometry variable
fill = median_income), # fill by unemployment rate
lwd = 0) + # remove borders
theme_void() + # clears other plot elements
labs(fill = "Median income")
```
## Get correspondence files
You can use the `get_correspondence_absmaps` function to get population-weighted correspondence tables provided [by the ABS](https://data.gov.au/data/dataset/asgs-geographic-correspondences-2016/resource/951e18c7-f187-4c86-a73f-fcabcd19af16).
Note that while there are lots of correspondence tables, not every combination is available.
For example:
```{r}
get_correspondence_absmaps("cd", 2006,
"sa1", 2016)
```
## Why does this package exist?
The motivation for this package is that maps are cool and fun and are,
sometimes, the best way to communicate data. And making maps is `R` with
`ggplot` is relatively easy *when you have the right `object`*.
Getting the right `object` is not technically difficult, but requires
research into the best-thing-to-do at each of the following steps:
- Find the ASGS ABS spatial-data page and determine the right file to
download.
- Read the shapefile into `R` using one-of-many import tools.
- Convert the object into something usable.
- Clean up any inconsistencies and apply consistent variable
naming/values across areas and years.
- Find an appropriate compression function and level to optimise
output.
For me, at least, finding the correct information and developing the
best set of steps was a little bit interesting but mostly tedious and
annoying. The `absmapsdata` package holds this data for you, so you can
spend more time making maps, and less time on Stack Overflow, the ABS
website, and [lovely-people’s wonderful
blogs](https://www.neonscience.org/dc-open-shapefiles-r).
## Comments/complaints/requests/THOUGHTS
Fair enough! The best avenue is via a Github issue at
[wfmackey/absmapsdata/issues](https://github.com/wfmackey/absmapsdata/issues).
This is also the best place to request data that isn't yet available in the package.