Skip to content

Commit 00ffad9

Browse files
committed
codigo apresentacao
1 parent 5562b2d commit 00ffad9

File tree

1 file changed

+215
-0
lines changed

1 file changed

+215
-0
lines changed

IC/Cap11_IC.qmd

+215
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,215 @@
1+
---
2+
title: "What if?"
3+
subtitle: "Cap11: Why model?"
4+
author: "Grupo top de IC"
5+
institute: "UnB"
6+
format:
7+
revealjs:
8+
embed-resources: true
9+
multiplex: true
10+
incremental: true
11+
logo: logo.jpeg
12+
scrollable: true
13+
highlight-style: arrow
14+
transition: fade
15+
---
16+
17+
## Capítulo 11
18+
19+
- Part I conceitual, estimadores não paramétricos
20+
- Part II dados "reais", estimadores paramétricos
21+
22+
- We define nonparametric estimators: those that produce estimates from the data without any a priori restrictions on the conditional mean function: Part I
23+
24+
- Parametric estimation and other approaches to borrow information are our only hope when data are unable to speak for themselves.
25+
26+
27+
## Data cannot speak for themselves
28+
29+
- 16 individuals infected with HIV randomly sampled from a larger target population.
30+
31+
- Each individual receives a certain level of antiretroviral therapy. At the end CD4 cell count, in cells/mm3 is measured: We wish to consistently estimate the mean of cell counts for individuals with level A=a.
32+
33+
. . .
34+
35+
$\hat{E}=[Y|A=a]$ é um estimador consistente.
36+
37+
```{r eval=F}
38+
#install.packages("gfoRmula")
39+
40+
library(tidyverse)
41+
dataurls <- list()
42+
stub <- "https://cdn1.sph.harvard.edu/wp-content/uploads/sites/1268/"
43+
dataurls[[1]] <- paste0(stub, "2012/10/nhefs_sas.zip")
44+
dataurls[[2]] <- paste0(stub, "2012/10/nhefs_stata.zip")
45+
dataurls[[3]] <- paste0(stub, "2017/01/nhefs_excel.zip")
46+
dataurls[[4]] <- paste0(stub, "1268/20/nhefs.csv")
47+
48+
temp <- tempfile()
49+
for (i in 1:3) {
50+
download.file(dataurls[[i]], temp)
51+
unzip(temp, exdir = "data")
52+
}
53+
54+
download.file(dataurls[[4]], here("data", "nhefs.csv"))
55+
```
56+
57+
## Tratamentos
58+
59+
::: panel-tabset
60+
## 2 categorias
61+
62+
```{r echo=F}
63+
library(tidyverse)
64+
A <- c(1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0)
65+
Y <- c(200, 150, 220, 110, 50, 180, 90, 170, 170, 30,
66+
70, 110, 80, 50, 10, 20)
67+
68+
ggplot()+
69+
geom_point(aes(A, Y))+
70+
theme_classic(base_size = 14)+
71+
scale_x_continuous(breaks = c(0,1))+
72+
geom_label(aes(x=c(0,1), y=c(200, 30), label=c(67.5,
73+
146.2)))
74+
```
75+
76+
```{r}
77+
78+
summary(Y[A == 0])
79+
summary(Y[A == 1])
80+
```
81+
82+
## 4 categorias
83+
84+
```{r include=F}
85+
library(tidyverse)
86+
A2 <- c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4)
87+
Y2 <- c(110, 80, 50, 40, 170, 30, 70, 50, 110, 50, 180,
88+
89+
130, 200, 150, 220, 210)
90+
m1 <- summary(Y2[A2 == 1])[[4]]
91+
m2 <-summary(Y2[A2 == 2 ])[[4]]
92+
m3 <- summary(Y2[A2 == 3 ])[[4]]
93+
m4 <- summary(Y2[A2 == 4 ])[[4]]
94+
```
95+
96+
```{r}
97+
ggplot()+
98+
geom_point(aes(A2, Y2))+
99+
theme_classic(base_size = 14)+
100+
scale_x_continuous(breaks = c(0,1))+
101+
geom_label(aes(x=c(1,2,3, 4), y=c(195), label=c(m1, m2, m3, m4)))
102+
```
103+
104+
## 0-100mg
105+
106+
- Quanto é a predição para A=90?
107+
108+
```{r echo=F}
109+
110+
A3 <-c(3, 11, 17, 23, 29, 37, 41, 53, 67, 79, 83, 97, 60, 71, 15, 45)
111+
Y3 <-c(21, 54, 33, 101, 85, 65, 157, 120, 111, 200, 140, 220, 230, 217,
112+
11, 190)
113+
114+
ggplot()+
115+
geom_point(aes(A3, Y3))+
116+
theme_classic(base_size = 14)
117+
```
118+
:::
119+
120+
## Parametric estimators of the conditional mean
121+
122+
$E[Y|A]= \theta_0 + \theta_1A$
123+
124+
- modelo linear, $\theta$ são parâmetros.
125+
126+
. . .
127+
128+
```{r echo=F}
129+
#install.packages("ggtext")
130+
library(ggtext)
131+
library(glue)
132+
modelo <- lm(Y3~A3)
133+
134+
a <- format(round(modelo$coefficients[[1]], 1), decimal.mark = ",")
135+
b <- format(round(modelo$coefficients[[2]],1), decimal.mark = ",")
136+
137+
theta0 <- modelo$coefficients[[1]]
138+
theta1 <- modelo$coefficients[[2]]
139+
140+
ggplot()+
141+
geom_point(aes(A3, Y3))+
142+
geom_smooth(aes(A3, Y3),
143+
method="lm", se=F, color="red")+
144+
theme_classic(base_size = 14)+
145+
geom_richtext(aes(50, 150, label=glue("E[Y|A] = {a} + {b}A ")),angle = 28)
146+
```
147+
148+
```{r}
149+
(reta_90 <- theta0 + theta1*90)
150+
```
151+
152+
- A certain degree of model misspecification is almost always expected.
153+
154+
155+
156+
## Mais parametros
157+
158+
::: panel-tabset
159+
160+
## Três parametros
161+
```{r echo=F, message=FALSE, warning=FALSE}
162+
library(geomtextpath)
163+
library(latex2exp)
164+
#> Loading required package: ggplot2
165+
a <- format(round(lm(Y3 ~ A3 + I(A3^2) )$coefficients[[1]],2),decimal.mark = ",")
166+
b <- format(round(lm(Y3 ~ A3 + I(A3^2) )$coefficients[[2]],2),decimal.mark = ",")
167+
c <- format(round(lm(Y3 ~ A3 + I(A3^2) )$coefficients[[3]],2),decimal.mark = ",")
168+
169+
theta0 <- lm(Y3 ~ A3 + I(A3^2) )$coefficients[[1]]
170+
theta1 <- lm(Y3 ~ A3 + I(A3^2) )$coefficients[[2]]
171+
theta2 <- lm(Y3 ~ A3 + I(A3^2) )$coefficients[[3]]
172+
173+
dado <-data.frame(A3, Y3)
174+
label <- expression("~A^2")
175+
176+
177+
ggplot(dado, aes(A3, Y3))+
178+
geom_point()+
179+
geom_textsmooth(aes(label = glue("E[Y|A] = {a} + {b}A + {c}A2")),
180+
method = "lm",formula=y ~ poly(x, 3), rich=TRUE,
181+
size = 5, linetype = 3,
182+
fontface = 2, linewidth = 1, color="darkred")
183+
```
184+
185+
```{r}
186+
(parabola_90 <- theta0 + theta1*90 + theta2*90^2)
187+
```
188+
189+
190+
## 12 parametros
191+
192+
```{r}
193+
ggplot(dado, aes(A3, Y3))+
194+
geom_point()+
195+
geom_smooth(method = "lm",formula=y ~ poly(x, 12),
196+
size = 5, linetype = 3, se=F,
197+
linewidth = 1, color="darkred")
198+
```
199+
200+
- Often modeling can be viewed as a procedure to transform noisy data into more or less smooth curves.
201+
:::
202+
203+
204+
205+
## The bias-variance trade-off
206+
207+
Qual o certo?
208+
209+
```{r}
210+
paste(round(reta_90,2), "IC = 172.1 - 261.6)")
211+
paste(round(parabola_90 ,2), "IC = 142.8 - 251.5)")
212+
213+
```
214+
215+
- Acabou!

0 commit comments

Comments
 (0)