Skip to content

Commit 3365a1f

Browse files
committed
Finalize solutions for probability
1 parent 643fca0 commit 3365a1f

1 file changed

Lines changed: 158 additions & 1 deletion

File tree

content/exercises/probability/index.md

Lines changed: 158 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -160,7 +160,7 @@ Pr(B | A) = Pr(A)/Pr(A) = 1
160160

161161
But since _all_ probabilities are less than 1, it follows that `Pr(B | A) ≥ Pr(A)`.
162162

163-
# Base-rate fallacy
163+
# Base-rate fallacy {.solved}
164164

165165
The [base-rate fallacy](https://en.wikipedia.org/wiki/Base_rate_fallacy) is
166166
about a common probabilistic fallacy involving conditional probabilities, which
@@ -210,3 +210,160 @@ medicine to a patient where we're more certain than not that they have the
210210
disease. How many times do you need to get a positive result before you should
211211
administer the medicine if you apply Bayesian updating after each positive
212212
result.
213+
214+
## Solution {#base-rate-fallacySolution .solution}
215+
216+
1. Here's the relevant probabilities:
217+
218+
- `Pr(HasDisease) = 0.01`
219+
- `Pr(TestPositive | HasDisease) = 0.9`
220+
- `Pr(¬TestPositive | ¬HasDisease) = 0.9`
221+
222+
2. According to Bayes formula
223+
224+
```
225+
Pr(HasDisease | TestPositive)
226+
```
227+
```
228+
=
229+
```
230+
```
231+
[Pr(HasDisease) x Pr(TestPositive | HasDisease) ] / Pr(TestPositive)
232+
```
233+
234+
We have `Pr(HasDisease)` and `Pr(TestPositive | HasDisease)`, but we need to figure out `Pr(TestPositive)`.
235+
236+
For this, we use the law of total probability. Apply it to `TestPositive` as `A` and `HasDisease` as `B` gives us:
237+
238+
```
239+
Pr(TestPositive)
240+
```
241+
```
242+
=
243+
```
244+
```
245+
Pr(TestPositive| HasDiesease)Pr(HasDisease)
246+
```
247+
```
248+
+
249+
```
250+
```
251+
Pr(TestPositive | ¬HasDisease)Pr(¬HasDisease)
252+
```
253+
254+
We have `Pr(TestPositive| HasDiesease)` and `Pr(HasDisease)` given. To obtain
255+
`Pr(TestPositive | ¬HasDisease)` and `Pr(¬HasDisease)`, we apply the negation laws:
256+
```
257+
Pr(TestPositive | ¬HasDisease) = 1 - Pr(¬TestPositive | ¬HasDisease)
258+
```
259+
```
260+
Pr(¬HasDisease) = 1 - Pr(HasDisease)
261+
```
262+
263+
Now we have all the relevant values and can calculate:
264+
265+
```
266+
Pr(¬HasDisease) = 1 - Pr(HasDisease) = 1 - 0.01 = 0.99
267+
```
268+
```
269+
Pr(TestPositive | ¬HasDisease) = 1 - Pr(¬TestPositive | ¬HasDisease)
270+
```
271+
```
272+
= 1 - 0.9 = 0.1
273+
```
274+
```
275+
Pr(TestPositive | ¬HasDisease)Pr(¬HasDisease) = 0.99 x 0.1 = 0.099
276+
```
277+
278+
So:
279+
280+
```
281+
Pr(TestPositive)
282+
```
283+
```
284+
=
285+
```
286+
```
287+
Pr(TestPositive| HasDiesease)Pr(HasDisease)
288+
```
289+
```
290+
+
291+
```
292+
```
293+
Pr(TestPositive | ¬HasDisease)Pr(¬HasDisease)
294+
```
295+
```
296+
=
297+
```
298+
```
299+
(0.9 x 0.01) + 0.099 = 0.108
300+
```
301+
302+
Now for the final probabilities:
303+
304+
```
305+
Pr(HasDisease | TestPositive)
306+
```
307+
```
308+
=
309+
```
310+
```
311+
[Pr(HasDisease) x Pr(TestPositive | HasDisease) ] / Pr(TestPositive)
312+
```
313+
```
314+
=
315+
```
316+
```
317+
(0.01 x 0.9) / 0.108 = 0.009 / 0.108 = 0.083...
318+
```
319+
320+
In other words, a random person testing positive, given this setup, gives
321+
us a probability of around 8% of them having the disease—even if the test
322+
is 90% reliable. This is because the prior probability of the person having
323+
the disease—the so-called base-rate—is very low.
324+
325+
Note that it's important here that the person was chosen randomly. If there
326+
is additional information, such as them showing symptoms, the prior of them
327+
having the disease would be different.
328+
329+
3. If we re-set the probabilities using Bayesian updating, we now have:
330+
331+
```
332+
Pr(Disease) = 0.083...
333+
```
334+
335+
This affects the value of `Pr(TestPositive)`, which we've calculated using `Pr(Disease)`.
336+
337+
We now get:
338+
339+
```
340+
Pr(¬HasDisease) = 1 - Pr(HasDisease) = 1 - 0.083.. = 0.916...
341+
```
342+
```
343+
Pr(TestPositive | ¬HasDisease)Pr(¬HasDisease) = 0.1 x 0.916... = 0.0916...
344+
```
345+
346+
So, we now have for `Pr(TestPositive)` that:
347+
348+
```
349+
Pr(TestPositive) = 0.009 + 0.0916... = 0.1006...
350+
```
351+
352+
```
353+
Pr(HasDisease | TestPositive)
354+
```
355+
```
356+
=
357+
```
358+
```
359+
[Pr(HasDisease) x Pr(TestPositive | HasDisease) ] / Pr(TestPositive)
360+
```
361+
```
362+
=
363+
```
364+
```
365+
(0.083 x 0.9) / 0.106 = 0.0747 / 0.1006... ≈ 0.742..
366+
```
367+
368+
So, two consecutive positive tests on a random citizen are sufficient to raise the probability to around 75%.
369+

0 commit comments

Comments
 (0)