Skip to content

Commit da8cd41

Browse files
committed
Add more on fitting
1 parent d3c008c commit da8cd41

File tree

1 file changed

+96
-0
lines changed

1 file changed

+96
-0
lines changed

fitting.qmd

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ execute:
1414
```{r}
1515
#| include: false
1616
#| cache: false
17+
set.seed(1)
1718
source("common.R")
1819
```
1920
# A pragmatic introduction
@@ -493,3 +494,98 @@ Pass `runner` through to `monty_sample`:
493494
samples <- monty_sample(posterior, sampler, 1000,
494495
runner = runner, n_chains = 4)
495496
```
497+
498+
(sorry, this is broken unless your model is in a package!)
499+
500+
## Run chains on different cluster nodes
501+
502+
```r
503+
monty_sample_manual_prepare(posterior, sampler, 10000, "mypath", n_chains = 10)
504+
```
505+
506+
Then queue these up on a cluster, e.g., using hipercow:
507+
508+
```r
509+
hipercow::task_create_bulk_call(
510+
monty_sample_manual_run, 1:10, args = list("mypath"))
511+
```
512+
513+
And retrieve the result
514+
515+
```r
516+
samples <- monty_sample_manual_collect("mypath")
517+
```
518+
519+
(sorry, also broken unless your model is in a package!)
520+
521+
## Autocorrelation {.smaller}
522+
523+
* Notion from time series, which translates for (P)MCMC in term of the steps of the chains
524+
* Autocorrelation refers to the correlation between the values of a time series at different points in time. In MCMC, this means correlation between successive samples.
525+
* In the context of MCMC, autocorrelation can most of the time be substituted instead of "bad mixing"
526+
* A signature of random-walk MCMC
527+
* Likely to bias estimate (wrong mean) and reduce variance compared with the true posterior distribution
528+
* Linked with the notion of Effective Sample Size, roughly speaking ESS gives the equivalent in i.i.d. samples
529+
530+
## Autocorrelation in practice FAQ {.smaller}
531+
532+
* **Why is Autocorrelation a Problem?** For optimal performance, we want the samples to be independent and identically distributed (i.i.d.) samples from the target distribution.
533+
* **How to Detect Autocorrelation?** We can calculate the **autocorrelation function (ACF)**, which measures the correlation between the samples and their lagged values.
534+
* **How to Reduce Autocorrelation?** To mitigate the problem of autocorrelation, there's a number of strategies, including: using a longer chain, adapting the proposal distribution, using thinning or subsampling techniques. By reducing autocorrelation, we can obtain better estimates of the target distribution and improve the accuracy of our Bayesian inference.
535+
536+
## Thinning the chain
537+
538+
* Either before or after fit
539+
* Faster and less memory to thin before
540+
* More flexible to thin later
541+
* No real difference if trajectories not saved
542+
543+
This is useful because most of your chain is not interesting due to the autocorrelation.
544+
545+
## Saving history
546+
547+
* Save your trajectories at every collected sample
548+
* Save the final state at every sample (for onward simulation)
549+
550+
## Trajectories
551+
552+
```{r}
553+
likelihood <- dust_likelihood_monty(filter, packer, save_trajectories = TRUE)
554+
posterior <- likelihood + prior
555+
samples2 <- monty_sample(posterior, sampler, 100, initial = samples)
556+
dim(samples2$observations$trajectories)
557+
```
558+
559+
## Trajectories
560+
561+
```{r}
562+
trajectories <- dust_unpack_state(filter,
563+
samples2$observations$trajectories)
564+
matplot(data$time, drop(trajectories$incidence),
565+
type = "l", lty = 1, col = "#00000033")
566+
points(data, pch = 19, col = "red")
567+
```
568+
569+
# Next steps
570+
571+
* forward time predictions
572+
* posterior predictive checks
573+
* rerun filter in mcmc
574+
* multi-parameter models
575+
* deterministic (expectation) models as starting points
576+
* adaptive fitting (deterministic models only)
577+
* HMC
578+
579+
# Resources {.smaller}
580+
581+
A nice PMCMC introduction written for the epidemiologist
582+
[Endo, A., van Leeuwen, E. & Baguelin, M. Introduction to particle Markov-chain Monte Carlo for disease dynamics modellers. Epidemics 29, 100363 (2019).](https://www.sciencedirect.com/science/article/pii/S1755436519300301?via%3Dihub)
583+
584+
A tutorial about SMC
585+
[Doucet, A. & Johansen, A. M. A Tutorial on Particle filtering and smoothing: Fiteen years later. Oxford Handb. nonlinear Filter. 656–705 (2011). doi:10.1.1.157.772](https://www.stats.ox.ac.uk/~doucet/doucet_johansen_tutorialPF2011.pdf)
586+
587+
The reference paper on PMCMC
588+
[Andrieu, C., Doucet, A. & Holenstein, R. Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B (Statistical Methodol. 72, 269–342 (2010).](https://www.stats.ox.ac.uk/~doucet/andrieu_doucet_holenstein_PMCMC.pdf)
589+
590+
A software oriented paper introducing odin, dust and mcstate
591+
[R. G. FitzJohn et al. Reproducible parallel inference and simulation of stochastic state space models using odin, dust, and mcstate. Wellcome Open Res. 2021 5288 5, 288 (2021).](https://wellcomeopenresearch.org/articles/5-288)

0 commit comments

Comments
 (0)