Fix spelling and whitespace

richfitz · richfitz · commit af0447d4fa7a · 2024-11-20T17:08:35.000Z
diff --git a/WORDLIST b/WORDLIST
@@ -1,24 +1,64 @@
+'s
+Andrieu
+BSSMC
+BUGs
+Baguelin
+Bayes
+BayesTools
 COVID
 DIDE
 DSLs
+ESS
+Endo
+FitzJohn
+HMC
 HPC
+Holenstein
+Leeuwen
 McElreath's
 ODEs
 OpenMP
+Poisson
 R's
 RStudio
+Randow
 Rtools
+SMC
+SSM
+SimCity
+VCV
+Wellcome
 Wickham
+WinBUGS
 aseasonal
 callout
+cdot
+csv
+dI
+dR
+dS
+discretised
+doi
+dt
 etc
+frac
 ggplot2
+hyperpriors
+leq
+mathrm
+mcstate
 metaprogramming
 monty
+nd
 nondifferentiable
 nonlinear
 odin
 odin's
 pMCMC
+propto
+sim
 standalone
 stochasticity
+th
+unnormalised
+untractable
diff --git a/arrays.qmd b/arrays.qmd
@@ -99,7 +99,7 @@ so there is an implicit indexing by `i` on the LHS in that equation of the odin
 
 We can see in our first example that it is necessary to declare the dimensions of all arrays within your `odin` code using a `dim` equation.
 
-Dimensions can be hardcoded:
+Dimensions can be hard-coded:
 ```r
 dim(S) <- 2
 ```
@@ -374,4 +374,4 @@ and there would be some other bits we'd need to change to deal with the increase
 ```r
 s_ij[, ] <- m[i, j] * sum(I[j, ,])
 initial(I[, , ]) <- I0[i, j, k]
-```
+```
diff --git a/fitting.qmd b/fitting.qmd
@@ -61,7 +61,7 @@ matplot(t, t(incidence), type = "l", lty = 1, col = "#00000033",
 We need:
 
 * a data set
-  - time series of observed data (incidence? prevalance? something else?)
+  - time series of observed data (incidence? prevalence? something else?)
 * a measure of goodness of fit
   - how do we cope with stochasticity?
 * to know what parameters we are trying to fit
@@ -248,7 +248,7 @@ AKA, the particle filter
 - More complex structures are built up from simpler objects
   - Filter {data, model, n_particles}
   - PMCMC {parameters, filter}
-- Provides you with low-level tools, and little handholding
+- Provides you with low-level tools, and little hand-holding
 - Pretty fast though
 
 # PMCMC {.smaller}
@@ -503,7 +503,7 @@ samples <- monty_sample(posterior, sampler, 1000,
 monty_sample_manual_prepare(posterior, sampler, 10000, "mypath", n_chains = 10)
 ```
 
-Then queue these up on a cluster, e.g., using hipercow:
+Then queue these up on a cluster, e.g., using `hipercow`:
 
 ```r
 hipercow::task_create_bulk_call(
@@ -570,22 +570,8 @@ points(data, pch = 19, col = "red")
 
 * forward time predictions
 * posterior predictive checks
-* rerun filter in mcmc
+* rerun filter in MCMC
 * multi-parameter models
 * deterministic (expectation) models as starting points
 * adaptive fitting (deterministic models only)
 * HMC
-
-# Resources {.smaller}
-
-A nice PMCMC introduction written for the epidemiologist
-[Endo, A., van Leeuwen, E. & Baguelin, M. Introduction to particle Markov-chain Monte Carlo for disease dynamics modellers. Epidemics 29, 100363 (2019).](https://www.sciencedirect.com/science/article/pii/S1755436519300301?via%3Dihub)
-
-A tutorial about SMC
-[Doucet, A. & Johansen, A. M. A Tutorial on Particle filtering and smoothing: Fiteen years later. Oxford Handb. nonlinear Filter. 656–705 (2011). doi:10.1.1.157.772](https://www.stats.ox.ac.uk/~doucet/doucet_johansen_tutorialPF2011.pdf)
-
-The reference paper on PMCMC
-[Andrieu, C., Doucet, A. & Holenstein, R. Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B (Statistical Methodol. 72, 269–342 (2010).](https://www.stats.ox.ac.uk/~doucet/andrieu_doucet_holenstein_PMCMC.pdf)
-
-A software oriented paper introducing odin, dust and mcstate
-[R. G. FitzJohn et al. Reproducible parallel inference and simulation of stochastic state space models using odin, dust, and mcstate. Wellcome Open Res. 2021 5288 5, 288 (2021).](https://wellcomeopenresearch.org/articles/5-288)
diff --git a/model.qmd b/model.qmd
@@ -13,7 +13,7 @@ In this book, we explore two fundamentally different yet interconnected approach
 
 Imagine a city model, as in *SimCity*, where thousands of virtual inhabitants follow routines, interact, and respond to changes like new buildings or natural disasters. These elements evolve according to predefined rules and parameters, similar to how **dynamical systems** simulate real-world processes over time.
 
-In a [**dynamical model**](https://en.wikipedia.org/wiki/Dynamical_system), we track changes in a system's "state" over time. The state is a summary of key information at a specific time point; for example, in an epidemiological model, the state might include the numbers of susceptible, infected, and recovered individuals. More detailed models may add variables like age, location, health risks, and symptoms, allowing us to simulate interventions, such as vaccination campaigns, and explore their potential impacts. There exist many formalisms to describe dynamical systems, incorporating e.g. specific input and output functions in control theory but the central element of these systems is this notion of the "state", a mathematical object summarising the system at a timepoint. 
+In a [**dynamical model**](https://en.wikipedia.org/wiki/Dynamical_system), we track changes in a system's "state" over time. The state is a summary of key information at a specific time point; for example, in an epidemiological model, the state might include the numbers of susceptible, infected, and recovered individuals. More detailed models may add variables like age, location, health risks, and symptoms, allowing us to simulate interventions, such as vaccination campaigns, and explore their potential impacts. There exist many formalisms to describe dynamical systems, incorporating e.g. specific input and output functions in control theory but the central element of these systems is this notion of the "state", a mathematical object summarising the system at a time-point.
 
 The `odin` package, which supports differential and difference equations, is an effective tool for building dynamical models, enabling users to simulate time-evolving systems. Dynamical models are well-suited for exploring how specific scenarios or interventions impact a system over time, making them particularly useful for modelling real-world phenomena in a structured way.
 
@@ -37,19 +37,19 @@ A natural way to bridge dynamical and statistical models is by introducing [**st
 
 In a **stochastic process**, the system’s state is no longer a single deterministic value but a collection of potential outcomes, each weighted by a probability. This probabilistic view enables us to model fluctuations and uncertainties within the dynamical framework, capturing both the system’s evolution and the uncertainty in each state. Stochastic processes are thus a natural extension of dynamical models, adding an extra layer of realism by treating system evolution as inherently uncertain.
 
-The `odin` package provides an intuitive framework for writing a class of stochastic systems, making it easier to define models that incorporate randomness in their evolution over time. The `dust` package complements `odin` by enabling efficient large-scale simulations, allowing users to capture and analyse the uncertainty inherent to these systems through repeated runs. Together, `odin` and `dust` offer a powerful toolkit for developing and exploring stochastic models that reflect the complexity and variability of real-world dynamics. 
+The `odin` package provides an intuitive framework for writing a class of stochastic systems, making it easier to define models that incorporate randomness in their evolution over time. The `dust` package complements `odin` by enabling efficient large-scale simulations, allowing users to capture and analyse the uncertainty inherent to these systems through repeated runs. Together, `odin` and `dust` offer a powerful toolkit for developing and exploring stochastic models that reflect the complexity and variability of real-world dynamics.
 
 TODO simple dust example or just link with relevant dust section in the book
 
 ### Bayesian inference: statistical modelling of model parameters
 
 **Bayesian inference** is another approach to linking dynamical and statistical models by treating model parameters as random variables rather than fixed values. This introduces a probability distribution over possible parameter values, making parameter estimation a statistical problem.
 
-Using Bayes’ theorem, Bayesian inference combines:
+Using Bayes' theorem, Bayesian inference combines:
 
   - **The likelihood**: the probability of observing the data given specific parameter values, and
   - **The prior distribution**: our initial assumptions about the parameter values before observing the data.
-  
+
 The result is the **posterior distribution** of a parameter $\theta$ given data $y$:
 
 $$
@@ -72,11 +72,11 @@ By combining **stochastic processes** with **Bayesian inference**, we add a dual
 
 In both dynamical and statistical frameworks, the number of parameters can be adjusted as needed to capture the desired level of complexity. In the `monty` package, random variables - termed 'parameters' with a slight simplification of language - are typically used to summarise processes, and so they often form a more compact set than those found in dynamical models. This distinction is especially relevant in Bayesian models constructed from complex `odin` models.
 
-In dynamical systems, parameters define the structure and evolution of a scenario in detail. For instance, an epidemiological model may include parameters for transmission rates, contact patterns, or intervention schedules. These inputs enable "what-if" scenarios, allowing decision-makers to predict and manage changes in real time. The `odin` package, designed to support such dynamical models, provides the flexibility to specify numerous parameters for exploring system behaviours over time. 
+In dynamical systems, parameters define the structure and evolution of a scenario in detail. For instance, an epidemiological model may include parameters for transmission rates, contact patterns, or intervention schedules. These inputs enable "what-if" scenarios, allowing decision-makers to predict and manage changes in real time. The `odin` package, designed to support such dynamical models, provides the flexibility to specify numerous parameters for exploring system behaviours over time.
 
-Statistical models, by contrast, use parameters to define probability distributions over possible outcomes, capturing uncertainties, predicting risks, or summarising data patterns. In Bayesian models based on a complex `odin` framework, the statistical parameters are usually a subset of those used in the dynamical model itself. Parameters such as those defining a vaccination campaign (e.g. daiy number of doses given to target groups), for example, might be central to shaping the `odin` model but may not necessarily be included in Bayesian inference (that might focus on just vaccine efficacy at most). This selective approach allows us to quantify uncertainties and make probabilistic inferences about key aspects of the model without needing to explore detail of the underlying dynamics that are "known" for what was actually observed.
+Statistical models, by contrast, use parameters to define probability distributions over possible outcomes, capturing uncertainties, predicting risks, or summarising data patterns. In Bayesian models based on a complex `odin` framework, the statistical parameters are usually a subset of those used in the dynamical model itself. Parameters such as those defining a vaccination campaign (e.g. daily number of doses given to target groups), for example, might be central to shaping the `odin` model but may not necessarily be included in Bayesian inference (that might focus on just vaccine efficacy at most). This selective approach allows us to quantify uncertainties and make probabilistic inferences about key aspects of the model without needing to explore detail of the underlying dynamics that are "known" for what was actually observed.
 
-Thus, while dynamical models rely on a broad parameter set for flexibility, statistical parameters summarise uncertainty more compactly, making the combined approach especially effective for realistic, data-driven inferences. 
+Thus, while dynamical models rely on a broad parameter set for flexibility, statistical parameters summarise uncertainty more compactly, making the combined approach especially effective for realistic, data-driven inferences.
 
 ## Probability densities: normalised vs. unnormalised
 
@@ -92,11 +92,11 @@ where:
 
   - $p(y|\theta)$ is the likelihood, and
   - $p(\theta)$ is the prior distribution, as above.
-  
-Note that it does not involve the normalising constant $p(y)$ anymore. The reason is that since calculating $p(y)$ can be very difficult, we often work with the **unnormalised posterior density** $p(y|\theta)p(\theta)$. This unnormalised form is sufficient for many Monte Carlo methods, where only relative densities matter.
+
+Note that it does not involve the normalising constant $p(y)$ any more. The reason is that since calculating $p(y)$ can be very difficult, we often work with the **unnormalised posterior density** $p(y|\theta)p(\theta)$. This unnormalised form is sufficient for many Monte Carlo methods, where only relative densities matter.
 
 A **normalised density** integrates to 1 over its entire parameter space. This is necessary for direct probability interpretations and for certain Bayesian methods, like model comparison using [Bayes factors](https://en.wikipedia.org/wiki/Bayes_factor).
 
 Monte Carlo algorithms, such as **Metropolis-Hastings** and **Importance Sampling**, often operate using unnormalised densities, focusing on relative probabilities rather than absolute values. This makes them efficient for high-dimensional problems where calculating a normalising constant is untractable.
 
-Normalisation in probability densities has parallels in physics. In statistical mechanics, the **partition function** $Z$ normalises probabilities over all possible states, much like normalising densities in Bayesian inference. This connection highlights why algorithms like **Metropolis-Hastings** use unnormalised densities: they mirror physical systems where absolute energies are less relevant than energy differences.
+Normalisation in probability densities has parallels in physics. In statistical mechanics, the **partition function** $Z$ normalises probabilities over all possible states, much like normalising densities in Bayesian inference. This connection highlights why algorithms like **Metropolis-Hastings** use unnormalised densities: they mirror physical systems where absolute energies are less relevant than energy differences.
diff --git a/monty.qmd b/monty.qmd
@@ -60,11 +60,11 @@ plot(l,
 
 ## Sampling from our example distribution
 
-We now want to sample from this model, using the `monty_sample()` function. For this we need to tell `monty` which sampler we want to use to explore our distribution. There are a variety of samplers available and you can learn about them in @sec-samplers. One of the simplest is the random walk [Metropolis-Hastings](https://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm) algorithm that should work almost out of the box (though not necesseraly efficiently) in most cases.
+We now want to sample from this model, using the `monty_sample()` function. For this we need to tell `monty` which sampler we want to use to explore our distribution. There are a variety of samplers available and you can learn about them in @sec-samplers. One of the simplest is the random walk [Metropolis-Hastings](https://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm) algorithm that should work almost out of the box (though not necessarily efficiently) in most cases.
 
-The random walk sampler uses a variance-covariance (VCV) matrix to guide its exploration, determining the 'jump' from the current point to the next in a random walk by drawing from a multivariate normal distribution parameterised by this matrix. For our single-parameter model here, we use a 1x1 matrix of variance 2 (`matrix(2)`) as our VCV matrix.
+The random walk sampler uses a variance-covariance (VCV) matrix to guide its exploration, determining the 'jump' from the current point to the next in a random walk by drawing from a multivariate normal distribution parametrised by this matrix. For our single-parameter model here, we use a 1x1 matrix of variance 2 (`matrix(2)`) as our VCV matrix.
 
-The choice of the VCV matrix is critical for the sampler's efficiency, especially in more complex cases where the tuning of this matrix can significantly affect performance. A well-chosen VCV matrix optimises moving accross the parameter space, making the random walk sampler more effective in exploring the distribution of interest.
+The choice of the VCV matrix is critical for the sampler's efficiency, especially in more complex cases where the tuning of this matrix can significantly affect performance. A well-chosen VCV matrix optimises moving across the parameter space, making the random walk sampler more effective in exploring the distribution of interest.
 
 ```{r}
 sampler <- monty_sampler_random_walk(matrix(2))
@@ -91,7 +91,7 @@ hist(samples$pars["l", , ], breaks = 100)
 
 `monty` includes a simple probabilistic domain-specific language (DSL) that is inspired by languages of the BUGS family such as `stan` and [Statistical Rethinking](https://xcelab.net/rm/). It is designed to make some tasks a bit easier, particularly when defining priors for your model.  We expect that this DSL is not sufficiently advanced to represent most interesting models but it may get more clever and flexible in the future. In particular we do not expect the DSL to be useful in writing likelihood functions for comparison to data; we expect that if your model is simple enough for this you would be better off using `stan` or some similarly flexible system.
 
-*mention SSM, dr jacoby, mcstate, BayesTools
+*mention SSM, `drjacoby`, `mcstate`, `BayesTools`
 
 ## Going further
 
diff --git a/stochasticity.qmd b/stochasticity.qmd
@@ -146,7 +146,7 @@ plot(t, y$incidence, type = "p", xlab = "Time", ylab = "Incidence")
 
 We have already seen use of Binomial draws, and we support other distributions. Support for the distributions includes both random draws and density calculations (more on that later). The support for these functions comes from `monty` and all the distributions are available for use in the both the `odin` DSL and the `monty` DSL (more on that also later).
 
-Some distributions have several parameterisations; these are distinguished by the arguments to the functions. These distributions have a default parameterisation. For example, the Gamma distribution defaults to a `shape` and `rate` parameterisation so:
+Some distributions have several parametrisations; these are distinguished by the arguments to the functions. These distributions have a default parametrisation. For example, the Gamma distribution defaults to a `shape` and `rate` parametrisation so:
 
 ```r
 a <- Gamma(2, 0.1)
@@ -162,7 +162,7 @@ a <- Gamma(shape = 2, scale = 10)
 
 draw from a Gamma distribution with a shape of 2 and a **scale** of 10. You may find it is good practice to specify your arguments with such distributions whether using the default parameterisation or not.
 
-Other supported distributions include the Normal, Uniform, Exponential, Beta and Poisson distributions. A full list of supported distributions and their parameterisations can be found [here](https://mrc-ide.github.io/odin2/articles/functions.html#distribution-functions).
+Other supported distributions include the Normal, Uniform, Exponential, Beta and Poisson distributions. A full list of supported distributions and their parametrisations can be found [here](https://mrc-ide.github.io/odin2/articles/functions.html#distribution-functions).
 
 ## More sections to add: