You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* [Ariel Caticha - 2012 - Entropic Inference and the Foundations of Physics](https://github.com/bmlip/course/blob/main/assets/files/Caticha-2012-Entropic-Inference-and-the-Foundations-of-Physics.pdf), pp.30-34, section 2.8, the Gaussian distribution
70
70
* References
71
71
72
-
* [E.T. Jaynes - 2003 - Probability Theory, The Logic of Science](http://www.med.mcgill.ca/epidemiology/hanley/bios601/GaussianModel/JaynesProbabilityTheory.pdf) (best book available on the Bayesian view on probability theory)
72
+
* [E.T. Jaynes - 2003 - The central, Gaussian or normal distribution, ch.7 in: Probability Theory, The Logic of Science](https://github.com/bmlip/course/blob/main/assets/files/Jaynes%20-%202003%20-%20Probability%20theory%20-%20ch-7%20-%20Gaussian%20distribution.pdf) (Very insightful chapter in Jaynes' book on the Gaussian distribution.)
73
73
74
74
"""
75
75
@@ -123,7 +123,7 @@ md"""
123
123
124
124
##### Solution
125
125
126
-
- See later in this lecture.
126
+
- See [later in this lecture](#Challenge-Revisited:-Gaussian-Density-Estimation).
127
127
"""
128
128
129
129
# ╔═╡ 71f1c8ee-3b65-4ef8-b36f-3822837de410
@@ -203,7 +203,7 @@ Why is the Gaussian distribution so ubiquitously used in science and engineering
203
203
* Any smooth function with a single rounded maximum goes into a Gaussian function, if raised to higher and higher powers. This is particularly useful in sequential Bayesian inference where repeated updates leads to Gaussian posteriors. (See also this [tweet](https://x.com/Almost_Sure/status/1745480056288186768)).
204
204
* The [Gaussian distribution has higher entropy](https://en.wikipedia.org/wiki/Differential_entropy#Maximization_in_the_normal_distribution) than any other with the same variance.
205
205
* Therefore, any operation on a probability distribution that discards information but preserves variance gets us closer to a Gaussian.
206
-
* As an example, see [Jaynes, section 7.1.4](http://www.med.mcgill.ca/epidemiology/hanley/bios601/GaussianModel/JaynesProbabilityTheory.pdf#page=250) for how this leads to the [Central Limit Theorem](https://en.wikipedia.org/wiki/Central_limit_theorem), which results from performing convolution operations on distributions.
206
+
* As an example, see [Jaynes, section 7.1.4](https://github.com/bmlip/course/blob/main/assets/files/Jaynes%20-%202003%20-%20Probability%20theory%20-%20ch-7%20-%20Gaussian%20distribution.pdf) for how this leads to the [Central Limit Theorem](https://en.wikipedia.org/wiki/Central_limit_theorem), which results from performing convolution operations on distributions.
207
207
208
208
209
209
2. Once the Gaussian has been attained, this form tends to be preserved. e.g.,
@@ -212,7 +212,7 @@ Why is the Gaussian distribution so ubiquitously used in science and engineering
212
212
* The product of two Gaussian functions is another Gaussian function (useful in Bayes rule).
213
213
* The Fourier transform of a Gaussian function is another Gaussian function.
214
214
215
-
See also [Jaynes, section 7.14](http://www.med.mcgill.ca/epidemiology/hanley/bios601/GaussianModel/JaynesProbabilityTheory.pdf#page=250), and the whole chapter 7 in his book for more details on why the Gaussian distribution is so useful.
215
+
See also [Jaynes, section 7.14](https://github.com/bmlip/course/blob/main/assets/files/Jaynes%20-%202003%20-%20Probability%20theory%20-%20ch-7%20-%20Gaussian%20distribution.pdf), and the whole chapter 7 in his book for more details on why the Gaussian distribution is so useful.
216
216
217
217
"""
218
218
@@ -245,7 +245,7 @@ for given ``A`` and ``b``, the mean and covariance of ``z`` are given by ``\mu_z
245
245
Since a Gaussian distribution is fully specified by its mean and covariance matrix, it follows that a linear transformation ``z=Ax+b`` of a Gaussian variable ``x \sim \mathcal{N}(\mu_x,\Sigma_x)`` is Gaussian distributed as
In case ``x`` is not Gaussian, higher order moments may be needed to specify the distribution for ``z``.
@@ -265,7 +265,7 @@ A commonly occurring example of a linear transformation is the *sum of two indep
265
265
Let ``x \sim \mathcal{N} \left(\mu_x, \sigma_x^2 \right)`` and ``y \sim \mathcal{N} \left(\mu_y, \sigma_y^2 \right)``. Prove that the PDF for ``z=x+y`` is given by
@@ -410,8 +410,10 @@ Let ``\theta =\{\mu,\Sigma\}``. Prove that the log-likelihood (LLH) function ``\
410
410
411
411
# ╔═╡ f008a742-6900-4e18-ab4e-b5da53fb64a6
412
412
hide_proof(
413
-
414
-
md" ```math
413
+
md"""
414
+
Hint: it may be helpful here to use the matrix calculus rules from the [5SSD0 Formula Sheet](https://github.com/bmlip/course/blob/main/assets/files/5SSD0_formula_sheet.pdf).
It is important to distinguish between two concepts: the *product of Gaussian distributions*, which results in a (possibly unnormalized) Gaussian distribution, and the *product of Gaussian-distributed variables*, which generally does not yield a Gaussian-distributed variable. See the [optional slides below](#OPTIONAL-SLIDES) for further discussion.
675
677
"""
676
678
677
-
# ╔═╡ 93361b31-022f-46c0-b80d-b34f3ed61d5f
678
-
md"""
679
-
## Gaussian Distributions in Julia
680
-
Take a look at this mini lecture to see some simple examples of using distributions in Julia:
0 commit comments