Skip to content

Commit e91ad0e

Browse files
committed
proof and theorem env
update proof and thm env
1 parent f63761a commit e91ad0e

File tree

1 file changed

+38
-18
lines changed

1 file changed

+38
-18
lines changed

lectures/orth_proj.md

Lines changed: 38 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,10 @@ What vector within a linear subspace of $\mathbb R^n$ best approximates a given
131131

132132
The next theorem answers this question.
133133

134-
**Theorem** (OPT) Given $y \in \mathbb R^n$ and linear subspace $S \subset \mathbb R^n$,
134+
```{prf:theorem} Orthogonal Projection Theorem
135+
:label: opt
136+
137+
Given $y \in \mathbb R^n$ and linear subspace $S \subset \mathbb R^n$,
135138
there exists a unique solution to the minimization problem
136139
137140
$$
@@ -144,6 +147,7 @@ The minimizer $\hat y$ is the unique vector in $\mathbb R^n$ that satisfies
144147
* $y - \hat y \perp S$
145148
146149
The vector $\hat y$ is called the **orthogonal projection** of $y$ onto $S$.
150+
```
147151

148152
The next figure provides some intuition
149153

@@ -179,7 +183,7 @@ $$
179183
y \in Y\; \mapsto \text{ its orthogonal projection } \hat y \in S
180184
$$
181185

182-
By the OPT, this is a well-defined mapping or *operator* from $\mathbb R^n$ to $\mathbb R^n$.
186+
By the {prf:ref}`opt`, this is a well-defined mapping or *operator* from $\mathbb R^n$ to $\mathbb R^n$.
183187

184188
In what follows we denote this operator by a matrix $P$
185189

@@ -192,7 +196,7 @@ The operator $P$ is called the **orthogonal projection mapping onto** $S$.
192196
193197
```
194198

195-
It is immediate from the OPT that for any $y \in \mathbb R^n$
199+
It is immediate from the {prf:ref}`opt` that for any $y \in \mathbb R^n$
196200

197201
1. $P y \in S$ and
198202
1. $y - P y \perp S$
@@ -224,16 +228,20 @@ such that $y = x_1 + x_2$.
224228

225229
Moreover, $x_1 = \hat E_S y$ and $x_2 = y - \hat E_S y$.
226230

227-
This amounts to another version of the OPT:
231+
This amounts to another version of the {prf:ref}`opt`:
228232

229-
**Theorem**. If $S$ is a linear subspace of $\mathbb R^n$, $\hat E_S y = P y$ and $\hat E_{S^{\perp}} y = M y$, then
233+
```{prf:theorem} Orthogonal Projection Theorem (another version)
234+
:label: opt_another
235+
236+
If $S$ is a linear subspace of $\mathbb R^n$, $\hat E_S y = P y$ and $\hat E_{S^{\perp}} y = M y$, then
230237
231238
$$
232239
P y \perp M y
233240
\quad \text{and} \quad
234241
y = P y + M y
235242
\quad \text{for all } \, y \in \mathbb R^n
236243
$$
244+
```
237245

238246
The next figure illustrates
239247

@@ -285,7 +293,7 @@ Combining this result with {eq}`pob` verifies the claim.
285293

286294
When a subspace onto which we project is orthonormal, computing the projection simplifies:
287295

288-
**Theorem** If $\{u_1, \ldots, u_k\}$ is an orthonormal basis for $S$, then
296+
```{prf:theorem} If $\{u_1, \ldots, u_k\}$ is an orthonormal basis for $S$, then
289297
290298
```{math}
291299
:label: exp_for_op
@@ -294,8 +302,9 @@ P y = \sum_{i=1}^k \langle y, u_i \rangle u_i,
294302
\quad
295303
\forall \; y \in \mathbb R^n
296304
```
305+
```
297306
298-
Proof: Fix $y \in \mathbb R^n$ and let $P y$ be defined as in {eq}`exp_for_op`.
307+
```{prf:proof} Fix $y \in \mathbb R^n$ and let $P y$ be defined as in {eq}`exp_for_op`.
299308
300309
Clearly, $P y \in S$.
301310
@@ -312,6 +321,7 @@ $$
312321
$$
313322
314323
(Why is this sufficient to establish the claim that $y - P y \perp S$?)
324+
```
315325

316326
## Projection Via Matrix Algebra
317327

@@ -327,13 +337,17 @@ Evidently $Py$ is a linear function from $y \in \mathbb R^n$ to $P y \in \mathb
327337

328338
[This reference](https://en.wikipedia.org/wiki/Linear_map#Matrices) is useful.
329339

330-
**Theorem.** Let the columns of $n \times k$ matrix $X$ form a basis of $S$. Then
340+
```{prf:theorem}
341+
:label: proj_matrix
342+
343+
Let the columns of $n \times k$ matrix $X$ form a basis of $S$. Then
331344
332345
$$
333346
P = X (X'X)^{-1} X'
334347
$$
348+
```
335349

336-
Proof: Given arbitrary $y \in \mathbb R^n$ and $P = X (X'X)^{-1} X'$, our claim is that
350+
```{prf:proof} Given arbitrary $y \in \mathbb R^n$ and $P = X (X'X)^{-1} X'$, our claim is that
337351
338352
1. $P y \in S$, and
339353
2. $y - P y \perp S$
@@ -367,6 +381,7 @@ y]
367381
$$
368382
369383
The proof is now complete.
384+
```
370385

371386
### Starting with the Basis
372387

@@ -378,7 +393,7 @@ $$
378393

379394
Then the columns of $X$ form a basis of $S$.
380395

381-
From the preceding theorem, $P = X (X' X)^{-1} X' y$ projects $y$ onto $S$.
396+
From the {prf:ref}`proj_matrix`, $P = X (X' X)^{-1} X' y$ projects $y$ onto $S$.
382397

383398
In this context, $P$ is often called the **projection matrix**
384399

@@ -428,15 +443,16 @@ By approximate solution, we mean a $b \in \mathbb R^k$ such that $X b$ is close
428443

429444
The next theorem shows that a best approximation is well defined and unique.
430445

431-
The proof uses the OPT.
446+
The proof uses the {prf:ref}`opt`.
432447

433-
**Theorem** The unique minimizer of $\| y - X b \|$ over $b \in \mathbb R^K$ is
448+
```{prf:theorem} The unique minimizer of $\| y - X b \|$ over $b \in \mathbb R^K$ is
434449
435450
$$
436451
\hat \beta := (X' X)^{-1} X' y
437452
$$
453+
```
438454

439-
Proof: Note that
455+
```{prf:proof} Note that
440456
441457
$$
442458
X \hat \beta = X (X' X)^{-1} X' y =
@@ -458,6 +474,7 @@ $$
458474
$$
459475
460476
This is what we aimed to show.
477+
```
461478

462479
## Least Squares Regression
463480

@@ -594,9 +611,9 @@ Here are some more standard definitions:
594611

595612
> TSS = ESS + SSR
596613
597-
We can prove this easily using the OPT.
614+
We can prove this easily using the {prf:ref}`opt`.
598615

599-
From the OPT we have $y = \hat y + \hat u$ and $\hat u \perp \hat y$.
616+
From the {prf:ref}`opt` we have $y = \hat y + \hat u$ and $\hat u \perp \hat y$.
600617

601618
Applying the Pythagorean law completes the proof.
602619

@@ -611,7 +628,7 @@ The next section gives details.
611628
(gram_schmidt)=
612629
### Gram-Schmidt Orthogonalization
613630

614-
**Theorem** For each linearly independent set $\{x_1, \ldots, x_k\} \subset \mathbb R^n$, there exists an
631+
```{prf:theorem} For each linearly independent set $\{x_1, \ldots, x_k\} \subset \mathbb R^n$, there exists an
615632
orthonormal set $\{u_1, \ldots, u_k\}$ with
616633
617634
$$
@@ -620,6 +637,7 @@ $$
620637
\quad \text{for} \quad
621638
i = 1, \ldots, k
622639
$$
640+
```
623641

624642
The **Gram-Schmidt orthogonalization** procedure constructs an orthogonal set $\{ u_1, u_2, \ldots, u_n\}$.
625643

@@ -639,12 +657,13 @@ In some exercises below, you are asked to implement this algorithm and test it u
639657

640658
The following result uses the preceding algorithm to produce a useful decomposition.
641659

642-
**Theorem** If $X$ is $n \times k$ with linearly independent columns, then there exists a factorization $X = Q R$ where
660+
```{prf:theorem} If $X$ is $n \times k$ with linearly independent columns, then there exists a factorization $X = Q R$ where
643661
644662
* $R$ is $k \times k$, upper triangular, and nonsingular
645663
* $Q$ is $n \times k$ with orthonormal columns
664+
```
646665

647-
Proof sketch: Let
666+
```{prf:proof} Let
648667
649668
* $x_j := \col_j (X)$
650669
* $\{u_1, \ldots, u_k\}$ be orthonormal with the same span as $\{x_1, \ldots, x_k\}$ (to be constructed using Gram--Schmidt)
@@ -658,6 +677,7 @@ x_j = \sum_{i=1}^j \langle u_i, x_j \rangle u_i
658677
$$
659678
660679
Some rearranging gives $X = Q R$.
680+
```
661681

662682
### Linear Regression via QR Decomposition
663683

0 commit comments

Comments
 (0)