-
Notifications
You must be signed in to change notification settings - Fork 13
/
Copy pathdot_outer.Rmd
200 lines (151 loc) · 4.64 KB
/
dot_outer.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
---
jupyter:
jupytext:
notebook_metadata_filter: all,-language_info
split_at_heading: true
text_representation:
extension: .Rmd
format_name: rmarkdown
format_version: '1.2'
jupytext_version: 1.13.7
kernelspec:
display_name: Python 3
language: python
name: python3
---
# Vector and matrix dot products, “np.outer”
Our standard imports to start:
```{python}
import numpy as np
import matplotlib.pyplot as plt
# Display array values to 6 digits of precision
np.set_printoptions(precision=6, suppress=True)
```
## Vector dot products
If I have two vectors $\vec{a}$ with elements $a_0, a_1, ...
a_{n-1}$, and $\vec{b}$ with elements $b_0, b_1, ... b_{n-1}$
then the [dot product](https://en.wikipedia.org/wiki/Dot_product) is
defined as:
$$
\vec{a} \cdot \vec{b} = \sum_{i=0}^{n-1} a_ib_i = a_0b_0 + a_1b_1 + \cdots + a_{n-1}b_{n-1}
$$
In code:
```{python}
a = np.arange(5)
b = np.arange(10, 15)
np.dot(a, b)
```
```{python}
# The same thing as
np.sum(a * b) # Elementwise multiplication
```
`dot` is also a *method* of the NumPy array object, and using the method can
be neater and easier to read:
```{python}
a.dot(b)
```
Better still, the matrix multiplication operator `@` implies a dot product between two vectors:
```{python}
a @ b
```
## Matrix dot products
Matrix multiplication operates by taking dot products of the rows of the first
array (matrix) with the columns of the second.
Let’s say I have a matrix $\mathbf{X}$, and $\vec{X_{i,:}}$ is row
$i$ in $\mathbf{X}$. I have a matrix $\mathbf{Y}$, and
$\vec{Y_{:,j}}$ is column $j$ in $\mathbf{Y}$. The output
matrix $\mathbf{Z} = \mathbf{X} \mathbf{Y}$ has entry $Z_{i,j} =
\vec{X_{i,:}} \cdot \vec{Y_{:, j}}$.
```{python}
X = np.array([[0, 1, 2], [3, 4, 5]])
X
```
```{python}
Y = np.array([[7, 8], [9, 10], [11, 12]])
Y
```
```{python}
X @ Y
```
```{python}
X[0, :] @ Y[:, 0]
```
```{python}
X[1, :] @ Y[:, 0]
```
## The outer product
We can use the rules of matrix multiplication for row vectors and column
vectors.
A row vector is a 2D vector where the first dimension is length 1.
```{python}
row_vector = np.array([[1, 3, 2]])
print(row_vector.shape)
row_vector
```
A column vector is a 2D vector where the second dimension is length 1.
```{python}
col_vector = np.array([[2], [0], [1]])
print(col_vector.shape)
col_vector
```
We know what will happen if we matrix multiply the row vector and the column
vector:
```{python}
row_vector @ col_vector
```
What happens when we matrix multiply the column vector by the row vector? We
know this will work because we are multiplying a 3 by 1 array by a 1 by 3
array, so this should generate a 3 by 3 array:
```{python}
col_vector @ row_vector
```
This arises from the rules of matrix multiplication, except there is only one
row \* column pair making up each of the output elements:
```{python}
print(col_vector[0] * row_vector)
print(col_vector[1] * row_vector)
print(col_vector[2] * row_vector)
```
This (M by 1) vector matrix multiply with a (1 by N) vector is also called the
*outer product* of two vectors. We can generate the same thing from 1D
vectors, by using the numpy `np.outer` function:
```{python}
np.outer(col_vector.ravel(), row_vector.ravel())
```
(dot-vectors-matrices)=
## Dot, vectors and matrices
Unlike MATLAB, Python has one-dimensional vectors. For example, if I slice a
column out of a 2D array of shape (M, N), I do not get a column vector, shape
(M, 1), I get a 1D vector, shape (M,):
```{python}
X = np.array([[0, 1, 2],
[3, 4, 5]])
v = X[:, 0]
v
```
Because the 1D vector has lost the idea of being a column rather than a row in
a matrix, it is no longer unambiguous what $v \cdot \mathbf{X}$ means. It
could mean a dot product of a row vector shape (1, M) with a matrix shape
(M, N), which is valid – or a dot product of a row vector (M, 1) with a
matrix shape (M, N), which is not valid.
If you pass a 1D vector into the `dot` function or method, or use the `@`
matrix multiplier, NumPy assumes you mean it to be a row vector on the left,
and a column vector on the right, which is nearly always what you intended:
```{python}
# 1D vector is row vector on the left hand side of dot / matrix multiply.
v @ X
```
```{python}
# 1D vector is column vector on the right hand side of dot / matrix multiply.
w = np.array([-1, 0, 1])
X @ w
```
Notice that, in both cases, `@` returns a 1D result.
It sometimes helps to make a 1D vector into a 2D row or column vector, to make
your intention explicit, and preserve the 2D shape of the output:
```{python}
# Turn 1D vector into explicit row vector
row_v = np.reshape(v, (1, 2))
# @ now returns a row vector rather than a 1D vector
row_v @ X
```