-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathquickstart.fsx
259 lines (206 loc) · 14.1 KB
/
quickstart.fsx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
#r "nuget: DiffSharp-lite,1.0.7"
#r "nuget: SixLabors.ImageSharp,1.0.1"
(**
[](https://colab.research.google.com/github/DiffSharp/diffsharp.github.io/blob/master/quickstart.ipynb) 
[](https://mybinder.org/v2/gh/diffsharp/diffsharp.github.io/master?filepath=quickstart.ipynb) 
[](quickstart.fsx) 
[](quickstart.ipynb)
# Quickstart
Here we cover some key tasks involved in a typical machine learning pipeline and how these can be implemented with DiffSharp. Note that a significant part of DiffSharp's design has been influenced by [PyTorch](https://pytorch.org/) and you would feel mostly at home if you have familiarity with PyTorch.
## Datasets and Data Loaders
DiffSharp provides the [Dataset](https://diffsharp.github.io/reference/diffsharp-data-dataset.html) type that represents a data source and the [DataLoader](https://diffsharp.github.io/reference/diffsharp-data-dataloader.html) type that handles the loading of data from datasets and iterating over [minibatches](https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Iterative_method) of data.
See the [DiffSharp.Data](/reference/diffsharp-data.html) namespace for the full API reference.
### Datasets
DiffSharp has ready-to-use types that cover main datasets typically used in machine learning, such as [MNIST](https://diffsharp.github.io/reference/diffsharp-data-mnist.html), [CIFAR10](https://diffsharp.github.io/reference/diffsharp-data-cifar10.html), [CIFAR100](https://diffsharp.github.io/reference/diffsharp-data-cifar100.html), and also more generic dataset types such as [TensorDataset](https://diffsharp.github.io/reference/diffsharp-data-tensordataset.html) or [ImageDataset](https://diffsharp.github.io/reference/diffsharp-data-imagedataset.html).
The following loads the [MNIST](http://yann.lecun.com/exdb/mnist/) dataset and shows one image entry and the corresponding label.
*)
open DiffSharp
open DiffSharp.Data
// First ten images in MNIST training set
let dataset = MNIST("../data", train=true, transform=id, n=10)
// Inspect a single image and label
let data, label = dataset[7]
// Save image to file
data.saveImage("test.png")(* output:
<img src="" style="width: 64px; height: auto"/>*)
// Inspect data as ASCII and show label
printfn "Data: %A\nLabel: %A" (data.toImageString()) label(* output:
Data: "
~-}@#####Z
-j*W########J'
O############i
[##Mxxxxo####i
::^ 'W##Z
|&##f
(o###'
(q%###d.
"uaaa####8}:
_m########O
_*####@####?
"v<____f##?
`##?
|##?
?. 1&##?
iQ#: `)8##&!
p##txxxxb###o\
p#########MC.
+J#####wdt_
}B#Z}^
"
Label: tensor(3,dtype=Int32)*)
(**
### Data Loaders
A data loader handles tasks such as constructing minibatches from an underlying dataset on-the-fly, shuffling the data, and moving the data tensors between devices. In the example below we show a single batch of six MNIST images and their corresponding classification labels.
*)
let loader = DataLoader(dataset, shuffle=true, batchSize=6)
let batch, labels = loader.batch()
printfn "%A\nLabels: %A" (batch.toImageString()) labels(* output:
"
<J#J+
```uzO:Y@%u +8###8 u#@!
";)CX#####*Z#&h! <W##8M#> 1&$#!
+8########$)\\>~ .i*###b|##v u$$#!
`m#####k0%& U######1o#Y lB$p".
/C[##d' -C <8##ar#W_/@X iW$$)
' C#( +8##0'?v, #&+ C##o
j#a ~Y#Mp| ##Y ,#$Bl
'a#I O#8I,: ##h "d#$u
;&*J[ >##! ##h }$#Q`
\8##c^ k#a @#h /8$a^
_o##L: ?%#] ##t "###J
`)##o |#M^ z#o' L$$$~
$#$! |#* n#*I +m$$Z
_nQ##d |#f +Y#Z B$$h'
~tW###$0 |#* r8#U #$$(
^rm####b/ |#$t+:|O*#*Y> J@##"
^lq####k\ |###Ww###hn +W#%j.
`Xm####h/. :k#####Mf !$#m
>ZW####&x' ^n###j~ !$#m
z###qzx` ^a#m
lM~ '^{v
!\ c0~ ;Lp###t
uU Cp~ >tp##r|t> ~L&##*p#M~
mU :#U |M##op###X `L8###w"~##i
mU Q#u >&#a! '0##r 1####w: ~##i
_%U k#> 1##Q' )##*, 0##mY" /##i
c#U ^M#: x##t' w##/ :n>^ {##i
J#c U#w` u#%O. ./%#n J##i
J#l '|O$#( `M#O ;b##X' ]vvvb#h
J#| _+rfL&&B0&#~ ^##"^ck&##$/ .<0##@##W;
L#8ddd##$8kf(: M$ `M#####WW#M <W#&WX&##Mc
cOOOOO1> }#m >B#wz-^a#f /##ui `p####|
X#z !@#[ z##0` ,b#%nZ##Ql++
X#> I##, z#&[` <k#w! 'IU&##*
X#> ##, }#&(`?X&#u: (00~
X@) I##, M#%dw###u
X#1 [##, d####Or;
X#C _@#, ')fv^
X@C w#>
1#C 1#o-
'Q#X'
't#-
"
Labels: tensor([5., 0., 1., 4., 9., 2.])*)
(**
In practice a data loader is typically used to iterate over all minibatches in a given dataset in order to feed each minibatch through a machine learning model. One full iteration over the dataset would be called an "epoch". Typically you would perform multiple such epochs of iterations during the training of a model.
*)
for epoch = 1 to 10 do
for i, data, labels in loader.epoch() do
printfn "Epoch %A, minibatch %A" epoch (i+1)
// Process the minibatch
// ...
(**
## Models
Many machine learning models are differentiable functions whose parameters can be tuned via [gradient-based optimization](https://en.wikipedia.org/wiki/Gradient_descent), finding an optimum for an objective function that quantifies the fit of the model to a given set of data. These models are typically built as compositions non-linear functions and ready-to-use building blocks such as linear, recurrent, and convolutional layers.
DiffSharp provides the most commonly used model building blocks including convolutions, transposed convolutions, batch normalization, dropout, recurrent and other architectures.
See the [DiffSharp.Model](/reference/diffsharp-model.html) namespace for the full API reference.
### Constructing models, PyTorch style
If you have experience with [PyTorch](https://pytorch.org/), you would find the following way of model definition familiar. Let's look at an example of a [generative adversarial network (GAN)](https://arxiv.org/abs/1406.2661) architecture.
*)
open DiffSharp.Model
open DiffSharp.Compose
// PyTorch style
// Define a model class inheriting the base
type Generator(nz: int) =
inherit Model()
let fc1 = Linear(nz, 256)
let fc2 = Linear(256, 512)
let fc3 = Linear(512, 1024)
let fc4 = Linear(1024, 28*28)
do base.addModel(fc1, fc2, fc3, fc4)
override self.forward(x) =
x
|> dsharp.view([-1;nz])
|> fc1.forward
|> dsharp.leakyRelu(0.2)
|> fc2.forward
|> dsharp.leakyRelu(0.2)
|> fc3.forward
|> dsharp.leakyRelu(0.2)
|> fc4.forward
|> dsharp.tanh
// Define a model class inheriting the base
type Discriminator(nz:int) =
inherit Model()
let fc1 = Linear(28*28, 1024)
let fc2 = Linear(1024, 512)
let fc3 = Linear(512, 256)
let fc4 = Linear(256, 1)
do base.addModel(fc1, fc2, fc3, fc4)
override self.forward(x) =
x
|> dsharp.view([-1;28*28])
|> fc1.forward
|> dsharp.leakyRelu(0.2)
|> dsharp.dropout(0.3)
|> fc2.forward
|> dsharp.leakyRelu(0.2)
|> dsharp.dropout(0.3)
|> fc3.forward
|> dsharp.leakyRelu(0.2)
|> dsharp.dropout(0.3)
|> fc4.forward
|> dsharp.sigmoid
// Instantiate the defined classes
let nz = 128
let gen = Generator(nz)
let dis = Discriminator(nz)
print gen
print dis(* output:
Model(Linear(128, 256), Linear(256, 512), Linear(512, 1024), Linear(1024, 784))
Model(Linear(784, 1024), Linear(1024, 512), Linear(512, 256), Linear(256, 1))*)
(**
### Constructing models, DiffSharp style
A key advantage of DiffSharp lies in the [functional programming](https://en.wikipedia.org/wiki/Functional_programming) paradigm enabled by the F# language, where functions are first-class citizens, many algorithms can be constructed by applying and composing functions, and differentiation operations can be expressed as composable [higher-order functions](https://en.wikipedia.org/wiki/Higher-order_function). This allows very succinct (and beautiful) machine learning code to be expressed as a powerful combination of [lambda calculus](https://en.wikipedia.org/wiki/Lambda_calculus) and [differential calculus](https://en.wikipedia.org/wiki/Differential_calculus).
For example, the following constructs the same GAN architecture (that we constructed in PyTorch style in the previous section) using DiffSharp's `-->` composition operator, which allows you to seamlessly compose `Model` instances and differentiable `Tensor->Tensor` functions.
*)
// DiffSharp style
// Model as a composition of models and Tensor->Tensor functions
let generator =
dsharp.view([-1;nz])
--> Linear(nz, 256)
--> dsharp.leakyRelu(0.2)
--> Linear(256, 512)
--> dsharp.leakyRelu(0.2)
--> Linear(512, 1024)
--> dsharp.leakyRelu(0.2)
--> Linear(1024, 28*28)
--> dsharp.tanh
// Model as a composition of models and Tensor->Tensor functions
let discriminator =
dsharp.view([-1; 28*28])
--> Linear(28*28, 1024)
--> dsharp.leakyRelu(0.2)
--> dsharp.dropout(0.3)
--> Linear(1024, 512)
--> dsharp.leakyRelu(0.2)
--> dsharp.dropout(0.3)
--> Linear(512, 256)
--> dsharp.leakyRelu(0.2)
--> dsharp.dropout(0.3)
--> Linear(256, 1)
--> dsharp.sigmoid
print generator
print discriminator(* output:
Model(Linear(128, 256), Linear(256, 512), Linear(512, 1024), Linear(1024, 784))
Model(Linear(784, 1024), Linear(1024, 512), Linear(512, 256), Linear(256, 1))*)