|
17 | 17 | @variable(model, x[1:layer_size, 1:batch_size] >= 0)
|
18 | 18 | @objective(model, Min, x[:]'x[:] - 2y[:]'x[:])
|
19 | 19 | optimize!(model)
|
20 |
| - return value.(x) |
| 20 | + return Float32.(value.(x)) |
21 | 21 | end</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">matrix_relu (generic function with 1 method)</code></pre><p>Define the reverse differentiation rule, for the function we defined above.</p><pre><code class="language-julia hljs">function ChainRulesCore.rrule(::typeof(matrix_relu), y::Matrix{T}) where {T}
|
22 | 22 | model = Model(() -> DiffOpt.diff_optimizer(Ipopt.Optimizer))
|
23 | 23 | pv = matrix_relu(y; model = model)
|
|
53 | 53 | NNlib.softmax,
|
54 | 54 | ) # Total: 4 arrays, 7_960 parameters, 31.297 KiB.</code></pre><h2 id="Prepare-data"><a class="docs-heading-anchor" href="#Prepare-data">Prepare data</a><a id="Prepare-data-1"></a><a class="docs-heading-anchor-permalink" href="#Prepare-data" title="Permalink"></a></h2><pre><code class="language-julia hljs">N = 1000 # batch size
|
55 | 55 | # Preprocessing train data
|
56 |
| -imgs = MLDatasets.MNIST.traintensor(1:N) |
57 |
| -labels = MLDatasets.MNIST.trainlabels(1:N) |
| 56 | +imgs = MLDatasets.MNIST(; split = :train).features[:, :, 1:N] |
| 57 | +labels = MLDatasets.MNIST(; split = :train).targets[1:N] |
58 | 58 | train_X = float.(reshape(imgs, size(imgs, 1) * size(imgs, 2), N)) # stack images
|
59 | 59 | train_Y = Flux.onehotbatch(labels, 0:9);
|
60 | 60 | # Preprocessing test data
|
61 |
| -test_imgs = MLDatasets.MNIST.testtensor(1:N) |
62 |
| -test_labels = MLDatasets.MNIST.testlabels(1:N) |
| 61 | +test_imgs = MLDatasets.MNIST(; split = :test).features[:, :, 1:N] |
| 62 | +test_labels = MLDatasets.MNIST(; split = :test).targets[1:N]; |
63 | 63 | test_X = float.(reshape(test_imgs, size(test_imgs, 1) * size(test_imgs, 2), N))
|
64 |
| -test_Y = Flux.onehotbatch(test_labels, 0:9);</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">┌ Warning: MNIST.traintensor() is deprecated, use `MNIST(split=:train).features` instead. |
65 |
| -└ @ MLDatasets ~/.julia/packages/MLDatasets/0MkOE/src/datasets/vision/mnist.jl:157 |
66 |
| -┌ Warning: MNIST.trainlabels() is deprecated, use `MNIST(split=:train).targets` instead. |
67 |
| -└ @ MLDatasets ~/.julia/packages/MLDatasets/0MkOE/src/datasets/vision/mnist.jl:173 |
68 |
| -┌ Warning: MNIST.testtensor() is deprecated, use `MNIST(split=:test).features` instead. |
69 |
| -└ @ MLDatasets ~/.julia/packages/MLDatasets/0MkOE/src/datasets/vision/mnist.jl:165 |
70 |
| -┌ Warning: MNIST.testlabels() is deprecated, use `MNIST(split=:test).targets` instead. |
71 |
| -└ @ MLDatasets ~/.julia/packages/MLDatasets/0MkOE/src/datasets/vision/mnist.jl:180</code></pre><p>Define input data The original data is repeated <code>epochs</code> times because <code>Flux.train!</code> only loops through the data set once</p><pre><code class="language-julia hljs">epochs = 50 # ~1 minute (i7 8th gen with 16gb RAM) |
| 64 | +test_Y = Flux.onehotbatch(test_labels, 0:9);</code></pre><p>Define input data The original data is repeated <code>epochs</code> times because <code>Flux.train!</code> only loops through the data set once</p><pre><code class="language-julia hljs">epochs = 50 # ~1 minute (i7 8th gen with 16gb RAM) |
72 | 65 | # epochs = 100 # leads to 77.8% in about 2 minutes
|
73 |
| -dataset = repeated((train_X, train_Y), epochs);</code></pre><h2 id="Network-training"><a class="docs-heading-anchor" href="#Network-training">Network training</a><a id="Network-training-1"></a><a class="docs-heading-anchor-permalink" href="#Network-training" title="Permalink"></a></h2><p>training loss function, Flux optimizer</p><pre><code class="language-julia hljs">custom_loss(x, y) = Flux.crossentropy(m(x), y) |
74 |
| -opt = Flux.Adam() |
75 |
| -evalcb = () -> @show(custom_loss(train_X, train_Y))</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">#11 (generic function with 1 method)</code></pre><p>Train to optimize network parameters</p><pre><code class="language-julia hljs">@time Flux.train!( |
76 |
| - custom_loss, |
77 |
| - Flux.params(m), |
78 |
| - dataset, |
79 |
| - opt, |
80 |
| - cb = Flux.throttle(evalcb, 5), |
81 |
| -);</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">┌ Warning: Layer with Float32 parameters got Float64 input. |
82 |
| -│ The input will be converted, but any earlier layers may be very slow. |
83 |
| -│ layer = Dense(10 => 10) # 110 parameters |
84 |
| -│ summary(x) = "10×1000 Matrix{Float64}" |
85 |
| -└ @ Flux ~/.julia/packages/Flux/hiqg1/src/layers/stateless.jl:60 |
86 |
| -custom_loss(train_X, train_Y) = 2.355365f0 |
87 |
| -custom_loss(train_X, train_Y) = 2.2240443f0 |
88 |
| -custom_loss(train_X, train_Y) = 2.1510334f0 |
89 |
| -custom_loss(train_X, train_Y) = 2.0600805f0 |
90 |
| -custom_loss(train_X, train_Y) = 1.9604436f0 |
91 |
| -custom_loss(train_X, train_Y) = 1.8702683f0 |
92 |
| -custom_loss(train_X, train_Y) = 1.7790897f0 |
93 |
| -custom_loss(train_X, train_Y) = 1.691865f0 |
94 |
| -custom_loss(train_X, train_Y) = 1.610134f0 |
95 |
| -custom_loss(train_X, train_Y) = 1.5316879f0 |
96 |
| -106.215850 seconds (76.76 M allocations: 4.763 GiB, 1.44% gc time, 0.71% compilation time)</code></pre><p>Although our custom implementation takes time, it is able to reach similar accuracy as the usual ReLU function implementation.</p><h2 id="Accuracy-results"><a class="docs-heading-anchor" href="#Accuracy-results">Accuracy results</a><a id="Accuracy-results-1"></a><a class="docs-heading-anchor-permalink" href="#Accuracy-results" title="Permalink"></a></h2><p>Average of correct guesses</p><pre><code class="language-julia hljs">accuracy(x, y) = Statistics.mean(Flux.onecold(m(x)) .== Flux.onecold(y));</code></pre><p>Training accuracy</p><pre><code class="language-julia hljs">accuracy(train_X, train_Y)</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">0.562</code></pre><p>Test accuracy</p><pre><code class="language-julia hljs">accuracy(test_X, test_Y)</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">0.478</code></pre><p>Note that the accuracy is low due to simplified training. It is possible to increase the number of samples <code>N</code>, the number of epochs <code>epoch</code> and the connectivity <code>inner</code>.</p><hr/><p><em>This page was generated using <a href="https://github.com/fredrikekre/Literate.jl">Literate.jl</a>.</em></p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../chainrules_unit/">« ChainRules integration demo: Relaxed Unit Commitment</a><a class="docs-footer-nextpage" href="../matrix-inversion-manual/">Differentiating a QP wrt a single variable »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 0.27.25 on <span class="colophon-date" title="Saturday 28 December 2024 09:26">Saturday 28 December 2024</span>. Using Julia version 1.11.2.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html> |
| 66 | +dataset = repeated((train_X, train_Y), epochs);</code></pre><h2 id="Network-training"><a class="docs-heading-anchor" href="#Network-training">Network training</a><a id="Network-training-1"></a><a class="docs-heading-anchor-permalink" href="#Network-training" title="Permalink"></a></h2><p>training loss function, Flux optimizer</p><pre><code class="language-julia hljs">custom_loss(m, x, y) = Flux.crossentropy(m(x), y) |
| 67 | +opt = Flux.setup(Flux.Adam(), m)</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">(layers = ((weight = Leaf(Adam(0.001, (0.9, 0.999), 1.0e-8), (Float32[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0], Float32[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0], (0.9, 0.999))), bias = Leaf(Adam(0.001, (0.9, 0.999), 1.0e-8), (Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], (0.9, 0.999))), σ = ()), (), (weight = Leaf(Adam(0.001, (0.9, 0.999), 1.0e-8), (Float32[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0], Float32[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0], (0.9, 0.999))), bias = Leaf(Adam(0.001, (0.9, 0.999), 1.0e-8), (Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], (0.9, 0.999))), σ = ()), ()),)</code></pre><p>Train to optimize network parameters</p><pre><code class="language-julia hljs">@time Flux.train!(custom_loss, m, dataset, opt);</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">103.973062 seconds (72.91 M allocations: 4.475 GiB, 1.44% gc time, 0.71% compilation time)</code></pre><p>Although our custom implementation takes time, it is able to reach similar accuracy as the usual ReLU function implementation.</p><h2 id="Accuracy-results"><a class="docs-heading-anchor" href="#Accuracy-results">Accuracy results</a><a id="Accuracy-results-1"></a><a class="docs-heading-anchor-permalink" href="#Accuracy-results" title="Permalink"></a></h2><p>Average of correct guesses</p><pre><code class="language-julia hljs">accuracy(x, y) = Statistics.mean(Flux.onecold(m(x)) .== Flux.onecold(y));</code></pre><p>Training accuracy</p><pre><code class="language-julia hljs">accuracy(train_X, train_Y)</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">0.562</code></pre><p>Test accuracy</p><pre><code class="language-julia hljs">accuracy(test_X, test_Y)</code></pre><pre class="documenter-example-output"><code class="nohighlight hljs ansi">0.478</code></pre><p>Note that the accuracy is low due to simplified training. It is possible to increase the number of samples <code>N</code>, the number of epochs <code>epoch</code> and the connectivity <code>inner</code>.</p><hr/><p><em>This page was generated using <a href="https://github.com/fredrikekre/Literate.jl">Literate.jl</a>.</em></p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../chainrules_unit/">« ChainRules integration demo: Relaxed Unit Commitment</a><a class="docs-footer-nextpage" href="../matrix-inversion-manual/">Differentiating a QP wrt a single variable »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 0.27.25 on <span class="colophon-date" title="Tuesday 14 January 2025 06:11">Tuesday 14 January 2025</span>. Using Julia version 1.11.2.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html> |
0 commit comments