You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1516: add Embedding layer r=DhairyaLGandhi a=CarloLucibello
Basic implementation.
Maybe could be improved when FluxML/NNlib.jl#255 lands
### PR Checklist
- [x] Tests are added
- [x] Entry in NEWS.md
- [x] Documentation, if applicable
- [ ] Final review from `@dhairyagandhi96` (for API changes).
Co-authored-by: Carlo Lucibello <[email protected]>
Co-authored-by: Carlo Lucibello <[email protected]>
@@ -42,38 +42,38 @@ Normally, your training and test data come from real world observations, but thi
42
42
43
43
Now, build a model to make predictions with `1` input and `1` output:
44
44
45
-
```
45
+
```julia
46
46
julia> model =Dense(1, 1)
47
47
Dense(1, 1)
48
48
49
-
julia> model.W
50
-
1-element Array{Float64,1}:
51
-
-0.99009055
49
+
julia> model.weight
50
+
1×1 Matrix{Float32}:
51
+
-1.4925033
52
52
53
-
julia> model.b
54
-
1-element Array{Float64,1}:
53
+
julia> model.bias
54
+
1-element Vector{Float32}:
55
55
0.0
56
56
```
57
57
58
-
Under the hood, a dense layer is a struct with fields `W` and `b`. `W` represents a weight and `b` represents a bias. There's another way to think about a model. In Flux, *models are conceptually predictive functions*:
58
+
Under the hood, a dense layer is a struct with fields `weight` and `bias`. `weight` represents a weights' matrix and `bias` represents a bias vector. There's another way to think about a model. In Flux, *models are conceptually predictive functions*:
59
59
60
-
```
60
+
```julia
61
61
julia> predict =Dense(1, 1)
62
62
```
63
63
64
64
`Dense(1, 1)` also implements the function `σ(Wx+b)` where `W` and `b` are the weights and biases. `σ` is an activation function (more on activations later). Our model has one weight and one bias, but typical models will have many more. Think of weights and biases as knobs and levers Flux can use to tune predictions. Activation functions are transformations that tailor models to your needs.
65
65
66
66
This model will already make predictions, though not accurate ones yet:
In order to make better predictions, you'll need to provide a *loss function* to tell Flux how to objectively *evaluate* the quality of a prediction. Loss functions compute the cumulative distance between actual values and predictions.
75
75
76
-
```
76
+
```julia
77
77
julia>loss(x, y) = Flux.Losses.mse(predict(x), y)
78
78
loss (generic function with 1 method)
79
79
@@ -87,7 +87,7 @@ More accurate predictions will yield a lower loss. You can write your own loss f
87
87
88
88
Under the hood, the Flux [`train!`](@ref) function uses *a loss function* and *training data* to improve the *parameters* of your model based on a pluggable [`optimiser`](../training/optimisers.md):
89
89
90
-
```
90
+
```julia
91
91
julia>using Flux: train!
92
92
93
93
julia> opt =Descent()
@@ -100,12 +100,12 @@ julia> data = [(x_train, y_train)]
100
100
101
101
Now, we have the optimiser and data we'll pass to `train!`. All that remains are the parameters of the model. Remember, each model is a Julia struct with a function and configurable parameters. Remember, the dense layer has weights and biases that depend on the dimensions of the inputs and outputs:
0 commit comments