-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Gibbskernel #374
Adding Gibbskernel #374
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening this! It's a very nice well-sized PR.
The basic functionality all seems is present. The main thing that needs addressing is turning the GibbsKernel
function
into a struct
(see comment below) -- other than that it's just formatting-related things and other minor details that need addressing.
Co-authored-by: willtebbutt <[email protected]>
Co-authored-by: willtebbutt <[email protected]>
Co-authored-by: willtebbutt <[email protected]>
I have updated the unit test - it was a little more confusing and complicated that I originally thought! Hopefully you can follow what I've done through the comments. Let me know what you think, cheers! |
test/kernels/gibbskernel.jl
Outdated
k_gibbs = GibbsKernel(l_func) | ||
|
||
# check they are equal | ||
@test k_gibbs(x, y) == k(x, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the test it would be great if you could be as faithful as possible to the latex formula, it makes it easier to see if it's doing the right thing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure what you mean, could you elaborate or sketch out an idea of what you would like to see?
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like
@test k_gibbs(x, y) == sqrt((2 * ell(x) * ell(y)) / (ell(x)^2 + ell(y)^2)) * exp( - norm(x - y)^2 / (ell(x)^2 + ell(y)^2))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
n.b. probably better to use ≈
here rather than ==
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I see what you mean. I've added your suggestion but actually this test is failing and I'm not sure why!
I don't think I fully understand the implementation in https://github.com/Cyberface/KernelFunctions.jl/blob/gibbskernel/src/kernels/gibbskernel.jl#L37 so will take a look a bit later again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure the using norm(x-y)
is correct here? Calling this on
x = randn(2)
y = randn(2)
norm(x-y)
This returns a single number - shouldn't this be a 2x2 matrix?
Assuming the norm
is from LinearAlgebra
?
Sorry for my confusion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No no, this the L2 norm! So it's sqrt(sum((x_i - y_i)^2))
, this is what we would expect I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I get you now - I thought this was supposed to give the covariance matrix but that is the kernelmatrix
function!
This is just the covariance between two points in n-dimensions.
Hopefully I got this correct :)
…ivial function for ell
added a broken test to seek help about why it's broken
src/kernels/gibbskernel.jl
Outdated
l = hypot(lx, ly) | ||
kernel = (sqrt(2 * lx * ly) / l) * with_lengthscale(SqExponentialKernel(), l) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
l = hypot(lx, ly) | |
kernel = (sqrt(2 * lx * ly) / l) * with_lengthscale(SqExponentialKernel(), l) | |
l = hypot(lx, ly) / sqrt(2) | |
kernel = (sqrt(lx * ly) / l) * with_lengthscale(SqExponentialKernel(), l) |
I think this should more correspond to the exact formula
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without promotion to Float64:
l = hypot(lx, ly) | |
kernel = (sqrt(2 * lx * ly) / l) * with_lengthscale(SqExponentialKernel(), l) | |
l = invsqrt2 * hypot(lx, ly) | |
kernel = (sqrt(lx * ly) / l) * with_lengthscale(SqExponentialKernel(), l) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct though because l
is passed into with_lengthscale
- doesn't the SqExponentialKernel
already have the 1/2
in the exponential looking here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't it be
l = hypot(lx, ly)
kernel = (sqrt(2 * lx * ly) / l) * with_lengthscale(SqExponentialKernel(), 1 / l^2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct though because l is passed into with_lengthscale - doesn't the SqExponentialKernel already have the 1/2 in the exponential looking here
Yes but in the latex formula the factor 2 is not there, so this way we cancel it out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the 2 disappears because we already have l(x) + l(y) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the 2 would be there when l(x) = l(y) = constant.
I think I finally understand most of the things here but I am confused by with_lengthscale
. Can you explain why we shouldn't square the argument to with_lengthscale? Looking at the with_lengthscale code it seems to just multiply by 1/l
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the 2 disappears because we already have l(x) + l(y) ?
I'm not sure what 2 you refer to (and there is no l(x) + l(y) term anywhere 🤔). The 2 in the square root outside of the squared exponential kernel disappears because we already scale the square root of the denominator by 1/sqrt(2).
why we shouldn't square the argument to
with_lengthscale
It will be squared by the squared exponential kernel. A squared exponential kernel with lengthscale l is of the form k(x, y) = exp(- (x - y)^2/(2*l^2))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the l(x) + l(y)
term I was refering to is the denominator in the exponential i.e.
exp(-(x - y)^2 / (l(x)^2 + l(y)^2)
And when l(x) = l(y) = l = constant
you get exp(-(x - y)^2 / (2l^2)
?
Also thanks for explaining to me about the with_lengthscale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, if they are constant you get exp(-(x-y)^2 / (2 * l^2))
. Very explicitly, in general the implementation yields
which is exactly the formula in the docstring (and the other issue).
test/kernels/gibbskernel.jl
Outdated
ell(x) = exp(sum(sin, x)) | ||
|
||
# this is the gibbs lengthscale function. | ||
# to check that we can recover the stationary SqExponentialKernel | ||
# with a constant lengthscale we just set this function | ||
# equal to a constant. | ||
l_func(x) = ell(1) | ||
|
||
# in order to compare to the Gibbs kernel we compute the | ||
# equivalent lengthscale l(x)^2 + l(y)^2. | ||
# See the denominator of the exponential term in the gibbs kernel. | ||
lengthscale = hypot(l_func(x), l_func(y)) | ||
|
||
# create a SqExponentialKernel with a lengthscale | ||
k = with_lengthscale(SqExponentialKernel(), lengthscale) | ||
|
||
# create a gibbs kernel with our constant lengthscale function | ||
k_gibbs = GibbsKernel(l_func) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ell(x) = exp(sum(sin, x)) | |
# this is the gibbs lengthscale function. | |
# to check that we can recover the stationary SqExponentialKernel | |
# with a constant lengthscale we just set this function | |
# equal to a constant. | |
l_func(x) = ell(1) | |
# in order to compare to the Gibbs kernel we compute the | |
# equivalent lengthscale l(x)^2 + l(y)^2. | |
# See the denominator of the exponential term in the gibbs kernel. | |
lengthscale = hypot(l_func(x), l_func(y)) | |
# create a SqExponentialKernel with a lengthscale | |
k = with_lengthscale(SqExponentialKernel(), lengthscale) | |
# create a gibbs kernel with our constant lengthscale function | |
k_gibbs = GibbsKernel(l_func) | |
# this is the gibbs lengthscale function. | |
ell(x) = exp(sum(sin, x)) | |
# create a gibbs kernel with our specific lengthscale function | |
k_gibbs = GibbsKernel(ell) |
That's more what I meant to use instead of a constant.
bugs ironed out hopefully with lots of help from the devs
OK I think this PR is basically done? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy -- once @theogf is happy, we can merge :)
remove space thanks @willtebbutt Co-authored-by: willtebbutt <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment for the docstring. And could you also add the kernel in the docs (just add the kernel name in the list in docs/kernels.md)?
src/kernels/gibbskernel.jl
Outdated
kernel. The lengthscale parameter ``l`` becomes a function of | ||
position ``l(x)``. | ||
|
||
``l(x) = l`` then you recover the standard Squared-Exponential kernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``l(x) = l`` then you recover the standard Squared-Exponential kernel | |
For a constant function``l(x) = c``, one recovers the standard Squared-Exponential kernel with lengthscale `0.5 c` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great thanks and that answers my other question about the square root / squaring about with_lengthscale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lengthscale is just c
, it seems: In this case we get exp(-(x-y)^2/(2*c^2))
, both with our implementation and the formula it is based on which corresponds to a lengthscale of c
.
thanks @theogf Co-authored-by: Théo Galy-Fajou <[email protected]>
Yeah sure thing. Shall I add a new section in |
I'd just put it in alphabetical order in the base kernels list. Feels like the appropriate place to me (the base kernels aren't necessarily stationary) |
great thanks - done! |
thanks @devmotion Co-authored-by: David Widmann <[email protected]>
thanks @devmotion Co-authored-by: David Widmann <[email protected]>
The PR says there is a request to change something but I'm not sure what it's referring to. |
It was my request to fix the docstring. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor improvements to the docstring.
Great! All we need now is a version bump. In |
Thanks a lot @Cyberface! |
that's really great! Thanks so much for all your help with this! Next I'll work on getting the kernelmatrix working and eventually getting it so that the gibbs kernel length scale can be modelled with a separate GP and fit/optimise that! |
Summary
This PR begins to implement the non-stationary Gibbs kernel what is being discussed in issue: #372
This is a modification of the standard Squared-Exponential kernel where the lengthscale is no longer constant but is a function of the input parameters.
In this PR I have simply added a function
GibbsKernel
to evaluate the gibbs kernel as well as added a test to check that when the lengthscale function just returns a constant value of1.0
then it should be the same as theSqExponentialKernel()
.The implementation of the gibbs kernel is probably going to change as I learn more about how your kernels are defined/implemented.
This is my first PR here and I am only just starting to learn julia so I'm sorry for some basic mistakes!
And I appriciate any comments to make this better thanks!
Proposed changes
What alternatives have you considered?
A better implementation would be to use the kernel/transform infrastructure you have and I will work on this after this PR.
Breaking changes
None