Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Gibbskernel #374

Merged
merged 28 commits into from
Oct 1, 2021
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
d50f2b4
add gibbskernel.jl
Cyberface Sep 28, 2021
2ca62f4
include gibbskernel.jl
Cyberface Sep 28, 2021
2b83286
some modifications for gibbskernel and added a small test
Cyberface Sep 28, 2021
e4d1414
fix missing definition of ell function in test
Cyberface Sep 28, 2021
f9f066a
fix gibbskernel test
Cyberface Sep 29, 2021
d6b53b9
add gibbskernel to test/runtests.jl
Cyberface Sep 29, 2021
698c453
Update test/kernels/gibbskernel.jl
Cyberface Sep 29, 2021
3a98028
Update src/kernels/gibbskernel.jl
Cyberface Sep 29, 2021
0f757a3
Update src/kernels/gibbskernel.jl
Cyberface Sep 29, 2021
2e1df2d
implement changes based on PR discussion and update test
Cyberface Sep 29, 2021
b5c6faa
add brackets back in
Cyberface Sep 29, 2021
19e6665
fix unit test for gibbs kernel
Cyberface Sep 29, 2021
21de7d4
update gibbskernel test: implement theogf's idea about using a non-tr…
Cyberface Sep 30, 2021
4ddafbe
ensure ell returns positive value in test
Cyberface Sep 30, 2021
9fd1d14
gibbs kernel: fixed implementation and unittest
Cyberface Sep 30, 2021
5531851
gibbs kernel: fix docstring
Cyberface Sep 30, 2021
78faeac
Update src/kernels/gibbskernel.jl
Cyberface Oct 1, 2021
ae8b3d4
Update src/kernels/gibbskernel.jl
Cyberface Oct 1, 2021
6b91a23
add gibbskernel to docs
Cyberface Oct 1, 2021
a914957
Update src/kernels/gibbskernel.jl
Cyberface Oct 1, 2021
6eb4099
Update test/kernels/gibbskernel.jl
Cyberface Oct 1, 2021
79aa4f5
gibbskernel test: formatting
Cyberface Oct 1, 2021
0942447
Merge branch 'master' into gibbskernel
devmotion Oct 1, 2021
de3e535
Fix format (hopefully)
devmotion Oct 1, 2021
0caf2fd
Add newline character
devmotion Oct 1, 2021
cfbca3d
Apply suggestions from code review
devmotion Oct 1, 2021
1fb7903
Export invsqrt2
theogf Oct 1, 2021
f577700
Version bump
theogf Oct 1, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions src/KernelFunctions.jl
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ export PiecewisePolynomialKernel
export PeriodicKernel, NeuralNetworkKernel
export KernelSum, KernelProduct, KernelTensorProduct
export TransformedKernel, ScaledKernel, NormalizedKernel
export GibbsKernel

export Transform,
SelectTransform,
Expand Down Expand Up @@ -94,6 +95,7 @@ include("basekernels/rational.jl")
include("basekernels/sm.jl")
include("basekernels/wiener.jl")

include("kernels/gibbskernel.jl")
include("kernels/scaledkernel.jl")
include("kernels/normalizedkernel.jl")
include("matrix/kernelmatrix.jl")
Expand Down
44 changes: 44 additions & 0 deletions src/kernels/gibbskernel.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
"""
GibbsKernel(x, y)

# Definition

The Gibbs kernel is non-stationary generalisation of the Squared-Exponential
kernel. The lengthscale parameter ``l`` becomes a function of
position ``l(x)``.

``l(x) = 1.`` then you recover the standard Squared-Exponential kernel
with constant lengthscale.

```math
k(x, y) = \\sqrt{ \\left(\\frac{2 l(x) l(y)}{l(x)^2 + l(y)^2} \\right) }
\\quad \\rm{exp} \\left( - \\frac{(x - y)^2}{l(x)^2 + l(y)^2} \\right)
```

[1] - Mark N. Gibbs. "Bayesian Gaussian Processes for Regression and Classication."
PhD thesis, 1997

[2] - Christopher J. Paciorek and Mark J. Schervish. "Nonstationary Covariance
Functions for Gaussian Process Regression". NEURIPS, 2003

[3] - Sami Remes, Markus Heinonen, Samuel Kaski.
"Non-Stationary Spectral Kernels". arXiV:1705.08736, 2017

[4] - Sami Remes, Markus Heinonen, Samuel Kaski.
"Neural Non-Stationary Spectral Kernel". arXiv:1811.10978, 2018
"""

struct GibbsKernel{T} <: Kernel
lengthscale::T
end

GibbsKernel(; lengthscale) = GibbsKernel(lengthscale)

function (k::GibbsKernel)(x, y)
lengthscale = k.lengthscale
lx = lengthscale(x)
ly = lengthscale(y)
l = hypot(lx, ly)
kernel = (sqrt(2 * lx * ly) / l) * with_lengthscale(SqExponentialKernel(), l)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
l = hypot(lx, ly)
kernel = (sqrt(2 * lx * ly) / l) * with_lengthscale(SqExponentialKernel(), l)
l = hypot(lx, ly) / sqrt(2)
kernel = (sqrt(lx * ly) / l) * with_lengthscale(SqExponentialKernel(), l)

I think this should more correspond to the exact formula

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without promotion to Float64:

Suggested change
l = hypot(lx, ly)
kernel = (sqrt(2 * lx * ly) / l) * with_lengthscale(SqExponentialKernel(), l)
l = invsqrt2 * hypot(lx, ly)
kernel = (sqrt(lx * ly) / l) * with_lengthscale(SqExponentialKernel(), l)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct though because l is passed into with_lengthscale - doesn't the SqExponentialKernel already have the 1/2 in the exponential looking here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't it be

l = hypot(lx, ly)
kernel = (sqrt(2 * lx * ly) / l) * with_lengthscale(SqExponentialKernel(), 1 / l^2)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct though because l is passed into with_lengthscale - doesn't the SqExponentialKernel already have the 1/2 in the exponential looking here

Yes but in the latex formula the factor 2 is not there, so this way we cancel it out!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the 2 disappears because we already have l(x) + l(y) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the 2 would be there when l(x) = l(y) = constant.

I think I finally understand most of the things here but I am confused by with_lengthscale. Can you explain why we shouldn't square the argument to with_lengthscale? Looking at the with_lengthscale code it seems to just multiply by 1/l.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the 2 disappears because we already have l(x) + l(y) ?

I'm not sure what 2 you refer to (and there is no l(x) + l(y) term anywhere 🤔). The 2 in the square root outside of the squared exponential kernel disappears because we already scale the square root of the denominator by 1/sqrt(2).

why we shouldn't square the argument to with_lengthscale

It will be squared by the squared exponential kernel. A squared exponential kernel with lengthscale l is of the form k(x, y) = exp(- (x - y)^2/(2*l^2)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the l(x) + l(y) term I was refering to is the denominator in the exponential i.e.
exp(-(x - y)^2 / (l(x)^2 + l(y)^2)

And when l(x) = l(y) = l = constant you get exp(-(x - y)^2 / (2l^2) ?

Also thanks for explaining to me about the with_lengthscale

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, if they are constant you get exp(-(x-y)^2 / (2 * l^2)). Very explicitly, in general the implementation yields

$$k(x, y) = sqrt(l(x) * l(y)) / (sqrt(l(x)^2 + l(y)^2) / sqrt(2)) * exp(- (x - y)^2 / (2 * (sqrt(l(x)^2 + l(y)^2) / sqrt(2))^2)) = sqrt(2 * l(x) * l(y) / (l(x)^2 + l(y)^2)) * exp(- (x - y)^2 / (l(x)^2 + l(y)^2))$$

which is exactly the formula in the docstring (and the other issue).

return kernel(x, y)
end
32 changes: 32 additions & 0 deletions test/kernels/gibbskernel.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
@testset "gibbskernel" begin
x = randn(2)
y = randn(2)

# generate random number of standard SqExponentialKernel lengthscale
# taking exp() to ensure ell is positive
ell(x) = exp(sum(sin, x))

# this is the gibbs lengthscale function.
# to check that we can recover the stationary SqExponentialKernel
# with a constant lengthscale we just set this function
# equal to a constant.
l_func(x) = ell(1)

# in order to compare to the Gibbs kernel we compute the
# equivalent lengthscale l(x)^2 + l(y)^2.
# See the denominator of the exponential term in the gibbs kernel.
lengthscale = hypot(l_func(x), l_func(y))

# create a SqExponentialKernel with a lengthscale
k = with_lengthscale(SqExponentialKernel(), lengthscale)

# create a gibbs kernel with our constant lengthscale function
k_gibbs = GibbsKernel(l_func)
Copy link
Member

@theogf theogf Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ell(x) = exp(sum(sin, x))
# this is the gibbs lengthscale function.
# to check that we can recover the stationary SqExponentialKernel
# with a constant lengthscale we just set this function
# equal to a constant.
l_func(x) = ell(1)
# in order to compare to the Gibbs kernel we compute the
# equivalent lengthscale l(x)^2 + l(y)^2.
# See the denominator of the exponential term in the gibbs kernel.
lengthscale = hypot(l_func(x), l_func(y))
# create a SqExponentialKernel with a lengthscale
k = with_lengthscale(SqExponentialKernel(), lengthscale)
# create a gibbs kernel with our constant lengthscale function
k_gibbs = GibbsKernel(l_func)
# this is the gibbs lengthscale function.
ell(x) = exp(sum(sin, x))
# create a gibbs kernel with our specific lengthscale function
k_gibbs = GibbsKernel(ell)

That's more what I meant to use instead of a constant.


# check they are equal
@test k_gibbs(x, y) == k(x, y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the test it would be great if you could be as faithful as possible to the latex formula, it makes it easier to see if it's doing the right thing!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure what you mean, could you elaborate or sketch out an idea of what you would like to see?

Thanks!

Copy link
Member

@theogf theogf Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like

@test k_gibbs(x, y) == sqrt((2 * ell(x) * ell(y)) / (ell(x)^2 + ell(y)^2)) * exp( - norm(x - y)^2 / (ell(x)^2 + ell(y)^2))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n.b. probably better to use here rather than ==.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see what you mean. I've added your suggestion but actually this test is failing and I'm not sure why!

I don't think I fully understand the implementation in https://github.com/Cyberface/KernelFunctions.jl/blob/gibbskernel/src/kernels/gibbskernel.jl#L37 so will take a look a bit later again.

Copy link
Contributor Author

@Cyberface Cyberface Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the using norm(x-y) is correct here? Calling this on

x = randn(2)
y = randn(2)
norm(x-y)

This returns a single number - shouldn't this be a 2x2 matrix?

Assuming the norm is from LinearAlgebra ?

Sorry for my confusion!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No no, this the L2 norm! So it's sqrt(sum((x_i - y_i)^2)), this is what we would expect I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I get you now - I thought this was supposed to give the covariance matrix but that is the kernelmatrix function!

This is just the covariance between two points in n-dimensions.

Hopefully I got this correct :)


k_gibbs = GibbsKernel(ell)
@test k_gibbs(x, y) ≈ sqrt((2 * ell(x) * ell(y)) / (ell(x)^2 + ell(y)^2)) * exp(- norm(x - y)^2 / (ell(x)^2 + ell(y)^2))

end
1 change: 1 addition & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,7 @@ include("test_utils.jl")
include("kernels/transformedkernel.jl")
include("kernels/normalizedkernel.jl")
include("kernels/neuralkernelnetwork.jl")
include("kernels/gibbskernel.jl")
end
@info "Ran tests on Kernel"
end
Expand Down