-
Notifications
You must be signed in to change notification settings - Fork 12
StackOverFlow error with automatic differentiation #25
Comments
@DhairyaLGandhi Is this a zygote issue? |
The error isn't a Zygote specific issue, but will need something like FluxML/Zygote.jl#762 for sparse arrays. Also, are there mutations in the solve? |
I uncovered a few issues, and I think the best way to deal with this is to post some minimal examples of code in the solve that error out with Zygote: First: a = sparse(rand(10,10))
ml = smoothed_aggregation(a)
function gg(b)
res = zeros(10)
x = rand(10)
mul!(res, a, x)
res .= res .- b
norm(res)
end Zygote does not like both Second: a = sparse(rand(10, 10))
x = rand(10)
function g2(x)
sum(pinv(Matrix(a)) * x)
end
Zygote.gradient(g2, x) Shouldn't Zygote just return There are more issues in the ForwardDiff level as well. The stack overflow occurs on Julia v1.5 comes from Dual numbers being fed into If we can decide how to fix all these three issues, this example will work. |
Yeah we ideally don't want to be mutating here, is there a significant impact on runtime? If so, we can hide the mutation behind a dummy adjoint.to get the perf. In the second case, we return the shape of |
@DhairyaLGandhi Would it be possible to make a PR for this in GenericSVD.jl? |
@ViralBShah happy to! Although I would need to know where exactly in the package to make any changes. I think we can steal the adjoint of @ranjanan what all should we include here? The allocating version: julia> a = sparse(rand(10,10));
julia> ml = smoothed_aggregation(a);
julia> function gg(a, b)
x = rand(10)
tmp = a * x
res = tmp .- b
norm(res)
end
gg (generic function with 1 method)
julia> gradient(gg, a, rand(10))
([0.05043748387462807 0.29173903907441895 … 0.26446805170314763 0.1852669996547526; 0.04379342989124224 0.25330869370886433 … 0.22963007253740866 0.16086205609916615; … ; 0.04194103727965614 0.24259413780762254 … 0.2199170847485408 0.15405784631375644; 0.015444321299626654 0.08933259768289514 … 0.0809820246811391 0.05673009137406974], [-0.33098881300351585, -0.2873881538794712, -0.17353663654685716, -0.4018202140012206, -0.47086562303011126, -0.22966622556149668, -0.3696521183095285, -0.3451833506963632, -0.275232090875825, -0.10135116151541051]) For the sparse case: @adjoint Matrix(a::SparseMatrixCSC) = Matrix(a), Δ -> (Δ,) julia> gradient(g2, a, x)[1]
10×10 Array{Float64,2}:
0.297149 -4.15341 1.97924 2.14591 -2.02367 1.09381 -4.71775 2.83978 -0.272864 0.152132
-0.0981608 1.37204 -0.653825 -0.708883 0.668503 -0.36133 1.55847 -0.938097 0.0901384 -0.0502557
0.237394 -3.31817 1.58122 1.71437 -1.61672 0.873845 -3.76903 2.26871 -0.217992 0.121539
-0.825264 11.5351 -5.49688 -5.95976 5.62028 -3.03779 13.1025 -7.88683 0.757817 -0.422513
-0.180465 2.52246 -1.20204 -1.30326 1.22902 -0.664293 2.86519 -1.72466 0.165716 -0.0923934
0.378071 -5.28449 2.51824 2.7303 -2.57477 1.39168 -6.00253 3.61313 -0.347173 0.193562
0.446601 -6.24237 2.9747 3.2252 -3.04148 1.64394 -7.09056 4.26805 -0.410102 0.228648
-0.711433 9.94406 -4.73868 -5.13772 4.84506 -2.61878 11.2952 -6.79898 0.65329 -0.364235
1.32769 -18.5578 8.84343 9.58812 -9.04196 4.88723 -21.0794 12.6884 -1.21918 0.679742
0.463574 -6.4796 3.08775 3.34776 -3.15707 1.70641 -7.36002 4.43025 -0.425687 0.237337
Hmm, shouldn't it be something like |
if I make the matvecs allocating (RA/diff branch on AMG.jl), I am able to get the following to work. import AlgebraicMultigrid
import ForwardDiff
using GenericSVD
import LinearAlgebra
import LinearAlgebra: givensAlgorithm
import SparseArrays
import Zygote
import Zygote: @adjoint
@adjoint Matrix(a::SparseMatrixCSC) = Matrix(a), Δ -> (Δ,)
hessian_vector_product(f, x, v) = ForwardDiff.jacobian(s->Zygote.gradient(f, x + s[1] * v)[1], [0.0])[:]
n = 4
x = randn(2 * n + 1)
k = x[1:n + 1]
B = SparseArrays.spdiagm(0=>k[1:end - 1] + k[2:end], -1=>-k[2:end - 1], 1=>-k[2:end - 1])
function g(x)
ml = AlgebraicMultigrid.ruge_stuben(B)
return sum(AlgebraicMultigrid.solve(ml, x[n + 2:end], calculate_residual = false))
end
v = randn(2 * n + 1)
Zygote.gradient(g, x) returns
julia> hessian_vector_product(g, x, v)
9-element Vector{Float64}:
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0 does not seem right to me though. I'll check what's happening. |
Ah fixing ERROR: LoadError: Need an adjoint for constructor SparseMatrixCSC{Float64, Int64}. Gradient is of type Matrix{Float64}
Stacktrace:
[1] error(s::String)
@ Base .\error.jl:33
[2] (::Zygote.Jnew{SparseMatrixCSC{Float64, Int64}, Nothing, false})(Δ::Matrix{Float64})
@ Zygote ~\.julia\packages\Zygote\CgsVi\src\lib\lib.jl:311
[3] (::Zygote.var"#1720#back#194"{Zygote.Jnew{SparseMatrixCSC{Float64, Int64}, Nothing, false}})(Δ::Matrix{Float64})
@ Zygote ~\.julia\packages\ZygoteRules\OjfTt\src\adjoint.jl:59
[4] Pullback
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\SparseArrays\src\sparsematrix.jl:32 [inlined]
[5] (::typeof(∂(SparseMatrixCSC{Float64, Int64})))(Δ::Matrix{Float64})
@ Zygote .\compiler\interface2.jl:0
[6] Pullback
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\SparseArrays\src\sparsematrix.jl:45 [inlined]
[7] (::typeof(∂(SparseMatrixCSC)))(Δ::Matrix{Float64})
@ Zygote .\compiler\interface2.jl:0
[8] Pullback
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\SparseArrays\src\sparsematrix.jl:930 [inlined]
[9] (::typeof(∂(sparse!)))(Δ::Matrix{Float64})
@ Zygote .\compiler\interface2.jl:0
[10] Pullback
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\SparseArrays\src\sparsematrix.jl:767 [inlined]
[11] (::typeof(∂(sparse)))(Δ::Matrix{Float64})
@ Zygote .\compiler\interface2.jl:0
[12] Pullback
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\SparseArrays\src\sparsematrix.jl:956 [inlined]
[13] (::typeof(∂(sparse)))(Δ::Matrix{Float64})
@ Zygote .\compiler\interface2.jl:0
[14] Pullback
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\SparseArrays\src\sparsematrix.jl:3654 [inlined]
[15] (::typeof(∂(_spdiagm)))(Δ::Matrix{Float64})
@ Zygote .\compiler\interface2.jl:0
[16] (::Zygote.var"#178#179"{Tuple{Tuple{Nothing}, Tuple{Nothing, Nothing, Nothing}}, typeof(∂(_spdiagm))})(Δ::Matrix{Float64})
@ Zygote ~\.julia\packages\Zygote\CgsVi\src\lib\lib.jl:191
[17] #1686#back
@ ~\.julia\packages\ZygoteRules\OjfTt\src\adjoint.jl:59 [inlined]
[18] Pullback
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\SparseArrays\src\sparsematrix.jl:3616 [inlined]
[19] (::typeof(∂(spdiagm)))(Δ::Matrix{Float64})
@ Zygote .\compiler\interface2.jl:0
[20] Pullback
@ ~\Documents\Work\CS\AMGdiff\example.jl:28 [inlined]
[21] (::typeof(∂(g)))(Δ::Float64)
@ Zygote .\compiler\interface2.jl:0
[22] (::Zygote.var"#41#42"{typeof(∂(g))})(Δ::Float64)
@ Zygote ~\.julia\packages\Zygote\CgsVi\src\compiler\interface.jl:41
[23] gradient(f::Function, args::Vector{Float64})
@ Zygote ~\.julia\packages\Zygote\CgsVi\src\compiler\interface.jl:59
[24] top-level scope
@ ~\Documents\Work\CS\AMGdiff\example.jl:34
[25] include(fname::String)
@ Base.MainInclude .\client.jl:444
[26] top-level scope
@ REPL[72]:1
in expression starting at C:\Users\Ranjan Anantharaman\Documents\Work\CS\AMGdiff\example.jl:34 @DhairyaLGandhi what's the right adjoint here? |
Oh I thought we had added that constructor already - but I see it was specifically for the |
Is the hessian vector product doing the right thing otherwise? |
We could also add Zygote.@adjoint function LinearAlgebra.mul!(res, A, B)
LinearAlgebra.mul!(res, A, B), Δ -> (nothing, Δ * B', A' * Δ)
end if its not too much abuse. It would let you use the non-allocating
@adjoint function SparseArrays.spdiagm(x::Pair...)
ks = first.(x)
SparseArrays.spdiagm(x...), Δ -> begin
tuple((k => diag(Δ, k) for k in ks)...)
end
end |
@DhairyaLGandhi the spdiagm one worked, thank you! Could I bother you for one more adjoint? ERROR: LoadError: Need an adjoint for constructor Pair{Int64, Vector{Float64}}. Gradient is of type Pair{Int64, Vector{Float64}}
Stacktrace:
[1] error(s::String)
@ Base .\error.jl:33
[2] (::Zygote.Jnew{Pair{Int64, Vector{Float64}}, Nothing, false})(Δ::Pair{Int64, Vector{Float64}})
@ Zygote ~\.julia\packages\Zygote\pM10l\src\lib\lib.jl:314
[3] (::Zygote.var"#1720#back#194"{Zygote.Jnew{Pair{Int64, Vector{Float64}}, Nothing, false}})(Δ::Pair{Int64, Vector{Float64}})
@ Zygote ~\.julia\packages\ZygoteRules\OjfTt\src\adjoint.jl:59
[4] Pullback
@ .\pair.jl:12 [inlined]
[5] Pullback
@ .\pair.jl:15 [inlined]
[6] (::typeof(∂(Pair)))(Δ::Pair{Int64, Vector{Float64}})
@ Zygote ~\.julia\packages\Zygote\pM10l\src\compiler\interface2.jl:0
[7] Pullback
@ ~\Documents\Work\CS\AMGdiff\example.jl:34 [inlined]
[8] (::typeof(∂(g)))(Δ::Float64)
@ Zygote ~\.julia\packages\Zygote\pM10l\src\compiler\interface2.jl:0
[9] (::Zygote.var"#41#42"{typeof(∂(g))})(Δ::Float64)
@ Zygote ~\.julia\packages\Zygote\pM10l\src\compiler\interface.jl:41
[10] gradient(f::Function, args::Vector{Float64})
@ Zygote ~\.julia\packages\Zygote\pM10l\src\compiler\interface.jl:59
[11] top-level scope
@ ~\Documents\Work\CS\AMGdiff\example.jl:39
[12] include(fname::String)
@ Base.MainInclude .\client.jl:444
[13] top-level scope
@ REPL[1]:1 |
I'm assuming it would be something like @adjoint function Pair(x...)
Pair(x...), Δ -> (Δ,)
end I wonder if we need a special optimisation rule for it? Not sure. Could also return @adjoint function Pair(x, y)
Pair(x, y), Δ -> (nothing, Δ.second)
end |
Thanks Dhairya, the second adjoint worked. Now we need to resolve this one last issue with the ForwardDiff.jacobian. We need to get ┌ Warning: keyword `alg` ignored in generic svd!
└ @ GenericSVD C:\Users\Ranjan Anantharaman\.julia\packages\GenericSVD\cT5Cu\src\GenericSVD.jl:12
ERROR: LoadError: MethodError: no method matching givensAlgorithm(::ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}, ::ForwardDiff.Dual{ForwardD
iff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1})
Closest candidates are:
givensAlgorithm(::T, ::T) where T at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\givens.jl:253
givensAlgorithm(::Any, ::Any) at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\givens.jl:264
Stacktrace:
[1] givensAlgorithm(f::ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}, g::ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Floa
t64}, Vector{Float64}}, Float64}, Float64, 1})
@ LinearAlgebra C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\givens.jl:256
[2] givens(f::ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}, g::ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vec
tor{Float64}}, Float64}, Float64, 1}, i1::Int64, i2::Int64)
@ LinearAlgebra C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\givens.jl:291
[3] svd_gk!(B::Bidiagonal{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vect
or{Float64}, Vector{Float64}}, Float64}, Float64, 1}}}, U::Matrix{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}}, Vt::Matrix{ForwardDiff.Dual{
ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}}, n₁::Int64, n₂::Int64, shift::ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Fl
oat64}}, Float64}, Float64, 1})
@ GenericSVD ~\.julia\packages\GenericSVD\cT5Cu\src\GenericSVD.jl:251
[4] svd_bidiag!(B::Bidiagonal{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g),
Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}}}, U::Matrix{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}}, Vt::Matrix{ForwardDiff.D
ual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}}, ε::Float64)
@ GenericSVD ~\.julia\packages\GenericSVD\cT5Cu\src\GenericSVD.jl:162
[5] svd_bidiag!
@ ~\.julia\packages\GenericSVD\cT5Cu\src\GenericSVD.jl:107 [inlined]
[6] generic_svdfact!(X::Matrix{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}}; sorted::Bool, full::Bool)
@ GenericSVD ~\.julia\packages\GenericSVD\cT5Cu\src\GenericSVD.jl:32
[7] svd!(X::Matrix{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}}; full::Bool, thin::Nothing, alg::LinearAlgebra.DivideAndConquer)
@ GenericSVD ~\.julia\packages\GenericSVD\cT5Cu\src\GenericSVD.jl:18
[8] #svd#85
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\svd.jl:157 [inlined]
[9] pinv(A::Matrix{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}}; atol::Float64, rtol::Float64)
@ LinearAlgebra C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\dense.jl:1385
[10] pinv
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\dense.jl:1364 [inlined]
[11] #adjoint#714
@ ~\.julia\packages\Zygote\pM10l\src\lib\array.jl:416 [inlined]
[12] adjoint
@ .\none:0 [inlined]
[13] _pullback
@ ~\.julia\packages\ZygoteRules\OjfTt\src\adjoint.jl:57 [inlined]
[14] _pullback
@ ~\.julia\dev\AlgebraicMultigrid\src\multilevel.jl:57 [inlined]
[15] _pullback(ctx::Zygote.Context, f::Type{AlgebraicMultigrid.Pinv{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}}}, args::SparseMatrixCSC{Fo
rwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}, Int64})
@ Zygote ~\.julia\packages\Zygote\pM10l\src\compiler\interface2.jl:0
[16] _pullback
@ ~\.julia\dev\AlgebraicMultigrid\src\multilevel.jl:59 [inlined]
[17] _pullback(ctx::Zygote.Context, f::Type{AlgebraicMultigrid.Pinv}, args::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}, In
t64})
@ Zygote ~\.julia\packages\Zygote\pM10l\src\compiler\interface2.jl:0
[18] _pullback
@ ~\.julia\dev\AlgebraicMultigrid\src\classical.jl:44 [inlined]
[19] _pullback(::Zygote.Context, ::AlgebraicMultigrid.var"##ruge_stuben#13", ::AlgebraicMultigrid.Classical{Float64}, ::AlgebraicMultigrid.RS, ::AlgebraicMultigrid.GaussSeidel{AlgebraicMultigrid.SymmetricSwee
p}, ::AlgebraicMultigrid.GaussSeidel{AlgebraicMultigrid.SymmetricSweep}, ::Int64, ::Int64, ::Type{AlgebraicMultigrid.Pinv}, ::typeof(AlgebraicMultigrid.ruge_stuben), ::SparseMatrixCSC{ForwardDiff.Dual{ForwardD
iff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}, Int64}, ::Type{Val{1}})
@ Zygote ~\.julia\packages\Zygote\pM10l\src\compiler\interface2.jl:0
[20] _pullback
@ ~\.julia\dev\AlgebraicMultigrid\src\classical.jl:20 [inlined]
[21] _pullback(ctx::Zygote.Context, f::typeof(AlgebraicMultigrid.ruge_stuben), args::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float6
4, 1}, Int64})
@ Zygote ~\.julia\packages\Zygote\pM10l\src\compiler\interface2.jl:0
[22] _pullback
@ ~\Documents\Work\CS\AMGdiff\example.jl:45 [inlined]
[23] _pullback(ctx::Zygote.Context, f::typeof(g), args::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}})
@ Zygote ~\.julia\packages\Zygote\pM10l\src\compiler\interface2.jl:0
[24] _pullback(f::Function, args::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}})
@ Zygote ~\.julia\packages\Zygote\pM10l\src\compiler\interface.jl:34
[25] pullback(f::Function, args::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}})
@ Zygote ~\.julia\packages\Zygote\pM10l\src\compiler\interface.jl:40
[26] gradient(f::Function, args::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}})
@ Zygote ~\.julia\packages\Zygote\pM10l\src\compiler\interface.jl:58
[27] (::var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}})(s::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}})
@ Main ~\Documents\Work\CS\AMGdiff\example.jl:38
[28] vector_mode_dual_eval
@ ~\.julia\packages\ForwardDiff\sqhTO\src\apiutils.jl:37 [inlined]
[29] vector_mode_jacobian(f::var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, x::Vector{Float64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float6
4}}, Float64}, Float64, 1, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}}})
@ ForwardDiff ~\.julia\packages\ForwardDiff\sqhTO\src\jacobian.jl:147
[30] jacobian(f::Function, x::Vector{Float64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1, Vector{ForwardDiff.Dual{ForwardDi
ff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}}}, ::Val{true})
@ ForwardDiff ~\.julia\packages\ForwardDiff\sqhTO\src\jacobian.jl:21
[31] jacobian(f::Function, x::Vector{Float64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1, Vector{ForwardDiff.Dual{ForwardDi
ff.Tag{var"#15#16"{typeof(g), Vector{Float64}, Vector{Float64}}, Float64}, Float64, 1}}})
@ ForwardDiff ~\.julia\packages\ForwardDiff\sqhTO\src\jacobian.jl:19
[32] hessian_vector_product(f::Function, x::Vector{Float64}, v::Vector{Float64})
@ Main ~\Documents\Work\CS\AMGdiff\example.jl:38
[33] top-level scope
@ ~\Documents\Work\CS\AMGdiff\example.jl:50
[34] include(fname::String)
@ Base.MainInclude .\client.jl:444
[35] top-level scope
@ REPL[1]:1
in expression starting at C:\Users\Ranjan Anantharaman\Documents\Work\CS\AMGdiff\example.jl:50 Here's the MWE. You need to be on import AlgebraicMultigrid
import ForwardDiff
using GenericSVD
import LinearAlgebra
import LinearAlgebra: givensAlgorithm
import SparseArrays
import Zygote
import Zygote: @adjoint
using SparseArrays
using LinearAlgebra
@adjoint Matrix(a::SparseMatrixCSC) = Matrix(a), Δ -> (Δ,)
# include("sparse.jl")
# @adjoint Pair(a, b) = Pair(a, b), Δ -> ((first=nothing, second=Δ),)
@adjoint function Pair(x, y)
Pair(x, y), Δ -> (Δ.first, Δ.second)
end
@adjoint function SparseArrays.spdiagm(x::Pair...)
ks = first.(x)
SparseArrays.spdiagm(x...), Δ -> begin
tuple((k => diag(Δ, k) for k in ks)...)
end
end
hessian_vector_product(f, x, v) = ForwardDiff.jacobian(s->Zygote.gradient(f, x + s[1] * v)[1], [0.0])[:]
n = 4
x = randn(2 * n + 1)
function g(x)
k = x[1:n + 1]
B = SparseArrays.spdiagm(0=>k[1:end - 1] + k[2:end], -1=>-k[2:end - 1], 1=>-k[2:end - 1])
ml = AlgebraicMultigrid.ruge_stuben(B)
return sum(AlgebraicMultigrid.solve(ml, x[n + 2:end], calculate_residual = false))
end
v = randn(2 * n + 1)
Zygote.gradient(g, x)
hessian_vector_product(g, x, v)#stack overflow |
MWE for the ForwardDiff issue: import ForwardDiff: Dual
givensAlgorithm(Dual(1.,1.), Dual(1.,1.)) You get this error: ERROR: MethodError: no method matching givensAlgorithm(::Dual{Nothing, Float64, 1}, ::Dual{Nothing, Float64, 1})
Closest candidates are:
givensAlgorithm(::T, ::T) where T at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\givens.jl:253
givensAlgorithm(::Any, ::Any) at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\givens.jl:264
Stacktrace:
[1] givensAlgorithm(f::Dual{Nothing, Float64, 1}, g::Dual{Nothing, Float64, 1})
@ LinearAlgebra C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\givens.jl:256
[2] top-level scope
@ REPL[5]:1 |
Ok, now the following code works (it needs branch import AlgebraicMultigrid
import ForwardDiff
using GenericSVD
import LinearAlgebra
import LinearAlgebra: givensAlgorithm
import SparseArrays
import Zygote
import Zygote: @adjoint
using SparseArrays
using LinearAlgebra
using Random
import ForwardDiff: Dual
import LinearAlgebra: floatmin2
Random.seed!(123)
@adjoint Matrix(a::SparseMatrixCSC) = Matrix(a), Δ -> (Δ,)
@adjoint function Pair(x, y)
Pair(x, y), Δ -> (Δ.first, Δ.second)
end
Zygote.@adjoint function LinearAlgebra.mul!(res, A, B)
LinearAlgebra.mul!(res, A, B), Δ -> (nothing, Δ * B', A' * Δ)
end
@adjoint function SparseArrays.spdiagm(x::Pair...)
ks = first.(x)
SparseArrays.spdiagm(x...), Δ -> begin
tuple((k => diag(Δ, k) for k in ks)...)
end
end
function LinearAlgebra.givensAlgorithm(f::T, g::T) where T
fs = f / oneunit(T)
gs = g / oneunit(T)
c, s, r = _givensAlgorithm(fs, gs)
return c, s, r * oneunit(T)
end
Base.floatmin(x::ForwardDiff.Dual{T}) where T = Dual{T}(floatmin(x.value), floatmin(x.partials...))
Base.floatmin(::Type{ForwardDiff.Dual{T, S, V}}) where {T,S,V} = Dual{T}(floatmin(S), floatmin(S))
function _givensAlgorithm(f::T, g::T) where T
onepar = one(T)
twopar = 2one(T)
T0 = typeof(onepar) # dimensionless
zeropar = T0(zero(T)) # must be dimensionless
# need both dimensionful and dimensionless versions of these:
safmn2 = floatmin2(T0)
safmn2u = floatmin2(T)
safmx2 = one(T)/safmn2
safmx2u = oneunit(T)/safmn2
if g == 0
cs = onepar
sn = zeropar
r = f
elseif f == 0
cs = zeropar
sn = onepar
r = g
else
f1 = f
g1 = g
scalepar = max(abs(f1), abs(g1))
if scalepar >= safmx2u
count = 0
while true
count += 1
f1 *= safmn2
g1 *= safmn2
scalepar = max(abs(f1), abs(g1))
if scalepar < safmx2u break end
end
r = sqrt(f1*f1 + g1*g1)
cs = f1/r
sn = g1/r
for i = 1:count
r *= safmx2
end
elseif scalepar <= safmn2u
count = 0
while true
count += 1
f1 *= safmx2
g1 *= safmx2
scalepar = max(abs(f1), abs(g1))
if scalepar > safmn2u break end
end
r = sqrt(f1*f1 + g1*g1)
cs = f1/r
sn = g1/r
for i = 1:count
r *= safmn2
end
else
r = sqrt(f1*f1 + g1*g1)
cs = f1/r
sn = g1/r
end
if abs(f) > abs(g) && cs < 0
cs = -cs
sn = -sn
r = -r
end
end
return cs, sn, r
end
hessian_vector_product(f, x, v) = ForwardDiff.jacobian(s->Zygote.gradient(f, x + s[1] * v)[1], [0.0])[:]
n = 4
x = randn(2 * n + 1)
function g(x)
k = x[1:n + 1]
B = SparseArrays.spdiagm(0=>k[1:end - 1] + k[2:end], -1=>-k[2:end - 1], 1=>-k[2:end - 1])
ml = AlgebraicMultigrid.ruge_stuben(B)
return sum(AlgebraicMultigrid.solve(ml, x[n + 2:end], calculate_residual = false))
end
v = randn(2 * n + 1)
Zygote.gradient(g, x)
hessian_vector_product(g, x, v) cc: @omalled @ViralBShah I think a PR may need to go to givens in base, because the dispatch there does not take Dual numbers. Also, @DhairyaLGandhi I tried using your |
I made the |
All the relevant adjoints are in When would this be needed in a release? |
Yeah, go ahead and add the Pair adjoint to your PR because you need it for the
Shouldn't it return the same thing as out of place matmul? |
For A and B, it does that, but for the buffer, it's just a bunch of setindex calls, so that's something like |
Btw, are we using zygote for the loops? That may not be optimal. |
Please note that this package may be deprecated in the near future (see #30). |
Awesome work getting this rolling, @ranjanan and @DhairyaLGandhi! I was able to get Ranjan's example working on my machine. However, I run into an issue for larger problems (the example is for
At
|
I ran into an issue getting automatically-differentiated Hessian-vector products to work with AlgebraicMultigrid. GenericSVD was proposed as a potential solution, but using GenericSVD resulted in a StackOverFlowError. Here's some code that reproduces the error:
Here is the output with the stack overflow:
Can this stack overflow be resolved?
The text was updated successfully, but these errors were encountered: