Check master #493

ChrisRackauckas · 2021-09-23T11:54:41Z

No description provided.

ChrisRackauckas · 2021-09-23T12:07:28Z

https://github.com/SciML/DiffEqSensitivity.jl/pull/493/checks?check_run_id=3687059298#step:6:295

Something happened maybe in Zygote that makes it so that it now misses some adjoints?

@DhairyaLGandhi @mcabbott this is a pretty big master failure and I hope we can identify a fix ASAP. I'll try bounding.

mcabbott · 2021-09-23T12:36:15Z

Error is from
https://github.com/SciML/DiffEqSensitivity.jl/blob/master/test/literal_adjoint.jl#L23
which I can't reproduce locally, as I get a segfault:

julia> Zygote.gradient(test,u0,p)

signal (11): Segmentation fault: 11
in expression starting at REPL[9]:1
union_isinlinable at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)

My guess from the stack trace is that what's new is
https://github.com/FluxML/Zygote.jl/blob/master/src/compiler/chainrules.jl#L200
which you can try removing to confirm:

@eval Zygote function ChainRulesCore.rrule_via_ad(config::ZygoteRuleConfig, f_args...; kwargs...)
    y, pb = if !isempty(kwargs)
        kwf() = first(f_args)(Base.tail(f_args)...; kwargs...)
        _y, _pb = _pullback(config.context, kwf)
        _y, Δ -> first(_pb(Δ)).f_args  # `first` should be `only`
    else
        _pullback(config.context, f_args...)
    end
    ad_pullback(Δ) = zygote2differential(pb(wrap_chainrules_output(Δ)), f_args)
    return y, ad_pullback
end

as I believe the rule for sum(f, OneTo(3)) calls back into Zygote via rrule_via_ad().

DhairyaLGandhi · 2021-09-23T14:14:33Z

Yeah, this doesn't look good. Which rule does this miss specifically? It might also explain JuliaGaussianProcesses/KernelFunctions.jl#364 needing the ChainRules types in the adjoint. Maybe we can avoid checking for an rrule first.

DhairyaLGandhi · 2021-09-23T14:39:57Z

Found the issue - its @inline wrap_chainrules_output(x::AbstractThunk) = wrap_chainrules_output(unthunk(x)) from FluxML/Zygote.jl#1044 removing that solves the issue, but would need unthunking separately

ChrisRackauckas · 2021-09-23T14:42:20Z

So will there be a hotfix patch removing it or should I upper bound?

DhairyaLGandhi · 2021-09-23T14:46:08Z

This will need to be fixed, since otherwise we start leaking a lot of the ChainRules's types in the returned gradients.

(Thunk(ProjectTo{AbstractArray}(element = ProjectTo{Float64}(), axes = (Base.OneTo(2),)) ∘ DiffEqSensitivity.var"#380#386"{0, ODESolution{Any, 2, Vector{Any}, Nothing, Nothing, Vector{Float64}, Nothing, ODEProblem{Vector{Float64}, Tuple{Float64, Float64}, true, Vector{Float64}, ODEFunction{true, typeof(lv!), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, ...

DhairyaLGandhi · 2021-09-23T14:50:48Z

This still means there are ChainRules' types in the adjoints, so it would still error on unthunking. So it doesn't quite solve it.

DhairyaLGandhi · 2021-09-23T15:27:15Z

@ChrisRackauckas mind running the tests here with FluxML/Zygote.jl#1075 ?

ChrisRackauckas · 2021-09-23T22:22:27Z

Seems like there are still failures

DhairyaLGandhi · 2021-09-24T23:31:22Z

Did a little bit of sleuthing. Turns out, RecursiveArrayTools has been broken as well. FluxML/Zygote.jl#1044 added restrictions on how to represent the gradient. For example

julia> va = RecursiveArrayTools.VectorOfArray([rand(3,3), rand(3,3)]);

julia> gradient(va) do va
         sum(va[1])
       end
ERROR: DimensionMismatch("variable with size(x) == (3, 3, 2) cannot have a gradient with size(dx) == (2,)")
Stacktrace:
 [1] (::ChainRulesCore.ProjectTo{AbstractArray, NamedTuple{(:element, :axes), Tuple{ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}, Base.OneTo{Int64}}}}})(dx::Vector{AbstractMatrix{Float64}})
   @ ChainRulesCore ~/.julia/packages/ChainRulesCore/ChM7X/src/projection.jl:219
 [2] _project
   @ ~/Downloads/mwes/diffeqsensitivity/Zygote.jl/src/compiler/chainrules.jl:141 [inlined]
 [3] map(f::typeof(Zygote._project), t::Tuple{VectorOfArray{Float64, 3, Vector{Matrix{Float64}}}}, s::Tuple{Vector{AbstractMatrix{Float64}}})
   @ Base ./tuple.jl:232
 [4] gradient(f::Function, args::VectorOfArray{Float64, 3, Vector{Matrix{Float64}}})
   @ Zygote ~/Downloads/mwes/diffeqsensitivity/Zygote.jl/src/compiler/interface.jl:77
 [5] top-level scope
   @ REPL[30]:1

Also, rather than initialising gradients using zero(arg) (specifically in the getindex adjoint), we now get something different due to projection

x = Any[ChainRulesCore.NoTangent(), ChainRulesCore.NoTangent(), ChainRulesCore.NoTangent(), ChainRulesCore.NoTangent(), ChainRulesCore.NoTangent(), [1.0, 0.0]]

which of course RAT doesn't know how to handle as the gradient of a VectorOfArray. Thankfully, we are still preserving the indices that we need.

Also notice that we get a type instability here, earlier, this was inferred type stable, now it isn't.

Since the array is initialised using eltype from the partial, we should be producing correct types of arrays already via similar. So I think we should be removing projection here. Removing the projection solves the issue reported in this issue, but not RAT.

mcabbott · 2021-09-25T01:27:54Z

RecursiveArrayTools's problem is that it claims to be a 3-array of numbers, but when you iterate it, is instead a list of matrices. I think that's going to break a lot of code that believes what it says about itself:

julia> va = RecursiveArrayTools.VectorOfArray([rand(3,3), rand(3,3)]);

julia> va isa AbstractArray{<:Any, 3}
true

julia> size(va)  # 3-array
(3, 3, 2)

julia> eachindex(va)  # vector of arrays
Base.OneTo(2)

julia> length(va)
2

julia> CartesianIndices(va)  # 3-array
3×3×2 CartesianIndices{3, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}, Base.OneTo{Int64}}}:
[:, :, 1] =
 CartesianIndex(1, 1, 1)  CartesianIndex(1, 2, 1)  CartesianIndex(1, 3, 1)
...

julia> first(va)  # vector of arrays
3×3 Matrix{Float64}:
 0.620685  0.343948  0.940048
 0.848516  0.633285  0.243568
 0.775576  0.950859  0.134642

julia> va .+ 1  # 3-array broadcasting
VectorOfArray{Float64,3}:
2-element Vector{Matrix{Float64}}:
 [1.6206853179111187 1.3439483231282066 1.940048306517085; 1.8485156907122404 1.6332850011729758 1.243567963268383; 1.7755759351198481 1.9508590097266754 1.1346416765694358]
 [1.426320523589507 1.4595848092326293 1.0161374955217441; 1.8125891607780416 1.2695014949100343 1.314350910162879; 1.694957600928874 1.0902405127361687 1.5597647050882553]

julia> va[1] + 1
ERROR: MethodError: no method matching +(::Matrix{Float64}, ::Int64)

julia> collect(va)  # trusts eltype?
ERROR: MethodError: Cannot `convert` an object of type Matrix{Float64} to an object of type Float64

julia> Array(va)  # 3-array
3×3×2 Array{Float64, 3}:
[:, :, 1] =
 0.620685  0.343948  0.940048
...

julia> va[1] == va[:,1]  # neither 3-array nor vector of arrays? 
true

julia> Array(va)[:,1,1] == va[:,1,1]    # 3-array
true

julia> va[1] == first(eachslice(va; dims=3))    # 3-array
true

Zygote could special-case it one way or another, because it's well-established. But this is pretty surprising behaviour.

ChrisRackauckas · 2021-09-25T01:30:02Z

Nothing about that breaks the iteration interface of the AbstractArray interface, so it's not surprising to the interfaces and is something code should be compatible with if it's not making untrue assumptions.

DhairyaLGandhi · 2021-09-25T05:03:10Z

I wouldn't want to special case it in Zygote. Established or not, it obeys regular Julia semantics and implements the same array interface as any other array.

mcabbott · 2021-09-25T05:34:52Z

So what gradients are acceptable for this object? It's a requirement that the tangent vector can be added to the original, right?

julia> va = RecursiveArrayTools.VectorOfArray([rand(3,4), rand(3,4)]);

julia> va + ones(size(va)) |> summary  # seems like gradient can be a 3-array of same size
"3×4×2 Array{Float64, 3}"

julia> va .+ ones(size(va)) |> summary  # the same
"3×4×2 Array{Float64, 3}"

julia> pullback(va -> sum(va .+ 1), va)[2](1.0)  # this gradient makes an array
(Fill(1.0, 3, 3, 2),)

julia> gradient((x,y) -> sum(x ./ y), va, rand(3)); # this gradient makes an array

julia> summary.(ans)
("3×3×2 Array{Float64, 3}", "3-element Vector{Float64}")

julia> y, bk = pullback(va -> sum(va[1]), va); # like the example above

julia> summary(y)
"Float64"

julia> bk(1.0)[1] |> summary  # Zygote produces a vector of matrices
"2-element Vector{AbstractMatrix{Float64}}"

julia> va + bk(1.0)  # adding that to original is an error
ERROR: MethodError: no method matching +(::VectorOfArray{Float64, 3, Vector{Matrix{Float64}}}, ::Tuple{Vector{AbstractMatrix{Float64}}})

julia> va .+ bk(1.0)  # maybe .+ ? different error.
ERROR: MethodError: no method matching +(::Float64, ::Vector{AbstractMatrix{Float64}})

So what Zygote was producing seems wrong. Must it produce a Tangent? What consumes this, if not +?

Or maybe bk(1.0)[1] ought to be re-wrapped to make another VectorOfArray, so that the gradient is always AbstractArray{<:Any, 3}? This kind of enforcement sounds like what ProjectTo is good for, once you tell it what is acceptable. I see that RAT already depends on CRC, so the simplest version is a few lines:

julia> ChainRulesCore.ProjectTo(x::VectorOfArray) = ProjectTo{VectorOfArray}()

julia> (::ProjectTo{VectorOfArray})(dx::AbstractVector{<:AbstractArray}) = VectorOfArray(dx)

julia> (::ProjectTo{VectorOfArray})(dx::AbstractArray) = dx

julia> gradient(va -> sum(va[1]), va)[1]
VectorOfArray{Float64,3}:
2-element Vector{AbstractMatrix{Float64}}:
 3×3 Fill{Float64}, with entries equal to 1.0
 [0.0 0.0 0.0; 0.0 0.0 0.0; 0.0 0.0 0.0]

julia> ans + va |> summary
"VectorOfArray{Float64,3}"

Should gradient(va -> sum(va .+ 1), va)[1] also be wrapped in something else with this same dual nature, rather than just being an array? I don't see an existing structure like VectorOfArray(eachslice(A)).

And, should this work? The in-between indexing that's neither vector-of-arrays not a 3-array:

julia> pullback(va -> sum(va[:,1]), va)[2](1.0)
ERROR: BoundsError: attempt to access 3×3×2 VectorOfArray{Float64, 3, Vector{Matrix{Float64}}} at index [1:3, 1]

ChrisRackauckas · 2021-09-25T11:43:33Z

What's your suggestion on what I should do then to make it not error? I would've assumed ProjectTo with just the first two overloads would be correct.

mcabbott · 2021-09-25T12:55:53Z

assumed ProjectTo with just the first two overloads

This will break 3-array gradients like gradient((x,y) -> sum(x ./ y), va, rand(3)). And won't accept the VofA gradient producded by the (::ProjectTo{VectorOfArray})(dx::AbstractVector{<:AbstractArray}) method.

Whether what's written will "make it not error" I don't know. This is a real question:

What consumes this, if not +?

i.e. what happens downstream? Did it previously handle both 3-array and vector of array gradients, or only the latter? Does it just use iteration or will it expect indexing / broadcasting to work exactly like a vector of matrices?

ChrisRackauckas · 2021-09-25T15:04:25Z

It broadcasts like an Array under the interpretation of its multidimensional indexing. It just uses the AbstractArray interface fallbacks for all of that.

mcabbott · 2021-09-25T16:21:29Z

Sure. What I mean is, what broadcasting behaviour is expected of the gradient? Changing a vector of arrays into a VectorOfArrays (as the above projection does) will change this.

ChrisRackauckas · 2021-09-25T16:22:44Z

It should broadcast like a VectorOfArray

Check master

57a4142

Update Project.toml

e9c26bc

Update Project.toml

bf35c18

DhairyaLGandhi mentioned this pull request Sep 23, 2021

Project nothing to nothing FluxML/Zygote.jl#1075

Closed

ChrisRackauckas mentioned this pull request Sep 25, 2021

Second order ODE issue with adjoint sensitivity #492

Open

DhairyaLGandhi mentioned this pull request Sep 29, 2021

Add project utility FluxML/Zygote.jl#1087

Closed

mcabbott mentioned this pull request Sep 29, 2021

Gradient definitions & supertypes SciML/RecursiveArrayTools.jl#168

Merged

ChrisRackauckas added 2 commits October 1, 2021 07:23

Update Project.toml

c047f58

Merge branch 'master' into ChrisRackauckas-patch-2

54eca3b

ChrisRackauckas closed this Oct 26, 2021

ChrisRackauckas deleted the ChrisRackauckas-patch-2 branch October 26, 2021 16:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check master #493

Check master #493

ChrisRackauckas commented Sep 23, 2021

ChrisRackauckas commented Sep 23, 2021

mcabbott commented Sep 23, 2021 •

edited

Loading

DhairyaLGandhi commented Sep 23, 2021 •

edited

Loading

DhairyaLGandhi commented Sep 23, 2021 •

edited

Loading

ChrisRackauckas commented Sep 23, 2021

DhairyaLGandhi commented Sep 23, 2021

DhairyaLGandhi commented Sep 23, 2021 •

edited

Loading

DhairyaLGandhi commented Sep 23, 2021

ChrisRackauckas commented Sep 23, 2021

DhairyaLGandhi commented Sep 24, 2021 •

edited

Loading

mcabbott commented Sep 25, 2021 •

edited

Loading

ChrisRackauckas commented Sep 25, 2021 •

edited

Loading

DhairyaLGandhi commented Sep 25, 2021

mcabbott commented Sep 25, 2021 •

edited

Loading

ChrisRackauckas commented Sep 25, 2021

mcabbott commented Sep 25, 2021

ChrisRackauckas commented Sep 25, 2021

mcabbott commented Sep 25, 2021

ChrisRackauckas commented Sep 25, 2021

Check master #493

Check master #493

Conversation

ChrisRackauckas commented Sep 23, 2021

ChrisRackauckas commented Sep 23, 2021

mcabbott commented Sep 23, 2021 • edited Loading

DhairyaLGandhi commented Sep 23, 2021 • edited Loading

DhairyaLGandhi commented Sep 23, 2021 • edited Loading

ChrisRackauckas commented Sep 23, 2021

DhairyaLGandhi commented Sep 23, 2021

DhairyaLGandhi commented Sep 23, 2021 • edited Loading

DhairyaLGandhi commented Sep 23, 2021

ChrisRackauckas commented Sep 23, 2021

DhairyaLGandhi commented Sep 24, 2021 • edited Loading

mcabbott commented Sep 25, 2021 • edited Loading

ChrisRackauckas commented Sep 25, 2021 • edited Loading

DhairyaLGandhi commented Sep 25, 2021

mcabbott commented Sep 25, 2021 • edited Loading

ChrisRackauckas commented Sep 25, 2021

mcabbott commented Sep 25, 2021

ChrisRackauckas commented Sep 25, 2021

mcabbott commented Sep 25, 2021

ChrisRackauckas commented Sep 25, 2021

mcabbott commented Sep 23, 2021 •

edited

Loading

DhairyaLGandhi commented Sep 23, 2021 •

edited

Loading

DhairyaLGandhi commented Sep 23, 2021 •

edited

Loading

DhairyaLGandhi commented Sep 23, 2021 •

edited

Loading

DhairyaLGandhi commented Sep 24, 2021 •

edited

Loading

mcabbott commented Sep 25, 2021 •

edited

Loading

ChrisRackauckas commented Sep 25, 2021 •

edited

Loading

mcabbott commented Sep 25, 2021 •

edited

Loading