Skip to content

Gradient definitions & supertypes for Zygote, continued #169

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Sep 30, 2021
Merged

Conversation

ChrisRackauckas
Copy link
Member

Continues #168

@mcabbott

I didn't have the write permissions for some reason so continuing it here.

src/zygote.jl Outdated
Comment on lines 46 to 52
# Define a new species of projection operator for this type:
ChainRulesCore.ProjectTo(x::VectorOfArray) = ChainRulesCore.ProjectTo{VectorOfArray}()

# Gradient from iteration will be e.g. Vector{Vector}, this makes it another AbstractMatrix
(::ChainRulesCore.ProjectTo{VectorOfArray})(dx::AbstractVector{<:AbstractArray}) = VectorOfArray(dx)
# Gradient from broadcasting will be another AbstractArray
(::ChainRulesCore.ProjectTo{VectorOfArray})(dx::AbstractArray) = dx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this may not be necessary at all. One thing I thought to test was whether iteration like this worked without it, but it does, it hits the @adjoint getindex rule:

julia> function iter(vofa)
       s = 0
       for a in vofa
         s += prod(a)
       end
       s
       end;

julia> gradient(iter, va)[1]
VectorOfArray{Float64,3}:
2-element Vector{Matrix{Float64}}:
 [0.007377548303139424 0.0014293014720444096 0.0004998127128840348; 0.0005414269337139141 0.0007721441834498009 0.0006559506948612249; 0.0042378737935180105 0.0006765914005991947 0.00045986425172967415]
 [0.002774992305796606 0.002978041675310144 0.004412709924140469; 0.0056425205408066285 0.005228088118453952 0.0036646150274027; 0.0036825199036535465 0.004902176341789764 0.045170987413739046]

src/zygote.jl Outdated

# These rules duplicate the `rrule` methods above, because Zygote looks for an `@adjoint`
# definition first, and finds its own before finding those.

ZygoteRules.@adjoint function getindex(VA::AbstractVectorOfArray, i::Union{Int,AbstractArray{Int},CartesianIndex,Colon,BitArray,AbstractArray{Bool}})
function AbstractVectorOfArray_getindex_adjoint(Δ)
Δ′ = [ (i == j ? Δ : zero(x)) for (x,j) in zip(VA.u, 1:length(VA))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to the PR, but this seems like it allocates quite a bit, when iterating a VectorOfArray. I guess that using Fill(0.0, size(Δ)) would often make Δ′ have an abstract type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would make it an abstract type and sometimes hurt inference. Then we'd have to rely on union optimizations and pray.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on union optimizations might be the right idea here though, I'll have to check.

@ChrisRackauckas ChrisRackauckas merged commit 1d94f35 into master Sep 30, 2021
@ChrisRackauckas ChrisRackauckas deleted the grad branch September 30, 2021 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants