-
-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sparse grads for getindex #589
Conversation
I realized the type signatures might be a little bit too strong here, so I'm adjusting.
this would suggest that |
Δ′ = zero(xs) | ||
Δ′[i...] = data(Δ) | ||
(nobacksies(:getindex, Δ′), map(_->nothing, i)...) | ||
checkbounds(xs, i...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed given that we already did xs[i...]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch:)
Base.similar(x::SparseGrad{T,N,S,P,O}) where {T,N,S,P,O} = similar(O, size(x)) | ||
|
||
#FIXME: Very slow getindex. | ||
function Base.getindex(x::SparseGrad, i...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to just implement the scalar version and have the rest fall back? Or is this need for the GPU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, the scalar case is worth it only for a very small queries (the indexin()
s are expensive), hence I added the allocating version for the general case.
I like the general idea here but I don't think this should touch |
Gradient definition of |
Yes, touching things in |
This is a proposition that fixes #577. I basically delay the gradient computation to happen inplace in the
accum!()
call and not in the@grad
definition, which allows me to avoid the copy. On an extreme cornercase:Before, I got the following timings:
And after:
There is a downside:
getindex
ing into the sparse structure I use is very slow and I think there is somegetindex
ing happening in the jacobians (at least the jacobian tests were complaining).Please let me know what you think.