Skip to content

Adapt with grad, jvp, etc #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
willtebbutt opened this issue May 30, 2019 · 7 comments
Closed

Adapt with grad, jvp, etc #19

willtebbutt opened this issue May 30, 2019 · 7 comments

Comments

@willtebbutt
Copy link
Member

Is it clear how to use the fancy new adapt kwarg with grad, jvp, and the like? Or do we need to extend them to cope with it?

@wesselb
Copy link
Member

wesselb commented May 30, 2019

I think you can just give a FDM.Central(5, 1; adapt=1) (for example) for the fdm argument of, say, jacobian.

On another note, I think jvps might be more stably numerically estimated in this way. What do you think?

@willtebbutt
Copy link
Member Author

I think you can just give a FDM.Central(5, 1; adapt=1) (for example) for the fdm argument of, say, jacobian.

Ah nice. Any chance we could document this new API? Or is it documented and I've just missed it?

On another note, I think jvps might be more stably numerically estimated in this way. What do you think?

If I've understood correctly (that the Python implementation computes directional derivatives w.r.t. the full function, rather than estimating the jacobian separately and then computing its product with the vector in question) then I agree. This is probably also quite a bit more performant. Maybe open a separate issue about it?

@wesselb
Copy link
Member

wesselb commented May 30, 2019

Yeah, the new API has not been documented yet. That still needs to be done. It was introduced with the recent PR introducing adaptation.

If I've understood correctly (that the Python implementation computes directional derivatives w.r.t. the full function, rather than estimating the jacobian separately and then computing its product with the vector in question) then I agree. This is probably also quite a bit more performant. Maybe open a separate issue about it?

Actually, I just realised that a jvp is just a directional derivative... See this; super simple! That should indeed much more performant and probably more accurate than what we're currently doing. Can we do j'vp in a similar way do you think? Should indeed open an issue about this, because ChainRules is using jvp quite extensively.

@ararslan
Copy link
Member

ararslan commented May 30, 2019

Yeah, the new API has not been documented yet.

It is documented, see https://invenia.github.io/FDM.jl/latest/pages/api.html#FDM.central_fdm. EDIT: Except the link to FDMethod is broken... but the fdm docs describe the keyword arguments.

@wesselb
Copy link
Member

wesselb commented May 31, 2019

Ah, sorry, you're completely right that it is documented! I meant that we might want to add an example usage to the README.

@oxinabox
Copy link
Member

oxinabox commented Feb 2, 2020

So adapt and grad/jvp etc do not play particularly nice.

AFAICT
We regenerate the coefs each time adapt is triggered, as well as at the start.
Which means it can happen once for every element of the input to grad.

I have been playing around with precomputing them, to see if it would give performance improvement. but I can't seem to trigger adapt anymore...

@oxinabox
Copy link
Member

fixed enough I think now via memoization in #73

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants