Skip to content

Formalise AD integration status, rewrite AD page correspondingly #595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

penelopeysm
Copy link
Member

@penelopeysm penelopeysm commented Mar 28, 2025

This is an initial attempt to put down in words what @willtebbutt and I think our approach to integrating AD backends should be going forward.

@penelopeysm penelopeysm changed the title AD formalism Formalise AD integration status, rewrite AD page correspondingly Mar 28, 2025
@penelopeysm penelopeysm force-pushed the py/ad branch 2 times, most recently from 23469c6 to e594515 Compare March 29, 2025 17:02
Copy link
Contributor

Preview the changes: https://turinglang.org/docs/pr-previews/595
Please avoid using the search feature and navigation bar in PR previews!

Copy link
Member

@willtebbutt willtebbutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts.

Comment on lines +76 to +78
**Tier 3** is the same as Tier 2, but in addition to that, we formally also take responsibility for ensuring that the backend works with Turing models.
If you submit an issue about using Turing with a Tier 3 library, we will actively try to make it work.
Realistically, this is only possible for AD backends that are actively maintained by somebody on the Turing team, such as Mooncake.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the limits to how far we're willing to take this? Per our discussion yesterday, if someone does something really non-differentiable (e.g. a custom ccall), we're not going to try and add support for their proposal.

Maybe "we will actively try to make it work" could be extended to say how we'll try to make it work? e.g. if someone encounters a bug, we'll fix it, but if they're doing something unusual we might suggest a more standard way to go about it that avoids the problem they're seeing entirely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could add a paragraph above or below the list of tiers explaining what it means that a backend works with Turing? It's mostly relevant for Tier 3, but it could be a good clarification more generally too. Something like

When we say that an AD backend works with Turing, we mean that it is able to differentiate any Turing model that depends only on Turing and some common Julia standard library modules such as LinearAlgebra. Note that a Turing model can include arbitrary Julia code, which can involve code dependencies on other packages such as differential equation solvers or external calls using ccall (should something else be added to the list of exclusions?). If a Tier 2 or 3 AD backend fails on a Turing model because of such features, we may still be able to help you out in some cases, but we may also consider the problem to be outside our control or scope.

Good to also keep in mind that while it's nice to be clear and explicit about our thinking, it's not a legal contract and we don't have to be suuuuper precise about our wording on what we commit to fixing. It's all still subject to the usual uncertainties of academic funding and time anyway.

Comment on lines +192 to +214
Firstly, you could broaden the type of the container:

```{julia}
@model function forwarddiff_working1()
x = Real[0.0, 1.0]
a ~ Normal()
x[1] = a
b ~ MvNormal(x, I)
end
sample(forwarddiff_working1(), NUTS(; adtype=AutoForwardDiff()), 10)
```

Or, you can pass a type as a parameter to the model:

```{julia}
@model function forwarddiff_working2(::Type{T}=Float64) where T
x = T[0.0, 1.0]
a ~ Normal()
x[1] = a
b ~ MvNormal(x, I)
end
sample(forwarddiff_working2(), NUTS(; adtype=AutoForwardDiff()), 10)
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful for users to make it clear that the second option here is highly preferable to the first in general, and that the first should only be used if the second doesn't work for some reason?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely, and a link to https://discourse.julialang.org/t/vector-real-vector-float64-methoderror/25926/5 might also be helpful (it helped me back in the day)

### Usable AD Backends

Turing.jl uses the functionality in [DifferentiationInterface.jl](https://github.com/JuliaDiff/DifferentiationInterface.jl) ('DI') to interface with AD libraries in a unified way.
Thus, in principle, any AD library that has integrations with DI can be used with Turing; you should consult the [DI documentation](https://juliadiff.org/DifferentiationInterface.jl/DifferentiationInterface/stable/) for an up-to-date list of compatible AD libraries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plural in "integrations with DI" feels funny to me.

| 1 | Yes | No | 'You're on your own' | Enzyme, Zygote |
| 0 | No | No | 'You can't use this' | |

**Tier 0** means that the AD library is not integrated with DI, and thus will not work with Turing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Tier 0** means that the AD library is not integrated with DI, and thus will not work with Turing.
**Tier 0** means that the AD library is not integrated with DI, and thus will not work with Turing, or is known to have serious enough issues when used with Turing to render it useless.

To cover cases like what might happen with Zygote, where we know that it won't with any Turing models, so don't bother trying.

**Tier 1** means that the AD library is integrated with DI, and you can try to use it with Turing if you like; however, we provide no guarantee that it will work correctly.
If you submit an issue about using Turing with a Tier 1 library, it is unlikely that we will be able to help you, unless the issue is very simple to fix.

**Tier 2** indicates some level of confidence on our side that the AD library will work, because it is included as part of DynamicPPL's continuous integration (CI) tests.
Copy link
Member

@mhauru mhauru Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Tier 2** indicates some level of confidence on our side that the AD library will work, because it is included as part of DynamicPPL's continuous integration (CI) tests.
**Tier 2** indicates some level of confidence on our side that the AD library will work, because it is included as part of Turing's continuous integration (CI) tests.

Since these are user-facing docs, I think we can't assume that the reader knows what DPPL is. If Turing sounds too much like the Turing.jl repo then we could also be more ambiguous with something like "our CI tests". Also nice not to have to reedit this if we just move some tests around.

This may be either due to upstream bugs / limitations (which exist even for ForwardDiff), or simply because of time constraints.
However, if there are workarounds that can be implemented in Turing to make the backend work, we will try to do so.

**Tier 3** is the same as Tier 2, but in addition to that, we formally also take responsibility for ensuring that the backend works with Turing models.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Tier 3** is the same as Tier 2, but in addition to that, we formally also take responsibility for ensuring that the backend works with Turing models.
**Tier 3** is the same as Tier 2, but in addition to that, we also take responsibility for ensuring that the backend works with Turing models.

I felt like the word "formally" wasn't adding anything.

Comment on lines +76 to +78
**Tier 3** is the same as Tier 2, but in addition to that, we formally also take responsibility for ensuring that the backend works with Turing models.
If you submit an issue about using Turing with a Tier 3 library, we will actively try to make it work.
Realistically, this is only possible for AD backends that are actively maintained by somebody on the Turing team, such as Mooncake.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could add a paragraph above or below the list of tiers explaining what it means that a backend works with Turing? It's mostly relevant for Tier 3, but it could be a good clarification more generally too. Something like

When we say that an AD backend works with Turing, we mean that it is able to differentiate any Turing model that depends only on Turing and some common Julia standard library modules such as LinearAlgebra. Note that a Turing model can include arbitrary Julia code, which can involve code dependencies on other packages such as differential equation solvers or external calls using ccall (should something else be added to the list of exclusions?). If a Tier 2 or 3 AD backend fails on a Turing model because of such features, we may still be able to help you out in some cases, but we may also consider the problem to be outside our control or scope.

Good to also keep in mind that while it's nice to be clear and explicit about our thinking, it's not a legal contract and we don't have to be suuuuper precise about our wording on what we commit to fixing. It's all still subject to the usual uncertainties of academic funding and time anyway.

@penelopeysm penelopeysm self-assigned this Apr 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants