Formalise AD integration status, rewrite AD page correspondingly #595

penelopeysm · 2025-03-28T16:13:48Z

This is an initial attempt to put down in words what @willtebbutt and I think our approach to integrating AD backends should be going forward.

github-actions · 2025-03-29T18:55:42Z

Preview the changes: https://turinglang.org/docs/pr-previews/595
Please avoid using the search feature and navigation bar in PR previews!

willtebbutt

Some thoughts.

usage/automatic-differentiation/index.qmd

willtebbutt · 2025-04-04T08:52:06Z

usage/automatic-differentiation/index.qmd

+**Tier 3** is the same as Tier 2, but in addition to that, we formally also take responsibility for ensuring that the backend works with Turing models.
+If you submit an issue about using Turing with a Tier 3 library, we will actively try to make it work.
+Realistically, this is only possible for AD backends that are actively maintained by somebody on the Turing team, such as Mooncake.


What are the limits to how far we're willing to take this? Per our discussion yesterday, if someone does something really non-differentiable (e.g. a custom ccall), we're not going to try and add support for their proposal.

Maybe "we will actively try to make it work" could be extended to say how we'll try to make it work? e.g. if someone encounters a bug, we'll fix it, but if they're doing something unusual we might suggest a more standard way to go about it that avoids the problem they're seeing entirely.

Maybe we could add a paragraph above or below the list of tiers explaining what it means that a backend works with Turing? It's mostly relevant for Tier 3, but it could be a good clarification more generally too. Something like

When we say that an AD backend works with Turing, we mean that it is able to differentiate any Turing model that depends only on Turing and some common Julia standard library modules such as LinearAlgebra. Note that a Turing model can include arbitrary Julia code, which can involve code dependencies on other packages such as differential equation solvers or external calls using ccall (should something else be added to the list of exclusions?). If a Tier 2 or 3 AD backend fails on a Turing model because of such features, we may still be able to help you out in some cases, but we may also consider the problem to be outside our control or scope.

Good to also keep in mind that while it's nice to be clear and explicit about our thinking, it's not a legal contract and we don't have to be suuuuper precise about our wording on what we commit to fixing. It's all still subject to the usual uncertainties of academic funding and time anyway.

willtebbutt · 2025-04-04T08:54:49Z

usage/automatic-differentiation/index.qmd

+Firstly, you could broaden the type of the container:
+
+```{julia}
+@model function forwarddiff_working1()
+    x = Real[0.0, 1.0]
+    a ~ Normal()
+    x[1] = a
+    b ~ MvNormal(x, I)
+end
+sample(forwarddiff_working1(), NUTS(; adtype=AutoForwardDiff()), 10)
+```
+
+Or, you can pass a type as a parameter to the model:
+
+```{julia}
+@model function forwarddiff_working2(::Type{T}=Float64) where T
+    x = T[0.0, 1.0]
+    a ~ Normal()
+    x[1] = a
+    b ~ MvNormal(x, I)
+end
+sample(forwarddiff_working2(), NUTS(; adtype=AutoForwardDiff()), 10)
+```


Would it be helpful for users to make it clear that the second option here is highly preferable to the first in general, and that the first should only be used if the second doesn't work for some reason?

Yes, definitely, and a link to https://discourse.julialang.org/t/vector-real-vector-float64-methoderror/25926/5 might also be helpful (it helped me back in the day)

Co-authored-by: Will Tebbutt <[email protected]>

JuliaRegistries/General#128301

mhauru · 2025-04-07T10:41:15Z

usage/automatic-differentiation/index.qmd

+### Usable AD Backends
+
+Turing.jl uses the functionality in [DifferentiationInterface.jl](https://github.com/JuliaDiff/DifferentiationInterface.jl) ('DI') to interface with AD libraries in a unified way.
+Thus, in principle, any AD library that has integrations with DI can be used with Turing; you should consult the [DI documentation](https://juliadiff.org/DifferentiationInterface.jl/DifferentiationInterface/stable/) for an up-to-date list of compatible AD libraries.


The plural in "integrations with DI" feels funny to me.

mhauru · 2025-04-07T10:44:13Z

usage/automatic-differentiation/index.qmd

+| 1                | Yes           | No                      | 'You're on your own'             | Enzyme, Zygote           |
+| 0                | No            | No                      | 'You can't use this'             |                          |
+
+**Tier 0** means that the AD library is not integrated with DI, and thus will not work with Turing.


Suggested change

**Tier 0** means that the AD library is not integrated with DI, and thus will not work with Turing.

**Tier 0** means that the AD library is not integrated with DI, and thus will not work with Turing, or is known to have serious enough issues when used with Turing to render it useless.

To cover cases like what might happen with Zygote, where we know that it won't with any Turing models, so don't bother trying.

mhauru · 2025-04-07T10:46:02Z

usage/automatic-differentiation/index.qmd

+**Tier 1** means that the AD library is integrated with DI, and you can try to use it with Turing if you like; however, we provide no guarantee that it will work correctly.
+If you submit an issue about using Turing with a Tier 1 library, it is unlikely that we will be able to help you, unless the issue is very simple to fix.
+
+**Tier 2** indicates some level of confidence on our side that the AD library will work, because it is included as part of DynamicPPL's continuous integration (CI) tests.


Suggested change

**Tier 2** indicates some level of confidence on our side that the AD library will work, because it is included as part of DynamicPPL's continuous integration (CI) tests.

**Tier 2** indicates some level of confidence on our side that the AD library will work, because it is included as part of Turing's continuous integration (CI) tests.

Since these are user-facing docs, I think we can't assume that the reader knows what DPPL is. If Turing sounds too much like the Turing.jl repo then we could also be more ambiguous with something like "our CI tests". Also nice not to have to reedit this if we just move some tests around.

mhauru · 2025-04-07T10:48:12Z

usage/automatic-differentiation/index.qmd

+This may be either due to upstream bugs / limitations (which exist even for ForwardDiff), or simply because of time constraints.
+However, if there are workarounds that can be implemented in Turing to make the backend work, we will try to do so.
+
+**Tier 3** is the same as Tier 2, but in addition to that, we formally also take responsibility for ensuring that the backend works with Turing models.


Suggested change

**Tier 3** is the same as Tier 2, but in addition to that, we formally also take responsibility for ensuring that the backend works with Turing models.

**Tier 3** is the same as Tier 2, but in addition to that, we also take responsibility for ensuring that the backend works with Turing models.

I felt like the word "formally" wasn't adding anything.

mhauru · 2025-04-07T11:06:44Z

usage/automatic-differentiation/index.qmd

+**Tier 3** is the same as Tier 2, but in addition to that, we formally also take responsibility for ensuring that the backend works with Turing models.
+If you submit an issue about using Turing with a Tier 3 library, we will actively try to make it work.
+Realistically, this is only possible for AD backends that are actively maintained by somebody on the Turing team, such as Mooncake.


Maybe we could add a paragraph above or below the list of tiers explaining what it means that a backend works with Turing? It's mostly relevant for Tier 3, but it could be a good clarification more generally too. Something like

When we say that an AD backend works with Turing, we mean that it is able to differentiate any Turing model that depends only on Turing and some common Julia standard library modules such as LinearAlgebra. Note that a Turing model can include arbitrary Julia code, which can involve code dependencies on other packages such as differential equation solvers or external calls using ccall (should something else be added to the list of exclusions?). If a Tier 2 or 3 AD backend fails on a Turing model because of such features, we may still be able to help you out in some cases, but we may also consider the problem to be outside our control or scope.

Good to also keep in mind that while it's nice to be clear and explicit about our thinking, it's not a legal contract and we don't have to be suuuuper precise about our wording on what we commit to fixing. It's all still subject to the usual uncertainties of academic funding and time anyway.

penelopeysm changed the title ~~AD formalism~~ Formalise AD integration status, rewrite AD page correspondingly Mar 28, 2025

penelopeysm force-pushed the py/ad branch 2 times, most recently from 23469c6 to e594515 Compare March 29, 2025 17:02

Formalise AD integration status

33f12c1

penelopeysm force-pushed the py/ad branch from e594515 to 33f12c1 Compare April 2, 2025 10:40

penelopeysm added 3 commits April 2, 2025 11:48

Add ForwardDiff thing

7dcbb90

Expand ForwardDiff thing

41c2974

Update Manifest

b33d0ae

penelopeysm requested review from mhauru and willtebbutt April 3, 2025 15:50

willtebbutt reviewed Apr 4, 2025

View reviewed changes

penelopeysm and others added 2 commits April 4, 2025 12:58

Apply suggestions from code review

747ddcd

Co-authored-by: Will Tebbutt <[email protected]>

Update Manifest again

753387d

JuliaRegistries/General#128301

mhauru reviewed Apr 7, 2025

View reviewed changes

penelopeysm self-assigned this Apr 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Formalise AD integration status, rewrite AD page correspondingly #595

Formalise AD integration status, rewrite AD page correspondingly #595

penelopeysm commented Mar 28, 2025 •

edited

Loading

github-actions bot commented Mar 29, 2025

willtebbutt left a comment

willtebbutt Apr 4, 2025

mhauru Apr 7, 2025

willtebbutt Apr 4, 2025

penelopeysm Apr 4, 2025

mhauru Apr 7, 2025

mhauru Apr 7, 2025

mhauru Apr 7, 2025 •

edited

Loading

mhauru Apr 7, 2025

mhauru Apr 7, 2025

	Tier 0 means that the AD library is not integrated with DI, and thus will not work with Turing.
	Tier 0 means that the AD library is not integrated with DI, and thus will not work with Turing, or is known to have serious enough issues when used with Turing to render it useless.

	Tier 2 indicates some level of confidence on our side that the AD library will work, because it is included as part of DynamicPPL's continuous integration (CI) tests.
	Tier 2 indicates some level of confidence on our side that the AD library will work, because it is included as part of Turing's continuous integration (CI) tests.

	Tier 3 is the same as Tier 2, but in addition to that, we formally also take responsibility for ensuring that the backend works with Turing models.
	Tier 3 is the same as Tier 2, but in addition to that, we also take responsibility for ensuring that the backend works with Turing models.

Formalise AD integration status, rewrite AD page correspondingly #595

Are you sure you want to change the base?

Formalise AD integration status, rewrite AD page correspondingly #595

Conversation

penelopeysm commented Mar 28, 2025 • edited Loading

github-actions bot commented Mar 29, 2025

willtebbutt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mhauru Apr 7, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

penelopeysm commented Mar 28, 2025 •

edited

Loading

mhauru Apr 7, 2025 •

edited

Loading