Improve Robustness and Performance #130

wesselb · 2021-01-02T13:11:33Z

master:

julia> m = central_fdm(6, 1); @benchmark $m(sin, 1)
BenchmarkTools.Trial:
  memory estimate:  1.02 KiB
  allocs estimate:  56
  --------------
  minimum time:     2.220 μs (0.00% GC)
  median time:      2.307 μs (0.00% GC)
  mean time:        2.491 μs (1.43% GC)
  maximum time:     182.601 μs (98.30% GC)
  --------------
  samples:          10000
  evals/sample:     9

This PR:

julia> m = central_fdm(6, 1); @benchmark $m(sin, 1)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     451.449 ns (0.00% GC)
  median time:      466.278 ns (0.00% GC)
  mean time:        479.135 ns (0.00% GC)
  maximum time:     1.387 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     198

The current implementation is suboptimal in several ways:

The function f is evaluated at x and possibly again around x to estimate the round-off error. This can be avoided by reusing the function evaluations of the bound estimator.
The bound estimator may be run twice if it returns zero identically, which requires even more function evaluations. This can be avoided by increasing the order of the bound estimator by one so that it can estimate the derivative in a neighbourhood of x instead of just at x.
The functions estimate_magnitude and estimate_roundoff_error, whilst fine in principle, feel a little ad hoc and complicate the logic.

This PR gets rid of estimate_magnitude and estimate_roundoff_error entirely by more cleverly using the function evaluations of the bound estimator (_estimate_magnitudes). Moreover, the PR makes sure that unnecessary allocations are avoided where possible.

Improvements:

Reduce number of allocations.
Reduce number of function evaluations and ensure that the number of function evaluations is always the same.
Improve robustness of adaptation.
Improve mechanism for step size capping.
Eliminate dependence on arbitrary thresholds and numbers.
Allow non-floats inextrapolate_fdm.
Automatically copy README.md to docs/src/index.md in docs/make.jl.

Additions:

More elaborate accuracy tests.
Expansion of README.
Resolves gradient of non-smooth functions #129: Add a max_range keyword that specifies how far f is allowed to be evaluated from x:

julia> m = central_fdm(5, 1, max_range=1e-6); grad(m, sum, ones(5))

Breaking changes:

Remove max_step keyword argument.

oxinabox · 2021-01-02T13:30:00Z

Maybe StaticArrays.jl, rather than tuples?

wesselb · 2021-01-04T14:25:09Z

With the current spaghetti code, allocations are down to zero, which achieves a 4x speed-up compared to master:

julia> using FiniteDifferences, BenchmarkTools

julia> m = central_fdm(10, 2, adapt=2)
@benchmark m(sinFiniteDifferenceMethod:
  order of method:       10
  order of derivative:   2
  grid:                  [-5, -4, -3, -2, -1, 1, 2, 3, 4, 5]
  coefficients:          [-0.011298500881834215, 0.11119929453262786, -0.4830357142857143, 1.1558201058201059, -0.7726851851851851, -0.7726851851851851, 1.1558201058201059, -0.4830357142857143, 0.11119929453262786, -0.011298500881834215]

julia> m(sin, 1) + sin(1)
3.8624659026709196e-13

julia> @benchmark $m(sin, 1)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     916.167 ns (0.00% GC)
  median time:      926.521 ns (0.00% GC)
  mean time:        1.027 μs (0.00% GC)
  maximum time:     3.898 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     48

master:

julia> m(sin, 1) + sin(1)
2.97983859809392e-13

julia> @benchmark $m(sin, 1)
BenchmarkTools.Trial:
  memory estimate:  1.80 KiB
  allocs estimate:  92
  --------------
  minimum time:     3.824 μs (0.00% GC)
  median time:      3.942 μs (0.00% GC)
  mean time:        4.196 μs (1.39% GC)
  maximum time:     207.103 μs (97.42% GC)
  --------------
  samples:          10000
  evals/sample:     8

test/methods.jl

oxinabox · 2021-01-08T17:11:03Z

we need to drop Julia-Nightly from the CI (out of scope for this PR)

docs/make.jl

src/methods.jl

oxinabox · 2021-01-08T17:29:00Z

src/methods.jl

+    factor::Float64
+    max_range::Float64
+    ∇f_magnitude_mult::Float64
+    f_error_mult::Float64


Would it make sense if a AdapteFiniteDifferenceMethod had just 2 fields:
A bounds_estimator and a inner that is an UnaddaptedFiniteDifferenceMethod ?

Then a bunch of methods for the AdapteFiniteDifferenceMethod would just delegate down to the inner -- sometimes passing in a bounds estimate.

Certainly a good alternative. The downside is that this would introduce a bunch of accessor methods like

_get_factor(m::UnadaptedFiniteDifferenceMethod) = m.factor _get_factor(m::AdaptedFiniteDifferenceMethod) = _get_factor(m.inner)

which are not necessary right now. I also considered this, but decided to copy, because that seemed like the simpler option and the duplication appears fairly harmless.

Happy to go with whichever option you find better.

Fair enough let's merge as is.
Can always change later

src/methods.jl

oxinabox · 2021-01-08T18:10:05Z

src/methods.jl

+    coefs = T.(coefs)
+    return sum(fs .* coefs) ./ T(step)^Q


I have a MWE
JuliaLang/julia#39151

oxinabox · 2021-01-08T18:11:21Z

src/methods.jl

+    # For high precision on the `\`, we use `Rational`, and to prevent overflows we use
+    # `BigInt`. At the end we go to `Float64` for fast floating point math, rather than
+    # rational math.
+    C = [Rational{BigInt}(g^i) for i in 0:(p - 1), g in grid]


was Int128 not doing it for us?

No, Int128 unfortunately hit overflow for orders bigger than 12.

src/methods.jl

oxinabox · 2021-01-08T18:35:44Z

src/methods.jl

+    step_max = m.max_range / maximum(abs.(m.grid))
+    if step > step_max
+        step = step_max
+        acc = NaN


is it documented somewhere what this means?

Also should we just stop computing accuracy?
Its never used nor exposed to the user AFAICT

The function estimate_step is exposed to the user, which promises to also estimate the accuracy. You're right that NaN isn't documented, which I'll add. (Edit: Added.)

Co-authored-by: Lyndon White <[email protected]>

wesselb · 2021-01-10T12:53:35Z

@oxinabox The ChainRules integration test fails because ChainRules and ChainRulesUtils both have a [compat] entry that does not allow version 0.12.0 for FiniteDifferences. I tried to fix this by removing the entry from the Project.toml in the cloned ChainRules repo in the GitHub workflow, but that fails because also ChainRulesUtils restricts FiniteDifferences. I there a way to tell Julia to ignore requirements for FiniteDifferences?

oxinabox · 2021-01-10T13:27:49Z

There is not.
At least not an easy one I know
We don't need them to pass.

wesselb · 2021-01-10T14:01:46Z

Fair enough, I guess we can just check this PR locally. Running the ChainRules tests locally, it appears that some tests fail due to singularities, but those failures are resolved by changing

const _fdm = central(5, 1)

to

const _fdm = central(5, 1; max_range=1e-2)

in src/ChainRulesTestUtils.jl. I'll make a PR.

Are we happy to merge this?

This reverts commit 9be39b8.

oxinabox · 2021-01-10T14:26:43Z

Are we happy to merge this?

Yes.

wesselb added 3 commits January 2, 2021 14:03

Differentiate between adapted and unadapted FDMs

df62b63

Use tuples to store coefs to reduce allocs

f826fa8

Make grids tuples

16ef690

wesselb added 9 commits January 2, 2021 14:34

Simplify construction of adapted methods

c3dd629

Add conversions

3f1b78b

Propagate geom keyword and remove redundant method

d4aa794

Make adaptation part of type and precompute more

b3017f1

First go at restructuring

ec78b97

Allow real inputs to extrapolate_fdm

8e36074

Add StaticArrays

cb63d6b

Rework and use SAs to get zero allocs

721bcd5

Fix tests

9bbbda4

wesselb added 10 commits January 4, 2021 16:20

Unspaghetti code

4fc61e2

Change kw_arg to arg to fix type inference

3d2e44f

Add BenchmarkTools as test dep

759cceb

Add test for allocations

e4bbc21

Reorganise tests

e96e7e1

Loosen very tight tolerances

942297a

Make max_range part of FDM struct

e92fb11

Fix custom grid

9c7cd09

Loosen tolerance

7c33edf

Loosen tolerances even more

ce7147d