Adaptive explicit extrapolation #714

AlthausKonstantin · 2019-04-06T12:02:48Z

This are two implementations of the adaptive extrapolation of the explicit midpoint rule (Similar to ODEX and DIFEX ).
I developed the original code for my bachelor's thesis and would like to add it now to the JuliaDiffEq ecosystem.

This is the first working version of the algorithms and I would love to get some feedback:
Are my additions ok? Is there something I have to change? Are there some essential options from the common interface I should implement?

Since it is the first time that I contribute to an open source project, I would also be grateful to get some tips, which tests I should implement.

Thanks!

…thm ExtrapolationMidpointDeuflhard

…xtrapolation_caches.jl. Small adjustments in algorithms.jl and alg_utils.jl

Integration of `ExtrapolationMidpointDeuflhard` algorithm using a constant cache. Stepsize control does not work properly.

ExtrapolationMidpointDeuflhard() with a constant cache.

…and ExtrapolationMidpointHairerWanner

src/alg_utils.jl

ChrisRackauckas · 2019-04-06T13:38:46Z

src/alg_utils.jl

@@ -339,16 +346,22 @@ beta2_default(alg::FunctionMap) = 0
 beta2_default(alg::DP8) = 0//1
 beta2_default(alg::DP5) = 4//100
 beta2_default(alg::DP5Threaded) = 4//100
+beta2_default(alg::ExtrapolationMidpointDeuflhard) = 0//1


Is there an issue with PI adaptivity here?

ChrisRackauckas · 2019-04-06T13:43:42Z

src/caches/extrapolation_caches.jl

+  subdividing_sequence::Array{BigInt,1}  # subdividing_sequence[n] is used for the (n -1)th internal discretisation
+  stage_number::Array{Int,1} # Stage_number[n] contains information for extrapolation order (n + alg.n_min - 1)
+  # Weights and Scaling factors for extrapolation operators
+  extrapolation_weights::Array{Rational{BigInt},2}


Do these need to be Rational? Maybe it doesn't matter all that much, but Bigs are allocating, so it would be worthwhile to profile if the calculations on these actually end up being costly, and whether just having them as Float64 may be numerically stable enough to get the job done. If Float64 is enough, then typing these to make t or u would make sense.

I haven't played with Floats here so far.

The good thing about theBigInts is that the weights are converted in perform_step to the type of u. That allows to adapt the precision of the problem.
I'd suspect that if one uses Float64 weights and say aBigFloat representation for the problem, the approximation error will not get any better than eps(Float64).

What about just converting to u's type here?

Ah I see - the caches are recreated for every call of solve, right? Yes, then the type of u should be an option.
I guess I will set up an experiment to check the stability of the weights!

BTW: Would it be possible to move the weights into the algorithm's struct? That way the computation of the tables for every solve call would be bypassed..

ChrisRackauckas · 2019-04-06T13:45:25Z

src/caches/extrapolation_caches.jl

+function alg_cache(alg::ExtrapolationMidpointDeuflhard,u,rate_prototype,uEltypeNoUnits,uBottomEltypeNoUnits,tTypeNoUnits,uprev,uprev2,f,t,dt,reltol,p,calck,::Type{Val{false}})
+  @unpack n_min, n_init, n_max = alg
+
+  QType = tTypeNoUnits <: Integer ? typeof(qmin_default(alg)) : tTypeNoUnits # Cf. DiffEqBase.__init in solve.jl


peculiar that the type of this is needed. I'll see where this goes...

ChrisRackauckas · 2019-04-06T13:47:15Z

src/caches/extrapolation_caches.jl

+
+  # Compute scaling factors for extrapolation operators
+  extrapolation_scalars = -nodes[1] * [BigInt(1); extrapolation_scalars_2]
+


Because the amount of pre-computations is based on n_max, it actually matters in small enough settings. That is something that may be worth remembering to document, or at least test a little bit. BTW, at when n_max does overflow start to occur in the sequences?

Ok, I will mention this in the documentation.

As far as I've tested the constructor, I have not been able to produce an overflow (for any sequence) if using BigInt. As I replaced BigInt by Int, I got an overflow for the Romberg sequence and n_max = 6

ChrisRackauckas · 2019-04-06T13:48:37Z

src/caches/extrapolation_caches.jl

+  extrapolation_scalars_2::Array{Rational{BigInt},1}
+end
+
+function alg_cache(alg::ExtrapolationMidpointHairerWanner,u,rate_prototype,uEltypeNoUnits,uBottomEltypeNoUnits,tTypeNoUnits,uprev,uprev2,f,t,dt,reltol,p,calck,::Type{Val{false}})


I feel like some refactoring can make this re-use a lot of code from the previous cache function. That's a second order issue that we can handle later though.

That's true. Am I "allowed" to define a function producing the weights etc. somewhere? If so where?

just do it in another part of this file, and then call it from in here.

Ok I will change that!

ChrisRackauckas · 2019-04-06T13:49:47Z

src/caches/extrapolation_caches.jl

+  cc = alg_cache(alg,u,rate_prototype,uEltypeNoUnits,uBottomEltypeNoUnits,tTypeNoUnits,uprev,uprev2,f,t,dt,reltol,p,calck,Val{false})
+
+  ExtrapolationMidpointDeuflhardCache(u_tilde, u_temp1, u_temp2, tmp, T, res, fsalfirst, k,
+      cc.Q, cc.n_curr, cc.n_old, cc.subdividing_sequence, cc.stage_number, cc.extrapolation_weights, cc.extrapolation_scalars, cc.extrapolation_weights_2, cc.extrapolation_scalars_2)


it might be easier to just stick the cc in here, using composition to improve code re-use. But again with the other comment, optimizing code simplicity and re-use can come in a later PR.

Here I decided against that to make the call @unpack subdividing_sequence = cache and such possible for both caches.

Would you prefer an if - else statement before calling @unpack in order to check if a cache is mutable or not?

ChrisRackauckas · 2019-04-06T13:51:58Z

src/integrators/integrator_utils.jl

+    integrator.opts.beta1 = typeof(integrator.opts.beta1)(1 // (2integrator.cache.n_curr + 1))
+    integrator.opts.gamma = typeof(integrator.opts.gamma)(1 // 4)^integrator.opts.beta1
+    # Compute new stepsize scaling
+    qtmp = integrator.EEst^integrator.opts.beta1 / integrator.opts.gamma


I see why you got rid of beta2. Something that would be worthwhile later would be to try and make this use PI instead of P, since step stabilization will help a method based on explicit RKs take larger steps.

But hmm, yes anything based on a work estimate is hard to stabilize.

ChrisRackauckas · 2019-04-06T13:58:40Z

src/integrators/integrator_utils.jl

+  else
+    # Initialize
+    @unpack t,EEst = integrator
+    tol = integrator.opts.internalnorm(integrator.opts.reltol,t) # Deuflhard's approach relies on EEstD ≈ ||relTol||


This might have weird interactions with element-wise tolerances, but does work.

That is a problem I have not figured out yet. The thing is that in Deuflhard's approach some sort of notion of a relative tolerance (< 1 !) is needed for the convergence monitor and stepsize prediction.

I guess this forces the user in the case of vector valued rel. and abs. tolerance to think of these vectors as scaling vectors multiplied by tol.

So far I haven't played with element-wise tolerances and don't know what the effects are...

ChrisRackauckas · 2019-04-06T14:13:58Z

What's the reference you used for the stepping behaviors?

AlthausKonstantin · 2019-04-06T17:53:21Z

What's the reference you used for the stepping behaviors?

What do you mean exactly - the literature for the adaptivity?

ChrisRackauckas · 2019-04-06T20:25:54Z

What's the literature you used for the implementation? That's what I meant.

AlthausKonstantin · 2019-04-07T08:02:06Z

What's the literature you used for the implementation? That's what I meant.

My two main sources have been the book of Hairer, Wanner and Nørsett (Solving Ordinary Differential Equations I)
and Deuflahrd's paper (Order and stepsize control in extrapolation methods).
I also sketched their reasoning - as I understood it and based my code on - in my thesis (I added it to my repository).
Does that help?

AlthausKonstantin · 2019-04-22T11:04:16Z

Switch broadcasts to @.. #716

Does this also apply to expressions like broadcast(+, A, B) or only to @.?

AlthausKonstantin · 2019-04-22T11:27:11Z

Use a version of the problem where u0 is BigFloat so that way you are using arbitrary precision arithmetic, and test convergence of high orders with that. That is also a good test that the implementation isn't relying on Float64 anywhere.

Thank you!
I adapted the tests. But curiously enough it seems to be vital for my methods that also the time span is given by two BigFloat if the subdividing sequence happens to be different from the Romberg sequence (See the convergence plots in the updated notebook).

Why this behavior depends on the chosen subdividing sequence is not really clear to me...

ChrisRackauckas · 2019-04-27T12:32:51Z

Does this also apply to expressions like broadcast(+, A, B) or only to @.?

All if possible. @.. is just a faster form of broadcast since it removes some aliasing assumptions when the broadcast is supposed to match a map.

ChrisRackauckas · 2019-04-27T13:17:31Z

I handled the merge conflicts. We may want to just merge the PR once I check out the convergence tests are all good, and then future updates can take place in follow up PRs. It's much easier to keep doing small PRs than it is to keep a branch from going stale.

codecov · 2019-04-27T13:47:31Z

Codecov Report

Merging #714 into master will decrease coverage by 0.38%.
The diff coverage is 46.48%.

@@            Coverage Diff             @@
##           master     #714      +/-   ##
==========================================
- Coverage   72.01%   71.63%   -0.39%     
==========================================
  Files          93       93              
  Lines       28806    29247     +441     
==========================================
+ Hits        20746    20952     +206     
- Misses       8060     8295     +235

Impacted Files	Coverage Δ
src/OrdinaryDiffEq.jl	`100% <ø> (ø)`	⬆️
src/integrators/controllers.jl	`40.38% <0%> (-50.92%)`	⬇️
src/caches/extrapolation_caches.jl	`95.5% <100%> (+14.55%)`	⬆️
src/perform_step/extrapolation_perform_step.jl	`61.72% <45.66%> (-34.86%)`	⬇️
src/alg_utils.jl	`32.93% <50%> (+0.49%)`	⬆️
src/algorithms.jl	`96.41% <75%> (-1.86%)`	⬇️
src/initdt.jl	`94.52% <0%> (+1.36%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b65159...ec087d7. Read the comment docs.

coveralls · 2019-04-27T13:47:36Z

Coverage increased (+5.5%) to 75.053% when pulling ec087d7 on AlthausKonstantin:explicit_extrapolation into 3b65159 on JuliaDiffEq:master.

ChrisRackauckas · 2019-04-27T14:11:10Z

My guess is that something odd happens in a spot like j_int = 2Int64(subdividing_sequence[i+1]), but yeah that bigfloat subdividing sequence issue is a bit hard to track down. I'll open an issue on it. It's not the worst thing in the world since the majority users use Float64, but it would be nice to find out sometime in the future.

sum( broadcast(*, T[1:(i+1)], eltype(uprev).(extrapolation_weights[1:(i+1), (i+1)])) ) that expression allocates a bit and should be changed to using generators instead of broadcast, but that's probably not a big deal so it can be handled in a future PR.

I'll merge if tests pass and we can continue from there. Thanks! This is quite a nice PR, and we will want to continue down the path of https://arxiv.org/abs/1305.6165 . With Julia's new multithreading system this will be pretty exciting.

One contribution you may want to think about is the implicit methods. Since you have the sequences now, swapping out for the implicit parts shouldn't be too bad: just use the nonlinear solvers already in OrdinaryDiffEq. Doing this with the parallelism was attempted in Python, but of course Python got in the way: numerical-mathematics/extrapolation#27 . @YingboMa and I have interesting test cases where BLAS multithreading doesn't speed up the calculation (and in fact slows it down), so taking better control of the parallelism can help on some 50ish ODE very implicit systems.

ChrisRackauckas · 2019-04-27T14:19:38Z

This commit introduced a binary and needs to be deleted somehow: 6ddcfcb , otherwise it will sit and clog and repo history.

…ntHairerWanner

ChrisRackauckas · 2019-04-27T14:33:49Z

I cleaned it up and removed the binary from the history. If all went well, then tests will pass again and I'll merge.

AlthausKonstantin · 2019-04-27T14:49:31Z

Okay, thank you so much for all the help!
After this merges I'd also like to contribute to the documentation. Should I just open another pull request in the DiffEqDocs with a reference to this PR?
Same for the Benchmarking, should I add this algorithms to the NonStiffODE notebooks?

ChrisRackauckas · 2019-04-27T14:57:59Z

After this merges I'd also like to contribute to the documentation. Should I just open another pull request in the DiffEqDocs with a reference to this PR?

Yes

Same for the Benchmarking, should I add this algorithms to the NonStiffODE notebooks?

Everything is moving away from notebooks. We're in the middle of a benchmark overhaul. See SciML/SciMLBenchmarks.jl#32

ChrisRackauckas · 2019-04-28T17:54:59Z

`ExtrapolationMidpointDeuflhard` and `ExtrapolationMidpointHairerWanner`

ExtrapolationMidpointDeuflhard and ExtrapolationMidpointHairerWanner adapt order
and stepsize. Both extrapolate the explicit midpoint rule and differ only in the
implementation of the adaptive behavior.

Arguments for Initializing the Algorithms

Both methods accept the same set of arguments:

min_extrapolation_order and max_extrapolation_order
restrict the range of values the adaptively controlled extrapolation
order can attain.
Since the order control mechanism relies on information from the previous time
step, the order of extrapolation for the very first timestep is provided externally
by init_extrapolation_order.
- ExtrapolationMidpointDeuflhard uses by default an order range of 1:10 and
  starts with init_extrapolation_order = 5.
- ExtrapolationMidpointHairerWanner works similarly but uses a default range of
  2:10 since it requires
  2 <= min_extrapolation_order <= init_extrapolation_order < max_extrapolation_order.
sequence_symbol specifies the step-number sequences, also called the subdividing
sequence, for both algorithms.
Possible values for this flag are :harmonic, :romberg or :bulirsch.

To override the default values use keyword arguments like shown above for the
AitkenNevillie algorithm.

If you specify values that do not make sense, e.g.
init_extrapolation_order = 3, min_extrapolation_order = 4 or
sequence_symbol = :foo, the constructor will correct the input and warn you
it did so.

Finally note that the extrapolation order and the order of the method do not
coincide. For both algorithms an extrapolation order of n translates to an error of
of order 2(n+1).

Details on the Implementation

As mentioned above, the explicit midpoint rule is the internal method which is
extrapolated. Since it is not a one step method, it is started with one explicit
Euler step.

The actual extrapolation is not achieved by the
algorithm of Aitken and Neville
but by usage of the First Barycentric Formula (A short overview of Barycentric
Interpolation
is given in this paper of Berrut
and Trefethen).
This means that scalars and weights necessary for carrying out the extrapolation are
precomputed and tabulated once at the beginning of the call of solve.
From there the
actual extrapolation performed in each timestep is achieved by computing a linear
combination of the internal discretizations that have been obtained by the explicit
midpoint rule.
Thus initializing the precomputed values requires time and storage of order
max_extrapolation_order ^ 2 but the time computing the extrapolation for some timestep
scales only linearly in max_extrapolation_order.

Since the algorithms do not use the Aitken Neville scheme for extrapolation, they
also use a different embedded approximation than other extrapolation methods, e.g.
ODEX, do.
Using a notation common for the algorithm of Aitken Neville, let the internal
discretizations be given by T[1:n, 1]. Then ODEX would use the entry T[n,n] as
the new solution and estimate the local error with approximation given by T[n,n-1]
(cf. equation (9.23) in
Solving Ordinary Differential Equations I
of Hairer, Wanner and Nørsett).
Since for the two algorithms presented here only the entries T[i,i] for i = 1:n
are accessible, they use the value T[n-1,n-1] for error estimation instead.

Konstantin Althaus added 14 commits March 1, 2019 13:06

Adapted algorithms.jl, alg_utils.jl, OrdinaryDiffEg.jl for new algori…

b447213

…thm ExtrapolationMidpointDeuflhard

Added first version of caches for ExtrapolationMidpointDeuflhard in e…

a2f6adb

…xtrapolation_caches.jl. Small adjustments in algorithms.jl and alg_utils.jl

Minor corrections in alg_utils.jl and extrapolation_caches.jl

2990b2c

Micro-commit of implementing ExtrapolationMidpointDeuflhard.

2b51658

First solver

78c8bdb

Integration of `ExtrapolationMidpointDeuflhard` algorithm using a constant cache. Stepsize control does not work properly.

Debugged first version of

68fbad8

ExtrapolationMidpointDeuflhard() with a constant cache.

Cleaned code and comments

a49313b

Added ExtrapolationMidpointDeuflhard with mutable cache

3374c35

Added handling of non-adaptive behavior.

257ab84

Polished notation

5a1c2d9

Added ExtrapolationMidpointHairerWanner algorithm

e7ceaaa

Added warnings to the constructors of ExtrapolationMidpointDeuflhard …

781ac67

…and ExtrapolationMidpointHairerWanner

Export ExtrapolationMidpointHairerWanner in OrdinaryDiffEq.jl

4d48247

cleaned up test

3fb696b

ChrisRackauckas reviewed Apr 6, 2019

View reviewed changes

src/alg_utils.jl Outdated Show resolved Hide resolved

ChrisRackauckas reviewed Apr 6, 2019

View reviewed changes

Konstantin Althaus added 3 commits April 7, 2019 10:24

Corrected get_current_adaptive_order

36be1d9

Refactored ExtrapolationMidpointDeuflhardConstantCache

56150bc

Microcommit, WIP

a2e15d6

ChrisRackauckas approved these changes Apr 27, 2019

View reviewed changes

ChrisRackauckas and others added 13 commits April 27, 2019 10:22

Fixed microcommit

e7f9383

Refactored all ExtrapolationMidpoint* caches

0012a27

Cleaned repository

3bb27e5

Changed u_tilde to utilde

0fb345f

Corrected refactoring og ExplicitMidpoint* caches

0d51a01

Cleaned up comments

24c8cd3

Added first version of convergence tests

3a2a4a9

Cleaned ode_extrapolation_tests.jl

d99152f

Added handling of invalid inputs for the argument sequence_symbol

9b3ed55

Corrected selection of minimal possible order for ExtrapolationMidpoi…

aab7438

…ntHairerWanner

Provided test problems where the timespan is given by BigFloats

2b4626f

Merge refactors into extrapolation

ed420fb

Use @..

e080e64

ChrisRackauckas force-pushed the explicit_extrapolation branch from f0a3cbf to e080e64 Compare April 27, 2019 14:29

Merge branch 'master' into explicit_extrapolation

ec087d7

ChrisRackauckas merged commit 1d022d5 into SciML:master Apr 27, 2019

AlthausKonstantin mentioned this pull request Apr 28, 2019

Documentation for explicit extrapolation methods SciML/DiffEqDocs.jl#221

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adaptive explicit extrapolation #714

Adaptive explicit extrapolation #714

AlthausKonstantin commented Apr 6, 2019

ChrisRackauckas Apr 6, 2019

ChrisRackauckas Apr 6, 2019 •

edited

Loading

AlthausKonstantin Apr 6, 2019

ChrisRackauckas Apr 6, 2019

AlthausKonstantin Apr 7, 2019

ChrisRackauckas Apr 6, 2019

ChrisRackauckas Apr 6, 2019

AlthausKonstantin Apr 6, 2019

ChrisRackauckas Apr 6, 2019

AlthausKonstantin Apr 6, 2019 •

edited

Loading

ChrisRackauckas Apr 6, 2019

AlthausKonstantin Apr 7, 2019

ChrisRackauckas Apr 6, 2019

AlthausKonstantin Apr 6, 2019

ChrisRackauckas Apr 6, 2019

ChrisRackauckas Apr 6, 2019

ChrisRackauckas Apr 6, 2019

AlthausKonstantin Apr 6, 2019

ChrisRackauckas commented Apr 6, 2019

AlthausKonstantin commented Apr 6, 2019

ChrisRackauckas commented Apr 6, 2019

AlthausKonstantin commented Apr 7, 2019

AlthausKonstantin commented Apr 22, 2019 •

edited

Loading

AlthausKonstantin commented Apr 22, 2019

ChrisRackauckas commented Apr 27, 2019

ChrisRackauckas commented Apr 27, 2019

codecov bot commented Apr 27, 2019 •

edited

Loading

coveralls commented Apr 27, 2019 •

edited

Loading

ChrisRackauckas commented Apr 27, 2019

ChrisRackauckas commented Apr 27, 2019

ChrisRackauckas commented Apr 27, 2019

AlthausKonstantin commented Apr 27, 2019

ChrisRackauckas commented Apr 27, 2019

ChrisRackauckas commented Apr 28, 2019


		# Compute scaling factors for extrapolation operators
		extrapolation_scalars = -nodes[1] * [BigInt(1); extrapolation_scalars_2]

Adaptive explicit extrapolation #714

Adaptive explicit extrapolation #714

Conversation

AlthausKonstantin commented Apr 6, 2019

Choose a reason for hiding this comment

ChrisRackauckas Apr 6, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlthausKonstantin Apr 6, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChrisRackauckas commented Apr 6, 2019

AlthausKonstantin commented Apr 6, 2019

ChrisRackauckas commented Apr 6, 2019

AlthausKonstantin commented Apr 7, 2019

AlthausKonstantin commented Apr 22, 2019 • edited Loading

AlthausKonstantin commented Apr 22, 2019

ChrisRackauckas commented Apr 27, 2019

ChrisRackauckas commented Apr 27, 2019

codecov bot commented Apr 27, 2019 • edited Loading

Codecov Report

coveralls commented Apr 27, 2019 • edited Loading

ChrisRackauckas commented Apr 27, 2019

ChrisRackauckas commented Apr 27, 2019

ChrisRackauckas commented Apr 27, 2019

AlthausKonstantin commented Apr 27, 2019

ChrisRackauckas commented Apr 27, 2019

ChrisRackauckas commented Apr 28, 2019

ExtrapolationMidpointDeuflhard and ExtrapolationMidpointHairerWanner

Arguments for Initializing the Algorithms

Details on the Implementation

ChrisRackauckas Apr 6, 2019 •

edited

Loading

AlthausKonstantin Apr 6, 2019 •

edited

Loading

AlthausKonstantin commented Apr 22, 2019 •

edited

Loading

codecov bot commented Apr 27, 2019 •

edited

Loading

coveralls commented Apr 27, 2019 •

edited

Loading

`ExtrapolationMidpointDeuflhard` and `ExtrapolationMidpointHairerWanner`