Unexpected allocations with SVectors #518

SebastianM-C · 2018-10-07T17:30:44Z

I noticed that with out-of-place integration with SVectors the allocations increase with the integration time with save_everystep=false. MWE:

using DifferentialEquations
using StaticArrays
using BenchmarkTools

@inbounds @inline function ż(z, p, t)
    A, B, D = p
    p₀, p₂ = z[SVector{2}(1:2)]
    q₀, q₂ = z[SVector{2}(3:4)]

    return SVector{4}(
        -A * q₀ - 3 * B / √2 * (q₂^2 - q₀^2) - D * q₀ * (q₀^2 + q₂^2),
        -q₂ * (A + 3 * √2 * B * q₀ + D * (q₀^2 + q₂^2)),
        A * p₀,
        A * p₂
    )
end

@inbounds @inline function ṗ(p, q, params, t)
    A, B, D = params
    dp1 = -A * q[1] - 3 * B / √2 * (q[2]^2 - q[1]^2) - D * q[1] * (q[1]^2 + q[2]^2)
    dp2 = -q[2] * (A + 3 * √2 * B * q[1] + D * (q[1]^2 + q[2]^2))
    return SVector{2}(dp1, dp2)
end

@inbounds @inline function q̇(p, q, params, t)
    params.A * p
end

q0 = SVector{2}([0.0, -4.363920590485035])
p0 = SVector{2}([10.923918825236079, -5.393598858645495])
z0 = vcat(p0, q0)
p = (A=1,B=0.55,D=0.4)

tspan = (0., 10.)

prob1 = ODEProblem(ż, z0, tspan, p)
prob2 = DynamicalODEProblem(ṗ, q̇, p0, q0, tspan, p)

@btime solve($prob1, Vern9(), abstol=1e-14, reltol=1e-14);
@btime solve($prob2, DPRKN12(), abstol=1e-14, reltol=1e-14);
@btime solve($prob2, KahanLi8(), dt=1e-2, maxiters=1e10);

# 629.188 μs (39944 allocations: 1.55 MiB)
# 276.165 μs (17752 allocations: 534.27 KiB)
# 1.291 ms (110195 allocations: 3.05 MiB)

@btime solve($prob1, Vern9(), abstol=1e-14, reltol=1e-14, save_everystep=false);
@btime solve($prob2, DPRKN12(), abstol=1e-14, reltol=1e-14, save_everystep=false);
@btime solve($prob2, KahanLi8(), dt=1e-2, maxiters=1e10, save_everystep=false);

# 592.659 μs (38972 allocations: 1.20 MiB)
# 264.475 μs (17438 allocations: 462.94 KiB)
# 1.223 ms (108190 allocations: 2.76 MiB)

and increasing the integration time

tspan = (0., 100.)
prob1 = ODEProblem(ż, z0, tspan, p)
prob2 = DynamicalODEProblem(ṗ, q̇, p0, q0, tspan, p)

@btime solve($prob1, Vern9(), abstol=1e-14, reltol=1e-14);
@btime solve($prob2, DPRKN12(), abstol=1e-14, reltol=1e-14);
@btime solve($prob2, KahanLi8(), dt=1e-2, maxiters=1e10);

# 6.333 ms (395669 allocations: 15.43 MiB)
# 2.758 ms (177005 allocations: 4.80 MiB)
# 15.447 ms (1100082 allocations: 29.46 MiB)

@btime solve($prob1, Vern9(), abstol=1e-14, reltol=1e-14, save_everystep=false);
@btime solve($prob2, DPRKN12(), abstol=1e-14, reltol=1e-14, save_everystep=false);
@btime solve($prob2, KahanLi8(), dt=1e-2, maxiters=1e10, save_everystep=false);


@btime solve($prob1, Vern9(), abstol=1e-14, reltol=1e-14, save_everystep=false);
@btime solve($prob2, DPRKN12(), abstol=1e-14, reltol=1e-14, save_everystep=false);
@btime solve($prob2, KahanLi8(), dt=1e-2, maxiters=1e10, save_everystep=false);

# 5.910 ms (386012 allocations: 11.80 MiB)
# 2.650 ms (173906 allocations: 4.43 MiB)
# 14.698 ms (1080082 allocations: 27.47 MiB)

I also included the timings for the full solution for comparison.
(See http://nbviewer.jupyter.org/github/SebastianM-C/Benchmarks/blob/master/parallel.ipynb?flush_cache=true for more details)

The text was updated successfully, but these errors were encountered:

SebastianM-C · 2018-10-08T21:46:45Z

I wrote another benchmark comparing the integrator interface with the solve one and if I didn't make any mistakes I have showed that the problem is also present with the integrator interface.
See: https://github.com/SebastianM-C/Benchmarks/blob/e808d3d6f3e049627f0fa117795fa1cf4aff1823/integ_vs_solve.ipynb

The relevant part from the above:

@btime solve($prob1, Vern9(), abstol=1e-14, reltol=1e-14, save_everystep=false);
@btime solve($prob2, DPRKN12(), abstol=1e-14, reltol=1e-14, save_everystep=false);
@btime solve($prob2, KahanLi8(), dt=1e-2, maxiters=1e10, save_everystep=false);
# 789.911 μs (38974 allocations: 1.21 MiB)
# 355.283 μs (17439 allocations: 463.05 KiB)
# 1.438 ms (108191 allocations: 2.76 MiB)

function integ_benchmark(prob; args...)
    integ = init(prob; args...)
    while integ.t < prob.tspan[2]
        step!(integ)
    end
end

@btime integ_benchmark($prob1, alg=Vern9(), abstol=1e-14, reltol=1e-14)
@btime integ_benchmark($prob2, alg=DPRKN12(), abstol=1e-14, reltol=1e-14)
@btime integ_benchmark($prob2, alg=KahanLi8(), dt=1e-2, maxiters=1e10)
# 897.680 μs (40428 allocations: 1.55 MiB)
# 379.758 μs (17909 allocations: 536.55 KiB)
# 1.563 ms (111196 allocations: 3.06 MiB)

tspan = (0., 100.)
prob1 = ODEProblem(ż, z0, tspan, p)
prob2 = DynamicalODEProblem(ṗ, q̇, p0, q0, tspan, p)

@btime solve($prob1, Vern9(), abstol=1e-14, reltol=1e-14, save_everystep=false);
@btime solve($prob2, DPRKN12(), abstol=1e-14, reltol=1e-14, save_everystep=false);
@btime solve($prob2, KahanLi8(), dt=1e-2, maxiters=1e10, save_everystep=false);
# 7.940 ms (386014 allocations: 11.80 MiB)
# 3.480 ms (173907 allocations: 4.43 MiB)
# 17.513 ms (1080083 allocations: 27.47 MiB)

@btime integ_benchmark($prob1, alg=Vern9(), abstol=1e-14, reltol=1e-14)
@btime integ_benchmark($prob2, alg=DPRKN12(), abstol=1e-14, reltol=1e-14)
@btime integ_benchmark($prob2, alg=KahanLi8(), dt=1e-2, maxiters=1e10)
# 8.980 ms (400491 allocations: 15.50 MiB)
# 3.749 ms (178553 allocations: 4.83 MiB)
# 18.589 ms (1110082 allocations: 29.61 MiB)

Note that those benchmarks were done on a different (slower) machine compared with the first ones

julia> versioninfo()
Julia Version 1.0.1
Commit 0d713926f8 (2018-09-29 19:05 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, haswell)

I am not sure why the benchmarks for the integrator interface give slower timings than the ones for the solve interface. I hope that the benchmark function did not introduce any (grave) performance problems.

Edit: updated the link to point to the relevant file version. I tried some modifications (see master), but I am not sure if I got it right.

YingboMa · 2018-10-09T04:04:48Z

I opened Julia with --track-allocation=user --inline=no to analyze this issue, however, the only allocation that I found was produced by the derivative function itself.

julia> @timev ż(z0, p, 1.);
  0.000006 seconds (5 allocations: 208 bytes)
elapsed time (ns): 6038
bytes allocated:   208
pool allocs:       5

ChrisRackauckas · 2018-10-09T17:04:41Z

If changing the timespan doesn't change the allocations in init when doing save_everystep=false, then I don't know if there's anything we can point to? We should check this.

SebastianM-C · 2018-10-09T20:24:50Z

I added another set of benchmarks with the init and step! separated. I am not sure if I didn't make any mistakes since I get some strange results.
See https://github.com/SebastianM-C/Benchmarks/blob/694db1bef9340c12b292714691ce74d8425dac42/integ_vs_solve.ipynb

With tspan = (0.,10.) I get

@btime init($prob1, Vern9(), abstol=1e-14, reltol=1e-14, save_everystep=false)
@btime step_integ(integ1, $tspan[2]) setup=(integ1=init($prob1, Vern9(), abstol=1e-14, reltol=1e-14, save_everystep=false))
@btime init($prob2, DPRKN12(), abstol=1e-14, reltol=1e-14, save_everystep=false)
@btime step_integ(integ2, $tspan[2]) setup=(integ2=init($prob2, DPRKN12(), abstol=1e-14, reltol=1e-14, save_everystep=false))
@btime init($prob2, KahanLi8(), dt=1e-2, maxiters=1e10, save_everystep=false)
@btime step_integ(integ3, $tspan[2]) setup=(integ3=init($prob2, KahanLi8(), dt=1e-2, maxiters=1e10, save_everystep=false))
# 4.335 μs (88 allocations: 18.45 KiB)
# 3.354 μs (223 allocations: 6.98 KiB)
# 4.258 μs (95 allocations: 11.13 KiB)
# 1.114 μs (69 allocations: 1.80 KiB)
# 2.825 μs (79 allocations: 6.03 KiB)
# 120.168 μs (10810 allocations: 281.53 KiB)

and when I increase to tspan=(0.,100.)

@btime init($prob1, Vern9(), abstol=1e-14, reltol=1e-14, save_everystep=false)
@btime step_integ(integ1, $tspan[2]) setup=(integ1=init($prob1, Vern9(), abstol=1e-14, reltol=1e-14, save_everystep=false))
@btime init($prob2, DPRKN12(), abstol=1e-14, reltol=1e-14, save_everystep=false)
@btime step_integ(integ2, $tspan[2]) setup=(integ2=init($prob2, DPRKN12(), abstol=1e-14, reltol=1e-14, save_everystep=false))
@btime init($prob2, KahanLi8(), dt=1e-2, maxiters=1e10, save_everystep=false)
@btime step_integ(integ3, $tspan[2]) setup=(integ3=init($prob2, KahanLi8(), dt=1e-2, maxiters=1e10, save_everystep=false))
# 4.384 μs (86 allocations: 18.34 KiB)
# 5.858 ms (385920 allocations: 11.78 MiB)
# 4.300 μs (94 allocations: 11.03 KiB)
# 308.408 μs (19312 allocations: 502.92 KiB)
# 2.825 μs (78 allocations: 5.94 KiB)
# 14.674 ms (1080000 allocations: 27.47 MiB)

What I find strange is that in the first case the timings are suspiciously small compared with the rest of the benchmarks and in the second case they explode in the case of Vern9.

ChrisRackauckas · 2018-10-10T12:11:28Z

Okay, that rules out the possibility of something wrong in init. We'd need to check the stepping and the derivative function. There is a possibility that stepping has a problem like JuliaLang/julia#22255

YingboMa · 2018-10-28T00:37:35Z

using OrdinaryDiffEq
using StaticArrays
using BenchmarkTools
using Profile

@inline function ż(z, p, t)
    @inbounds begin
        @assert z isa SVector
        A, B, D = p
        p₀, p₂ = z[1], z[2]
        q₀, q₂ = z[3], z[4]

        return SVector{4}(
            -A * q₀ - 3 * B / √2 * (q₂^2 - q₀^2) - D * q₀ * (q₀^2 + q₂^2),
            -q₂ * (A + 3 * √2 * B * q₀ + D * (q₀^2 + q₂^2)),
            A * p₀,
            A * p₂
        )
    end
end

q0 = SVector{2}([0.0, -4.363920590485035])
p0 = SVector{2}([10.923918825236079, -5.393598858645495])
z0 = vcat(p0, q0)
p = (A=1,B=0.55,D=0.4)

tspan = (0., 1000.)

prob1 = ODEProblem(ż, z0, tspan, p)

solve(prob1, Vern9(), abstol=1e-14, reltol=1e-14, save_everystep=false);
@timev solve(prob1, Vern9(), abstol=1e-14, reltol=1e-14, save_everystep=false);
Profile.clear_malloc_data()
solve(prob1, Vern9(), abstol=1e-14, reltol=1e-14, save_everystep=false);
exit()

When I set tspan = (0., 100.) I got the allocation info

 Coverage.MallocInfo(16, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 276)
 Coverage.MallocInfo(16, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 277)
 Coverage.MallocInfo(16, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 281)
 Coverage.MallocInfo(32, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 278)
 Coverage.MallocInfo(80, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 174)
 Coverage.MallocInfo(80, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 404)
 Coverage.MallocInfo(96, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 378)
 Coverage.MallocInfo(240, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 147)
 Coverage.MallocInfo(320, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 116)
 Coverage.MallocInfo(336, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 178)
 Coverage.MallocInfo(400, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 6)
 Coverage.MallocInfo(480, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/integrators/type.jl.mem", 2)
 Coverage.MallocInfo(624, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/perform_step/verner_rk_perform_step.jl.mem", 640)
 Coverage.MallocInfo(848, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 62)
 Coverage.MallocInfo(3264, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/alg_utils.jl.mem", 361)
 Coverage.MallocInfo(3328, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/interp_func.jl.mem", 4)
 Coverage.MallocInfo(7232, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/integrators/type.jl.mem", 130)

, with tspan=(0., 1000.), I got

 Coverage.MallocInfo(16, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 276)
 Coverage.MallocInfo(16, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 277)
 Coverage.MallocInfo(16, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 281)
 Coverage.MallocInfo(32, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 278)
 Coverage.MallocInfo(80, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 174)
 Coverage.MallocInfo(80, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 404)
 Coverage.MallocInfo(96, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 378)
 Coverage.MallocInfo(240, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 147)
 Coverage.MallocInfo(320, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 116)
 Coverage.MallocInfo(336, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 178)
 Coverage.MallocInfo(400, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 6)
 Coverage.MallocInfo(480, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/integrators/type.jl.mem", 2)
 Coverage.MallocInfo(624, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/perform_step/verner_rk_perform_step.jl.mem", 640)
 Coverage.MallocInfo(848, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/solve.jl.mem", 62)
 Coverage.MallocInfo(3264, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/alg_utils.jl.mem", 361)
 Coverage.MallocInfo(3328, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/interp_func.jl.mem", 4)
 Coverage.MallocInfo(7232, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/integrators/type.jl.mem", 130)

They are exactly identical.

So, there isn't anything extra allocated with a longer time span within the OrdinaryDiffEq.jl package. Closing.

ChrisRackauckas · 2018-11-01T13:59:39Z

How are the allocations in Coverage.MallocInfo(624, "/home/scheme/.julia/dev/OrdinaryDiffEq/src/perform_step/verner_rk_perform_step.jl.mem", 640) not dependent on the number of steps?

ChrisRackauckas · 2018-11-08T20:17:02Z

@YingboMa , if there's allocations in perform_step!, how is it not dependent on the number of steps?

SebastianM-C · 2019-10-07T22:38:17Z

I started investigating this again and I observed something quite strange. For a simpler user function (lorenz) the problem disappears.
MWE:

using OrdinaryDiffEq
using BenchmarkTools
using StaticArrays

function lorenz(u,p,t)
  du1 = 10.0*(u[2]-u[1])
  du2 = u[1]*(28.0-u[3]) - u[2]
  du3 = u[1]*u[2] - (8/3)*u[3]

  return SVector{3}(du1,du2,du3)
end

@inbounds @inline function ż(z, p, t)
    A, B, D = p
    p₀, p₂ = z[SVector{2}(1:2)]
    q₀, q₂ = z[SVector{2}(3:4)]

    return SVector{4}(
        -A * q₀ - 3 * B / √2 * (q₂^2 - q₀^2) - D * q₀ * (q₀^2 + q₂^2),
        -q₂ * (A + 3 * √2 * B * q₀ + D * (q₀^2 + q₂^2)),
        A * p₀,
        A * p₂
    )
end

u0 = @SVector [1.0,0.0,0.0]
u = vcat(u0,u0)
p = (A=1,B=0.55,D=0.4)

q0 = SVector{2}([0.0, -4])
p0 = SVector{2}([10, -5])
z0 = vcat(p0, q0)

tspan1 = (0.0,10.0)
prob1_ok = ODEProblem(lorenz,u0,tspan1)
prob1_notok = ODEProblem(ż,z0,tspan1,p)
@btime solve($prob1_ok, Vern9(), save_everystep=false) # 49.999 μs (40 allocations: 9.89 KiB)
@btime solve($prob1_notok, Vern9(), save_everystep=false) # 58.199 μs (3170 allocations: 107.92 KiB)

tspan2 = (0.0,100.0)
prob2_ok = ODEProblem(lorenz,u0,tspan2)
prob2_notok = ODEProblem(ż,z0,tspan2,p)

@btime solve($prob2_ok, Vern9(), save_everystep=false) # 543.900 μs (40 allocations: 9.89 KiB)
@btime solve($prob2_notok, Vern9(), save_everystep=false) # 450.700 μs (25810 allocations: 815.42 KiB)

SebastianM-C · 2019-10-08T15:20:08Z

I tried a couple of other functions and it looks like it's somehow related to how complicated the user function is (number of operations?). The Henon system is not sufficient to trigger the problem

henon(z, p, t) = SVector(
        -z[3] * (1 + 2z[4]),
        -z[4] - (z[3]^2 - z[4]^2),
        z[1],
        z[2]
    )

but if I extend lorenz or henon by writing the equations twice, I can reproduce the problem. For extending I used something like this

function extend2(f, ::SVector{M}) where M
    @inbounds function eom(u,p,t)
        idx1 = SVector{M}(1:M)
        idx2 = SVector{M}(M+1:2M)
        vcat(f(u[idx1],p,t),
             f(u[idx2],p,t))
    end
end

and I don't think it introduces problems.

SebastianM-C · 2019-10-08T15:47:20Z

Indeed, extending doesn't impact allocations. Using

function lorenz2(u,p,t)
    du1 = 10.0*(u[2]-u[1])
    du2 = u[1]*(28.0-u[3]) - u[2]
    du3 = u[1]*u[2] - (8/3)*u[3]

    du4 = 10.0*(u[2+3]-u[1+3])
    du5 = u[1+3]*(28.0-u[3+3]) - u[2+3]
    du6 = u[1+3]*u[2+3] - (8/3)*u[3+3]
  
    return SVector{6}(du1,du2,du3,du4,du5,du6)
end

yields the same number of allocations (and reproduces the problem).

Since the henon system doesn't present the problem, but the one for ż does, I am inclined to say that the problem is not related to the number of equations (or length of the SVector), but to something like the number of operations.
I tried @inline and @noinline with lorenz2 and I doesn't influence allocations

u0 = @SVector [1.0,0.0,0.0]
u = vcat(u0,u0)
tspan1 = (0.0,10.0)
prob1_2lm = ODEProblem(lorenz2,u,tspan1)
@btime solve($prob1_2lm, Vern9(), save_everystep=false)
# @inline 49.900 μs (200 allocations: 16.69 KiB)
# @noinline 53.000 μs (200 allocations: 16.69 KiB)
tspan2 = (0.0,100.0)
prob2_2lm = ODEProblem(lorenz2,u,tspan2)
@btime solve($prob2_2lm, Vern9(), save_everystep=false)
# @inline 536.101 μs (1796 allocations: 79.03 KiB)
# @noinline 571.701 μs (1796 allocations: 79.03 KiB)

SebastianM-C · 2019-10-08T17:48:52Z

@YingboMa I updated your script above to

using StaticArrays
using Profile
using BenchmarkTools
using OrdinaryDiffEq

function lorenz(u,p,t)
  du1 = 10.0*(u[2]-u[1])
  du2 = u[1]*(28.0-u[3]) - u[2]
  du3 = u[1]*u[2] - (8/3)*u[3]

  return SVector{3}(du1,du2,du3)
end

function lorenz2(u,p,t)
    du1 = 10.0*(u[2]-u[1])
    du2 = u[1]*(28.0-u[3]) - u[2]
    du3 = u[1]*u[2] - (8/3)*u[3]

    du4 = 10.0*(u[2+3]-u[1+3])
    du5 = u[1+3]*(28.0-u[3+3]) - u[2+3]
    du6 = u[1+3]*u[2+3] - (8/3)*u[3+3]

    return SVector{6}(du1,du2,du3,du4,du5,du6)
end

const u0 = @SVector [1.0,0.0,0.0]
const u = vcat(u0,u0)
const tspan = (0.0,10.0)
# const prob = ODEProblem(lorenz,u0,tspan)
const prob = ODEProblem(lorenz2,u,tspan)

@time solve(prob, Tsit5(), save_everystep=false);

@time solve(prob, Tsit5(), save_everystep=false);
Profile.clear_malloc_data()
@timev solve(prob, Tsit5(), save_everystep=false);
exit()

and tried to debug with --track-allocations=user, but it seems quite tricky due to inlining.
I wrote a very detailed log here: https://gist.github.com/SebastianM-C/578e46a7ccff39dabd4aeeac964c099c

ChrisRackauckas · 2019-10-08T17:57:53Z

Linear increase of allocations is worrisome. Where are all of the other allocs? That gist only displays 144 of like 700

SebastianM-C · 2019-10-08T21:33:07Z

I used the Coverage script and checked the top 5 memory allocation spots and I think I found something. The error estimator has a linear increase in allocations.
I was confused at first that in solve! the perform_step! function showed 0 allocations, but actually in the .mem file you can see them.
I updated my findings here: https://github.com/SebastianM-C/Benchmarks/blob/7687fcddde85f908d238e9347b1e4c470e4ef3f1/debug_log_518.md

This contradicts @YingboMa 's earlier post, so I would appreciate if someone could try to replicate my findings in order to double check and make sure I didn't miss something.

TLDR:
with tspan = (0.0,10.0)

- @muladd function perform_step!(integrator, cache::Tsit5ConstantCache, repeat_step=false)
0   @unpack t,dt,uprev,u,f,p = integrator
0   @unpack c1,c2,c3,c4,c5,c6,a21,a31,a32,a41,a42,a43,a51,a52,a53,a54,a61,a62,a63,a64,a65,a71,a72,a73,a74,a75,a76,btilde1,btilde2,btilde3,btilde4,btilde5,btilde6,btilde7 = cache
0   k1 = integrator.fsalfirst
0   a = dt*a21
0   k2 = f(uprev+a*k1, p, t+c1*dt)
0   k3 = f(uprev+dt*(a31*k1+a32*k2), p, t+c2*dt)
0   k4 = f(uprev+dt*(a41*k1+a42*k2+a43*k3), p, t+c3*dt)
0   k5 = f(uprev+dt*(a51*k1+a52*k2+a53*k3+a54*k4), p, t+c4*dt)
0   g6 = uprev+dt*(a61*k1+a62*k2+a63*k3+a64*k4+a65*k5)
0   k6 = f(g6, p, t+dt)
0   u = uprev+dt*(a71*k1+a72*k2+a73*k3+a74*k4+a75*k5+a76*k6)
0   integrator.fsallast = f(u, p, t+dt); k7 = integrator.fsallast
0   integrator.destats.nf += 6
-   if typeof(integrator.alg) <: CompositeAlgorithm
-     g7 = u
-     # Hairer II, page 22
-     integrator.eigen_est = integrator.opts.internalnorm(k7 - k6,t)/integrator.opts.internalnorm(g7 - g6,t)
-   end
0   if integrator.opts.adaptive
0     utilde = dt*(btilde1*k1 + btilde2*k2 + btilde3*k3 + btilde4*k4 + btilde5*k5 + btilde6*k6 + btilde7*k7)
0     atmp = calculate_residuals(utilde, uprev, u, integrator.opts.abstol, integrator.opts.reltol,integrator.opts.internalnorm,t)
1648     integrator.EEst = integrator.opts.internalnorm(atmp,t)
-   end
0   integrator.k[1] = k1
0   integrator.k[2] = k2
0   integrator.k[3] = k3
0   integrator.k[4] = k4
0   integrator.k[5] = k5
0   integrator.k[6] = k6
0   integrator.k[7] = k7
0   integrator.u = u
- end

with tspan = (0.0,20.0)

- @muladd function perform_step!(integrator, cache::Tsit5ConstantCache, repeat_step=false)
0   @unpack t,dt,uprev,u,f,p = integrator
0   @unpack c1,c2,c3,c4,c5,c6,a21,a31,a32,a41,a42,a43,a51,a52,a53,a54,a61,a62,a63,a64,a65,a71,a72,a73,a74,a75,a76,btilde1,btilde2,btilde3,btilde4,btilde5,btilde6,btilde7 = cache
0   k1 = integrator.fsalfirst
0   a = dt*a21
0   k2 = f(uprev+a*k1, p, t+c1*dt)
0   k3 = f(uprev+dt*(a31*k1+a32*k2), p, t+c2*dt)
0   k4 = f(uprev+dt*(a41*k1+a42*k2+a43*k3), p, t+c3*dt)
0   k5 = f(uprev+dt*(a51*k1+a52*k2+a53*k3+a54*k4), p, t+c4*dt)
0   g6 = uprev+dt*(a61*k1+a62*k2+a63*k3+a64*k4+a65*k5)
0   k6 = f(g6, p, t+dt)
0   u = uprev+dt*(a71*k1+a72*k2+a73*k3+a74*k4+a75*k5+a76*k6)
0   integrator.fsallast = f(u, p, t+dt); k7 = integrator.fsallast
0   integrator.destats.nf += 6
-   if typeof(integrator.alg) <: CompositeAlgorithm
-     g7 = u
-     # Hairer II, page 22
-     integrator.eigen_est = integrator.opts.internalnorm(k7 - k6,t)/integrator.opts.internalnorm(g7 - g6,t)
-   end
0   if integrator.opts.adaptive
0     utilde = dt*(btilde1*k1 + btilde2*k2 + btilde3*k3 + btilde4*k4 + btilde5*k5 + btilde6*k6 + btilde7*k7)
0     atmp = calculate_residuals(utilde, uprev, u, integrator.opts.abstol, integrator.opts.reltol,integrator.opts.internalnorm,t)
3616     integrator.EEst = integrator.opts.internalnorm(atmp,t)
-   end
0   integrator.k[1] = k1
0   integrator.k[2] = k2
0   integrator.k[3] = k3
0   integrator.k[4] = k4
0   integrator.k[5] = k5
0   integrator.k[6] = k6
0   integrator.k[7] = k7
0   integrator.u = u
- end

ChrisRackauckas · 2019-10-08T23:29:01Z

Looks like it's fixed by SciML/DiffEqBase.jl#348

YingboMa closed this as completed Oct 28, 2018

ChrisRackauckas reopened this Nov 8, 2018

ChrisRackauckas mentioned this issue Oct 8, 2019

faster default norms SciML/DiffEqBase.jl#348

Merged

ChrisRackauckas closed this as completed in SciML/DiffEqBase.jl#348 Oct 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected allocations with SVectors #518

Unexpected allocations with SVectors #518

SebastianM-C commented Oct 7, 2018

SebastianM-C commented Oct 8, 2018 •

edited

Loading

YingboMa commented Oct 9, 2018

ChrisRackauckas commented Oct 9, 2018

SebastianM-C commented Oct 9, 2018

ChrisRackauckas commented Oct 10, 2018 •

edited

Loading

YingboMa commented Oct 28, 2018

ChrisRackauckas commented Nov 1, 2018

ChrisRackauckas commented Nov 8, 2018

SebastianM-C commented Oct 7, 2019 •

edited

Loading

SebastianM-C commented Oct 8, 2019

SebastianM-C commented Oct 8, 2019

SebastianM-C commented Oct 8, 2019

ChrisRackauckas commented Oct 8, 2019

SebastianM-C commented Oct 8, 2019

ChrisRackauckas commented Oct 8, 2019

Unexpected allocations with SVectors #518

Unexpected allocations with SVectors #518

Comments

SebastianM-C commented Oct 7, 2018

SebastianM-C commented Oct 8, 2018 • edited Loading

YingboMa commented Oct 9, 2018

ChrisRackauckas commented Oct 9, 2018

SebastianM-C commented Oct 9, 2018

ChrisRackauckas commented Oct 10, 2018 • edited Loading

YingboMa commented Oct 28, 2018

ChrisRackauckas commented Nov 1, 2018

ChrisRackauckas commented Nov 8, 2018

SebastianM-C commented Oct 7, 2019 • edited Loading

SebastianM-C commented Oct 8, 2019

SebastianM-C commented Oct 8, 2019

SebastianM-C commented Oct 8, 2019

ChrisRackauckas commented Oct 8, 2019

SebastianM-C commented Oct 8, 2019

ChrisRackauckas commented Oct 8, 2019

SebastianM-C commented Oct 8, 2018 •

edited

Loading

ChrisRackauckas commented Oct 10, 2018 •

edited

Loading

SebastianM-C commented Oct 7, 2019 •

edited

Loading