Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative interface to expv! #6

Merged
merged 8 commits into from
Oct 6, 2018
Merged

Conversation

jagot
Copy link
Contributor

@jagot jagot commented Oct 2, 2018

As discussed in #1, I've implemented an alternative interface to exvp! for the Hermitian case, where the Krylov subspace is successively built up, but the iteration is terminated, if a stopping criterion based on an exponentiation error estimate is fulfilled. For time-stepping, where the error of the exponentiation, and not the condition number of the Krylov subspace, is important, I have found this to be a more reliable indicator of when to terminate the iteration.

  • This PR follows Diagonalize before multiplying by time-step in the case of Hermitian operators #5, but does not really depend on it, so before this is merged, it should be rebased on whatever state master is in at that time.
  • The Krylov iterations previously used a separate cache vector, which I skipped. Instead I use the "next" vector in the subspace, i.e. V[:,j+1], where cache after being orthogonalized and scaled would end up anyway. I have not, however, removed it from the call signature, since I still need to clean everywhere in krylov_phiv_adaptive.jl, which was not as fun to do.
  • I've seen different broadcast mechanisms being used, but I am not sure about the subtler implications, i.e. which one is best where. Please check and tell me!
    • @. V[:, 1] = 1 / beta
    • lmul!(1/beta, @view(V[:, 1])
    • For loop:
       @inbounds for i = 1:size(V,1)
           V[i,1] = 1/beta
       end
  • OK to keep verbose flag? I think it is nice to have when debugging time-stepping algorithms.
  • As can be seen in the benchmarks below, there are still some allocations going on, which I'd like to get rid of, or at least minimize. Pointers would be great!
  • It is not fantastically nice to have a LAPACK wrapper directly in the code, but I am not sure what else can be done at the moment, without a giant overhaul of LinearAlgebra.
  • Something similar should be implemented for Arnoldi.
  • Are there better unit tests that can be done?

As an example of usage and its performance, see the following code (which also doubles as the test):

n = 3000
m = 30
A = Hermitian(rand(n,n))
Ks = KrylovSubspace{ComplexF64,Float64}(n,m)
sc = StegrCache(ComplexF64, n)

b = rand(ComplexF64, n)
w = similar(b)
w′ = similar(b)

dt = 0.1

atol=1e-10
rtol=1e-10
expv!(w, -im, dt*A, b, Ks, sc, atol=atol, rtol=rtol, verbose=true)

println("Benchmarking Lanczos expv!")
@btime expv!(w, -im, dt*A, b, Ks, sc, atol=atol, rtol=rtol, verbose=false)

function fullexp!(w, A, v)
    eA = exp(A)
    mul!(w, eA, v)
    w
end
println("Benchmarking full exponentiation")
@btime fullexp!(w′, -im*dt*A, b)

norm(w-w′)

n=300

Initial norm: β₀ 1.380667e+01, stopping threshold: 1.480667e-09
iter 1, α[1] 1.087384e+01, β[1] 6.692534e+00, σ 9.240161e+01
iter 2, α[2] 4.099573e+00, β[2] 5.765774e-01, σ 6.664466e+00
iter 3, α[3] 1.026875e-02, β[3] 4.875859e-01, σ 1.600913e+00
iter 4, α[4] -3.137509e-04, β[4] 5.099652e-01, σ 4.215033e-01
iter 5, α[5] -2.819237e-02, β[5] 5.078813e-01, σ 7.169712e-02
iter 6, α[6] 1.549987e-03, β[6] 4.963334e-01, σ 8.827376e-03
iter 7, α[7] -2.843072e-03, β[7] 5.057275e-01, σ 8.815623e-04
iter 8, α[8] 3.516485e-02, β[8] 5.074857e-01, σ 7.329762e-05
iter 9, α[9] -1.400483e-02, β[9] 5.057665e-01, σ 5.180701e-06
iter 10, α[10] -1.756147e-02, β[10] 4.907045e-01, σ 3.096524e-07
iter 11, α[11] 2.551781e-02, β[11] 4.859925e-01, σ 1.624763e-08
iter 12, α[12] 1.383772e-02, β[12] 6.395702e-01, σ 1.007198e-09
Krylov subspace size: 12
Benchmarking Lanczos expv!
  1.089 ms (1359 allocations: 765.83 KiB)
Benchmarking full exponentiation
  29.685 ms (1320 allocations: 44.01 MiB)
7.58943925351473e-11

n=3000

Initial norm: β₀ 4.522136e+01, stopping threshold: 4.622136e-09
iter 1, α[1] 1.138220e+02, β[1] 6.424715e+01, σ 2.905344e+03
iter 2, α[2] 3.624997e+01, β[2] 1.814199e+00, σ 2.416432e+01
iter 3, α[3] -1.562011e-02, β[3] 1.595978e+00, σ 3.571388e+01
iter 4, α[4] -4.007526e-03, β[4] 1.594921e+00, σ 2.880494e+01
iter 5, α[5] 7.710502e-03, β[5] 1.589851e+00, σ 1.602004e+01
iter 6, α[6] 1.563939e-02, β[6] 1.567023e+00, σ 6.572510e+00
iter 7, α[7] 6.127848e-03, β[7] 1.601089e+00, σ 2.190977e+00
iter 8, α[8] 1.206389e-02, β[8] 1.598282e+00, σ 6.015621e-01
iter 9, α[9] 7.070683e-02, β[9] 1.622612e+00, σ 1.430245e-01
iter 10, α[10] 1.103534e+01, β[10] 3.919011e+01, σ 4.440673e-01
iter 11, α[11] 1.388096e+02, β[11] 5.896185e+00, σ 2.817613e-02
iter 12, α[12] 2.579454e-01, β[12] 1.598403e+00, σ 5.083859e-03
iter 13, α[13] -1.817458e-02, β[13] 1.567401e+00, σ 8.086115e-04
iter 14, α[14] -2.814222e-02, β[14] 1.568095e+00, σ 1.167917e-04
iter 15, α[15] -1.184318e-02, β[15] 1.563882e+00, σ 1.539143e-05
iter 16, α[16] 2.292472e-02, β[16] 1.547416e+00, σ 1.849834e-06
iter 17, α[17] -2.205499e-02, β[17] 1.583346e+00, σ 2.109877e-07
iter 18, α[18] 1.480475e-02, β[18] 1.672977e+00, σ 2.369455e-08
iter 19, α[19] 1.511450e+01, β[19] 4.513727e+01, σ 5.071813e-08
iter 20, α[20] 1.348318e+02, β[20] 4.894348e+00, σ 2.187540e-09
Krylov subspace size: 20
Benchmarking Lanczos expv!
  173.878 ms (12255 allocations: 69.22 MiB)
Benchmarking full exponentiation
  25.138 s (12128 allocations: 4.69 GiB)
1.293094021783796e-10

@codecov
Copy link

codecov bot commented Oct 2, 2018

Codecov Report

Merging #6 into master will decrease coverage by 0.01%.
The diff coverage is 96.42%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master       #6      +/-   ##
==========================================
- Coverage   96.87%   96.86%   -0.02%     
==========================================
  Files           4        6       +2     
  Lines         192      223      +31     
==========================================
+ Hits          186      216      +30     
- Misses          6        7       +1
Impacted Files Coverage Δ
src/krylov_phiv_adaptive.jl 98.73% <100%> (ø) ⬆️
src/arnoldi.jl 95.55% <100%> (+1.8%) ⬆️
src/stegr_cache.jl 100% <100%> (ø)
src/krylov_expv.jl 90% <90%> (ø)
src/krylov_phiv.jl 92.1% <90%> (-1.23%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d40ff06...e340a4b. Read the comment docs.

@ChrisRackauckas
Copy link
Member

Summary

basically, this alternative method sounds interesting, we should definitely have it in ExponentialUtilities, it might not work with the higher order semilinear and first-order ODE exponential integrators, but it might be something we want to interface with in the splitting methods, so we should try to keep the interface the same as much as possible, but that might not be possible since it's adapting the Krylov subspace size

if m > Ks.maxiter
resize!(Ks, m)
else
Ks.m = m # might change if happy-breakdown occurs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

happy-breakdown occurs -> convergence criterion met

@MSeeker1340
Copy link
Contributor

but that might not be possible since it's adapting the Krylov subspace size

To be more specific, the issue lies with extending this alternative method to phiv_timestep because of the adaptation. On the other hand, for phiv! we can probably apply similar changes as the error estimate for exp should more or less still work for higher order phis, especially if the Krylov dimension is large compared to the phi order.

@MSeeker1340
Copy link
Contributor

Nicely done. To complete things, can you add a keyword argument to expv in krylov_phiv.jl that allows it to use the new version (this should also make your testing script simpler) and add a one-liner comment to krylov_expv.jl describing its purpose?

@jagot
Copy link
Contributor Author

jagot commented Oct 6, 2018

I have addressed the comments above (I believe). There still remains the issue of allocation, but it is not huge, so can maybe rest for now. A similar approach for Arnoldi is possible using hseqr, which computes the Schur factorization of a Hessenberg matrix. What then remains, is the computation of the matrix exponential of an upper triangular matrix.

@jagot
Copy link
Contributor Author

jagot commented Oct 6, 2018

References for upper triangular matrix exponentials:

@MSeeker1340 MSeeker1340 changed the title WIP: Alternative interface to expv! Alternative interface to expv! Oct 6, 2018
@MSeeker1340 MSeeker1340 merged commit 16ed363 into SciML:master Oct 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants