
Home  Download  Support 
Understanding Expokit  

(1) 

(2) 
When u = 0, the solution is simply w(t) = exp(tA)v. Expokit provides userfriendly routines (in Fortran 77 and Matlab) for addressing either situation (case u = 0 and case u 0). Routines for computing small matrix exponentials in full are provided as well. The backbone of the sparse routines consists of Krylov subspace projection methods (Arnoldi and Lanczos processes) and that is why the toolkit is capable of coping with sparse matrices of very large dimension. The package handles real and complex matrices and provides specific routines for symmetric and Hermitian matrices. When dealing with Markov chains, the computation of the matrix exponential is subject to probabilistic constraints. In addition to addressing general matrix exponentials, a distinct attention is assigned to the computation of transient states of Markov chains.
The purpose of this cursory introduction is to let the reader catch a glimpse of the fundamental elements that make Expokit so fast and robust. The large sparse matrix exponential situation (case u = 0) is taken as the basis of the exposition.
To compute w(t) = exp(tA)v, the Krylovbased algorithm of Expokit purposely sets out to compute the matrix exponential times a vector rather than the matrix exponential in isolation.
Given an integer m, the Taylor expansion

(3) 
c_{0}v + c_{1}(tA)v + c_{2}(tA)^{2}v + ... + c_{m1}(tA)^{m1}v 

(4) 
[V,H,] = arnodi(A,v,m) 
:= v_{2} v_{1} := v/; for j := 1:m do p := Av_{j} for i := 1:j do h_{ij} := v_{i}^{T}p p := p  h_{ij}v_{i} end h_{j+1,j} := p_{2} v_{j+1} := p/h_{j+1,j} end 
From the starting vector v, if the Arnoldi procedure is applied simply with A (rather than tA), then upon completion it builds two matrices
H 
= [h_{ij}] R^{(m+1)×m} 

The matrices_{ } 
H _{ } 
and_{ }  H_{m}  _{ } have a special form 

For any arbitrary scalar , K_{m}(A,v) = K_{m}(A,v), and in fact, multiplying (5) and (6) by shows that it is straightforward to extend these relations to A using Hbar and H_{m} with V_{m+1} unchanged. Hence there is no loss of generality when applying the Arnoldi procedure directly to A.
These key observations are vital to the overall Krylov solution technique.
Let w_{opt} be the optimal Krylov approximation in the least squares sense to w() = exp(A)v. Since V_{m} is a basis of the Krylov subspace, w_{opt} = V_{m}y_{opt} with y_{opt} R^{m}. Thus by defintion we have


(7) 

(8) 
H 
_{m+1} = [ 
H 
_{ } 0] R^{(m+1)×(m+1)}. 
V_{m+1}exp( 
H 
_{m+1})e_{1} = p_{m}(A)v 
It is known from the theory of matrix functions that exp(A)v = p_{n1}(A)v where p_{n1} is the Hermite interpolation polynomial of the exponential function at Eig(A). From the theory of Krylov subspaces, Eig(H_{m}) not only approximates a larger subset of Eig(A) as m increases but also approximates it more accurately. These results further provide a stimulus for using these approximations. In our implementation, we use the corrected approximation (8) exclusively.
Spectrum

For a random v and A of order 500 whose spectrum Eig(A) is given on the first figure, the second figure shows three error curves corresponding to w_{exact}  w_{mthapprox}_{2}, with w_{exact} = exp(A)v and w_{mthapprox} is either the mth degree optimal approximation, the mth degree Krylov approximation in (8), or the mth degree truncated Taylor approximation. As m gradually increases from 1 to 30, the optimal error and the Krylov error decrease significantly from 10^{6} to 10^{6}.
As illustrated graphically, this relatively easy to get Krylov approximation is not the optimal one but as m augments, it quickly draws close to the optimal and it is better than the mfold Taylor expansion  which highlights that for the same amount of matrixvector products, the Krylov approximation is better than the Taylor approximation. However, the Krylov approach needs more room (especially to store V R^{n×(m+1)}) and involves more local calculation (notably the Arnoldi sweeps needed to construct V and Hbar). Hochbruck and Lubich have shown that the error in the Krylov approximation behaves like O(e^{A2}[(A_{2}e)/ m]^{m}) when m 2A_{2}.
Krylov approximation
w := v; 

(9) 
k = 0, 1, ..., s, _{k} = t_{k+1}t_{k}, 0 = t_{0} < t_{1} < ... < t_{s} < t_{s+1} = t. 
Consequently, in the course of the integration, one can output intermediate discrete observations (if they are needed) at no extra cost. The {_{k}} are stepsizes that are selected automatically within the code to ensure stability and accuracy. This timestepping selection is done in conjunction with error estimations. Nevertheless it remains clear from (9) that the crux of the problem at each stage is an operation of the form exp(A)v, albeit with different 's and v's. The selection of a specific stepsize is made so that exp(A)v is now effectively approximated by (8) using information of the current Arnoldi process for the stage. Following the procedures of ODEs solvers, an a posteriori error control is carried out to ensure that the intermediate approximation is acceptable with respect to expectations on the global error. Further information and references may be obtained from the Expokit documentation.