<< . .

. 7
( : 26)



. . >>

with a solution) produces matrices {Vk } with orthonormal columns such that
(3.8) AVk = Vk+1 Hk .
Hence, for some y k ∈ Rk ,
rk = b ’ Axk = r0 ’ A(xk ’ x0 ) = Vk+1 (βe1 ’ Hk y k ).
Hence xk = x0 + Vk y k , where y k minimizes βe1 ’ Hk y 2 over Rk . Note that
when y k has been computed, the norm of rk can be found without explicitly
forming xk and computing rk = b ’ Axk . We have, using the orthogonality of
Vk+1 ,
rk 2 = Vk+1 (βe1 ’ Hk y k ) 2 = βe1 ’ Hk y k Rk+1 .
(3.9)
The goal of the iteration is to ¬nd, for a given , a vector x so that
b ’ Ax ¤ b 2.
2

The input is the initial iterate, x, the right-hand side b, and a map that
computes the action of A on a vector. We limit the number of iterations
to kmax and return the solution, which overwrites the initial iterate x and the
residual norm.



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




39
GMRES ITERATION

Algorithm 3.4.2. gmresa(x, b, A, , kmax, ρ)
1. r = b ’ Ax, v1 = r/ r 2 , ρ = r 2 , β = ρ, k = 0

2. While ρ > b and k < kmax do
2

(a) k = k + 1
(b) for j = 1, . . . , k
hjk = (Avk )T vj
k
(c) vk+1 = Avk ’ j=1 hjk vj
(d) hk+1,k = vk+1 2

(e) vk+1 = vk+1 / vk+1 2

(f) e1 = (1, 0, . . . , 0)T ∈ Rk+1
Minimize βe1 ’ Hk y k Rk+1 over Rk to obtain y k .
(g) ρ = βe1 ’ Hk y k Rk+1 .

3. xk = x0 + Vk y k .

Note that xk is only computed upon termination and is not needed within
the iteration. It is an important property of GMRES that the basis for the
Krylov space must be stored as the iteration progress. This means that in
order to perform k GMRES iterations one must store k vectors of length N .
For very large problems this becomes prohibitive and the iteration is restarted
when the available room for basis vectors is exhausted. One way to implement
this is to set kmax to the maximum number m of vectors that one can store,
call GMRES and explicitly test the residual b’Axk if k = m upon termination.
If the norm of the residual is larger than , call GMRES again with x0 = xk ,
the result from the previous call. This restarted version of the algorithm is
called GMRES(m) in [167]. There is no general convergence theorem for the
restarted algorithm and restarting will slow the convergence down. However,
when it works it can signi¬cantly reduce the storage costs of the iteration. We
discuss implementation of GMRES(m) later in this section.
Algorithm gmresa can be implemented very straightforwardly in MAT-
LAB. Step 2f can be done with a single MATLAB command, the backward
division operator, at a cost of O(k 3 ) ¬‚oating-point operations. There are more
e¬cient ways to solve the least squares problem in step 2f, [167], [197], and we
use the method of [167] in the collection of MATLAB codes. The savings are
slight if k is small relative to N , which is often the case for large problems, and
the simple one-line MATLAB approach can be e¬cient for such problems.
A more serious problem with the implementation proposed in Algo-
rithm gmresa is that the vectors vj may become nonorthogonal as a result of
cancellation errors. If this happens, (3.9), which depends on this orthogonality,
will not hold and the residual and approximate solution could be inaccurate. A
partial remedy is to replace the classical Gram“Schmidt orthogonalization in
Algorithm gmresa with modi¬ed Gram“Schmidt orthogonalization. We replace



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




40 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

the loop in step 2c of Algorithm gmresa with

vk+1 = Avk
for j = 1, . . . k
T
vk+1 = vk+1 ’ (vk+1 vj )vj .

While modi¬ed Gram“Schmidt and classical Gram“Schmidt are equivalent in
in¬nite precision, the modi¬ed form is much more likely in practice to maintain
orthogonality of the basis.
We illustrate this point with a simple example from [128], doing the
computations in MATLAB. Let δ = 10’7 and de¬ne
« 
111
¬ ·
A =  δ δ 0 .
δ0δ

We orthogonalize the columns of A with classical Gram“Schmidt to obtain
« 
1.0000e + 00 1.0436e ’ 07 9.9715e ’ 08
¬ ·
V =  1.0000e ’ 07 1.0456e ’ 14 ’9.9905e ’ 01  .
1.0000e ’ 07 ’1.0000e + 00 4.3568e ’ 02

T
The columns of VU are not orthogonal at all. In fact v2 v3 ≈ ’.004. For
modi¬ed Gram“Schmidt
« 
1.0000e + 00 1.0436e ’ 07 1.0436e ’ 07
¬ ·
V =  1.0000e ’ 07 1.0456e ’ 14 ’1.0000e + 00  .
1.0000e ’ 07 ’1.0000e + 00 4.3565e ’ 16

Here |vi vj ’ δij | ¤ 10’8 for all i, j.
T

The versions we implement in the collection of MATLAB codes use modi-
¬ed Gram“Schmidt. The outline of our implementation is Algorithm gmresb.
This implementation solves the upper Hessenberg least squares problem using
the MATLAB backward division operator, and is not particularly e¬cient. We
present a better implementation in Algorithm gmres. However, this version is
very simple and illustrates some important ideas. First, we see that xk need
only be computed after termination as the least squares residual ρ can be used
to approximate the norm of the residual (they are identical in exact arithmetic).
Second, there is an opportunity to compensate for a loss of orthogonality in
the basis vectors for the Krylov space. One can take a second pass through the
modi¬ed Gram“Schmidt process and restore lost orthogonality [147], [160].
Algorithm 3.4.3. gmresb(x, b, A, , kmax, ρ)
1. r = b ’ Ax, v1 = r/ r 2 , ρ = r 2 , β = ρ, k = 0

2. While ρ > b and k < kmax do
2

(a) k = k + 1



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




41
GMRES ITERATION

(b) vk+1 = Avk
for j = 1, . . . k
T
i. hjk = vk+1 vj
ii. vk+1 = vk+1 ’ hjk vj
(c) hk+1,k = vk+1 2
(d) vk+1 = vk+1 / vk+1 2
(e) e1 = (1, 0, . . . , 0)T ∈ Rk+1
Minimize βe1 ’ Hk y k Rk+1 to obtain y k ∈ Rk .
(f) ρ = βe1 ’ Hk y k Rk+1 .

3. xk = x0 + Vk y k .

Even if modi¬ed Gram“Schmidt orthogonalization is used, one can still
lose orthogonality in the columns of V . One can test for loss of orthogonality
[22], [147], and reorthogonalize if needed or use a more stable means to create
the matrix V [195]. These more complex implementations are necessary if A is
ill conditioned or many iterations will be taken. For example, one can augment
the modi¬ed Gram“Schmidt process
• vk+1 = Avk
for j = 1, . . . k
T
hjk = vk+1 vj
vk+1 = vk+1 ’ hjk vj

• hk+1,k = vk+1 2

• vk+1 = vk+1 / vk+1 2
with a second pass (reorthogonalization). One can reorthogonalize in every
iteration or only if a test [147] detects a loss of orthogonality. There is nothing
to be gained by reorthogonalizing more than once [147].
The modi¬ed Gram“Schmidt process with reorthogonalization looks like
• vk+1 = Avk
for j = 1, . . . , k
T
hjk = vk+1 vj
vk+1 = vk+1 ’ hjk vj

• hk+1,k = vk+1 2

• If loss of orthogonality is detected
For j = 1, . . . , k
T
htmp = vk+1 vj
hjk = hjk + htmp
vk+1 = vk+1 ’ htmp vj

• hk+1,k = vk+1 2

• vk+1 = vk+1 / vk+1 2




Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




42 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

One approach to reorthogonalization is to reorthogonalize in every step.
This doubles the cost of the computation of V and is usually unnecessary.
More e¬cient and equally e¬ective approaches are based on other ideas. A
variation on a method from [147] is used in [22]. Reorthogonalization is done
after the Gram“Schmidt loop and before vk+1 is normalized if

(3.10) Avk + δ vk+1 = Avk
2 2 2


to working precision. The idea is that if the new vector is very small relative
to Avk then information may have been lost and a second pass through
the modi¬ed Gram“Schmidt process is needed. We employ this test in the
MATLAB code gmres with δ = 10’3 .
To illustrate the e¬ects of loss of orthogonality and those of reorthogonal-
ization we apply GMRES to the diagonal system Ax = b where b = (1, 1, 1)T ,
x0 = (0, 0, 0)T , and
« 
.001 0 0
¬ ·
A= 0 .0011 0  .
(3.11)
104
0 0

While in in¬nite precision arithmetic only three iterations are needed to solve
the system exactly, we ¬nd in the MATLAB environment that a solution to full
precision requires more than three iterations unless reorthogonalization is ap-
plied after every iteration. In Table 3.1 we tabulate relative residuals as a func-
tion of the iteration counter for classical Gram“Schmidt without reorthogonal-
ization (CGM), modi¬ed Gram“Schmidt without reorthogonalization (MGM),
reorthogonalization based on the test (3.10) (MGM-PO), and reorthogonaliza-
tion in every iteration (MGM-FO). While classical Gram“Schmidt fails, the
reorthogonalization strategy based on (3.10) is almost as e¬ective as the much
more expensive approach of reorthogonalizing in every step. The method based
on (3.10) is the default in the MATLAB code gmresa.
The kth GMRES iteration requires a matrix-vector product, k scalar
products, and the solution of the Hessenberg least squares problem in step 2e.
The k scalar products require O(kN ) ¬‚oating-point operations and the cost
of a solution of the Hessenberg least squares problem, by QR factorization or
the MATLAB backward division operator, say, in step 2e of gmresb is O(k 3 )
¬‚oating-point operations. Hence the total cost of the m GMRES iterations is
m matrix-vector products and O(m4 + m2 N ) ¬‚oating-point operations. When
k is not too large and the cost of matrix-vector products is high, a brute-
force solution to the least squares problem using the MATLAB backward
division operator is not terribly ine¬cient. We provide an implementation
of Algorithm gmresb in the collection of MATLAB codes. This is an appealing
algorithm, especially when implemented in an environment like MATLAB,
because of its simplicity. For large k, however, the brute-force method can be
very costly.



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




43
GMRES ITERATION

Table 3.1
E¬ects of reorthogonalization.

k CGM MGM MGM-PO MGM-FO
0 1.00e+00 1.00e+00 1.00e+00 1.00e+00
1 8.16e-01 8.16e-01 8.16e-01 8.16e-01
2 3.88e-02 3.88e-02 3.88e-02 3.88e-02
3 6.69e-05 6.42e-08 6.42e-08 6.34e-34
4 4.74e-05 3.70e-08 5.04e-24
5 3.87e-05 3.04e-18
6 3.35e-05
7 3.00e-05
8 2.74e-05
9 2.53e-05
10 2.37e-05




3.5. Implementation: Givens rotations
If k is large, implementations using Givens rotations [167], [22], Householder
re¬‚ections [195], or a shifted Arnoldi process [197] are much more e¬cient
than the brute-force approach in Algorithm gmresb. The implementation in
Algorithm gmres and in the MATLAB code collection is from [167]. This
implementation maintains the QR factorization of Hk in a clever way so that
the cost for a single GMRES iteration is O(N k) ¬‚oating-point operations. The
O(k 2 ) cost of the triangular solve and the O(kN ) cost of the construction of
xk are incurred after termination.
A 2 — 2 Givens rotation is a matrix of the form


c ’s
(3.12) G= ,
s c


where c = cos(θ), s = sin(θ) for θ ∈ [’π, π]. The orthogonal matrix G rotates
the vector (c, ’s), which makes an angle of ’θ with the x-axis through an
angle θ so that it overlaps the x-axis.


c 1
G = .
’s 0


An N — N Givens rotation replaces a 2 — 2 block on the diagonal of the



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




44 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

N — N identity matrix with a 2 — 2 Givens rotation.
« 
1
0 ... 0
¬ ·
¬ 0 ... ... ·
¬ ·
¬ ·
..
¬ ·
. c ’s
¬ ·
¬. ·
.
G=¬ . ·.
.
(3.13) ¬. s c 0 . ·
¬ ·
.
¬ ·
1 ..
0
¬ ·
¬ ·
¬ ·
.. ..
. .0
 
0 ... 01

Our notation is that Gj (c, s) is an N — N givens rotation of the form (3.13)
with a 2 — 2 Givens rotation in rows and columns j and j + 1.
Givens rotations are used to annihilate single nonzero elements of matrices
in reduction to triangular form [89]. They are of particular value in reducing
Hessenberg matrices to triangular form and thereby solving Hessenberg least
squares problems such as the ones that arise in GMRES. This reduction can be
accomplished in O(N ) ¬‚oating-point operations and hence is far more e¬cient
than a solution by a singular value decomposition or a reduction based on
Householder transformations. This method is also used in the QR algorithm
for computing eigenvalues [89], [184].
Let H be an N — M (N ≥ M ) upper Hessenberg matrix with rank M .
We reduce H to triangular form by ¬rst multiplying the matrix by a Givens
rotation that annihilates h21 (and, of course, changes h11 and the subsequent
columns). We de¬ne G1 = G1 (c1 , s1 ) by

c1 = h11 / h2 + h2 and s1 = ’h21 / h2 + h2 .
(3.14) 11 21 11 21

If we replace H by G1 H, then the ¬rst column of H now has only a single
nonzero element h11 . Similarly, we can now apply G2 (c2 , s2 ) to H, where

c2 = h22 / h2 + h2 and s2 = ’h32 / h2 + h2 .
(3.15) 22 32 22 32

and annihilate h32 . Note that G2 does not a¬ect the ¬rst column of H.
Continuing in this way and setting

Q = GN . . . G1

we see that QH = R is upper triangular.
A straightforward application of these ideas to Algorithm gmres would
solve the least squares problem by computing the product of k Givens rotations
Q, setting g = βQe1 , and noting that

βe1 ’ Hk y k = Q(βe1 ’ Hk y k ) = g ’ Rk y k Rk+1 ,
Rk+1 Rk+1

where Rk is the k + 1 — k triangular factor of the QR factorization of Hk .



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.

<< . .

. 7
( : 26)



. . >>