<< . .

. 2
( : 26)



. . >>

(1.5) x = (I ’ A)x + b,
and to de¬ne the Richardson iteration

(1.6) xk+1 = (I ’ A)xk + b.

We will discuss more general methods in which {xk } is given by

(1.7) xk+1 = M xk + c.

In (1.7) M is an N —N matrix called the iteration matrix. Iterative methods of
this form are called stationary iterative methods because the transition from xk
to xk+1 does not depend on the history of the iteration. The Krylov methods
discussed in Chapters 2 and 3 are not stationary iterative methods.
All our results are based on the following lemma.
Lemma 1.2.1. If M is an N — N matrix with M < 1 then I ’ M is
nonsingular and
1
(I ’ M )’1 ¤
(1.8) .
1’ M
Proof. We will show that I ’ M is nonsingular and that (1.8) holds by
showing that the series

M l = (I ’ M )’1 .
l=0

The partial sums
k
Ml
Sk =
l=0

form a Cauchy sequence in RN —N . To see this note that for all m > k
m
Ml .
Sk ’ Sm ¤
l=k+1

Now, M l ¤ M l because · is a matrix norm that is induced by a vector
norm. Hence
m
1 ’ M m’k
l k+1
Sk ’ S m ¤ M =M ’0
1’ M
l=k+1

as m, k ’ ∞. Hence the sequence Sk converges, say to S. Since M Sk + I =
Sk+1 , we must have M S + I = S and hence (I ’ M )S = I. This proves that
I ’ M is nonsingular and that S = (I ’ M )’1 .



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




6 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

Noting that

’1 l
= (1 ’ M )’1 .
(I ’ M ) ¤ M
l=0

proves (1.8) and completes the proof.
The following corollary is a direct consequence of Lemma 1.2.1.
Corollary 1.2.1. If M < 1 then the iteration (1.7) converges to
x = (I ’ M )’1 c for all initial iterates x0 .
A consequence of Corollary 1.2.1 is that Richardson iteration (1.6) will
converge if I ’ A < 1. It is sometimes possible to precondition a linear
equation by multiplying both sides of (1.1) by a matrix B

BAx = Bb

so that convergence of iterative methods is improved. In the context of
Richardson iteration, the matrices B that allow us to apply the Banach lemma
and its corollary are called approximate inverses.
Definition 1.2.1. B is an approximate inverse of A if I ’ BA < 1.
The following theorem is often referred to as the Banach Lemma.
Theorem 1.2.1. If A and B are N — N matrices and B is an approximate
inverse of A, then A and B are both nonsingular and
B A
A’1 ¤ B ’1 ¤
(1.9) , ,
1 ’ I ’ BA 1 ’ I ’ BA
and
B I ’ BA A I ’ BA
A’1 ’ B ¤ A ’ B ’1 ¤
(1.10) , .
1 ’ I ’ BA 1 ’ I ’ BA
Proof. Let M = I ’ BA. By Lemma 1.2.1 I ’ M = I ’ (I ’ BA) = BA is
nonsingular. Hence both A and B are nonsingular. By (1.8)
1 1
A’1 B ’1 = (I ’ M )’1 ¤
(1.11) = .
1’ M 1 ’ I ’ BA

Since A’1 = (I ’ M )’1 B, inequality (1.11) implies the ¬rst part of (1.9). The
second part follows in a similar way from B ’1 = A(I ’ M )’1 .
To complete the proof note that

A’1 ’ B = (I ’ BA)A’1 , A ’ B ’1 = B ’1 (I ’ BA),

and use (1.9).
Richardson iteration, preconditioned with approximate inversion, has the
form
(1.12) xk+1 = (I ’ BA)xk + Bb.
If the norm of I ’ BA is small, then not only will the iteration converge
rapidly, but, as Lemma 1.1.1 indicates, termination decisions based on the



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




7
BASIC CONCEPTS

preconditioned residual Bb ’ BAx will better re¬‚ect the actual error. This
method is a very e¬ective technique for solving di¬erential equations, integral
equations, and related problems [15], [6], [100], [117], [111]. Multigrid methods
[19], [99], [126], can also be interpreted in this light. We mention one other
approach, polynomial preconditioning, which tries to approximate A’1 by a
polynomial in A [123], [179], [169].

1.3. The spectral radius
The analysis in § 1.2 related convergence of the iteration (1.7) to the norm of
the matrix M . However the norm of M could be small in some norms and
quite large in others. Hence the performance of the iteration is not completely
described by M . The concept of spectral radius allows us to make a complete
description.
We let σ(A) denote the set of eigenvalues of A.
Definition 1.3.1. The spectral radius of an N — N matrix A is

ρ(A) = max |»| = lim An 1/n
(1.13) .
n’∞
»∈σ(A)


The term on the right-hand side of the second equality in (1.13) is the limit
used by the radical test for convergence of the series An .
The spectral radius of M is independent of any particular matrix norm of
M . It is clear, in fact, that
(1.14) ρ(A) ¤ A
for any induced matrix norm. The inequality (1.14) has a partial converse that
allows us to completely describe the performance of iteration (1.7) in terms of
spectral radius. We state that converse as a theorem and refer to [105] for a
proof.
Theorem 1.3.1. Let A be an N — N matrix. Then for any > 0 there is
a norm · on RN such that

ρ(A) > A ’ .

A consequence of Theorem 1.3.1, Lemma 1.2.1, and Exercise 1.5.1 is a
characterization of convergent stationary iterative methods. The proof is left
as an exercise.
Theorem 1.3.2. Let M be an N —N matrix. The iteration (1.7) converges
for all c ∈ RN if and only if ρ(M ) < 1.

1.4. Matrix splittings and classical stationary iterative methods
There are ways to convert Ax = b to a linear ¬xed-point iteration that are
di¬erent from (1.5). Methods such as Jacobi, Gauss“Seidel, and sucessive
overrelaxation (SOR) iteration are based on splittings of A of the form

A = A1 + A2 ,



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




8 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

where A1 is a nonsingular matrix constructed so that equations with A1 as
coe¬cient matrix are easy to solve. Then Ax = b is converted to the ¬xed-
point problem
x = A’1 (b ’ A2 x).
1

The analysis of the method is based on an estimation of the spectral radius of
’1
the iteration matrix M = ’A1 A2 .
For a detailed description of the classical stationary iterative methods the
reader may consult [89], [105], [144], [193], or [200]. These methods are usually
less e¬cient than the Krylov methods discussed in Chapters 2 and 3 or the more
modern stationary methods based on multigrid ideas. However the classical
methods have a role as preconditioners. The limited description in this section
is intended as a review that will set some notation to be used later.
As a ¬rst example we consider the Jacobi iteration that uses the splitting

A1 = D, A2 = L + U,

where D is the diagonal of A and L and U are the (strict) lower and upper
triangular parts. This leads to the iteration matrix

MJAC = ’D’1 (L + U ).

Letting (xk )i denote the ith component of the kth iterate we can express Jacobi
iteration concretely as
« 

(xk+1 )i = a’1 bi ’ aij (xk )j  .
(1.15) ii
j=i

Note that A1 is diagonal and hence trivial to invert.
We present only one convergence result for the classical stationary iterative
methods.
Theorem 1.4.1. Let A be an N — N matrix and assume that for all
1¤i¤N
(1.16) 0< |aij | < |aii |.
j=i

Then A is nonsingular and the Jacobi iteration (1.15) converges to x— = A’1 b
for all b.
Proof. Note that the ith row sum of M = MJAC satis¬es
N
j=i |aij |
|mij | = < 1.
|aii |
j=1

Hence MJAC ∞ < 1 and the iteration converges to the unique solution of
x = M x + D’1 b. Also I ’ M = D’1 A is nonsingular and therefore A is
nonsingular.



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




9
BASIC CONCEPTS

Gauss“Seidel iteration overwrites the approximate solution with the new
value as soon as it is computed. This results in the iteration
« 

(xk+1 )i = a’1 bi ’ aij (xk )j  ,
aij (xk+1 )j ’
ii
j<i j>i

the splitting
A1 = D + L, A2 = U,
and iteration matrix
MGS = ’(D + L)’1 U.
’1
Note that A1 is lower triangular, and hence A1 y is easy to compute for vectors
y. Note also that, unlike Jacobi iteration, the iteration depends on the ordering
of the unknowns. Backward Gauss“Seidel begins the update of x with the N th
coordinate rather than the ¬rst, resulting in the splitting

A1 = D + U, A2 = L,

and iteration matrix
MBGS = ’(D + U )’1 L.
A symmetric Gauss“Seidel iteration is a forward Gauss“Seidel iteration
followed by a backward Gauss“Seidel iteration. This leads to the iteration
matrix
MSGS = MBGS MGS = (D + U )’1 L(D + L)’1 U.
If A is symmetric then U = LT . In that event

MSGS = (D + U )’1 L(D + L)’1 U = (D + LT )’1 L(D + L)’1 LT .

From the point of view of preconditioning, one wants to write the stationary
method as a preconditioned Richardson iteration. That means that one wants
to ¬nd B such that M = I ’ BA and then use B as an approximate inverse.
For the Jacobi iteration,
BJAC = D’1 .
(1.17)
For symmetric Gauss“Seidel

BSGS = (D + LT )’1 D(D + L)’1 .
(1.18)

The successive overrelaxation iteration modi¬es Gauss“Seidel by adding a
relaxation parameter ω to construct an iteration with iteration matrix

MSOR = (D + ωL)’1 ((1 ’ ω)D ’ ωU ).

The performance can be dramatically improved with a good choice of ω but
still is not competitive with Krylov methods. A further disadvantage is that
the choice of ω is often di¬cult to make. References [200], [89], [193], [8], and
the papers cited therein provide additional reading on this topic.



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




10 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

1.5. Exercises on stationary iterative methods
1.5.1. Show that if ρ(M ) ≥ 1 then there are x0 and c such that the iteration
(1.7) fails to converge.

1.5.2. Prove Theorem 1.3.2.

1.5.3. Verify equality (1.18).

1.5.4. Show that if A is symmetric and positive de¬nite (that is AT = A and
xT Ax > 0 for all x = 0) that BSGS is also symmetric and positive
de¬nite.




Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




Chapter 2

Conjugate Gradient Iteration




2.1. Krylov methods and the minimization property
In the following two chapters we describe some of the Krylov space methods
for linear equations. Unlike the stationary iterative methods, Krylov methods
do not have an iteration matrix. The two such methods that we™ll discuss in
depth, conjugate gradient and GMRES, minimize, at the kth iteration, some
measure of error over the a¬ne space

x0 + Kk ,

where x0 is the initial iterate and the kth Krylov subspace Kk is

Kk = span(r0 , Ar0 , . . . , Ak’1 r0 )

for k ≥ 1.
The residual is
r = b ’ Ax.
So {rk }k≥0 will denote the sequence of residuals

rk = b ’ Axk .

As in Chapter 1, we assume that A is a nonsingular N — N matrix and let

x— = A’1 b.

There are other Krylov methods that are not as well understood as CG or
GMRES. Brief descriptions of several of these methods and their properties
are in § 3.6, [12], and [78].
The conjugate gradient (CG) iteration was invented in the 1950s [103] as a
direct method. It has come into wide use over the last 15 years as an iterative
method and has generally superseded the Jacobi“Gauss“Seidel“SOR family of
methods.
CG is intended to solve symmetric positive de¬nite (spd) systems. Recall
that A is symmetric if A = AT and positive de¬nite if

xT Ax > 0 for all x = 0.
11

Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




12 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

In this section we assume that A is spd. Since A is spd we may de¬ne a norm
(you should check that this is a norm) by

xT Ax.
(2.1) x =
A


· A is called the A-norm. The development in these notes is di¬erent from
the classical work and more like the analysis for GMRES and CGNR in [134].
In this section, and in the section on GMRES that follows, we begin with a
description of what the algorithm does and the consequences of the minimiza-
tion property of the iterates. After that we describe termination criterion,
performance, preconditioning, and at the very end, the implementation.
The kth iterate xk of CG minimizes

1
φ(x) = xT Ax ’ xT b
(2.2)
2

over x0 + Kk .
Note that if φ(˜) is the minimal value (in RN ) then
x

∇φ(˜) = A˜ ’ b = 0
x x

<< . .

. 2
( : 26)



. . >>