From (1.23) and Theorem 1.5 it follows that a su¬cient condition for con-

vergence to hold is that B < 1, for any matrix norm. It is reasonable

to expect that the convergence is faster when ρ(B) is smaller so that an

estimate of ρ(B) might provide a sound indication of the convergence of

the algorithm. Other remarkable quantities in convergence analysis are con-

tained in the following de¬nition.

De¬nition 4.2 Let B be the iteration matrix. We call:

1. Bm the convergence factor after m steps of the iteration;

2. Bm 1/m

the average convergence factor after m steps;

3. Rm (B) = ’ m log Bm the average convergence rate after m steps.

1

These quantities are too expensive to compute since they require evaluating

Bm . Therefore, it is usually preferred to estimate the asymptotic conver-

gence rate, which is de¬ned as

R(B) = lim Rk (B) = ’ log ρ(B) (4.5)

k’∞

where Property 1.13 has been accounted for. In particular, if B were sym-

metric, we would have

1

Rm (B) = ’ = ’ log ρ(B).

log Bm 2

m

In the case of nonsymmetric matrices, ρ(B) sometimes provides an overop-

timistic estimate of Bm 1/m (see [Axe94], Section 5.1). Indeed, although

ρ(B) < 1, the convergence to zero of the sequence Bm might be non-

monotone (see Exercise 1). We ¬nally notice that, due to (4.5), ρ(B) is

the asymptotic convergence factor. Criteria for estimating the quantities

de¬ned so far will be addressed in Section 4.6.

Remark 4.1 The iterations introduced in (4.2) are a special instance of

iterative methods of the form

x(0) = f0 (A, b),

x(n+1) = fn+1 (x(n) , x(n’1) , . . . , x(n’m) , A, b), for n ≥ m,

126 4. Iterative Methods for Solving Linear Systems

where fi and x(m) , . . . , x(1) are given functions and vectors, respectively.

The number of steps which the current iteration depends on is called the

order of the method. If the functions fi are independent of the step index i,

the method is called stationary, otherwise it is nonstationary. Finally, if fi

depends linearly on x(0) , . . . , x(m) , the method is called linear, otherwise

it is nonlinear.

In the light of these de¬nitions, the methods considered so far are there-

fore stationary linear iterative methods of ¬rst order. In Section 4.3, exam-

ples of nonstationary linear methods will be provided.

4.2 Linear Iterative Methods

A general technique to devise consistent linear iterative methods is based

on an additive splitting of the matrix A of the form A=P’N, where P

and N are two suitable matrices and P is nonsingular. For reasons that

will be clear in the later sections, P is called preconditioning matrix or

preconditioner.

Precisely, given x(0) , one can compute x(k) for k ≥ 1, solving the systems

k ≥ 0.

Px(k+1) = Nx(k) + b, (4.6)

The iteration matrix of method (4.6) is B = P’1 N, while f = P’1 b. Alter-

natively, (4.6) can be written in the form

x(k+1) = x(k) + P’1 r(k) , (4.7)

where

r(k) = b ’ Ax(k) (4.8)

denotes the residual vector at step k. Relation (4.7) outlines the fact that

a linear system, with coe¬cient matrix P, must be solved to update the

solution at step k + 1. Thus P, besides being nonsingular, ought to be easily

invertible, in order to keep the overall computational cost low. (Notice that,

if P were equal to A and N=0, method (4.7) would converge in one iteration,

but at the same cost of a direct method).

Let us mention two results that ensure convergence of the iteration (4.7),

provided suitable conditions on the splitting of A are ful¬lled (for their

proof, we refer to [Hac94]).

Property 4.1 Let A = P ’ N, with A and P symmetric and positive def-

inite. If the matrix 2P ’ A is positive de¬nite, then the iterative method

de¬ned in (4.7) is convergent for any choice of the initial datum x(0) and

ρ(B) = B =B < 1.

A P

4.2 Linear Iterative Methods 127

Moreover, the convergence of the iteration is monotone with respect to the

norms · P and · A (i.e., e(k+1) P < e(k) P and e(k+1) A < e(k) A

k = 0, 1, . . . ).

Property 4.2 Let A = P ’ N with A symmetric and positive de¬nite. If

the matrix P + PT ’ A is positive de¬nite, then P is invertible, the iterative

method de¬ned in (4.7) is monotonically convergent with respect to norm

· A and ρ(B) ¤ B A < 1.

4.2.1 Jacobi, Gauss-Seidel and Relaxation Methods

In this section we consider some classical linear iterative methods.

If the diagonal entries of A are nonzero, we can single out in each equation

the corresponding unknown, obtaining the equivalent linear system

®

n

1

°bi ’ aij xj » ,

xi = i = 1, . . . , n. (4.9)

aii j=1

j=i

In the Jacobi method, once an arbitrarily initial guess x0 has been chosen,

x(k+1) is computed by the formulae

®

n

1 (k)

(k+1)

°bi ’ aij xj » , i = 1, . . . , n.

xi = (4.10)

aii j=1

j=i

This amounts to performing the following splitting for A

P = D, N = D ’ A = E + F,

where D is the diagonal matrix of the diagonal entries of A, E is the lower

triangular matrix of entries eij = ’aij if i > j, eij = 0 if i ¤ j, and F is

the upper triangular matrix of entries fij = ’aij if j > i, fij = 0 if j ¤ i.

As a consequence, A=D-(E+F).

The iteration matrix of the Jacobi method is thus given by

BJ = D’1 (E + F) = I ’ D’1 A. (4.11)

A generalization of the Jacobi method is the over-relaxation method

(or JOR), in which, having introduced a relaxation parameter ω, (4.10) is

replaced by

®

n

ω (k)

(k+1) (k)

°bi ’ aij xj » + (1 ’ ω)xi ,

xi = i = 1, . . . , n.

aii j=1

j=i

128 4. Iterative Methods for Solving Linear Systems

The corresponding iteration matrix is

BJω = ωBJ + (1 ’ ω)I. (4.12)

In the form (4.7), the JOR method corresponds to

x(k+1) = x(k) + ωD’1 r(k) .

This method is consistent for any ω = 0 and for ω = 1 it coincides with

the Jacobi method.

The Gauss-Seidel method di¬ers from the Jacobi method in the fact that

(k+1)

at the k + 1-th step the available values of xi are being used to update

the solution, so that, instead of (4.10), one has

®

i’1 n

1°

aij xj » , i = 1, . . . , n. (4.13)

(k+1) (k+1) (k)

bi ’ ’

xi = aij xj

aii j=1 j=i+1

This method amounts to performing the following splitting for A

P = D ’ E, N = F,

and the associated iteration matrix is

BGS = (D ’ E)’1 F. (4.14)

Starting from Gauss-Seidel method, in analogy to what was done for

Jacobi iterations, we introduce the successive over-relaxation method (or

SOR method)

®

i’1 n

ω°

aij xj » + (1 ’ ω)xi , (4.15)

(k+1) (k+1) (k) (k)

bi ’ ’

xi = aij xj

aii j=1 j=i+1

for i = 1, . . . , n. The method (4.15) can be written in vector form as

(I ’ ωD’1 E)x(k+1) = [(1 ’ ω)I + ωD’1 F]x(k) + ωD’1 b (4.16)

from which the iteration matrix is

B(ω) = (I ’ ωD’1 E)’1 [(1 ’ ω)I + ωD’1 F]. (4.17)

Multiplying by D both sides of (4.16) and recalling that A = D ’ (E + F)

yields the following form (4.7) of the SOR method

’1

1

D’E

(k+1) (k)

r(k) .

x =x +

ω

It is consistent for any ω = 0 and for ω = 1 it coincides with Gauss-Seidel

method. In particular, if ω ∈ (0, 1) the method is called under-relaxation,

while if ω > 1 it is called over-relaxation.

4.2 Linear Iterative Methods 129

4.2.2 Convergence Results for Jacobi and Gauss-Seidel

Methods

There exist special classes of matrices for which it is possible to state a

priori some convergence results for the methods examined in the previous

section. The ¬rst result in this direction is the following.

Theorem 4.2 If A is a strictly diagonally dominant matrix by rows, the

Jacobi and Gauss-Seidel methods are convergent.

Proof. Let us prove the part of the theorem concerning the Jacobi method, while

for the Gauss-Seidel method we refer to [Axe94]. Since A is strictly diagonally

dominant by rows, |aii | > n |aij | for j = i and i = 1, . . . , n. As a consequence,

j=1

n

|aij |/|aii | < 1, so that the Jacobi method is convergent.

BJ = max

∞

i=1,... ,n

j=1,j=i

3

Theorem 4.3 If A and 2D ’ A are symmetric and positive de¬nite matri-

ces, then the Jacobi method is convergent and ρ(BJ ) = BJ A = BJ D .

3

Proof. The theorem follows from Property 4.1 taking P=D.

In the case of the JOR method, the assumption on 2D ’ A can be removed,

yielding the following result.

Theorem 4.4 If A if symmetric positive de¬nite, then the JOR method is

convergent if 0 < ω < 2/ρ(D’1 A).

Proof. The result immediately follows from (4.12) and noting that A has real

3

positive eigenvalues.

Concerning the Gauss-Seidel method, the following result holds.

Theorem 4.5 If A is symmetric positive de¬nite, the Gauss-Seidel method

is monotonically convergent with respect to the norm · A .

Proof. We can apply Property 4.2 to the matrix P=D’E, upon checking that

P + PT ’ A is positive de¬nite. Indeed

P + PT ’ A = 2D ’ E ’ F ’ A = D,

having observed that (D ’ E)T = D ’ F. We conclude by noticing that D is

3

positive de¬nite, since it is the diagonal of A.

Finally, if A is positive de¬nite and tridiagonal, it can be shown that also

the Jacobi method is convergent and

ρ(BGS ) = ρ2 (BJ ). (4.18)

130 4. Iterative Methods for Solving Linear Systems

In this case, the Gauss-Seidel method is more rapidly convergent than the

Jacobi method. Relation (4.18) holds even if A enjoys the following A-

property.

De¬nition 4.3 A consistently ordered matrix M ∈ Rn—n (that is, a matrix

such that ±D’1 E+±’1 D’1 F, for ± = 0, has eigenvalues that do not depend

on ±, where M=D-E-F, D = diag(m11 , . . . , mnn ), E and F are strictly lower

and upper triangular matrices, respectively) enjoys the A-property if it can

be partitioned in the 2 — 2 block form

˜

D1 M12

M= ,

˜

M21 D2

˜ ˜

where D1 and D2 are diagonal matrices.

When dealing with general matrices, no a priori conclusions on the conver-

gence properties of the Jacobi and Gauss-Seidel methods can be drawn, as

shown in Example 4.2.

Example 4.2 Consider the 3 — 3 linear systems of the form Ai x = bi , where bi

is always taken in such a way that the solution of the system is the unit vector,

and the matrices Ai are

® ®

’3 3 ’6

304

A1 = ° 7 4 2 » , A2 = ° ’4 7 ’8 » ,

’1 1 2 5 7 ’9

® ®

4 1 1 7 6 9

°2 0 », °4 ’4 » .

’9 5

A3 = A4 =

’8 ’6 ’7 ’3

0 8

It can be checked that the Jacobi method does fail to converge for A1 (ρ(BJ ) =

1.33), while the Gauss-Seidel scheme is convergent. Conversely, in the case of

A2 , the Jacobi method is convergent, while the Gauss-Seidel method fails to

converge (ρ(BGS ) = 1.¯ In the remaining two cases, the Jacobi method is more

1).

slowly convergent than the Gauss-Seidel method for matrix A3 (ρ(BJ ) = 0.44

against ρ(BGS ) = 0.018), and the converse is true for A4 (ρ(BJ ) = 0.64 while

•

ρ(BGS ) = 0.77).

We conclude the section with the following result.

Theorem 4.6 If the Jacobi method is convergent, then the JOR method

converges if 0 < ω ¤ 1.

Proof. From (4.12) we obtain that the eigenvalues of BJω are

µk = ω»k + 1 ’ ω, k = 1, . . . , n,

4.2 Linear Iterative Methods 131

where »k are the eigenvalues of BJ . Then, recalling the Euler formula for the

representation of a complex number, we let »k = rk eiθk and get

|µk |2 = ω 2 rk + 2ωrk cos(θk )(1 ’ ω) + (1 ’ ω)2 ¤ (ωrk + 1 ’ ω)2 ,

2

which is less than 1 if 0 < ω ¤ 1. 3

4.2.3 Convergence Results for the Relaxation Method

The following result provides a necessary condition on ω in order the SOR

method to be convergent.

Theorem 4.7 For any ω ∈ R we have ρ(B(ω)) ≥ |ω ’ 1|; therefore, the

SOR method fails to converge if ω ¤ 0 or ω ≥ 2.

Proof. If {»i } denote the eigenvalues of the SOR iteration matrix, then

n

»i = det (1 ’ ω)I + ωD’1 F = |1 ’ ω|n .

i=1

Therefore, at least one eigenvalue »i must exist such that |»i | ≥ |1 ’ ω| and thus,

in order for convergence to hold, we must have |1 ’ ω| < 1, that is 0 < ω < 2. 3

Assuming that A is symmetric and positive de¬nite, the condition 0 < ω <

2, besides being necessary, becomes also su¬cient for convergence. Indeed

the following result holds (for the proof, see [Hac94]).

Property 4.3 (Ostrowski) If A is symmetric and positive de¬nite, then

the SOR method is convergent i¬ 0 < ω < 2. Moreover, its convergence is

monotone with respect to · A .

Finally, if A is strictly diagonally dominant by rows, the SOR method

converges if 0 < ω ¤ 1.

The results above show that the SOR method is more or less rapidly

convergent, depending on the choice of the relaxation parameter ω. The

question of how to determine the value ωopt for which the convergence rate

is the highest possible can be given a satisfactory answer only in special

cases (see, for instance, [Axe94], [You71], [Var62] or [Wac66]). Here we limit

ourselves to quoting the following result (whose proof is in [Axe94]).

Property 4.4 If the matrix A enjoys the A-property and if BJ has real

eigenvalues, then the SOR method converges for any choice of x(0) i¬

ρ(BJ ) < 1 and 0 < ω < 2. Moreover,

2

ωopt = (4.19)

1 ’ ρ(BJ )2

1+

132 4. Iterative Methods for Solving Linear Systems

and the corresponding asymptotic convergence factor is

1’ 1 ’ ρ(BJ )2

ρ(B(ωopt )) = .

1 ’ ρ(BJ )2

1+

A priori Forward Analysis

4.2.4

In the previous analysis we have neglected the rounding errors. However, as

shown in the following example (taken from [HW76]), they can dramatically

a¬ect the convergence rate of the iterative method.

Example 4.3 Let A be a lower bidiagonal matrix of order 100 with entries

aii = 1.5 and ai,i’1 = 1, and let b ∈ R100 be the right-side with bi = 2.5. The

exact solution of the system Ax = b has components xi = 1 ’ (’2/3)i . The

SOR method with ω = 1.5 should be convergent, working in exact arithmetic,

since ρ(B(1.5)) = 0.5 (far below one). However, running Program 16 with x(0) =

¬‚ (x) + M , which is extremely close to the exact value, the sequence x(k) diverges

and after 100 iterations the algorithm yields a solution with x(100) ∞ = 1013 .

The ¬‚aw is due to rounding error propagation and must not be ascribed to a

•

possible ill-conditioning of the matrix since K∞ (A) 5.

To account for rounding errors, let us denote by x(k) the solution (in ¬nite

arithmetic) generated by an iterative method of the form (4.6) after k steps.

Due to rounding errors, x(k) can be regarded as the exact solution to the

problem

Px(k+1) = Nx(k) + b ’ ζ k , (4.20)

with

ζ k = δPk+1 x(k+1) ’ gk .

The matrix δPk+1 accounts for the rounding errors in the solution of (4.6),

while the vector gk includes the errors made in the evaluation of Nx(k) + b.

From (4.20), we obtain

k

Bj P’1 (b ’ ζ k’j )

(k+1) k+1 (0)

x =B x +

j=0

and for the absolute error e(k+1) = x ’ x(k+1)

k

Bj P’1 ζ k’j .

(k+1) k+1 (0)