is also valid in general for quasi-uniform triangulations.

5.1.4 SOR and Block-Iteration Methods

We assume again that A is a general nonsingular matrix. For the relaxation

of the Gauss“Seidel method we use it in the form

Dx(k+1) = ’Lx(k+1) ’ Rx(k) + b ,

instead of the resolved form (5.20).

The relaxed method is then

Dx(k+1) = ω ’ Lx(k+1) ’ Rx(k) + b + (1 ’ ω)Dx(k) (5.36)

with a relaxation parameter ω > 0. This is equivalent to

(D + ωL) x(k+1) = (’ωR + (1 ’ ω)D) x(k) + ωb . (5.37)

Hence

(D + ωL)’1 (’ωR + (1 ’ ω)D) ,

Mω :=

’1

Nω := (D + ωL) ω.

In the application to discretizations of boundary value problems, nor-

mally we choose ω > 1, which means overrelaxation. This explains the

name of the SOR method as an abbreviation of successive overrelaxation.

The e¬ort to execute an iteration step is hardly higher than for the Gauss“

Seidel method. Although we have to add 3m operations to the evaluation

of the right side of (5.36), the forward substitution to solve the auxiliary

system of equations in (5.37) is already part of the form (5.36).

The calculation of the optimal ω here is more di¬cult, because Mω

¯

depends nonlinearly on ω. Only for special classes of matrices can the opti-

mal ω minimizing (Mω ) be calculated explicitly in dependence on (M1 ),

¯

the convergence rate of the (nonrelaxed) Gauss“Seidel method. Before we

sketch this, we will look at some further variants of this procedure:

The matrix Nω is nonsymmetric even for symmetric A. One gets a sym-

metric Nω if after one SOR step another one is performed in which the

indices are run through in reverse order m, m ’ 1, . . . , 2, 1, which means

that L and R are exchanged. The two half steps

1 1

= ω ’ Lx(k+ 2 ) ’ Rx(k) + b + (1 ’ ω)Dx(k) ,

Dx(k+ 2 )

1 1

= ω ’ Lx(k+ 2 ) ’ Rx(k+1) + b + (1 ’ ω)Dx(k+ 2 ) ,

Dx(k+1)

5.1. Linear Stationary Iterative Methods 211

make up one step of the symmetric SOR, the SSOR method for short. A

special case is the symmetric Gauss“Seidel method for ω = 1.

We write down the procedure for symmetric A, i.e., R = LT in the form

(5.6), in which the symmetry of N becomes obvious:

’1 ’1

(1 ’ ω)D ’ ωL (D + ωL) (1 ’ ω)D ’ ωLT ,

D + ωLT

M =

’1

D (D + ωL)’1 .

= ω(2 ’ ω) D + ωLT

N (5.38)

The e¬ort for SSOR is only slightly higher than for SOR if the vectors

already calculated in the half steps are stored and used again, as for example

Lx(k+1/2) .

Other variants of these procedures are created if the procedures are not

applied to the matrix itself but to a block partitioning

with Aij ∈ Rmi ,mj ,

A = (Aij )i,j i, j = 1, . . . , p , (5.39)

p

with i=1 mi = m. As an example we get the block-Jacobi method, which

is analogous to (5.19) and has the form

«

p

i’1

’ + βi for all i = 1, . . . , p .

A’1

(k+1) (k) (k)

’

ξi = Aij ξj Aij ξj

ii

j=1 j=i+1

(5.40)

T T

Here x = (ξ1 , . . . , ξp ) and b = (β1 , . . . , βp ) , respectively, are correspond-

(k) (k+1)

ing partitions of the vectors. By exchanging ξj with ξj in the ¬rst

sum one gets the block-Gauss“Seidel method and then in the same way

the relaxed variants. The iteration (5.40) includes p vector equations. For

each of them we have to solve a system of equations with system matrix

Aii . To get an advantage compared to the pointwise method a much lower

e¬ort should be necessary than for the solution of the total system. This

can require ” if at all possible ” a rearranging of the variables and equa-

tions. The necessary permutations will not be noted explicitly here. Such

methods are applied in ¬nite di¬erence methods or other methods with

structured grids (see Section 4.1) if an ordering of nodes is possible such

that the matrices Aii are diagonal or tridiagonal and therefore the systems

of equations are solvable with O(mi ) operations.

As an example we again discuss the ¬ve-point stencil discretization of

the Poisson equation on a square with n + 1 nodes per space dimension.

The matrix A then has the form (1.14) with l = m = n. If the nodes are

numbered rowwise and we choose one block for each line, which means

p = n ’ 1 and mi = n ’ 1 for all i = 1, . . . , p, then the matrices Aii are

tridiagonal. On the other hand, if one chooses a partition of the indices of

the nodes in subsets Si such that a node with index in Si has neighbours

only in other index sets, then for such a selection and arbitrary ordering

within the index sets the matrices Aii become diagonal. Neighbours here

denote the nodes within a di¬erence stencil or more generally, those nodes

212 5. Iterative Methods for Systems of Linear Equations

«

4 ’1 0 ’1 0 0 0 0 0

¬’1 4 ’1 0 ’1 0 0 0 0 ·

¬ ·

¬ 0 ’1 4 0 0 ’1 0 0 0 ·

¬ ·

¬’1 0 0 4 ’1 0 ’1 0 0 ·

¬ ·

¬ 0 ’1 0 ’1 4 ’1 0 ’1 0 ·

¬ ·

¬ 0 0 ’1 0 ’1 4 0 0 ’1 ·

¬ ·

¬ 0 0 0 ’1 0 0 4 ’1 0 ·

¬ ·

0 0 0 0 ’1 0 ’1 4 ’1

0 0 0 0 0 ’1 0 ’1 4

m = 3 — 3 : rowwise ordering.

«

’1 ’1 0 0

4 0 0 0 0

¬ ·

’1 0 ’1 0

0 4 0 0 0

¬ ·

¬ ·

’1 ’1 ’1 ’1

0 0 4 0 0

¬ ·

¬ ·

0 ’1 0 ’1

0 0 0 4 0

¬ ·

¬ ·

0 0 ’1 ’1

0 0 0 0 4

¬ ·

¬ ·

¬’1 ’1 ’1 0 0 0·

¬ ·

4 0 0

¬’1 0 ’1 ’1 0 0·

¬ ·

0 4 0

0 ’1 ’1 0 ’1 0

0 0 4

0 0 ’1 ’1 ’1 0 0 0 4

red-black ordering:

red: node 1, 3, 5, 7, 9 from rowwise ordering

black: node 2, 4, 6, 8 from rowwise ordering

Figure 5.2. Comparison of orderings.

that contribute to the corresponding row of the discretization matrix. In

the example of the ¬ve-point stencil, starting with rowwise numbering, one

can combine all odd indices to a block S1 (the “red nodes”) and all even

indices to a block S2 (the “black” nodes). Here we have p = 2. We call this

a red-black ordering (see Figure 5.2). If two “colours” are not su¬cient, one

can choose p > 2.

We return to the SOR method and its convergence: In the following the

iteration matrix will be denoted by MSOR(ω) with the relaxation parameter

ω. Likewise, MJ and MGS are the iteration matrices of Jacobi™s and the

Gauss“Seidel method, respectively. General propositions are summarized

in the following theorem:

5.1. Linear Stationary Iterative Methods 213

Theorem 5.6 (of Kahan; Ostrowski and Reich)

MSOR(ω) ≥ |1 ’ ω| for ω = 0.

(1)

(2) If A is symmetric and positive de¬nite, then

for ω ∈ (0, 2) .

MSOR(ω) < 1

2

Proof: See [16, pp. 91 f.].

Therefore, we use only ω ∈ (0, 2). For a useful procedure we need more

information about the optimal relaxation parameter ωopt , given by

MSOR(ωopt ) = min MSOR(ω) ,

0<ω<2

and about the size of the contraction number. This is possible only if the

ordering of equations and unknowns has certain properties:

De¬nition 5.7 A matrix A ∈ Rm,m is consistently ordered if for the

partition (5.18), D is nonsingular and

C(±) := ±’1 D’1 L + ±D’1 R

has eigenvalues independent of ± for ± ∈ C\{0}.

There is a connection to the possibility of a multi-colour ordering, because

a matrix in the block form (5.39) is consistently ordered if it is block-

tridiagonal (i.e., Aij = 0 for |i ’ j| > 1) and the diagonal blocks Aii are

nonsingular diagonal matrices (see [28, pp. 114 f.]).

In the case of a consistently ordered matrix one can prove a relation

between the eigenvalues of MJ , MGS , and MSOR(ω) . From this we can see

how much faster the Gauss“Seidel method converges than Jacobi™s method:

Theorem 5.8 If A is consistently ordered, then

(MJ )2 = (MGS ) .

2

Proof: For a special case see Remark 5.5.2 in [16].

Due to (5.4) we can expect a halving of the number of iteration steps,

but this does not change the asymptotic statement (5.27).

Finally, in the case that Jacobi™s method converges the following theorem

holds:

Theorem 5.9 Let A be consistently ordered with nonsingular diagonal ma-

trix D, the eigenvalues of MJ being real and β := (MJ ) < 1. Then we have

for the SOR method:

2

(1) ωopt = ,

1 + (1 ’ β 2 )1/2

214 5. Iterative Methods for Systems of Linear Equations

±

2 2 1/2

1 ’ ω + 1 ω 2 β 2 + ωβ 1 ’ ω + ω β

2 4

(2) (MSOR(ω) ) = for 0 < ω < ωopt

ω’1 for ωopt ¤ ω < 2 ,

β2

(3) MSOR(ωopt ) = .

(1 + (1 ’ β 2 )1/2 )2

Proof: See [18, p. 216].

2

ρ( M SOR(ω) )

1

0 ω opt 2ω

1

Figure 5.3. Dependence of MSOR(ω) on ω.

If (MJ ) is known for Jacobi™s method, then ωopt can be calculated. This

is the case in the example of the ¬ve-point stencil discretization on a square:

From (5.26) and Theorem 5.9 it follows that

π2

2

π

+ O(n’4 ) ;

=1’

(MGS ) = cos 2

n n

hence

π

ωopt = 2/ 1 + sin n ,

ωopt ’ 1 = 1 ’ 2 π + O(n’2 ) .

MSOR(ωopt ) = n

Therefore, the optimal SOR method has a lower complexity than all

methods described up to now.

Correspondingly, the number of operations to reach the relative er-

ror level µ > 0 is reduced to ln 1 O(m3/2 ) operations in comparison to

µ

ln 1 O(m2 ) operations for the previous procedures.

µ

Table 5.1 gives an impression of the convergence for the model problem.

It displays the theoretically to be expected values for the numbers of iter-

ations of the Gauss“Seidel method (mGS ), as well as for the SOR method

5.1. Linear Stationary Iterative Methods 215

n mGS mSOR

8 43 8

16 178 17

32 715 35

64 2865 70

128 11466 140

256 45867 281

Table 5.1. Gauss“Seidel and optimal SOR method for the model problem.

with optimal relaxation parameter (mSOR ). Here we use the very moderate

termination criterion µ = 10’3 measured in the Euclidean norm.

The optimal SOR method is superior, even if we take into account the

almost doubled e¬ort per iteration step. But generally, ωopt is not known

explicitly. Figure 5.3 shows that it is probably better to overestimate ωopt

instead of underestimating. More generally, one can try to improve the

relaxation parameter during the iteration:

If (MJ ) is a simple eigenvalue, then this also holds true for the spectral

radius (MSOR(ω) ). The spectral radius can thus be approximated by the

power method on the basis of the iterates. By Theorem 5.9 (3) one can

approximate (MJ ), and by Theorem 5.9 (1) then also ωopt .

This basic principle can be extended to an algorithm (see, for example,

[18, Section 9.5]), but the upcoming overall procedure is no longer a linear

stationary method.

5.1.5 Extrapolation Methods

Another possibility for an extension of the linear stationary methods, re-

lated to the adaption of the relaxation parameter, is the following: Starting

with a linear stationary basic iteration xk+1 := ¦ xk we de¬ne a new

˜ ˜

iteration by

x(k+1) := ωk ¦ x(k) + (1 ’ ωk )x(k) , (5.41)

with extrapolation factors ωk to be chosen. A generalization of this de¬-

nition is to start with the iterates of the basic iteration x(0) , x(1) , . . .. The

˜ ˜

iterates of the new method are to be determined by

k

(k)

±kj x(j) ,

x := ˜

j=0

with ±kj de¬ned by a polynomial pk ∈ Pk , with the property pk (t) =

k j

j=0 ±kj t and pk (1) = 1. For an appropriate de¬nition of such extrapola-

tion or semi-iterative methods we need to know the spectrum of the basic

iteration matrix M , since the error e(k) = x(k) ’ x satis¬es

e(k) = pk (M )e(0) ,

216 5. Iterative Methods for Systems of Linear Equations

where M is the iteration matrix of the basic iteration. This matrix should

be normal, for example, such that

pk (M ) = (pk (M ))

2

holds. Then we have the obvious estimation

¤ pk (M )e(0) ¤ pk (M ) ¤ (pk (M )) e(0) 2 .

e(k) e(0) (5.42)

2 2 2 2

If the method is to be de¬ned in such a way that

(pk (M )) = max |pk (»)| » ∈ σ(M )

is minimized by choosing pk , then the knowledge of the spectrum σ(M ) is

necessary. Generally, instead of this, we assume that suitable supersets are

known: If σ(M ) is real and

a¤»¤b for all » ∈ σ(M ) ,

then, due to

¤ max pk (») e(0) 2 ,

e(k) 2 »∈[a,b]

it makes sense to determine the polynomials pk as a solution of the

minimization problem on [a, b],

max |pk (»)| ’ min p ∈ Pk

for all with p(1) = 1 . (5.43)

»∈[a,b]

In the following sections we will introduce methods with an analogous

convergence behaviour, without control parameters necessary for their

de¬nition.

For further information on semi-iterative methods see, for example, [16,

Chapter 7].

Exercises

5.1 Investigate Jacobi™s method and the Gauss“Seidel method for solving

the linear system of equations Ax = b with respect to their convergence if

we have the following system matrices:

« «

1 2 ’2 2 ’1 1

1

(a) A = 1 1 1 , (b) A = 2 2 2 .

2

’1 ’1 2

221

5.2 Prove the consistency of the SOR method.

5.3 Prove Theorem 5.6, (1).

5.2. Gradient and Conjugate Gradient Methods 217

5.2 Gradient and Conjugate Gradient Methods

In this section let A ∈ Rm,m be symmetric and positive de¬nite. Then the

system of equations Ax = b is equivalent to the problem

1

Minimize f (x) := xT Ax ’ bT x for x ∈ Rm , (5.44)

2

since for such a functional the minima and stationary points coincide, where

a stationary point is an x satisfying

0 = ∇f (x) = Ax ’ b . (5.45)

In contrast to the notation x · y for the “short” space vectors x, y ∈ Rd we

write here the Euclidean scalar product as matrix product xT y.

For the ¬nite element discretization this corresponds to the equivalence

of the Galerkin method (2.23) with the Ritz method (2.24) if A is the

sti¬ness matrix and b the load vector (see (2.34) and (2.35)). More generally,

Lemma 2.3 implies the equivalence of (5.44) and (5.45), if as bilinear form

the so-called energy scalar product

:= xT Ay

x, y (5.46)

A

is chosen.

A general iterative method to solve (5.44) has the following structure:

De¬ne a search direction d(k) .