<< . .

. 33
( : 59)



. . >>

remains to be discussed. It is known from (2.56) that also for the existence
of an LU factorization certain conditions are necessary, as for example the
M-matrix property. This is even su¬cient for an ILU factorization.

Theorem 5.17 Let A ∈ Rm,m be an M-matrix. Then for a given pat-
tern E that satis¬es (5.80), an ILU factorization exists. The hereby de¬ned
decomposition of A as A = LU ’ R is regular in the following sense:

(LU )’1 ≥ 0, (R)ij ≥ 0 for all i, j = 1, . . . , m .
ij



2
Proof: See [16, p. 235].


An ILU (or IC) factorization can be de¬ned by solving the equations
(5.78) for lij and uij in an appropriate order. Alternatively, the elimination
or Cholesky method can be used in its original form on the pattern E.
An improvement of the eigenvalue distribution of C ’1 A is sometimes
possible by using an MIC factorization (modi¬ed incomplete Cholesky fac-
torization) instead of an IC factorization. In contrast to (5.79) the updates
in the elimination method for positions outside the pattern are not ignored
here but have to be performed for the corresponding diagonal element.
Concerning the reduction of the condition number by the ILU or IC
preconditioning for the model problem, we have the same situation as for
the SSOR preconditioning. In particular (5.77) holds, too.
The auxiliary system of equations with C = N ’1 , which means that

h(k+1) = N g (k+1) ,

can also be interpreted as an iteration step of the iteration method de¬ned
by N with initial value z (0) = 0 and right-hand side g (k+1) . An expansion
of the discussed possibilities for preconditioning is therefore obtained by
using a ¬xed number of iteration steps instead of only one.
5.4. Krylov Subspace Methods for Nonsymmetric Systems of Equations 233

Exercises
5.7 Let A1 , A2 , . . . , Ak , C1 , C2 , . . . , Ck ∈ Rm,m be symmetric positive
semide¬nite matrices with the property
axT Ci x ¤ xT Ai x ¤ bxT Ci x for x ∈ Rm , i = 1, . . . , k and 0 < a ¤ b .
Prove: If A := k Ai and C := k Ci are positive de¬nite, then the
i=1 i=1
spectral condition number κ of C ’1 A satis¬es
b
κ(C ’1 A) ¤ .
a

5.8 Show that the matrix
« 
2 1 1
1 1
2
A :=
1 1 2
is positive de¬nite and its spectral condition number is 4.
Hint: Consider the associated quadratic form.

5.9 Investigate the convergence of the (P)CG method on the basis of
Theorem 3.45 and distinguish between d = 2 and d = 3.



5.4 Krylov Subspace Methods
for Nonsymmetric Systems of Equations
With the di¬erent variants of the PCG method we have methods that
are quite appropriate ” regarding their complexity ” for those systems
of equations that arise from the discretization of boundary value prob-
lems. However, this holds only under the assumption that the system
matrix is symmetric and positive de¬nite, reducing the possibilities of ap-
plication, for example to ¬nite element discretizations of purely di¬usive
processes without convective transport mechanism (see (3.23)). Exceptions
for time-dependent problems are only the (semi-)explicit time discretization
(compare (7.72)) and the Lagrange“Galerkin method (see Section 9.4). For
all other cases the systems of equations that arise are always nonsymmetric
and positive real, which means that the system matrix A satis¬es
A + AT is positive de¬nite.
It is desirable to generalize the (P)CG methods for such matrices. The CG
method is characterized by two properties:
• The iterate x(k) minimizes f (·) = · ’x on x(0) + Kk A; g (0) ,
A
where x = A’1 b.
234 5. Iterative Methods for Systems of Linear Equations

• The basis vectors d(i) , i = 0, . . . , k ’ 1, of Kk A; g (0) do not have to
be calculated in advance (and stored in the computer), but will be
calculated by a three-term recursion (5.61) during the iteration. An
analogous relation holds by de¬nition for x(k) (see (5.48)).
The ¬rst property can be preserved in the following, whereby the norm
of the error or residual minimization varies in each method. The second
property is partially lost, because generally all basis vectors d(0) , . . . , d(k’1)
are necessary for the calculation of x(k) . This will result in memory space
problems for large k. As for the CG methods, preconditioning will be nec-
essary for an acceptable convergence of the methods. The conditions for
the preconditioning matrices are the same as for the CG method with the
exception of symmetry and positive de¬niteness. All three methods of pre-
conditioning are in principle possible. Therefore, preconditioning will not
be discussed in the following; we refer to Section 5.3.
The simplest approach is the application of the CG method to a system
of equations with symmetric positive de¬nite matrix equivalent to (5.1).
This is the case for the normal equations
AT Ax = AT b . (5.81)
The approach is called CGNR (Conjugate Gradient Normal Residual),
because here the iterate x(k) minimizes the Euclidean norm of the residual
on x(0) + Kk AT A; g (0) with g (0) = AT Ax(0) ’ b . This follows from the
equation
y’x = (Ay ’ b)T (Ay ’ b) = |Ay ’ b|2
2
(5.82)
AT A 2

for any y ∈ Rm and the solution x = A’1 b.
All advantages of the CG method are preserved, although in (5.53) and
(5.65) Ad(k) is to be replaced by AT Ad(k) . Additionally to the doubling of
the number of operations this may be a disadvantage if κ2 (A) is large, since
κ2 (AT A) = κ2 (A)2 can lead to problems of stability and convergence. Due
to (5.34) this is to be expected for a large number of degrees of freedom.
Furthermore, in the case of list-based storage one of the operations Ay
and AT y is always very expensive due to searching. It is even possible
that we do not explicitly know the matrix A but that only the mapping
y ’ Ay can be evaluated, which then disquali¬es this method completely
(see Exercise 8.6).
The same drawback occurs if
AAT x = b
˜ (5.83)
with the solution x = A’T x taken instead of (5.81). If x(k) is the kth iterate
˜ ˜
of the CG method applied to (5.83), then the x := AT x(k) minimizes the
(k)
˜
T T (0)
residual in the Euclidean norm on x0 + A Kk AA ; g : Note that
T 2
2
y’x = AT y ’ x AT y ’ x = AT y ’ x
˜˜ ˜ ˜ ˜
AAT 2
5.4. Krylov Subspace Methods for Nonsymmetric Equations 235


Let g (0) ∈ Rm , g (0) = 0 be given, Set
v1 := g (0) / |g (0) |2 .
For j = 1, . . . , k calculate
T
hij := vi Avj for i = 1, . . . , j ,
j
Avj ’
wj := hij vi ,
i=1
|wj |2 .
hj+1,j :=
If hj+1,j = 0, termination; otherwise, set
vj+1 := wj /hj+1,j .

Table 5.4. Arnoldi algorithm.

holds for any y ∈ Rm and g (0) = Ax(0) ’ b. This explains the terminology
˜
CGNE (with E for Error).
Whether a method minimizes the error of the residual obviously depends
on the norm used. For a symmetric positive de¬nite B ∈ Rm,m , any y ∈ Rm ,
and x = A’1 b, we have
Ay ’ b = y’x .
AT BA
B

For B = A’T and a symmetric positive de¬nite A we get the situation of
the CG method:
Ay ’ b = y’x .
A’T A

For B = I we get again (5.82):
|Ay ’ b|2 = y ’ x AT A .

The minimization of this functional on x(0) + Kk A; g (0) (not
Kk AT A; g (0) ) leads to the GMRES method (Generalized Minimum
RESidual).
This (and other) methods are founded algorithmically on the recur-
sive construction of orthonormal bases of Kk A; g (0) by Arnoldi™s method.
This method combines the generation of a basis according to (5.61) and
Schmidt™s orthonormalization (see Table 5.4).
If Arnoldi™s method can be performed up to the index k, then
hij := 0 for j = 1, . . . , k, i = j + 2, . . . , k + 1 ,
(hij )ij ∈ Rk,k ,
Hk :=
(hij )ij ∈ Rk+1,k ,
¯
Hk :=
(v1 , . . . , vk+1 ) ∈ Rm,k+1 .
Vk+1 :=
The matrix Hk is an upper Hessenberg matrix (see Appendix A.3). The
basis for the GMRES method is the following theorem:
236 5. Iterative Methods for Systems of Linear Equations

Theorem 5.18 If Arnoldi™s method can be performed up to the index k,
then
(1) v1 , . . . , vk+1 form an orthonormal basis of Kk+1 (A; g (0) ).
(2)
¯
AVk = Vk Hk + wk eT = Vk+1 Hk , (5.84)
k

with ek = (0, . . . , 0, 1)T ∈ Rk ,
VkT AVk = Hk . (5.85)
(3) The problem
|Ay ’ b|2 y ∈ x(0) + Kk (A; g (0) )
Minimize for
with minimum x(k) is equivalent to
Hk ξ ’ βe1 ξ ∈ Rk
¯
Minimize for (5.86)
2

with β := ’ g (0) and minimum ξ (k) , and we have
2

x(k) = x(0) + Vk ξ (k) .
If Arnoldi™s method terminates at the index k, then
x(k) = x = A’1 b .

Proof: (1): The vectors v1 , . . . , vk+1 are orthonormal by construction;
hence we have only to prove vi ∈ Kk+1 A; g (0) for i = 1, . . . , k + 1. This
follows from the representation
with polynomials qi’1 ∈ Pi’1 .
vi = qi’1 (A)v1
In this form we can prove the statement by induction with respect to k.
For k = 0 the assertion is trivial. Let the statement hold for k ’ 1. The
validity for k then follows from
k k
hk+1,k vk+1 = Avk ’ Aqk’1 (A) ’
hik vi = hik qi’1 (A) v1 .
i=1 i=1

(2): Relation (5.85) follows from (5.84) by multiplication by VkT , since
VkT Vk = I and VkT wk = hk+1,k VkT vk+1 = 0 due to the orthonormality
of the vi .
The relation in (5.84) is the matrix representation of
j j+1
Avj = hij vi + wj = hij vi for j = 1, . . . , k .
i=1 i=1

(3): Due to (1), the space x(0) + Kk A; g (0) has the parametrisation

ξ ∈ Rk .
y = x(0) + Vk ξ with (5.87)
5.4. Krylov Subspace Methods for Nonsymmetric Equations 237

The assertion is a consequence of the identity
Ay ’ b = A x(0) + Vk ξ ’ b = AVk ξ + g (0)
Vk+1 Hk ξ ’ βv1 = Vk+1 Hk ξ ’ βe1 ,
¯ ¯
=
which follows from (2), since it implies
|Ay ’ b|2 = Vk+1 (Hk ξ ’ βe1 ) = Hk ξ ’ βe1
¯ ¯
2 2

due to the orthogonality of Vk+1 . The last assertion ¬nally can be seen in
this way: If Arnoldi™s method breaks down at the index k, then relation (2)
becomes
AVk = Vk Hk ,
and
¯
AVk = Vk+1 Hk
will further hold with vk+1 chosen arbitrarily (due to hk+1,k = 0). Since A
is nonsingular, this also holds for Hk . Hence the choice
’1
ξ := Hk (βe1 ) ,
which satis¬es
Hk ξ ’ βe1 = |Hk ξ ’ βe1 |2 = 0 ,
¯
2

is possible. Hence the corresponding y ∈ Rm de¬ned by (5.87) ful¬lls y =
2
x(k) = x.

One problem of Arnoldi™s method is that the orthogonality of the vi is
easily lost due to rounding errors. If one substitutes the assignment
j
wj := Avj ’ hij vi
i=1

in Table 5.4 by the operations
wj := Avj ,
for i = 1, . . . , j calculate
T
hij := wj vi ,
wj := wj ’ hij vi ,
which de¬ne the same vector, one obtains the modi¬ed Arnoldi™s method.
From this relation and from (5.86) the GMRES method is constructed in
its basic form. Alternatively, Schmidt™s orthonormalization can be replaced
by the Householder method (see [28, pp. 159 ¬.]). With exact arithmetic
the GMRES algorithm terminates only after reaching the exact solution
(with hk+1,k = 0). This is not always the case for alternative methods
of the same class. For an increasing iteration index k and large problem
dimensions m there may be lack of enough memory for the storage of
238 5. Iterative Methods for Systems of Linear Equations

the basis vectors v1 , . . . , vk . A remedy is o¬ered by working with a ¬xed
number n of iterations and then to restart the algorithm with x(0) := x(n)
and g (0) := Ax(0) ’ b, until ¬nally the convergence criterion is ful¬lled
(GMRES method with restart). There is also a truncated version of the
GMRES method, in which only the last n basis vectors are used. The
minimization of the error in the energy norm (on the vector space K)
as with the CG method makes sense only for symmetric positive de¬nite
matrices A. But the variational equation
(Ay ’ b)T z = 0 for all z ∈ K
that characterizes this minimum in general can be taken as de¬ning con-
dition for y. Further variants of Krylov subspace methods rely on this.
Another large class of such methods is founded on the Lanczos biorthogo-
nalization, in which apart from a basis v1 , . . . , vk of Kk (A; v1 ) another basis
w1 , . . . , wk of Kk (AT ; w1 ) is constructed, such that
T
vj wi = δij for i, j = 1, . . . , k .
The best-known representative of this method is the BICGSTAB method.
For further discussion of this topic see, for example, [28].


Exercises
5.10 Consider the linear system Ax = b, where A = ±Q for some ± ∈
R \ {0} and some orthogonal matrix Q. Show that, for an arbitrary initial
iterate x(0) , the CGNE method terminates after one step with the exact
solution.

5.11 Provided that Arnoldi™s method can be performed up to the index
k, show that it is possible to incorporate a convergence test of the GMRES
method without computing the approximate solution explicitely, i.e., prove
the following formulas:
g (k) := Ax(k) ’ b = hk+1,k eT ξ (k) vk+1 ,
k
|g (k) |2 = hk+1,k |eT ξ (k) | .
k



5.5 The Multigrid Method
5.5.1 The Idea of the Multigrid Method
We discuss again the model problem of the ¬ve-point stencil discretization
for the Poisson equation on the square and use the relaxed Jacobi™s method.
Then due to (5.31) the iteration matrix is
ω
M = ωMJ + (1 ’ ω)I = I ’ A ,
4
5.5. Multigrid Method 239

with A being the sti¬ness matrix according to (1.14). For ω = ω/4 this
˜
coincides with the relaxed Richardson method, which according to (5.35)
has the poor convergence behaviour of Jacobi™s method, even for optimal
choice of the parameter. Nevertheless, for a suitable ω the method has
positive properties. Due to (5.25) the eigenvalues of M are
ω kπ lπ
»k,l = 1 ’ ω + 1 ¤ k, l ¤ n ’ 1 .
cos + cos ,
2 n n
This shows that there is a relation between the size of the eigenvalues and
the position of the frequency of the assigned eigenfunction depending on
the choice of ω: For ω = 1, which is Jacobi™s method, (M ) = »1,1 =
’»n’1,n’1 . Thus the eigenvalues are large if k and l are close to 1 or n.
Hence there are large eigenvalues for eigenfunctions with low frequency as
well as for eigenfunctions with high frequency. For ω = 1 , however, we have
2
(M ) = »1,1 , and the eigenvalues are large only in the case that k and l
are near to 1, which means that the eigenfunctions have low frequency.
In general, if the error of the iterate e(k) had a representation in terms of
orthonormal eigenvectors zν with small eigenvalues, as for example |»ν | ¤
1
2,

e(k) = cν z ν ,
ν:|»ν |¤ 1
2


then according to (5.11) it would follow for the error measured in the
Euclidean vector norm | · |2 that
« 1/2
 »2 c2 
e(k+1) = »ν cν zν = νν
2
ν:|»ν |¤ 1 ν:|»ν |¤ 1
2 2
2
« 1/2
1 1 (k)
c2 
¤ = e
ν 2
2 2
ν:|»ν |¤ 1
2

if the eigenvectors are chosen orthonormal with respect to the Euclidean
scalar product (compare (5.67)). For such an initial error and with ex-
act arithmetic the method would thus have a “small” contraction number
independent of the discretization.
For Jacobi™s method damped by ω = 1 this means that if the initial error
2
consists of functions of high frequency only (in the sense of an eigenvector
expansion only of eigenvectors with k or l distant to 1), then the above
considerations hold. But already due to rounding errors we will always
¬nd functions of low frequency in the error such that the above statement
of convergence indeed does not hold, but instead the smoothing property
for the damped Jacobi™s method is valid: A few steps only lead to a low
reduction of the error but smooth the error in the sense that the parts of
high frequency are reduced considerably.
240 5. Iterative Methods for Systems of Linear Equations

<< . .

. 33
( : 59)



. . >>