<< . .

. 32
( : 59)



. . >>

’x »j |cj p(»j )| ¤ max |p(»i )| x(0) ’ x
(0)
p(A) x = .
A A
i=1,...,m
j=1
(5.68)
Relations (5.66), (5.68) imply the following theorem:
Theorem 5.15 For the CG method and any p ∈ Pk satisfying p(0) = 1,
we have
x(k) ’ x ¤ max |p(»i )| x(0) ’ x ,
A A
i=1,...,m

with the eigenvalues »1 , . . . , »m of A.
If the eigenvalues of A are not known, but their location is, i.e., if one
knows a, b ∈ R such that
a ¤ »1 , . . . , »m ¤ b , (5.69)
then only the following weaker estimate can be used:
x(k) ’ x ¤ max |p(»)| x(0) ’ x . (5.70)
A A
»∈[a,b]

Therefore, we have to ¬nd p ∈ Pm with p(0) = 1 that minimizes
max |p(»)| » ∈ [a, b] .
5.2. Gradient and Conjugate Gradient Methods 225

This approximation problem in the maximum norm appeared already in
(5.43), because there is a bijection between the sets p ∈ Pk p(1) = 1
and p ∈ Pk p(0) = 1 through
p’p , p(t) := p(1 ’ t) .
˜ ˜ (5.71)
Its solution can represented by using the Chebyshev polynomials of the
¬rst kind (see, for example, [38, p. 302]). They are recursively de¬ned by
Tk+1 (x) := 2xTk (x) ’ Tk’1 (x) for x ∈ R
T0 (x) := 1 , T1 (x) := x ,
and have the representation
Tk (x) = cos(k arccos(x))
for |x| ¤ 1. This immediately implies
|Tk (x)| ¤ 1 for |x| ¤ 1 .
A further representation, valid for x ∈ R, is
1/2 k 1/2 k
1
x + x2 ’ 1 + x ’ x2 ’ 1
Tk (x) = . (5.72)
2
The optimal polynomial in (5.70) is then de¬ned by
Tk ((b + a ’ 2z)/(b ’ a))
for z ∈ R .
p(z) :=
Tk ((b + a)/(b ’ a))
This implies the following result:
Theorem 5.16 Let κ be the spectral condition number of A and assume
κ > 1. Then
k
κ1/2 ’ 1
1
’x ¤ ’x ¤2 x(0) ’ x
(k) (0)
x x . (5.73)
A A A
κ1/2 + 1
κ+1
Tk κ’1


Proof: Choose a as the smallest eigenvalue »min and b as the largest one
»max .
The ¬rst inequality follows immediately from (5.70) and κ = b/a. For
the second inequality note that due to (κ + 1)/(κ ’ 1) = 1 + 2/(κ ’ 1) =:
1 + 2· ≥ 1, (5.72) implies
1/2 k
κ+1 1
≥ 1 + 2· + (1 + 2·)2 ’ 1
Tk
κ’1 2
k
1 1/2
= 1 + 2· + 2 (·(· + 1)) .
2
Finally,
(· + 1)1/2 + · 1/2
2
1/2 1/2 1/2
1 + 2· + 2 (·(· + 1)) = · + (· + 1) =
(· + 1)1/2 ’ · 1/2
226 5. Iterative Methods for Systems of Linear Equations

(1 + 1/·)1/2 + 1
= ,
1/2
’1
(1 + 1/·)
2
which concludes the proof because of 1 + 1/· = κ.

For large κ we have again
κ1/2 ’ 1 2
≈ 1 ’ 1/2 .
κ1/2 + 1 κ
Compared with (5.58), κ has been improved to κ1/2 .
From (5.4) and (5.34) the complexity of the ¬ve-point stencil discretiza-
tion of the Poisson equation on the square results in
1
O κ1/2 O(m) = O(n) O(m) = O m3/2 .
ln
µ
This is the same behaviour as that of the SOR method with optimal re-
laxation parameter. The advantage of the above method lies in the fact
that the determination of parameters is not necessary for applying the
CG method. For quasi-uniform triangulations, Theorem 3.45 implies an
analogous general statement.
A relation to the semi-iterative methods follows from (5.71): The estimate
(5.66) can also be expressed as
¤ p(I ’ A)e(0)
e(k) (5.74)
A A

for any p ∈ Pk with p(1) = 1.
This is the same estimate as (5.42) for the Richardson iteration (5.28) as
basis method, with the Euclidean norm |·|2 replaced by the energy norm ·
A . While the semi-iterative methods are de¬ned by minimization of upper
bounds in (5.42), the CG method is optimal in the sense of (5.74), without
knowledge of the spectrum σ(I ’ A). In this manner the CG method can
be seen as an (optimal) acceleration method for the Richardson iteration.



Exercises
5.4 Let A ∈ Rm,m be a symmetric positive de¬nite matrix.
(a) Show that for x, y with xT y = 0 we have
κ’1
x, y
¤
A
,
x y κ+1
A A

where κ denotes the spectral condition number of A.
Hint: Represent x, y in terms of an orthonormal basis consisting of
eigenvectors of A.
5.3. Preconditioned Conjugate Gradient Method 227

(b) Show using the example m = 2 that this estimate is sharp. To this
end, look for a positive de¬nite symmetric matrix A ∈ R2,2 as well
as vectors x, y ∈ R2 with xT y = 0 and
κ’1
x, y A
= .
x y κ+1
A A


5.5 Prove that the computation of the conjugate direction in the CG
method in the general step k ≥ 2 is equivalent to the three-term recursion
formula
d(k+1) = [±k A + (βk + 1)I] d(k) ’ βk’1 d(k’1) .

5.6 Let A ∈ Rm,m be a symmetric positive de¬nite matrix with spectral
condition number κ. Suppose that the spectrum σ(A) of the matrix A
satis¬es a0 ∈ σ(A) as well as σ(A) \ {a0 } ‚ [a, b] with 0 < a0 < a ¤ b.
Show that this yields the following convergence estimate for the CG
method:
√ k’1
b ’ a0 κ’1
ˆ

x(k) ’ x A ¤ 2 x(0) ’ x A ,
a0 κ+1
ˆ
where κ := b/a ( < κ ).
ˆ



5.3 Preconditioned Conjugate Gradient Method
Due to Theorem 5.16, κ(A) should be small or only weakly growing in m,
which is not true for a ¬nite element sti¬ness matrix.
The technique of preconditioning is used ” as already discussed in Sec-
tion 5.1 ” to transform the system of equations in such a way that the
condition number of the system matrix is reduced without increasing the
e¬ort in the evaluation of the matrix vector product too much.
In a preconditioning from the left the system of equations is transformed
to
C ’1 Ax = C ’1 b
with a preconditioner C; in a preconditioning from the right it is transformed
to
AC ’1 y = b ,
such that x = C ’1 y is the solution of (5.1). Since the matrices are generally
sparse, this always has to be interpreted as a solution of the system of
equations Cx = y.
If A is symmetric and positive de¬nite, then this property is generally
violated by the transformed matrix for both variants, even for a symmetric
228 5. Iterative Methods for Systems of Linear Equations

positive de¬nite C. We assume for a moment to have a decomposition of
C with a nonsingular matrix W as
C = WWT .
Then Ax = b can be transformed to W ’1 AW ’T W T x = W ’1 b, i.e., to
B = W ’1 AW ’T , c = W ’1 b .
By = c with (5.75)
The matrix B is symmetric and positive de¬nite. The solution x is then
given by x = W ’T y. This procedure is called split preconditioning.
Due to W ’T BW T = C ’1 A and W BW ’1 = AC ’1 , B, C ’1 A and AC ’1
have the same eigenvalues, and therefore also the same spectral condition
number κ. Therefore, C should be “close” to A in order to reduce the
condition number. The CG method, applied to (5.75) and then back trans-
formed, leads to the preconditioned conjugate gradient method (PCG):
The terms of the CG method applied to (5.75) will all be marked by ˜,
with the exception of ±k and βk .
Due to the back transformation
x = W ’T x
˜
the algorithm has the search direction
d(k) := W ’T d(k)
˜

for the transformed iterate
x(k) := W ’T x(k) .
˜ (5.76)
The gradient g (k) of (5.44) in x(k) is given by
g (k) := Ax(k) ’ b = W B x(k) ’ c = W g (k) ,
˜ ˜
and hence
˜
g (k+1) = g (k) + ±k W B d(k) = g (k) + ±k Ad(k) ,
so that this formula remains unchanged compared with the CG method
with a new interpretation of the search direction. The search directions are
updated by
d(k+1) = ’W ’T W ’1 g (k+1) + βk d(k) = ’C ’1 g (k+1) + βk d(k) ,
so that in each iteration step additionally the system of equations
Ch(k+1) = g (k+1) has to be solved.
Finally, we have
T T
g (k) g (k) = g (k) C ’1 g (k) = g (k) h(k)
T
˜ ˜
and
T
˜T ˜
d(k) B d(k) = d(k) Ad(k) ,
so that the algorithm takes the form of Table 5.3.
5.3. Preconditioned Conjugate Gradient Method 229


Choose any x(0) ∈ Rm and calculate
d(0) := ’h(0) := ’C ’1 g (0) .
g (0) = Ax(0) ’ b ,
For k = 0, 1, . . . put
T
g (k) h(k)
±k = ,
T
d(k) Ad(k)
x(k+1) x(k) + ±k d(k) ,
=
g (k+1) g (k) + ±k Ad(k) ,
=
C ’1 g (k+1) ,
h(k+1) =
T
g (k+1) h(k+1)
βk = ,
T
g (k) h(k)
’h(k+1) + βk d(k) ,
d(k+1) =
up to the termination criterion (“|g (k+1) |2 = 0”) .

Table 5.3. PCG method.


The solution of the additional systems of equations for sparse matrices
should have the complexity O(m), in order not to worsen the complexity
for an iteration. It is not necessary to know a decomposition C = W W T .
Alternatively, the PCG method can be established by noting that C ’1 A
is self-adjoint and positive de¬nite with respect to the energy scalar product
·, · C de¬ned by C:
T
C ’1 Ax, y = C ’1 Ax Cy = xT Ay = xT C(C ’1 Ay) = x, C ’1 Ay
C C

and hence also C ’1 Ax, x C > 0 for x = 0.
Choosing the CG method for (5.75) with respect to ·, · C , we obtain
precisely the above method.
In case the termination criterion “ g (k+1) 2 = 0” is used for the iteration,
the scalar product must be additionally calculated. Alternatively, we may
T
use “ g (k+1) h(k+1) = 0”. Then the residual is measured in the norm
· C ’1 .
Following the reasoning at the end of Section 5.2, the PCG method can be
interpreted as an acceleration of a linear stationary method with iteration
matrix

M = I ’ C ’1 A .

For a consistent method, we have N = C ’1 or, in the formulation (5.10),
W = C. This observation can be extended in such a way that the CG
method can be used for the acceleration of iteration methods, for example
also for the multigrid method, which will be introduced in Section 5.5. Due
230 5. Iterative Methods for Systems of Linear Equations

to the deduction of the preconditioned CG method and the identity
x(k) ’ x = x(k) ’ x
˜ ˜ ,
A B

which results from the transformation (5.76), the approximation properties
for the CG method also hold for the PCG method if the spectral condition
number κ(A) is replaced by κ(B) = κ(C ’1 A). Therefore,
k
κ1/2 ’ 1
’x ¤2 x(0) ’ x
(k)
x A A
κ1/2 + 1
with κ = κ(C ’1 A).
There is a close relation between those preconditioning matrices C, which
keep κ(C ’1 A) small, and well-convergent linear stationary iteration meth-
ods with N = C ’1 (and M = I ’ C ’1 A) if N is symmetric and positive
de¬nite. Indeed,
κ(C ’1 A) ¤ (1 + (M ))/(1 ’ (M ))
if the method de¬ned by M and N is convergent and N is symmetric for
symmetric A (see Exercise 5.7).
From the considered linear stationary methods because of the required
symmetry we may take
• Jacobi™s method:
This corresponds exactly to the diagonal scaling, which means the division
of each equation by its diagonal element. Indeed, from the decomposition
(5.18) we have C = N ’1 = D, and the PCG method is equivalent to the
preconditioning from the left by the matrix C ’1 in combination with the
usage of the energy scalar product ·, · C .
• The SSOR method:
According to (5.38) we have
C = ω ’1 (2 ’ ω)’1 (D + ωL)D’1 (D + ωLT ) .
Hence C is symmetric and positive de¬nite. The solution of the auxiliary
systems of equations needs only forward and backward substitutions with
the same structure of the matrix as for the system matrix, so that the
requirement of lower complexity is also ful¬lled. An exact estimation of
κ(C ’1 A) shows (see [3, pp. 328 ¬.]) that under certain requirements for A,
which re¬‚ect properties of the boundary value problem and the discretiza-
tion, we ¬nd a considerable improvement of the conditioning by using an
estimate of the type
κ(C ’1 A) ¤ const(κ(A)1/2 + 1) .
The choice of the relaxation parameter ω is not critical. Instead of try-
ing to choose an optimal one for the contraction number of the SSOR
5.3. Preconditioned Conjugate Gradient Method 231

method, we can minimize an estimation for κ(C ’1 A) (see [3, p. 337]),
which recommends a choice of ω in [1.2, 1.6].
For the ¬ve-point stencil discretization of the Poisson equation on the
square we have, according to (5.34), κ(A) = O(n2 ), and the above con-
ditions are ful¬lled (see [3, pp. 330 f.]). By SSOR preconditioning this is
improved to κ(C ’1 A) = O(n), and therefore the complexity of the method
is
1 1
O κ1/2 O(m) = ln O n1/2 O(m) = O m5/4 .
ln (5.77)
µ µ
As discussed in Section 2.5, direct elimination methods are not suitable in
conjunction with the discretization of boundary value problems with large
node numbers, because in general ¬ll-in occurs. As discussed in Section 2.5,
L = (lij ) describes a lower triangular matrix with lii = 1 for all i = 1, . . . , m
(the dimension is described there with the number of degrees of freedom
M ) and U = (uij ) an upper triangular matrix. The idea of the incomplete
LU factorization, or ILU factorization, is to allow only certain patterns
E ∈ {1, . . . , m}2 for the entries of L and U , and instead of A = LU , in
general we can require only
A = LU ’ R.
Here the remainder R = (rij ) ∈ Rm,m has to satisfy
for (i, j) ∈ E .
rij = 0 (5.78)
The requirements
m
for (i, j) ∈ E
aij = lik ukj (5.79)
k=1

mean |E| equations for the |E| entries of the matrices L and U . (Notice that
lii = 1 for all i.) The existence of such factorizations will be discussed later.
Analogously to the close connection between the LU factorization and
an LDLT or LLT factorization for symmetric or symmetric positive def-
inite matrices, as de¬ned in Section 2.5, we can use the IC factorization
(incomplete Cholesky factorization) for such matrices. The IC factorization
needs a representation in the following form:
A = LLT ’ R .
Based on an ILU factorization a linear stationary method is de¬ned by
N = (LU )’1 (and M = I ’ N A), the ILU iteration. We thus have an
expansion of the old method of iterative re¬nement.
Using C = N ’1 = LU for the preconditioning, the complexity of the
auxiliary systems depends on the choice of the matrix pattern E. In general,
the following is required:
E := (i, j) aij = 0 , i, j = 1, . . . , m ‚ E , (i, i) i = 1, . . . , m ‚ E .
(5.80)
232 5. Iterative Methods for Systems of Linear Equations

The requirement of equality E = E is most often used. Then, and also in the
case of ¬xed expansions of E , it is ensured that for a sequence of systems
of equations with a matrix A that is sparse in the strict sense, this will also
hold for L and U . All in all, only O(m) operations are necessary, including
the calculation of L and U , as in the case of the SSOR preconditioning
for the auxiliary system of equations. On the other hand, the remainder R
should be rather small in order to ensure a good convergence of the ILU
iteration and also to ensure a small spectral condition number κ(C ’1 A).
Possible matrix patterns E are shown, for example, in [28, pp. 275 ¬.], where
a more speci¬c structure of L and U is discussed if the matrix A is created
by a discretization on a structured grid, for example by a ¬nite di¬erence
method.
The question of the existence (and stability) of an ILU factorization

<< . .

. 32
( : 59)



. . >>