<< . .

. 5
( : 26)



. . >>

A more complete and detailed discussion of preconditioners is in [8] and a
concise survey with many pointers to the literature is in [12].
Some e¬ective preconditioners are based on deep insight into the structure
of the problem. See [124] for an example in the context of partial di¬erential
equations, where it is shown that certain discretized second-order elliptic
problems on simple geometries can be very well preconditioned with fast
Poisson solvers [99], [188], and [187]. Similar performance can be obtained from
multigrid [99], domain decomposition, [38], [39], [40], and alternating direction
preconditioners [8], [149], [193], [194]. We use a Poisson solver preconditioner
in the examples in § 2.7 and § 3.7 as well as for nonlinear problems in § 6.4.2
and § 8.4.2.
One commonly used and easily implemented preconditioner is Jacobi
preconditioning, where M is the inverse of the diagonal part of A. One can also
use other preconditioners based on the classical stationary iterative methods,
such as the symmetric Gauss“Seidel preconditioner (1.18). For applications to
partial di¬erential equations, these preconditioners may be somewhat useful,
but should not be expected to have dramatic e¬ects.
Another approach is to apply a sparse Cholesky factorization to the
matrix A (thereby giving up a fully matrix-free formulation) and discarding
small elements of the factors and/or allowing only a ¬xed amount of storage
for the factors. Such preconditioners are called incomplete factorization
preconditioners. So if A = LLT + E, where E is small, the preconditioner
is (LLT )’1 and its action on a vector is done by two sparse triangular solves.
We refer the reader to [8], [127], and [44] for more detail.
One could also attempt to estimate the spectrum of A, ¬nd a polynomial
p such that 1 ’ zp(z) is small on the approximate spectrum, and use p(A) as a
preconditioner. This is called polynomial preconditioning. The preconditioned
system is

p(A)Ax = p(A)b


and we would expect the spectrum of p(A)A to be more clustered near z = 1
than that of A. If an interval containing the spectrum can be found, the
residual polynomial q(z) = 1 ’ zp(z) of smallest L∞ norm on that interval
can be expressed in terms of Chebyshev [161] polynomials. Alternatively
q can be selected to solve a least squares minimization problem [5], [163].
The preconditioning p can be directly recovered from q and convergence rate
estimates made. This technique is used to prove the estimate (2.15), for
example. The cost of such a preconditioner, if a polynomial of degree K is
used, is K matrix-vector products for each application of the preconditioner
[5]. The performance gains can be very signi¬cant and the implementation is
matrix-free.



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




25
CONJUGATE GRADIENT ITERATION

2.6. CGNR and CGNE
If A is nonsingular and nonsymmetric, one might consider solving Ax = b by
applying CG to the normal equations

AT Ax = AT b.
(2.32)

This approach [103] is called CGNR [71], [78], [134]. The reason for this name
is that the minimization property of CG as applied to (2.32) asserts that

x— ’ x 2 = (x— ’ x)T AT A(x— ’ x)
AT A
= (Ax— ’ Ax)T (Ax— ’ Ax) = (b ’ Ax)T (b ’ Ax) = r 2


is minimized over x0 + Kk at each iterate. Hence the name Conjugate Gradient
on the Normal equations to minimize the Residual.
Alternatively, one could solve

AAT y = b
(2.33)

and then set x = AT y to solve Ax = b. This approach [46] is now called CGNE
[78], [134]. The reason for this name is that the minimization property of CG
as applied to (2.33) asserts that if y — is the solution to (2.33) then

y— ’ y 2 = (y — ’ y)T (AAT )(y — ’ y) = (AT y — ’ AT y)T (AT y — ’ AT y)
AAT

= x— ’ x 2
2

is minimized over y0 + Kk at each iterate. Conjugate Gradient on the Normal
equations to minimize the Error.
The advantages of this approach are that all the theory for CG carries over
and the simple implementation for both CG and PCG can be used. There
are three disadvantages that may or may not be serious. The ¬rst is that the
condition number of the coe¬cient matrix AT A is the square of that of A.
The second is that two matrix-vector products are needed for each CG iterate
since w = AT Ap = AT (Ap) in CGNR and w = AAT p = A(AT p) in CGNE.
The third, more important, disadvantage is that one must compute the action
of AT on a vector as part of the matrix-vector product involving AT A. As we
will see in the chapter on nonlinear problems, there are situations where this
is not possible.
The analysis with residual polynomials is similar to that for CG. We
consider the case for CGNR, the analysis for CGNE is essentially the same.
As above, when we consider the AT A norm of the error, we have

x— ’ x 2
= (x— ’ x)T AT A(x— ’ x) = A(x— ’ x) 2
= r 2.
2 2
AT A

Hence, for any residual polynomial pk ∈ Pk ,
¯

¤ pk (AT A)r0
(2.34) rk ¯ ¤ r0 max |¯k (z)|.
p
2 2 2
z∈σ(AT A)




Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




26 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

There are two major di¬erences between (2.34) and (2.7). The estimate is
in terms of the l2 norm of the residual, which corresponds exactly to the
termination criterion, hence we need not prove a result like Lemma 2.3.2. Most
signi¬cantly, the residual polynomial is to be maximized over the eigenvalues
of AT A, which is the set of the squares of the singular values of A. Hence the
performance of CGNR and CGNE is determined by the distribution of singular
values.

2.7. Examples for preconditioned conjugate iteration
In the collection of MATLAB codes we provide a code for preconditioned
conjugate gradient iteration. The inputs, described in the comment lines,
are the initial iterate, x0 , the right hand side vector b, MATLAB functions for
the matrix-vector product and (optionally) the preconditioner, and iteration
parameters to specify the maximum number of iterations and the termination
criterion. On return the code supplies the approximate solution x and the
history of the iteration as the vector of residual norms.
We consider the discretization of the partial di¬erential equation

(2.35) ’∇(a(x, y)∇u) = f (x, y)

on 0 < x, y < 1 subject to homogeneous Dirichlet boundary conditions

u(x, 0) = u(x, 1) = u(0, y) = u(1, y) = 0, 0 < x, y < 1.

One can verify [105] that the di¬erential operator is positive de¬nite in the
Hilbert space sense and that the ¬ve-point discretization described below is
positive de¬nite if a > 0 for all 0 ¤ x, y ¤ 1 (Exercise 2.8.10).
We discretize with a ¬ve-point centered di¬erence scheme with n2 points
and mesh width h = 1/(n + 1). The unknowns are

uij ≈ u(xi , xj )

where xi = ih for 1 ¤ i ¤ n. We set

u0j = u(n+1)j = ui0 = ui(n+1) = 0,

to re¬‚ect the boundary conditions, and de¬ne

±ij = ’a(xi , xj )h’2 /2.

We express the discrete matrix-vector product as

(Au)ij = (±ij + ±(i+1)j )(u(i+1)j ’ uij )

’(±(i’1)j + ±ij )(uij ’ u(i’1)j ) + (±i(j+1) + ±ij )(ui(j+1) ’ uij )
(2.36)

’(±ij + ±i(j’1) )(uij ’ ui(j’1) )



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




27
CONJUGATE GRADIENT ITERATION

for 1 ¤ i, j ¤ n.
For the MATLAB implementation we convert freely from the representa-
tion of u as a two-dimensional array (with the boundary conditions added),
which is useful for computing the action of A on u and applying fast solvers,
and the representation as a one-dimensional array, which is what pcgsol ex-
pects to see. See the routine fish2d in collection of MATLAB codes for an
example of how to do this in MATLAB.
For the computations reported in this section we took a(x, y) = cos(x) and
took the right hand side so that the exact solution was the discretization of

10xy(1 ’ x)(1 ’ y) exp(x4.5 ).

The initial iterate was u0 = 0.
In the results reported here we took n = 31 resulting in a system with
N = n2 = 961 unknowns. We expect second-order accuracy from the method
and accordingly we set termination parameter = h2 = 1/1024. We allowed
up to 100 CG iterations. The initial iterate was the zero vector. We will report
our results graphically, plotting rk 2 / b 2 on a semi-log scale.
In Fig. 2.1 the solid line is a plot of rk / b and the dashed line a plot of
— ’u —
u k A / u ’u0 A . Note that the reduction in r is not monotone. This is
consistent with the theory, which predicts decrease in e A but not necessarily
in r as the iteration progresses. Note that the unpreconditioned iteration is
slowly convergent. This can be explained by the fact that the eigenvalues are
not clustered and

κ(A) = O(1/h2 ) = O(n2 ) = O(N )

and hence (2.15) indicates that convergence will be slow. The reader is asked
to quantify this in terms of execution times in Exercise 2.8.9. This example
illustrates the importance of a good preconditioner. Even the unpreconditioned
iteration, however, is more e¬cient that the classical stationary iterative
methods.
For a preconditioner we use a Poisson solver. By this we mean an operator
G such that v = Gw is the solution of the discrete form of

’vxx ’ vyy = w,

subject to homogeneous Dirichlet boundary conditions. The e¬ectiveness of
such a preconditioner has been analyzed in [124] and some of the many ways
to implement the solver e¬ciently are discussed in [99], [188], [186], and [187].
The properties of CG on the preconditioned problem in the continuous
case have been analyzed in [48]. For many types of domains and boundary
conditions, Poisson solvers can be designed to take advantage of vector and/or
parallel architectures or, in the case of the MATLAB environment used in
this book, designed to take advantage of fast MATLAB built-in functions.
Because of this their execution time is less than a simple count of ¬‚oating-
point operations would indicate. The fast Poisson solver in the collection of



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




28 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS


1
10




Relative Residual and A-norm of Error
0
10



-1
10



-2
10



-3
10



-4
10
0 10 20 30 40 50 60
Iterations




Fig. 2.1. CG for 2-D elliptic equation.
1
10



0
10
Relative Residual




-1
10



-2
10



-3
10



-4
10
0 10 20 30 40 50 60
Iterations




Fig. 2.2. PCG for 2-D elliptic equation.


codes fish2d is based on the MATLAB fast Fourier transform, the built-in
function fft.
In Fig. 2.2 the solid line is the graph of rk 2 / b 2 for the preconditioned
iteration and the dashed line for the unpreconditioned. The preconditioned
iteration required 5 iterations for convergence and the unpreconditioned
iteration 52. Not only does the preconditioned iteration converge more
rapidly, but the number of iterations required to reduce the relative residual
by a given amount is independent of the mesh spacing [124]. We caution
the reader that the preconditioned iteration is not as much faster than the



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




29
CONJUGATE GRADIENT ITERATION

unpreconditioned one as the iteration count would suggest. The MATLAB
flops command indicates that the unpreconditioned iteration required roughly
1.2 million ¬‚oating-point operations while the preconditioned iteration required
.87 million ¬‚oating-point operations. Hence, the cost of the preconditioner is
considerable. In the MATLAB environment we used, the execution time of
the preconditioned iteration was about 60% of that of the unpreconditioned.
As we remarked above, this speed is a result of the e¬ciency of the MATLAB
fast Fourier transform. In Exercise 2.8.11 you are asked to compare execution
times for your own environment.




Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




30 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

2.8. Exercises on conjugate gradient
2.8.1. Let {xk } be the conjugate gradient iterates. Prove that rl ∈ Kk for all
l < k.

2.8.2. Let A be spd. Show that there is a spd B such that B 2 = A. Is B
unique?

2.8.3. Let Λ be a diagonal matrix with Λii = »i and let p be a polynomial.
Prove that p(Λ) = maxi |p(»i )| where · is any induced matrix norm.

2.8.4. Prove Theorem 2.2.3.

2.8.5. Assume that A is spd and that

σ(A) ‚ (1, 1.1) ∪ (2, 2.2).

Give upper estimates based on (2.6) for the number of CG iterations
required to reduce the A norm of the error by a factor of 10’3 and for
the number of CG iterations required to reduce the residual by a factor
of 10’3 .

2.8.6. For the matrix A in problem 5, assume that the cost of a matrix vector
multiply is 4N ¬‚oating-point multiplies. Estimate the number of ¬‚oating-
point operations reduce the A norm of the error by a factor of 10’3 using
CG iteration.

2.8.7. Let A be a nonsingular matrix with all singular values in the interval
(1, 2). Estimate the number of CGNR/CGNE iterations required to
reduce the relative residual by a factor of 10’4 .

2.8.8. Show that if A has constant diagonal then PCG with Jacobi precondi-
tioning produces the same iterates as CG with no preconditioning.

2.8.9. Assume that A is N — N , nonsingular, and spd. If κ(A) = O(N ), give
a rough estimate of the number of CG iterates required to reduce the
relative residual to O(1/N ).

2.8.10. Prove that the linear transformation given by (2.36) is symmetric and
2
positive de¬nite on Rn if a(x, y) > 0 for all 0 ¤ x, y ¤ 1.

2.8.11. Duplicate the results in § 2.7 for example, in MATLAB by writing the
matrix-vector product routines and using the MATLAB codes pcgsol
and fish2d. What happens as N is increased? How are the performance

<< . .

. 5
( : 26)



. . >>