<< . .

. 9
( : 26)



. . >>

The speci¬c formulation of q and L depends on the algorithm.
Returning to Algorithm cgs we de¬ne sequences
±
 uk’1 if m = 2k ’ 1 is odd,

ym =

q if m = 2k is even,
k

and ±
 (¯k (A))2 r0
p if m = 2k ’ 1 is odd,
wm =

 p (A)¯
¯k pk’1 (A)r0 if m = 2k is even.
Here the sequences {¯k } and {¯k } are given by (3.20) and (3.21). Hence, the
p q
CGS residual satis¬es
CGS
rk = w2k+1 .
We assume that CGS does not break down and, therefore, ±k = 0 for all
k ≥ 0. If we let r be the nearest integer less than or equal to a real r we
have
(3.23) Aym = (wm ’ wm+1 )/± (m’1)/2 ,
where the denominator is nonzero by assumption. We express (3.23) in matrix
form as
(3.24) AYm = Wm+1 Bm ,
where Ym is the N — m matrix with columns {yj }m , Wm+1 the N — (m + 1)
j=1
m+1
matrix with columns {wj }j=1 , and Bm is the (m + 1) — m matrix given by
« 
1 0 ... 0
¬ .·
.. .·
¬ ’1 .
1 .
¬ ·
¬ ·
. .
Bm = ¬ 0 · diag(±0 , ±0 , . . . , ± (m’1)/2 )’1 .
. .
. .
0
¬ ·
¬. ·
..
¬. ·
. ’1 1
. 
0 . . . 0 ’1

Our assumptions that no breakdown takes place imply that Km is the span
of {yj }m and hence
j=1
(3.25) xm = x0 + Ym z
for some z ∈ Rm . Therefore,

(3.26) rm = r0 ’ AYm z = Wm+1 (e1 ’ Bm z),



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




52 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

where, as in our discussion of GMRES, e1 = (1, 0, . . . , 0)T ∈ Rm . In a sense,
the conditioning of the matrix Wm+1 can be improved by scaling. Let

(3.27) „¦ = diag(ω1 , ω2 , . . . , ωm )

and rewrite (3.26) as

rm = r0 ’ AYm z = Wm+1 „¦’1 (fm+1 ’ Hm z)
(3.28) m+1

where fm+1 = „¦m+1 e1 = ωm+1 e1 and Hm = „¦m+1 Bm is bidiagonal and hence
upper Hessenberg. The diagonal entries in the matrix „¦m+1 are called weights.
The weights make it possible to derive straightforward estimates of the residual
in terms of easily computed terms in the iteration.
If Wm+1 „¦’1 were an orthogonal matrix, we could minimize rm by solving
m+1
the least squares problem

(3.29) minimizez∈Rm fm+1 ’ Hm z 2 .

The quasi-minimization idea is to solve (3.29) (thereby quasi-minimizing the
residual) to obtain zm and then set

(3.30) xm = x0 + Ym zm .

Note that if k corresponds to the iteration counter for CGS, each k produces
two approximate solutions (corresponding to m = 2k ’ 1 and m = 2k).
The solution to the least squares problem (3.29) can be done by using
Givens rotations to update the factorization of the upper Hessenberg matrix
Hm [77] and this is re¬‚ected in the implementation description below. As
one can see from that description, approximate residuals are not computed in
Algorithm tfqmr. Instead we have the quasi-residual norm

„m = fm+1 ’ Hm zm 2 .

then the columns of Wm+1 „¦’1 have
Now if we pick the weights ωi = wi 2 m+1
norm one. Hence

Wm+1 „¦’1 (fm+1
(3.31) rm = ’ Hm z) ¤ „m m + 1.
2 2
m+1

We can, therefore, base our termination criterion on „m , and we do this in
Algorithm tfqmr.
One can [80], [78], also use the quasi-minimization condition to estimate
the TFQMR residual in terms of residual polynomials.
Theorem 3.6.1. Let A be an N — N matrix and let x0 , b ∈ RN be given.
Assume that the TFQMR iteration does not break down or terminate with
the exact solution for k iterations and let 1 ¤ m ¤ 2k. Let rm RES be the
GM

residual for the mth GMRES iteration. Let ξm be the smallest singular value
of Wm+1 „¦’1 . Then if ξm > 0
m+1

„m ¤ rm RES
GM
(3.32) 2 /ξm .




Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




53
GMRES ITERATION

Proof. We may write

rm RES = r0 + Ym zm RES
GM GM


and obtain

= Wm+1 „¦’1 (fm+1 ’Hm zm RES )
rm RES
GM GM
≥ ξm fm+1 ’Hm zm RES
GM
2.
2 2
m+1

Hence, by the quasi-minimization property

rm RES
GM
≥ ξm „m
2

as desired.
As a corollary we have a ¬nite termination result and a convergence
estimate for diagonalizable matrices similar to Theorem 3.1.3.
Corollary 3.6.1. Let A be an N — N matrix and let x0 , b ∈ RN be given.
Then within (N + 1)/2 iterations the TFQMR iteration will either break down
or terminate with the solution.
We apply Theorem 3.1.3 to (3.32) and use (3.31) to obtain the following
result.
Theorem 3.6.2. Let A = V ΛV ’1 be a nonsingular diagonalizable matrix.
Assume that TFQMR does not break down or terminate with the solution for k
iterations. For 1 ¤ m ¤ 2k let ξm be the smallest singular value of Wm+1 „¦’1
m+1
and let xm be given by (3.30). Then, if ξm > 0,

rm 2 ’1
¤ m + 1ξm κ(V ) max |φ(z)|
r0 z∈σ(A)
2

for all φ ∈ Pm .
The implementation below follows [77] with r0 = r0 used throughout.
ˆ
Algorithm 3.6.4. tfqmr(x, b, A, , kmax)
1. k = 0, w1 = y1 = r0 = b ’ Ax, u1 = v = Ay1 , d = 0
T
ρ0 = r0 r0 , „ = r 2 , θ = 0, · = 0

2. Do While k < kmax
(a) k = k + 1
T
(b) σk’1 = r0 v, ± = ρk’1 /σk’1 , y2 = y1 ’ ±k’1 v, u2 = Ay2
(c) For j = 1, 2 (m = 2k ’ 2 + j)
i. w = w ’ ±k’1 uj
d = yj + (θ2 ·/±k’1 )d
ii.

θ = w 2 /„ , c = 1/ 1 + θ2
iii.
„ = „ θc, · = c2 ±k’1
iv.
v. x = x + ·d

vi. If „ m + 1 ¤ b 2 terminate successfully
T
(d) ρk = r0 w, β = ρk /ρk’1



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




54 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

(e) y1 = w + βy2 , u1 = Ay1
(f) v = u1 + β(u2 + βv)

Note that y2 and u2 = Ay2 need only be computed if the loop in step 2c
does not terminate when j = 1. We take advantage of this in the MATLAB
code tfqmr to potentially reduce the matrix-vector product cost by one.

3.7. Examples for GMRES iteration
These examples use the code gmres from the collection of MATLAB codes. The
inputs, described in the comment lines, are the initial iterate x0 , the right-hand
side vector b, a MATLAB function for the matrix-vector product, and iteration
parameters to specify the maximum number of iterations and the termination
criterion. We do not pass the preconditioner to the GMRES code, rather
we pass the preconditioned problem. In all the GMRES examples, we limit
the number of GMRES iterations to 60. For methods like GMRES(m), Bi-
CGSTAB, CGNR, and TFQMR, whose storage requirements do not increase
with the number of iterations, we allow for more iterations.
We consider the discretization of the partial di¬erential equation

(Lu)(x, y) = ’(uxx (x, y) + uyy (x, y)) + a1 (x, y)ux (x, y)
(3.33)
+a2 (x, y)uy (x, y) + a3 (x, y)u(x, y) = f (x, y)

on 0 < x, y < 1 subject to homogeneous Dirichlet boundary conditions

u(x, 0) = u(x, 1) = u(0, y) = u(1, y) = 0, 0 < x, y < 1.

For general coe¬cients {aj } the operator L is not self-adjoint and its discretiza-
tion is not symmetric. As in § 2.7 we discretize with a ¬ve-point centered dif-
ference scheme with n2 points and mesh width h = 1/(n + 1). The unknowns
are
uij ≈ u(xi , xj )

where xi = ih for 1 ¤ i ¤ n. We compute Lu in a matrix-free manner
as we did in § 2.7. We let n = 31 to create a system with 961 unknowns.
As before, we expect second-order accuracy and terminate the iteration when
rk 2 ¤ h2 b 2 ≈ 9.8 — 10’4 b 2 .
As a preconditioner we use the fast Poisson solver fish2d described in § 2.7.
The motivation for this choice is similar to that for the conjugate gradient
computations and has been partially explained theoretically in [33], [34], [121],
and [139]. If we let Gu denote the action of the Poisson solver on u, the
preconditioned system is GLu = Gf .
For the computations reported in this section we took

a1 (x, y) = 1, a2 (x, y) = 20y, and a3 (x, y) = 1.



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




55
GMRES ITERATION

u0 = 0 was the initial iterate. As in § 2.7 we took the right hand side so that
the exact solution was the discretization of

10xy(1 ’ x)(1 ’ y) exp(x4.5 ).

In Figs. 3.1 and 3.2 we plot iteration histories corresponding to precondi-
tioned or unpreconditioned GMRES. For the restarted iteration, we restarted
the algorithm after three iterations, GMRES(3) in the terminology of § 3.4.
Note that the plots are semilog plots of r 2 / b 2 and that the right-hand
sides in the preconditioned and unpreconditioned cases are not the same. In
both ¬gures the solid line is the history of the preconditioned iteration and the
dashed line that of the unpreconditioned. Note the importance of precondi-
tioning. The MATLAB flops command indicates that the unpreconditioned
iteration required 7.6 million ¬‚oating point operations and converged in 56
iterations. The preconditioned iteration required 1.3 million ¬‚oating point op-
erations and terminated after 8 iterations. Restarting after 3 iterations is more
costly in terms of the iteration count than not restarting. However, even in
the unpreconditioned case, the restarted iteration terminated successfully. In
this case the MATLAB flops command indicates that the unpreconditioned
iteration terminated after 223 iterations and 9.7 million ¬‚oating point opera-
tions and the preconditioned iteration required 13 iterations and 2.6 million
¬‚oating point operations.
The execution times in our environment indicate that the preconditioned
iterations are even faster than the di¬erence in operation counts indicates.
The reason for this is that the FFT routine is a MATLAB intrinsic and is not
interpreted MATLAB code. In other computing environments preconditioners
which vectorize or parallelize particularly well might perform in the same way.

3.8. Examples for CGNR, Bi-CGSTAB, and TFQMR iteration
When a transpose is available, CGNR (or CGNE) is an alternative to GMRES
that does not require storage of the iteration history. For elliptic problems like
(3.33), the transpose (adjoint) of L is given by

L— u = ’uxx ’ uyy ’ a1 ux ’ a2 uy + a3 u

as one can derive by integration by parts. To apply CGNR to the precondi-
tioned problem, we require the transpose of GL, which is given by

(GL)— u = L— G— u = L— Gu.

The cost of the application of the transpose of L or GL is the same as that
of the application of L or GL itself. Hence, in the case where the cost of the
matrix-vector product dominates the computation, a single CGNR iteration
costs roughly the same as two GMRES iterations.
However, CGNR has the e¬ect of squaring the condition number, this e¬ect
has serious consequences as Fig. 3.3 shows. The dashed line corresponds to



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




56 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS


0
10




-1
10




Relative Residual
-2
10




-3
10




-4
10
0 10 20 30 40 50 60
Iterations




Fig. 3.1. GMRES for 2-D elliptic equation.
0
10




-1
10
Relative Residual




-2
10




-3
10




-4
10
0 50 100 150 200 250
Iterations




Fig. 3.2. GMRES(3) for 2-D elliptic equation.


<< . .

. 9
( : 26)



. . >>