<< . .

. 11
( : 26)



. . >>




Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




67
FIXED-POINT ITERATION

Before discussing convergence of ¬xed-point iteration we make two de¬ni-
tions.
Definition 4.2.1. Let „¦ ‚ RN and let G : „¦ ’ RM . G is Lipschitz
continuous on „¦ with Lipschitz constant γ if

G(x) ’ G(y) ¤ γ x ’ y

for all x, y ∈ „¦.
Definition 4.2.2. Let „¦ ‚ RN . K : „¦ ’ RN is a contraction mapping
on „¦ if K is Lipschitz continuous on „¦ with Lipschitz constant γ < 1.
The standard result for ¬xed-point iteration is the Contraction Mapping
Theorem [9]. Compare the proof to that of Lemma 1.2.1 and Corollary 1.2.1.
Theorem 4.2.1. Let „¦ be a closed subset of RN and let K be a contraction
mapping on „¦ with Lipschitz constant γ < 1 such that K(x) ∈ „¦ for all x ∈ „¦.
Then there is a unique ¬xed point of K, x— ∈ „¦, and the iteration de¬ned by
(4.3) converges q-linearly to x— with q-factor γ for all initial iterates x0 ∈ „¦.
Proof. Let x0 ∈ „¦. Note that {xn } ‚ „¦ because x0 ∈ „¦ and K(x) ∈ „¦
whenever x ∈ „¦. The sequence {xn } remains bounded since for all i ≥ 1

xi+1 ’ xi = K(xi ) ’ K(xi’1 ) ¤ γ xi ’ xi’1 . . . ¤ γ i x1 ’ x0 ,

and therefore
n’1
xn ’ x0 = i=0 xi+1 ’ xi

n’1 n’1 i
¤ xi+1 ’ xi ¤ x1 ’ x0 i=0 γ
i=0


¤ x1 ’ x0 /(1 ’ γ).

Now, for all n, k ≥ 0,

xn+k ’ xn = K(xn+k’1 ) ’ K(xn’1 )

¤ γ xn+k’1 ’ xn’1

¤ γ K(xn+k’2 ) ’ K(xn’2 )

¤ γ 2 xn+k’2 ’ xn’2 ¤ . . . ¤ γ n xk ’ x0

¤ γ n x1 ’ x0 /(1 ’ γ).

Hence
lim xn+k ’ xn = 0
n,k’∞

and therefore the sequence {xn } is a Cauchy sequence and has a limit x— [162].
If K has two ¬xed points x— and y — in „¦, then

x— ’ y — = K(x— ) ’ K(y — ) ¤ γ x— ’ y — .



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




68 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

Since γ < 1 this can only hold if x— ’ y — = 0, i. e. x— = y — . Hence the ¬xed
point is unique.
Finally we note that

xn+1 ’ x— = K(xn ) ’ K(x— ) ¤ γ xn ’ x— ,

which completes the proof.
One simple application of this theorem is to numerical integration of
ordinary di¬erential equations by implicit methods. For example, consider
the initial value problem

y = f (y), y(t0 ) = y0

and its solution by the backward Euler method with time step h,

y n+1 ’ y n = hf (y n+1 ).
(4.4)

In (4.4) y k denotes the approximation of y(t0 + kh). Advancing in time from
y n to y n+1 requires solution of the ¬xed-point problem

y = K(y) = y n + hf (y).

For simplicity we assume that f is bounded and Lipschitz continuous, say

|f (y)| ¤ M and |f (x) ’ f (y)| ¤ M |x ’ y|

for all x, y.
If hM < 1, we may apply the contraction mapping theorem with „¦ = R
since for all x, y

K(x) ’ K(y) = h|f (x) ’ f (y)| ¤ hM |x ’ y|.

From this we conclude that for h su¬ciently small we can integrate forward
in time and that the time step h can be chosen independently of y. This time
step may well be too small to be practical, and the methods based on Newton™s
method which we present in the subsequent chapters are used much more often
[83]. This type of analysis was used by Picard to prove existence of solutions
of initial value problems [152]. Nonlinear variations of the classical stationary
iterative methods such as Gauss“Seidel have also been used; see [145] for a
more complete discussion of these algorithms.

4.3. The standard assumptions
We will make the standard assumptions on F .
Assumption 4.3.1.
1. Equation 4.1 has a solution x— .

2. F : „¦ ’ RN —N is Lipschitz continuous with Lipschitz constant γ.



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




69
FIXED-POINT ITERATION

3. F (x— ) is nonsingular.
These assumptions can be weakened [108] without sacri¬cing convergence
of the methods we consider here. However the classical result on quadratic
convergence of Newton™s method requires them and we make them throughout.
Exercise 5.7.1 illustrates one way to weaken the standard assumptions.
Throughout this part, we will always denote a root of F by x— . We let
B(r) denote the ball of radius r about x—

B(r) = {x | e < r},

where
e = x ’ x— .
The notation introduced above will be used consistently. If xn is the nth iterate
of a sequence, en = xn ’ x— is the error in that iterate.
Lemma 4.3.1 is an important consequence of the standard assumptions.
Lemma 4.3.1. Assume that the standard assumptions hold. Then there is
δ > 0 so that for all x ∈ B(δ)

F (x) ¤ 2 F (x— ) ,
(4.5)

F (x)’1 ¤ 2 F (x— )’1 ,
(4.6)
and
F (x— )’1 ’1
e /2 ¤ F (x) ¤ 2 F (x— ) e .
(4.7)
Proof. Let δ be small enough so that B(δ) ‚ „¦. For all x ∈ B(δ) we have

F (x) ¤ F (x— ) + γ e .

Hence (4.5) holds if γδ < F (x— ) .
The next result (4.6) is a direct consequence of the Banach Lemma if

I ’ F (x— )’1 F (x) < 1/2.

This will follow from
F (x— )’1 ’1
δ<

since then
I ’ F (x— )’1 F (x) = F (x— )’1 (F (x— ) ’ F (x))
(4.8)
¤ γ F (x— )’1 e ¤ γδ F (x— )’1 < 1/2.

To prove the ¬nal inequality (4.7), we note that if x ∈ B(δ) then x— + te ∈
B(δ) for all 0 ¤ t ¤ 1. We use (4.5) and Theorem 4.0.1 to obtain
1
F (x— + te) e dt ¤ 2 F (x— ) e
F (x) ¤
0




Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




70 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

which is the right inequality in (4.7).
To prove the left inequality in (4.7) note that
1
(x— )’1 F (x) (x— )’1 F (x— + te)e dt
F =F
0

1
(I ’ F (x— )’1 F (x— + te))e dt,
=e’
0

and hence, by (4.8)
1
— ’1
I ’ F (x— )’1 F (x— + te)dt ) ≥ e /2.
F (x ) F (x) ≥ e (1 ’
0

Therefore
e /2 ¤ F (x— )’1 F (x) ¤ F (x— )’1 F (x) ,
which completes the proof.




Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




Chapter 5

Newton™s Method




5.1. Local convergence of Newton™s method
We will describe iterative methods for nonlinear equations in terms of the
transition from a current iterate xc to a new iterate x+ . In this language,
Newton™s method is
x+ = xc ’ F (xc )’1 F (xc ).
(5.1)
We may also view x+ as the root of the two-term Taylor expansion or linear
model of F about xc
Mc (x) = F (xc ) + F (xc )(x ’ xc ).
In the context of single systems this method appeared in the 17th century
[140], [156], [13], [137].
The convergence result on Newton™s method follows from Lemma 4.3.1.
Theorem 5.1.1. Let the standard assumptions hold. Then there are K > 0
and δ > 0 such that if xc ∈ B(δ) the Newton iterate from xc given by (5.1)
satis¬es
e+ ¤ K ec 2 .
(5.2)
Proof. Let δ be small enough so that the conclusions of Lemma 4.3.1 hold.
By Theorem 4.0.1
1
’1 ’1
(F (xc ) ’ F (x— + tec ))ec dt.
e+ = ec ’ F (xc ) F (xc ) = F (xc )
0

By Lemma 4.3.1 and the Lipschitz continuity of F
e+ ¤ (2 F (x— )’1 )γ ec 2 /2.
This completes the proof of (5.2) with K = γ F (x— )’1 .
The proof of convergence of the complete Newton iteration will be complete
if we reduce δ if needed so that Kδ < 1.
Theorem 5.1.2. Let the standard assumptions hold. Then there is δ such
that if x0 ∈ B(δ) the Newton iteration
xn+1 = xn ’ F (xn )’1 F (xn )
converges q-quadratically to x— .
71

Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




72 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

Proof. Let δ be small enough so that the conclusions of Theorem 5.1.1
hold. Reduce δ if needed so that Kδ = · < 1. Then if n ≥ 0 and xn ∈ B(δ),
Theorem 5.1.1 implies that
2
(5.3) en+1 ¤ K en ¤ · en < en

and hence xn+1 ∈ B(·δ) ‚ B(δ). Therefore if xn ∈ B(δ) we may continue the
iteration. Since x0 ∈ B(δ) by assumption, the entire sequence {xn } ‚ B(δ).
(5.3) then implies that xn ’ x— q-quadratically.
The assumption that the initial iterate be “su¬ciently near” the solution
(x0 ∈ B(δ)) may seem arti¬cial at ¬rst look. There are, however, many
situations in which the initial iterate is very near the root. Two examples are
implicit integration of ordinary di¬erential equations and di¬erential algebraic
equations, [105], [83], [16], where the initial iterate is derived from the solution
at the previous time step, and solution of discretizations of partial di¬erential
equations, where the initial iterate is an interpolation of a solution from a
coarser computational mesh [99], [126]. Moreover, when the initial iterate is
far from the root and methods such as those discussed in Chapter 8 are needed,
local convergence results like those in this and the following two chapters
describe the terminal phase of the iteration.

5.2. Termination of the iteration
We can base our termination decision on Lemma 4.3.1. If x ∈ B(δ), where δ
is small enough so that the conclusions of Lemma 4.3.1 hold, then if F (x— ) is
well conditioned, we may terminate the iteration when the relative nonlinear
residual F (x) / F (x0 ) is small. We have, as a direct consequence of applying
Lemma 4.3.1 twice, a nonlinear form of Lemma 1.1.1.
Lemma 5.2.1. Assume that the standard assumptions hold. Let δ > 0 be
small enough so that the conclusions of Lemma 4.3.1 hold for x ∈ B(δ). Then
for all x ∈ B(δ)

4κ(F (x— )) e
e F (x)
¤ ¤ ,
4 e0 κ(F (x— )) F (x0 ) e0

where κ(F (x— )) = F (x— ) F (x— )’1 is the condition number of F (x— )
relative to the norm · .
From Lemma 5.2.1 we conclude that if F (x— ) is well conditioned, the size
of the relative nonlinear residual is a good indicator of the size of the error.
However, if there is error in the evaluation of F or the initial iterate is near a
root, a termination decision based on the relative residual may be made too
late in the iteration or, as we will see in the next section, the iteration may
not terminate at all. We raised this issue in the context of linear equations
in Chapter 1 when we compared (1.2) and (1.4). In the nonlinear case, there
is no “right-hand side” to use as a scaling factor and we must balance the
relative and absolute size of the nonlinear residuals in some other way. In all



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




73
NEWTON™S METHOD

of our MATLAB codes and algorithms for nonlinear equations our termination
criterion is to stop the iteration if

(5.4) F (x) ¤ „r F (x0 ) + „a ,

where the relative error tolerance „r and absolute error tolerance „a are input
to the algorithm. Combinations of relative and absolute error tolerances are
commonly used in numerical methods for ordinary di¬erential equations or
di¬erential algebraic equations [16], [21], [23], [151].
Another way to decide whether to terminate is to look at the Newton step

s = ’F (xc )’1 F (xc ) = x+ ’ xc ,

and terminate the iteration when s is su¬ciently small. This criterion is
based on Theorem 5.1.1, which implies that

ec = s + O( ec 2 ).

Hence, near the solution s and ec are essentially the same size.
For methods other than Newton™s method, the relation between the step
and the error is less clear. If the iteration is q-linearly convergent, say, then

e+ ¤ σ ec

implies that
(1 ’ σ) ec ¤ s ¤ (1 + σ) ec .
Hence the step is a reliable indicator of the error provided σ is not too near
1. The di¬erential equations codes discussed in [16], [21], [23], and [151], make
use of this in the special case of the chord method.
However, in order to estimate ec this way, one must do most of the work
needed to compute x+ , whereas by terminating on small relative residuals or
on a condition like (5.4) the decision to stop the iteration can be made before
computation and factorization of F (xc ). If computation and factorization of
the Jacobian are inexpensive, then termination on small steps becomes more
attractive.

5.3. Implementation of Newton™s method
In this section we assume that direct methods will be used to solve the linear
equation for the Newton step. Our examples and discussions of operation
counts make the implicit assumption that the Jacobian is dense. However,
the issues for sparse Jacobians when direct methods are used to solve linear
systems are very similar. In Chapter 6 we consider algorithms in which iterative
methods are used to solve the equation for the Newton step.
In order to compute the Newton iterate x+ from a current point xc one
must ¬rst evaluate F (xc ) and decide whether to terminate the iteration. If
one decides to continue, the Jacobian F (xc ) must be computed and factored.



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




74 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

Then the step is computed as the solution of F (xc )s = ’F (xc ) and the iterate
is updated x+ = xc + s. Of these steps, the evaluation and factorization of F
are the most costly. Factorization of F in the dense case costs O(N 3 ) ¬‚oating-
point operations. Evaluation of F by ¬nite di¬erences should be expected to
cost N times the cost of an evaluation of F because each column in F requires
an evaluation of F to form the di¬erence approximation. Hence the cost of a
Newton step may be roughly estimated as N + 1 evaluations of F and O(N 3 )
¬‚oating-point operations. In many cases F can be computed more e¬ciently,
accurately, and directly than with di¬erences and the analysis above for the
cost of a Newton iterate is very pessimistic. See Exercise 5.7.21 for an example.
For general nonlinearities or general purpose codes such as the MATLAB code
nsol from the collection, there may be little alternative to di¬erence Jacobians.
In the future, automatic di¬erentiation (see [94] for a collection of articles on

<< . .

. 11
( : 26)



. . >>