˜ of x with

exists a closed convex neighbourhood U ˜

Df (x) ¤ L < 1 for x ∈ U ˜

and, for example, L = Df (˜) + 1 (1 ’ Df (˜) ), guaranteeing the

x x

2

contractivity of f in U.

The unique existence of a ¬xed point and the convergence of (8.7) is

guaranteed if the set U where f is a contraction is mapped into itself:

Theorem 8.4 (Banach™s ¬xed-point theorem) Let U ‚ Rm , U = …,

and U be closed. Let f : U ’ Rm be contractive with Lipschitz constant

L < 1 and f [U ] ‚ U . Then we have:

(1) There exists one and only one ¬xed point x ∈ U of f .

(2) For arbitrary x(0) ∈ U the ¬xed point iteration (8.7) converges to x,

and we have

L

x(k) ’ x ¤ x(k) ’ x(k’1)

1’L

(a posteriori error estimate)

Lk

¤ x(1) ’ x(0)

1’L

(a priori error estimate).

Proof: The sequence x(k+1) := f (x(k) ) is well-de¬ned because of f [U ] ‚ U .

We prove that (x(k) )k is a Cauchy sequence (see Appendix A.4).

x(k+1) ’ x(k) = f (x(k) ) ’ f (x(k’1) ) ¤ L x(k) ’ x(k’1)

¤ L2 x(k’1) ’ x(k’2) ¤ · · · ¤ Lk x(1) ’ x(0) , (8.8)

so that for any k, l ∈ N

x(k+l) ’ x(k)

¤ x(k+l) ’ x(k+l’1) + x(k+l’1) ’ x(k+l’2) + · · · + x(k+1) ’ x(k)

¤ (Lk+l’1 + Lk+l’2 + · · · + Lk ) x(1) ’ x(0)

= Lk (1 + L + · · · + Ll’1 ) x(1) ’ x(0)

∞

1

¤L Ll x(1) ’ x(0) = Lk x(1) ’ x(0) .

k

1’L

l=0

346 8. Iterative Methods for Nonlinear Equations

Thus we have x(k+l) ’ x(k) ’ 0 for k ’ ∞; i.e., (x(k) )k is a Cauchy

sequence and thus converges to some x ∈ Rm because of the completeness

of Rm . Due to the closedness of U we conclude that x ∈ U . Since we have

x(k+1) ’ x , f x(k) ’ f (x) for k ’ ∞ ,

x is also a ¬xed point of f .

The ¬xed point is unique, because for ¬xed points x, x,

¯

x ’ x = f (x) ’ f (¯) ¤ L x ’ x ,

¯ x ¯

which immediately implies x = x because of L < 1. Moreover, we have

¯

x(k) ’ x f (x(k’1) ) ’ f (x) ¤ L x(k’1) ’ x

=

¤L x(k’1) ’ x(k) + x(k) ’ x ,

and thus from (8.8),

L L

x(k) ’ x ¤ x(k) ’ x(k’1) ¤ Lk’1 x(1) ’ x(0) .

1’L 1’L

2

Remark 8.5 The theorem can be generalized: Since we used only the com-

pleteness of Rm , the proposition holds even in a Banach space (X, · ),

where U ‚ X is a closed subset.

This enables us to de¬ne iterative schemes directly in the function space

for nonlinear boundary value problems, which means that the resulting

(linear) problems in the iteration step are to be discretized. So instead of

proceeding in the order discretization“iteration, we can apply the sequence

iteration“discretization. This leads in general to di¬erent schemes, even

if the approaches have been the same. We will always refer to the ¬rst

strategy.

According to Lemma 8.3 we can often construct a closed U such that

f is contractive on U . It remains to verify that f [U ] ‚ U . For this, the

following lemma is helpful:

Lemma 8.6 Let U ‚ Rm , f : U ’ Rm . If there exists a y ∈ U and a

r > 0 with

B r (y) ‚ U ,

with f contractive on B r (y) with Lipschitz constant L < 1, so that

y ’ f (y) ¤ r(1 ’ L) ,

then f has one and only one ¬xed point in B r (y), and (8.7) converges.

2

Proof: Exercise 8.2.

8.1. Fixed-Point Iterations 347

In the setting of Theorem 8.4 the ¬xed-point iteration is thus globally

convergent in U . In the setting of Lemma 8.6 it is locally convergent in

U (globally in B r (y)). We see that in the situation of Theorem 8.4 the

sequence (x(k) ) has, because of

x(k+1) ’ x = f (x(k) ) ’ f (x) ¤ L x(k) ’ x ,

a linear order of convergence (and in general not better).

A su¬cient condition for local convergence of the corresponding order is

given by the following theorem:

Theorem 8.7 Let U ‚ Rm be open, ¦ : U ’ U continuous, the sequence

(x(k) ) de¬ned by x(k+1) := ¦ x(k) for a given x(0) ∈ U . If there exists

some x ∈ U , an open V ‚ U with x ∈ V , and constants C, p with p ≥ 1,

¯ ¯

C ≥ 0, and C < 1 for p = 1, such that for all x ∈ V ,

¦(x) ’ x ¤ C x ’ x p

¯ ¯

holds, then the iteration de¬ned by ¦ converges locally to x of order at least

¯

p, and x is a ¬xed point of ¦.

¯

Proof: Choose W = Br (¯) ‚ V, with r > 0 su¬ciently small, such that

x

W ‚ V and

Crp’1 =: L < 1 .

If x(k) ∈ W , then we conclude because of

p

x(k+1) ’ x = ¦ x(k) ’ x ¤ C x(k) ’ x < Crp < r

¯ ¯ ¯

that x(k+1) ∈ W , too. This means that for x(0) ∈ W we have that x(k) ∈ W

for all k ∈ N. Furthermore, we have

p

x(k+1) ’ x ¤ C x(k) ’ x < C rp’1 x(k) ’ x = L x(k) ’ x ,

¯ ¯ ¯ ¯

i.e.,

x(k) ’ x for k ’ ∞,

¯

and consequently,

x = lim x(k+1) = lim ¦ x(k) = ¦(¯) .

¯ x

k’∞ k’∞

2

The special case of a scalar equation shows that we can expect at most

linear convergence for ¦ = f :

Corollary 8.8 Let U ‚ R be an open subset, ¦ on U p-times continuously

di¬erentiable, and x ∈ U a ¬xed point of ¦.

¯

If ¦ (¯) = 0, |¦ (¯)| < 1 for p = 1 and ¦ (¯) = · · · = ¦(p’1) (¯) = 0,

x x x x

(p)

¦ (¯) = 0 for p > 1, then the iteration de¬ned by ¦ is locally convergent

x

to x with order of convergence p, but not better.

¯

348 8. Iterative Methods for Nonlinear Equations

Proof: Taylor™s expansion of ¦ at x gives, for x ∈ U ,

¯

¦(p) (ξ)

(x ’ x)p with ξ ∈ (x, x) ,

¦(x) = ¦(¯) +

x ¯ ¯

p!

and in the case p = 1 we have |¦ (ξ)| < 1 for su¬ciently small |x’ x|. Thus,

¯

there exists a neighbourhood V of x such that |¦(x) ’ x| ¤ C|x ’ x|p for

¯ ¯ ¯

all x ∈ V and C < 1 for p = 1. Theorem 8.7 implies order of convergence p.

The example ¦(x) = Lxp with L < 1 for p = 1 with the ¬xed point x = 0

2

shows that no improvement is possible.

Exercises

8.1 Prove Lemma 8.3 with the help of the mean value theorem.

8.2 Prove Lemma 8.6.

8.2 Newton™s Method and Its Variants

8.2.1 The Standard Form of Newton™s Method

In the following we want to study the formulation stated in (8.2), i.e., the

problem of ¬nding the solutions of

f (x) = 0 .

The simplest method of Chapter 5, the Richardson iteration (cf. (5.28)),

suggests the direct application of the ¬xed-point iteration for, e.g., ¦(x) :=

’f (x) + x. This approach succeeds only if, in the case of a di¬erentiable

f , the Jacobian I ’ Df (x) is small in the sense of Lemma 8.3 close to the

solution. Here we denote by Df (x) = (‚j fi (x))ij the Jacobi or functional

matrix of f . A relaxation method similar to (5.30) leads to the damped

variants, which will be treated later.

The method in its corrector formulation, analogously to (5.10) with

δ (k) := x(k+1) ’ x(k) ,

is

δ (k) = ’f x(k) , (8.9)

or in its relaxation formulation with relaxation parameter ω > 0,

δ (k) = ’ωf (x(k) ) .

Now we want to introduce another approach to de¬ne ¦:

Let x(0) be an approximation of a zero. An improved approximation is

probably given by the following:

8.2. Newton™s Method and Variants 349

• Replace f by a simple function g that approximates f near x(0) and

whose zero is to be determined.

• Find x(1) as the solution of g(x) = 0.

Newton™s method needs the di¬erentiability of f , and one chooses the

approximating a¬ne-linear function given by Df (x(0) ), i.e.,

g(x) = f x(0) + Df x(0) x ’ x(0) .

Under the assumption that Df (x(0) ) is nonsingular, the new iterate x(1) is

determined by solving the system of linear equations

Df x(0) x(1) ’ x(0) = ’f x(0) , (8.10)

or formally by

’1

x(1) := x(0) ’ Df x(0) f x(0) .

This suggests the following de¬nition:

¦(f )(x) = x ’ Df (x)’1 f (x) . (8.11)

Here ¦ is well-de¬ned only if Df (x) is nonsingular. Then x ∈ Rm is a zero

of f if and only if x is a ¬xed point of ¦. When executing the iteration,

’1

we do not calculate Df x(k) but only the system of equations similar

to (8.10).

Thus, the kth iteration of Newton™s method reads as follows: Solve

Df x(k) δ (k) = ’f x(k) (8.12)

and set

x(k+1) := x(k) + δ (k) . (8.13)

Equation (8.13) has the same form as (5.10) with W = Df (x(k) ), with

the residual at x(k)

d(k) := f x(k) .

Thus the subproblem of the kth iteration is easier in the sense that it con-

sists of a system of linear equations (with the same structure of dependence

as f ; see Exercise 8.6). In the same sense the system of equations (5.10)

in the case of linear stationary methods is “easier” to solve than the orig-

inal problem of the same type. Furthermore, W is in general di¬erent for

di¬erent k.

An application of (8.12), (8.13) to Ax = b, i.e., Df (x) = A for all x ∈ Rm

results in (5.10) with W = A, a method converging in one step, which just

reformulates the original problem:

A x ’ x(0) = ’ Ax(0) ’ b .

350 8. Iterative Methods for Nonlinear Equations

The range of the iteration may be very small, as can be shown already

by one-dimensional examples. But in this neighbourhood of the solution

we have, e.g., for m = 1, the following:

Corollary 8.9 Let f ∈ C 3 (R) and let x be a simple zero of f (i.e., f (¯) =

¯ x

0). Then Newton™s method converges locally to x, of order at least 2.

¯

Proof: There exists an open neighbourhood V of x such that f (x) = 0 for

¯

all x ∈ V ; i.e., ¦ is well-de¬ned by (8.11), continuous on V , and x is a ¬xed

¯

point of ¦. According to Corollary 8.8 it su¬ces to show that ¦ (¯) = 0:

x

f (x)2 ’ f (x)f (x) f (x)

¦ (x) = 1 ’ = f (x) =0 for x = x,

¯

f (x)2 f (x)2

and ¦ exists continuously, because f ∈ C 3 (R). 2

In the following we want to develop a general local theorem of conver-

gence for Newton™s method (according to L.V. Kantorovich). It necessitates

only the Lipschitz continuity of Df and ensures the existence of a zero, too.

Here we always suppose a ¬xed norm on Rm and consider a compatible

norm on Rm,m . As a prerequisite we need the following lemma:

Lemma 8.10 Let C0 ‚ Rm be convex, open, f : C0 ’ Rm di¬erentiable,

and suppose there exists γ > 0 such that

Df (x) ’ Df (y) ¤ γ x ’ y for all x, y ∈ C0 . (8.14)

Then for all x, y ∈ C0 ,

γ

f (x) ’ f (y) ’ Df (y)(x ’ y) ¤ x’y 2

.

2

Proof: Let • : [0, 1] ’ Rm be de¬ned by •(t) := f (y + t(x ’ y)), for

arbitrary, ¬xed x, y ∈ C0 . Then • is di¬erentiable on [0, 1] and

• (t) = Df (y + t(x ’ y))(x ’ y) .

Thus for t ∈ [0, 1] we have

• (t) ’ • (0) (Df (y + t(x ’ y)) ’ Df (y)) (x ’ y)

=

¤ Df (y + t(x ’ y)) ’ Df (y) x’y ¤ γt x’y 2

.

For

1

(• (t) ’ • (0)) dt

∆ := f (x)’f (y)’Df (y)(x’y) = •(1)’•(0)’• (0) =

0

we also get

1 1

1

∆¤ • (t) ’ • (0) dt ¤ γ x ’ y γ x’y

2 2

t dt = .

2

0 0

2

8.2. Newton™s Method and Variants 351

Now we are able to conclude local, quadratic convergence:

Theorem 8.11 Let C ‚ Rm be convex, open and f : C ’ Rm di¬eren-

tiable.

For x(0) ∈ C there exist ±, β, γ > 0 such that

h := ± β γ/2 < 1 ,

r := ±/(1 ’ h) ,

Br x(0) ‚ C .

¯

Furthermore, we require:

(i) Df is Lipschitz continuous on C0 = Br+µ x(0) for some µ > 0 with

constant γ in the sense of (8.14).

(ii) For all x ∈ Br x(0) there exists Df (x)’1 and Df (x)’1 ¤ β.

’1

¤ ±.

Df x(0) f x(0)

(iii)

Then:

(1) The Newton iteration

’1

x(k+1) := x(k) ’ Df x(k) f x(k)

is well-de¬ned and

x(k) ∈ Br x(0) for all k ∈ N .

(2) x(k) ’ x for k ’ ∞ and f (¯) = 0.

¯ x

h2 ’1

k

βγ (k) 2

x(k+1) ’ x ¤ ’x x(k) ’ x ¤±

(3) ¯ x ¯ and ¯

1 ’ h2k

2

for k ∈ N .

Proof: (1): To show that x(k+1) is well-de¬ned it is su¬cient to verify

x(k) ∈ Br x(0) (‚ C) for all k ∈ N .

By induction we prove the extended proposition

’1

k’1

x(k) ∈ Br x(0) x(k) ’ x(k’1) ¤ ± h2 for all k ∈ N . (8.15)

and

The proposition (8.15) holds for k = 1, because according to (iii),

’1

x(1) ’ x(0) = Df x(0) ¤ ± < r.

f x(0)

Let (8.15) be valid for l = 1, . . . , k. Then x(k+1) is well-de¬ned, and by the

application of the Newton iteration for k ’ 1 we get

’1

x(k+1) ’ x(k) = Df x(k) ¤ β f (x(k) )

f x(k)

= β f x(k) ’ f x(k’1) ’ Df x(k’1) x(k) ’ x(k’1)

βγ (k) 2

¤ x ’ x(k’1)

2

352 8. Iterative Methods for Nonlinear Equations

according to Lemma 8.10 with C0 = Br x(0) , and

βγ (k) βγ 2 2k ’2

2

= ±h2 ’1 .

k

x(k+1) ’ x(k) ¤ x ’ x(k’1) ¤ ±h

2 2

Thus the second part of (8.15) holds for k + 1, and also

x(k+1) ’ x(0) ¤ x(k+1) ’ x(k) + x(k) ’ x(k’1) + · · · + x(1) ’ x(0)

¤ ± h2 ’1 + h2 ’1

k k’1

+ · · · + h7 + h3 + h + 1

< ±/(1 ’ h) = r .

Hence (8.15) holds for k + 1.

(2): Using (8.15) we are able to verify that (x(k) )k is a Cauchy sequence,

because for l ≥ k we have

x(l+1) ’ x(k) ¤ x(l+1) ’ x(l) + x(l) ’ x(l’1) + · · · + x(k+1) ’ x(k)

3

’1

k k k

¤ ± h2 + ···

1 + h2 + h2 (8.16)

’1

k

± h2

’0 for k ’ ∞ ,

< k

1 ’ h2

since h < 1. Hence there exists x = limk’∞ x(k) and x ∈ B r x(0) .

¯ ¯

Furthermore, f (¯) = 0, because we can conclude from x(k) ∈ Br x(0)

x

that

Df x(k) ’ Df x(0) ¤ γ x(k) ’ x(0) < γr ;

thus

¤ γr + Df x(0)

Df x(k) =: K

and from f x(k) = ’Df x(k) x(k+1) ’ x(k) , we obtain

¤ K x(k+1) ’ x(k) ’ 0

f x(k)

for k ’ ∞. Thus we also have

f (¯) = lim f (x(k) ) = 0 .

x

k’∞

(3): With l ’ ∞ in (8.16) we can prove the second part in (3); the ¬rst

part follows from

’1

x(k+1) ’ x = x(k) ’ Df x(k) f x(k) ’ x

¯ ¯

’1

= x(k) ’ x ’ Df x(k) f x(k) ’ f (¯)

¯ x

’1

f (¯) ’ f x(k) ’ Df x(k) x ’ x(k)

= Df x(k) x ¯ ,

which implies, according to Lemma 8.10 with C0 = Br+µ x(0) ‚ C,

γ (k) 2

x(k+1) ’ x ¤ β x ’x .

¯ ¯

2

2

8.2. Newton™s Method and Variants 353

The termination criterion (5.15), which is oriented at the residual, may

also be used for the nonlinear problem (and not just for the Newton

iteration). This can be deduced in analogy to (5.16):

Theorem 8.12 Let the following be valid: