T

and, by (7.26)

T T

En+1 φ = (I ’ θn Pn )En φ,

we may invoke Lemma 7.2.1 with

T

·n = sn / sn 2 , ψn = En φ, and =0

n

to conclude that

(En φ)T sn

T En sn

T

= φT

(7.30) ·n ψn = ’ 0.

sn 2 sn 2

This completes the proof.

The result here is not a local convergence result. It is global in the sense

that q-superlinear convergence takes place independently of the initial choices

of x and B. It is known [29], [82], [84], [143], that the Broyden iteration will

terminate in 2N steps in the linear case.

7.2.2. Nonlinear problems. Our analysis of the nonlinear case is di¬erent

from the classical work [28], [63]. As indicated in the preface, we provide a

proof based on ideas from [97], [115], and [104], that does not use the Frobenius

norm and extends to various in¬nite-dimensional settings.

For nonlinear problems our analysis is local and we set θn = 1. However

we must show that the sequence {Bn } of Broyden updates exists and remains

nonsingular. We make a de¬nition that allows us to compactly express this

fact.

Definition 7.2.1. We say that the Broyden sequence ({xn }, {Bn }) for the

data (F, x0 , B0 ) exists if F is de¬ned at xn for all n, Bn is nonsingular for all

n, and xn+1 and Bn+1 can be computed from xn and Bn using (7.1) and (7.2).

Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.

Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.

121

BROYDEN™S METHOD

We will show that if the standard assumptions hold and x0 and B0 are

su¬ciently good approximations to x— and F (x— ) then the Broyden sequence

exists for the data (F, x0 , B0 ) and that the Broyden iterates converge q-

superlinearly to x— . The development of the proof is complicated. We ¬rst

prove a bounded deterioration result like Lemma 7.2.2. However in the

nonlinear case, the errors in the Jacobian approximation can grow as the

iteration progresses. We then show that if the initial data is su¬ciently good,

this growth can be limited and therefore the Broyden sequence exists and

the Broyden iterates converges to x— at least q-linearly. Finally we verify the

Dennis“Mor´ condition (7.6), which, together with the q-linear convergence

e

proved earlier completes the proof of local q-superlinear convergence.

Bounded deterioration. Our ¬rst task is to show that bounded deteriora-

tion holds. Unlike the linear case, the sequence { En } need not be monotone

non-increasing. However, the possible increase can be bounded in terms of the

errors in the current and new iterates.

Theorem 7.2.2. Let the standard assumptions hold. Let xc ∈ „¦ and a

nonsingular matrix Bc be given. Assume that

’1

x+ = xc ’ Bc F (xc ) = xc + s ∈ „¦

and B+ is given by (7.2). Then

(7.31) E+ ¤ Ec + γ( ec + e+ 2 )/2

2 2

Proof. Note that (7.7) implies that

Ec s = ’F (x+ ) + (F (x+ ) ’ F (xc ) ’ F (x— )s)

(7.32) 1

(F (xc + ts) ’ F (x— ))s dt.

= ’F (x+ ) +

0

We begin by writing (7.32) as

1

(F (xc + ts) ’ F (x— ))s dt

F (x+ ) = ’Ec s +

0

and then using (7.2) to obtain

(∆c s)sT

E+ = Ec (I ’ Ps ) +

s2 2

where Ps is is given by (7.27) and

1

(F (xc + ts) ’ F (x— )) dt.

∆c =

0

Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.

Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.

122 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

Hence, by the standard assumptions

∆c ¤ γ( ec + e+ 2 )/2.

2 2

Just as in the proof of Lemma 7.2.2 for the linear case

E+ ¤ Ec + ∆c

2 2 2

and the proof is complete.

Local q-linear convergence. Theorem 7.2.3 states that Broyden™s method

is locally q-linearly convergent. The proof is more complex than similar results

in Chapters 5 and 6. We must explore the relation between the size of E0

needed to enforce q-linear convergence and the q-factor.

Theorem 7.2.3. Let the standard assumptions hold. Let r ∈ (0, 1) be

given. Then there are δ and δB such that if x0 ∈ B(δ) and E0 2 < δB the

Broyden sequence for the data (F, x0 , B0 ) exists and xn ’ x— q-linearly with

q-factor at most r.

Proof. Let δ and δ1 be small enough so that the conclusions of Theo-

rem 5.4.3 hold. Then, reducing δ and δ1 further if needed, the q-factor is at

most

KA (δ + δ1 ) ¤ r,

where KA is from the statement of Theorem 5.4.3.

We will prove the result by choosing δB and reducing δ if necessary so that

E0 2 ¤ δB will imply that En 2 ¤ δ1 for all n, which will prove the result.

To do this reduce δ if needed so that

γ(1 + r)δ

(7.33) δ2 = < δ1

2(1 ’ r)

and set

(7.34) δB = δ1 ’ δ2 .

We show that En 2 ¤ δ1 by induction. Since E0 2 < δB < δ1 we may

begin the induction. Now assume that Ek 2 < δ1 for all 0 ¤ k ¤ n. By

Theorem 7.2.2,

En+1 ¤ En + γ( en+1 + en 2 )/2.

2 2 2

Since En < δ1 , en+1 ¤ r en and therefore

2 2 2

+ γ(1 + r)rn δ/2.

En+1 ¤ En + γ(1 + r) en 2 /2 ¤ En

2 2 2

We can estimate En 2 , En’1 2 , . . . in the same way to obtain

n

(1 + r)γδ

rj

En+1 ¤ E0 2 +

2

2 j=0

(1 + r)γδ

¤ δB + ¤ δ1 ,

2(1 ’ r)

which completes the proof.

Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.

Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.

123

BROYDEN™S METHOD

Veri¬cation of the Dennis“Mor´ condition. The ¬nal task is to verify

e

the Dennis“Mor´ condition.

e

Theorem 7.2.4. Let the standard assumptions hold. Then there are δ and

δB such that if x0 ∈ B(δ) and E0 2 < δB the Broyden sequence for the data

(F, x0 , B0 ) exists and xn ’ x— q-superlinearly.

Proof. Let δ and δB be such that the conclusions of Theorem 7.2.3 hold.

By Theorem 7.1.1 we need only show that Dennis“Mor´ condition (7.6) holds

e

to complete the proof.

As in the proof of Theorem 7.2.1 let

sn sT

n

Pn = .

sn 22

Set

1

(F (xn + tsn ) ’ F (x— )) dt.

∆n =

0

Let φ ∈ RN be arbitrary. Note that

En+1 φ = (I ’ Pn )En φ + Pn ∆T φ.

T T

n

We wish to apply Lemma 7.2.1 with

T

= Pn ∆T φ.

ψn = En φ, ·n = sn / sn 2 , and n n

The hypothesis in the lemma that

<∞

n2

n

holds since Theorem 7.2.3 and the standard assumptions imply

¤ γ(1 + r)rn δ/2.

∆n 2

Hence Lemma 7.2.1 implies that

(En φ)T sn

T En sn

T

= φT

·n ψn = ’ 0.

sn 2 sn 2

This, as in the proof of Theorem 7.2.1, implies the Dennis“Mor´ condition and

e

completes the proof.

7.3. Implementation of Broyden™s method

The implementation of Broyden™s method that we advocate here is related

to one that has been used in computational ¬‚uid mechanics [73]. Some

such methods are called limited memory formulations in the optimization

literature [138], [142], [31]. In these methods, after the storage available for

the iteration is exhausted, the oldest of the stored vectors is replaced by the

most recent. Another approach, restarting, clears the storage and starts over.

Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.

Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.

124 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

This latter approach is similar to the restarted GMRES iteration. Our basic

implementation is a nonlinear version of the implementation for linear problems

in [67]. We use a restarted approach, though a limited memory approach could

be used as well.

We begin by showing how the initial approximation to F (x— ), B0 , may be

regarded as a preconditioner and incorporated into the de¬ntion of F just as

preconditioners were in the implementation of Newton-GMRES in § 6.4.

Lemma 7.3.1. Assume that the Broyden sequence ({xn }, {Bn }) for the data

(F, x0 , B0 ) exists. Then the Broyden sequence ({yn }, {Cn }) exists for the data

’1

(B0 F, x0 , I) and

(7.35) xn = yn , Bn = B0 Cn

for all n.

Proof. We proceed by induction. Equation (7.35) holds by de¬nition when

n = 0. If (7.35) holds for a given n, then

’1 ’1

yn+1 = yn ’ Cn B0 F (yn )

= xn ’ (B0 Cn )’1 F (xn ) = xn ’ Bn F (xn )

’1

(7.36)

= xn+1 .

Now sn = xn+1 ’ xn = yn+1 ’ yn by (7.36). Hence

F (xn+1 )sT

n

Bn+1 = Bn + 2

sn 2

’1

B0 F (yn+1 )sT

n

= B0 Cn + B0 = B0 Cn+1 .

2

sn 2

This completes the proof.

The consequence of this proof for implementation is that we can assume

that B0 = I. If a better choice for B0 exists, we can incorporate that into the

’1

function evaluation. Our plan will be to store Bn in a compact and easy to

apply form. The basis for this is the following simple formula [68], [176], [177],

[11].

Proposition 7.3.1. Let B be a nonsingular N — N matrix and let

u, v ∈ RN . Then B + uv T is invertible if and only if 1 + v T B ’1 u = 0. In

this case,

(B ’1 u)v T

T ’1

B ’1 .

(7.37) (B + uv ) = I ’ T B ’1 u

1+v

The expression (7.37) is called the Sherman“Morrison formula. We leave

the proof to Exercise 7.5.2.

In the context of a sequence of Broyden updates {Bn } we have for n ≥ 0,

T

Bn+1 = Bn + un vn ,

Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.

Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.

125

BROYDEN™S METHOD

where

un = F (xn+1 )/ sn and vn = sn / sn 2 .

2

Setting

’1 T ’1

wn = (Bn un )/(1 + vn Bn un )

we see that, if B0 = I,

’1 = (I ’ w T T T

Bn n’1 vn’1 )(I ’ wn’2 vn’2 ) . . . (I ’ w0 v0 )

(7.38)

n’1 T

= (I ’ wj vj ).

j=0

Since the empty matrix product is the identity, (7.38) is valid for n ≥ 0.

’1

Hence the action of Bn on F (xn ) (i.e., the computation of the Broyden

n’1

step) can be computed from the 2n vectors {wj , vj }j=0 at a cost of O(N n)

¬‚oating-point operations. Moreover, the Broyden step for the following

iteration is

n’1

’1 T

(7.39) sn = ’Bn F (xn ) = ’ (I ’ wj vj )F (xn ).

j=0

Since the product

n’2 T

(I ’ wj vj )F (xn )

j=0

must also be computed as part of the computation of wn’1 we can combine

the computation of wn’1 and sn as follows:

n’2 T

w = (I ’ wj vj )F (xn )

j=0

(7.40) + vn’1 w)’1

T

wn’1 = Cw w where Cw = ( sn’1 2

T

sn = ’(I ’ wn’1 vn’1 )w.

One may carry this one important step further [67] and eliminate the need

to store the sequence {wn }. Note that (7.40) implies that for n ≥ 1

T ’1

sn = ’w + Cw w(vn’1 w) = w(’1 + Cw (Cw ’ sn’1 2 ))

= ’Cw w sn’1 = ’ sn’1 2 wn’1 .

2

Hence, for n ≥ 0,

(7.41) wn = ’sn+1 / sn 2 .

Therefore, one need only store the steps {sn } and their norms to construct

the sequence {wn }. In fact, we can write (7.38) as

sj+1 sT

n’1 j

’1

(7.42) Bn = I+ .

sj 2

j=0

2

Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.

Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.

126 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

We cannot use (7.42) and (7.39) directly to compute sn+1 because sn+1 appears

on both sides of the equation

sj+1 sT

sn+1 sT n’1 j

n

sn+1 =’ I+ I+ F (xn+1 )

2 sj 2