<< . .

. 13
( : 26)



. . >>



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




81
NEWTON™S METHOD

w ∈ RN . We have
±
 0, w = 0,







 F (x + h x w/ w ) ’ F (x)

w , w, x = 0,
(5.15)Dh F (x : w) = hx






 F (hw/ w ) ’ F (x)


 w , x = 0, w = 0.
h
The standard assumptions and the Banach Lemma imply the following
result.
¯
Lemma 5.4.1. Let the standard assumptions hold. Then there are δ, ¯, h >
¯
0 such that if x ∈ B(δ), h ¤ h, and (x) ¤ ¯ for all x ∈ B(δ), then
∇h (F + )(x) is nonsingular for all x ∈ B(δ) and there is MF > 0 such that

F (x) ’ ∇h (F + )(x) ¤ MF (h + ¯/h).

If we assume that the forward di¬erence step has been selected so that
√ √
h = O( ¯), then ∆(x) = O( ¯) by Lemma 5.4.1. Theorem 5.4.1 implies the
following result.
Theorem 5.4.6. Let the standard assumptions hold. Then there are
positive δ, ¯, and KD such that if xc ∈ B(δ), (x) ¤ ¯ for all x ∈ B(δ),
and there is M’ > 0 such that

hc > M’ ¯

then
x+ = xc ’ ∇hc (F + )(xc )’1 (F (xc ) + (xc ))
satis¬es
e+ ¤ KD (¯ + ( ec + hc ) ec ).
Note that Theorem 5.4.6 does not imply that an iteration will converge to
x— .Even if is entirely due to ¬‚oating-point roundo¬, there is no guarantee
that (xn ) ’ 0 as xn ’ x— . Hence, one should expect the sequence { en }
to stagnate and cease to decrease once√ en ≈ ¯. Therefore in the early
phases of the iteration, while en >> ¯, the progress of the iteration will
be indistinguishable from Newton™s method as implemented with an exact

Jacobian. When en ¤ ¯, Theorem 5.4.6 says that en+1 = O(¯). Hence
one will see quadratic convergence until the error in F admits no further
reduction.
A di¬erence approximation to a full Jacobian does not use any special
structure of the Jacobian and will probably be less e¬cient than a hand coded
analytic Jacobian. If information on the sparsity pattern of the Jacobian is
available it is sometimes possible [47], [42], to evaluate ∇h F with far fewer than
N evaluations of F . If the sparsity pattern is such that F can be reduced to



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




82 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

a block triangular form, the entire solution process can take advantage of this
structure [59].
For problems in which the Lipschitz constant of F is large, the error in
the di¬erence approximation will be large, too. Moreover, Dh F (x : w) is not,
in general, a linear function of w and it is usually the case that

(∇h F (x))w = Dh F (x : w).

If F (x— ) is ill conditioned or the Lipschitz constant of F is large, the
di¬erence between them may be signi¬cant and a di¬erence approximation to
a directional derivative, which uses the size of w, is likely to be more accurate
than ∇h F (x)w.
While we used full matrix approximations to numerical Jacobians in all
the examples reported here, they should be used with caution.

5.4.5. The secant method. The results in this section are for single
equations f (x) = 0, where f is a real-valued function of one real variable.
We assume for the sake of simplicity that there are no errors in the evaluation
of f , i.e., (x) = 0. The standard assumptions, then, imply that f (x— ) = 0.
There is no reason to use a di¬erence approximation to f for equations in a
single variable because one can use previously computed data to approximate
f (xn ) by
f (xn ) ’ f (xn’1 )
an = .
xn ’ xn’1
The resulting method is called the secant method and is due to Newton
[140], [137]. We will prove that the secant method is locally q-superlinearly
convergent if the standard assumptions hold.
In order to start, one needs two approximations to x— , x0 and x’1 . The
local convergence theory can be easily described by Theorem 5.4.1. Let xc be
the current iterate and x’ the previous iterate. We have (xc ) = 0 and

f (xc ) ’ f (x’ )
(5.16) ∆(xc ) = ’ f (xc ).
xc ’ x’

We have the following theorem.
Theorem 5.4.7. Let the standard assumptions hold with N = 1. Then
there is δ > 0 such that if x0 , x’1 ∈ B(δ) and x0 = x’1 then the secant iterates
converge q-superlinearly to x— .
Proof. Let δ be small enough so that B(δ) ‚ „¦ and the conclusions of
Theorem 5.4.1 hold. Assume that x’1 , x0 , . . . , xn ∈ B(δ). If xn = x— for some
¬nite n, we are done. Otherwise xn = xn’1 . If we set s = xn ’ xn’1 we have
by (5.16) and the fundamental theorem of calculus
1
∆(xn ) = (f (xn’1 + ts) ’ f (xn )) dt
0




Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




83
NEWTON™S METHOD

and hence
γ(|en | + |en’1 |)
|∆(xn )| ¤ γ|xn’1 ’ xn |/2 ¤ .
2
Applying Theorem 5.4.1 implies
¯
|en+1 | ¤ K((1 + γ/2)|en |2 + γ|en ||en’1 |/2).
(5.17)

If we reduce δ so that
¯
K(1 + γ)δ = · < 1
then xn ’ x— q-linearly. Therefore

|en+1 | ¯
¤ K((1 + γ/2)|en | + γ|en’1 |/2) ’ 0
|en |

as n ’ ∞. This completes the proof.
The secant method and the classical bisection (see Exercise 5.7.8) method
are the basis for the popular method of Brent [17] for the solution of single
equations.

5.5. The Kantorovich Theorem
In this section we present the result of Kantorovich [107], [106], which states
that if the standard assumptions “almost” hold at a point x0 , then there
is a root and the Newton iteration converges r-quadratically to that root.
This result is of use, for example, in proving that discretizations of nonlinear
di¬erential and integral equations have solutions that are near to that of the
continuous problem.
We require the following assumption.
Assumption 5.5.1. There are β, ·, r, and γ with β·γ ¤ 1/2 and x0 ∈ RN
¯
such that
1. F is di¬erentiable at x0 ,

F (x0 )’1 ¤ β, and F (x0 )’1 F (x0 ) ¤ ·.

2. F is Lipschitz continuous with Lipschitz constant γ in a ball of radius
r ≥ r’ about x0 where
¯

1 ’ 1 ’ 2β·γ
r’ = .
βγ

We will let B0 be the closed ball of radius r’ about x0

B0 = {x | x ’ x0 ¤ r’ }.

We do not prove the result in its full generality and refer to [106], [145],
[57], and [58] for a complete proof and for discussions of related results.
However, we give a simpli¬ed version of a Kantorovich-like result for the chord



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




84 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

method to show how existence of a solution can be derived from estimates
on the function and Jacobian. Unlike the chord method results in [106],
[145], and [57], Theorem 5.5.1 assumes a bit more than Assumption 5.5.1 but
has a simple proof that uses only the contraction mapping theorem and the
fundamental theorem of calculus and obtains q-linear convergence instead of
r-linear convergence.
Theorem 5.5.1. Let Assumption 5.5.1 hold and let

(5.18) βγ· < 1/2.

Then there is a unique root x— of F in B0 , the chord iteration with x0 as the
initial iterate converges to x— q-linearly with q-factor βγr’ , and xn ∈ B0 for
all n. Moreover x— is the unique root of F in the ball of radius

1 + 1 ’ 2β·γ
min r,¯
βγ

about x0 .
Proof. We will show that the map

φ(x) = x ’ F (x0 )’1 F (x)

is a contraction on the set

S = {x | x ’ x0 ¤ r’ }.

By the fundamental theorem of calculus and Assumption 5.5.1 we have for
all x ∈ S

F (x0 )’1 F (x) = F (x0 )’1 F (x0 )

1
)’1
+F (x0 F (x0 + t(x ’ x0 ))(x ’ x0 ) dt
0
(5.19)
= F (x0 )’1 F (x0 ) + x ’ x0

1
)’1
+F (x0 (F (x0 + t(x ’ x0 )) ’ F (x0 ))(x ’ x0 ) dt
0

and
φ(x) ’ x0 = ’F (x0 )’1 F (x0 )

1
)’1
’F (x0 (F (x0 + t(x ’ x0 )) ’ F (x0 ))(x ’ x0 ) dt.
0

Hence,
2
φ(x) ’ x0 ¤ · + βγr’ /2 = r’ .



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




85
NEWTON™S METHOD

and φ maps S into itself.
To show that φ is a contraction on S and to prove the q-linear convergence
we will show that
φ(x) ’ φ(y) ¤ βγr’ x ’ y
for all x, y ∈ S. If we do this, the contraction mapping theorem (Theo-
rem 4.2.1) will imply that there is a unique ¬xed point x— of φ in S and
that xn ’ x— q-linearly with q-factor βγr’ . Clearly x— is a root of F because
it is a ¬xed point of φ. We know that βγr’ < 1 by our assumption that
β·γ < 1/2.
Note that for all x ∈ B0
φ (x) = I ’ F (x0 )’1 F (x) = F (x0 )’1 (F (x0 ) ’ F (x))
and hence
φ (x) ¤ βγ x ’ x0 ¤ βγr’ < 1.
Therefore, for all x, y ∈ B0 ,
1
φ(x) ’ φ(y) = φ (y + t(x ’ y))(x ’ y) dt ¤ βγr’ x ’ y .
0
This proves the convergence result and the uniqueness of x— in B0 .
To prove the remainder of the uniqueness assertion let

1 + 1 ’ 2β·γ
r+ =
βγ
and note that r+ > r’ because β·γ < 1/2. If r = r’ , then we are done by the
¯
— in B . Hence we may assume that r > r . We must show
uniqueness of x ¯
0 ’
that if x is such that
r’ < x ’ x0 < min(¯, r+ )
r
then F (x) = 0. Letting r = x ’ x0 we have by (5.19)
F (x0 )’1 F (x) ≥ r ’ · ’ βγr2 /2 > 0
because r’ < r < r+ . This completes the proof.
The Kantorovich theorem is more precise than Theorem 5.5.1 and has a
very di¬erent proof. We state the theorem in the standard way [145], [106],
[63], using the r-quadratic estimates from [106].
Theorem 5.5.2. Let Assumption 5.5.1 hold. Then there is a unique root
x— of F in B0 , the Newton iteration with x0 as the initial iterate converges to
x— , and xn ∈ B0 for all n. x— is the unique root of F in the ball of radius

1 + 1 ’ 2β·γ
min r, ¯
βγ
about x0 and the errors satisfy
n
(2β·γ)2
(5.20) en ¤ .
2n βγ



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




86 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS

If β·γ < 1/2 then (5.20) implies that the convergence is r-quadratic (see
Exercise 5.7.16).

5.6. Examples for Newton™s method
In the collection of MATLAB codes we provide an implementation nsol
of Newton™s method, the chord method, the Shamanskii method, or a
combination of all of them based on the reduction in the nonlinear residual.
This latter option decides if the Jacobian should be recomputed based on the
ratio of successive residuals. If

F (x+ ) / F (xc )

is below a given threshold, the Jacobian is not recomputed. If the ratio is too
large, however, we compute and factor F (x+ ) for use in the subsequent chord
steps. In addition to the inputs of Algorithm sham, a threshold for the ratio
of successive residuals is also input. The Jacobian is recomputed and factored
if either the ratio of successive residuals exceeds the threshold 0 < ρ < 1 or
the number of iterations without an update exceeds m. The output of the
MATLAB implementation of nsol is the solution, the history of the iteration
stored as the vector of l∞ norms of the nonlinear residuals, and an error ¬‚ag.
nsol uses diffjac to approximate the Jacobian by ∇h F with h = 10’7 .
While our examples have dense Jacobians, the ideas in the implementation
of nsol also apply to situations in which the Jacobian is sparse and direct
methods for solving the equation for the Newton step are necessary. One would
still want to minimize the number of Jacobian computations (by the methods
of [47] or [42], say) and sparse matrix factorizations. In Exercise 5.7.25, which
is not easy, the reader is asked to create a sparse form of nsol.
Algorithm 5.6.1. nsol(x, F, „, m, ρ)
1. rc = r0 = F (x)

2. Do while F (x) > „r r0 + „a
(a) Compute ∇h F (x) ≈ F (x)
(b) Factor ∇h F (x) = LU .
(c) is = 0, σ = 0
(d) Do While is < m and σ ¤ ρ
i. is = is + 1
ii. Solve LU s = ’F (x)
iii. x = x + s
iv. Evaluate F (x)
v. r+ = F (x) , σ = r+ /rc , rc = r+
vi. If F (x) ¤ „r r0 + „a exit.

In the MATLAB code the iteration is terminated with an error condition
if σ ≥ 1 at any point. This indicates that the local convergence analysis in this



Buy this book from SIAM at http://www.ec-securehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.




87
NEWTON™S METHOD

section is not applicable. In Chapter 8 we discuss how to recover from this.
nsol is based on the assumption that a di¬erence Jacobian will be used. In
Exercise 5.7.21 you are asked to modify the code to accept an analytic Jacobian.
Many times the Jacobian can be computed e¬ciently by reusing results that
are already available from the function evaluation. The parameters m and ρ
are assigned the default values of 1000 and .5. With these default parameters
the decision to update the Jacobian is made based entirely on the reduction in
the norm of the nonlinear residual. nsol allows the user to specify a maximum
number of iterations; the default is 40.

The Chandrasekhar H-equation. The Chandrasekhar H-equation, [41],
[30],
’1
1 µH(ν) dν
c
(5.21) F (H)(µ) = H(µ) ’ 1 ’ = 0,
2 µ+ν

<< . .

. 13
( : 26)



. . >>