with respect to d (if G : Rn ’ Rm , G (d) will be the Jacobian matrix of

G evaluated at the vector d), a Taylor™s expansion of G truncated at ¬rst

order ensures that

G(d + δd) ’ G(d) = G (d)δd + o( δd ) for δd ’ 0,

where · is a suitable norm for δd and o(·) is the classical in¬nitesimal

symbol denoting an in¬nitesimal term of higher order with respect to its

argument. Neglecting the in¬nitesimal of higher order with respect to δd ,

from (2.4) and (2.5) we respectively deduce that

d

K(d) G (d) , Kabs (d) G (d) , (2.7)

G(d)

the symbol · denoting the matrix norm associated with the vector norm

(de¬ned in (1.19)). The estimates in (2.7) are of great practical usefulness

in the analysis of problems in the form (2.6), as shown in the forthcoming

examples.

Example 2.2 (Algebraic equations of second degree) The solutions to the

algebraic equation x2 ’ 2px + 1 = 0, with p ≥ 1, are x± = p ± p2 ’ 1. In this

case, F (x, p) = x2 ’ 2px + 1, the datum d is the coe¬cient p, while x is the vector

of components {x+ , x’ }. As for the condition number, we notice that (2.6) holds

36 2. Principles of Numerical Mathematics

by taking G : R ’ R2 , G(p) = {x+ , x’ }. Letting G± (p) = x± , it follows that

G± (p) = 1 ± p/ p2 ’ 1. Using (2.7) with · = · 2 we get

|p|

K(p) , p > 1. (2.8)

p2 ’ 1

√

From (2.8) it turns out that in the case of separated roots (say, if p ≥ 2)

problem F (x, p) = 0 is well conditioned. The behavior dramatically changes in

the case of multiple roots, that is when p = 1. First of all, one notices that the

function G± (p) = p ± p2 ’ 1 is no longer di¬erentiable for p = 1, which makes

(2.8) meaningless. On the other hand, equation (2.8) shows that, for p close to

1, the problem at hand is ill conditioned. However, the problem is not ill posed.

Indeed, following Remark 2.1, it is possible to reformulate it in an equivalent

manner as F (x, t) = x2 ’ ((1 + t2 )/t)x + 1 = 0, with t = p + p2 ’ 1, whose

roots x’ = t and x+ = 1/t coincide for t = 1. The change of parameter thus

removes the singularity that is present in the former representation of the roots

as functions of p. The two roots x’ = x’ (t) and x+ = x+ (t) are now indeed

regular functions of t in the neighborhood of t = 1 and evaluating the condition

number by (2.7) yields K(t) 1 for any value of t. The transformed problem is

•

thus well conditioned.

Example 2.3 (Systems of linear equations) Consider the linear system Ax

= b, where x and b are two vectors in Rn , while A is the matrix (n — n) of the

real coe¬cients of the system. Suppose that A is nonsingular; in such a case x

is the unknown solution x, while the data d are the right-hand side b and the

matrix A, that is, d = {bi , aij , 1 ¤ i, j ¤ n}.

Suppose now that we perturb only the right-hand side b. We have d = b,

x = G(b) = A’1 b so that, G (b) = A’1 , and (2.7) yields

A’1 b Ax

A’1 ¤ A A’1 = K(A),

K(d) = (2.9)

’1 b

A x

where K(A) is the condition number of matrix A (see Section 3.1.1) and the use

of a consistent matrix norm is understood. Therefore, if A is well conditioned,

solving the linear system Ax=b is a stable problem with respect to perturbations

of the right-hand side b. Stability with respect to perturbations on the entries of

•

A will be analyzed in Section 3.10.

Example 2.4 (Nonlinear equations) Let f : R ’ R be a function of class

C 1 and consider the nonlinear equation

F (x, d) = f (x) = •(x) ’ d = 0,

where • : R ’ R is a suitable function and d ∈ R a datum (possibly equal

to zero). The problem is well de¬ned only if • is invertible in a neighborhood

of d: in such a case, indeed, x = •’1 (d) and the resolvent is G = •’1 . Since

’1

(•’1 ) (d) = [• (x)] , the ¬rst relation in (2.7) yields, for d = 0,

|d|

|[• (x)]’1 |,

K(d) (2.10)

|x|

2.2 Stability of Numerical Methods 37

while if d = 0 or x = 0 we have

|[• (x)]’1 |.

Kabs (d) (2.11)

The problem is thus ill posed if x is a multiple root of •(x)’d; it is ill conditioned

when • (x) is “small”, well conditioned when • (x) is “large”. We shall further

•

address this subject in Section 6.1.

In view of (2.8), the quantity G (d) is an approximation of Kabs (d) and

is sometimes called ¬rst order absolute condition number. This latter rep-

resents the limit of the Lipschitz constant of G (see Section 11.1) as the

perturbation on the data tends to zero.

Such a number does not always provide a sound estimate of the condition

number Kabs (d). This happens, for instance, when G vanishes at a point

whilst G is non null in a neighborhood of the same point. For example,

take x = G(d) = cos(d) ’ 1 for d ∈ (’π/2, π/2); we have G (0) = 0, while

Kabs (0) = 2/π.

2.2 Stability of Numerical Methods

We shall henceforth suppose the problem (2.1) to be well posed. A numer-

ical method for the approximate solution of (2.1) will consist, in general,

of a sequence of approximate problems

n≥1

Fn (xn , dn ) = 0 (2.12)

depending on a certain parameter n (to be de¬ned case by case). The

understood expectation is that xn ’ x as n ’ ∞, i.e. that the numerical

solution converges to the exact solution. For that, it is necessary that dn ’

d and that Fn “approximates” F , as n ’ ∞. Precisely, if the datum d of

problem (2.1) is admissible for Fn , we say that (2.12) is consistent if

Fn (x, d) = Fn (x, d) ’ F (x, d) ’ 0 for n ’ ∞ (2.13)

where x is the solution to problem (2.1) corresponding to the datum d.

The meaning of this de¬nition will be made precise in the next chapters

for any single class of considered problems.

A method is said to be strongly consistent if Fn (x, d) = 0 for any value

of n and not only for n ’ ∞.

In some cases (e.g., when iterative methods are used) problem (2.12)

could take the following form

n≥q

Fn (xn , xn’1 , . . . , xn’q , dn ) = 0 (2.14)

where x0 , x1 , . . . , xq’1 are given. In such a case, the property of strong

consistency becomes Fn (x, x, . . . , x, d) = 0 for all n ≥ q.

38 2. Principles of Numerical Mathematics

Example 2.5 Let us consider the following iterative method (known as New-

ton™s method and discussed in Section 6.2.2) for approximating a simple root ±

of a function f : R ’ R,

f (xn’1 )

xn = xn’1 ’ n ≥ 1.

given x0 , , (2.15)

f (xn’1 )

The method (2.15) can be written in the form (2.14) by setting Fn (xn , xn’1 , f ) =

xn ’ xn’1 + f (xn’1 )/f (xn’1 ) and is strongly consistent since Fn (±, ±, f ) = 0

for all n ≥ 1.

Consider now the following numerical method (known as the composite mid-

b

point rule discussed in Section 9.2) for approximating x = a f (t) dt,

n

tk + tk+1

n≥1

xn = H f ,

2

k=1

where H = (b ’ a)/n and tk = a + (k ’ 1)H, k = 1, . . . , (n + 1). This method

is consistent; it is also strongly consistent provided thet f is a piecewise linear

polynomial.

More generally, all numerical methods obtained from the mathematical prob-

lem by truncation of limit operations (such as integrals, derivatives, series, . . . )

•

are not strongly consistent.

Recalling what has been previously stated about problem (2.1), in order

for the numerical method to be well posed (or stable) we require that for any

¬xed n, there exists a unique solution xn corresponding to the datum dn ,

that the computation of xn as a function of dn is unique and, furthermore,

that xn depends continuously on the data, i.e.

∀· > 0, ∃Kn (·, dn ) : δdn < · ’ δxn ¤ Kn (·, dn ) δdn . (2.16)

As done in (2.4), we introduce for each problem in the sequence (2.12) the

quantities

δxn / xn δxn

Kn (dn ) = sup , Kabs,n (dn ) = sup , (2.17)

δdn / dn δdn

δdn ∈Dn δdn ∈Dn

and then de¬ne

K num (dn ) = lim sup Kn (dn ), num

Kabs (dn ) = lim sup Kabs,n (dn ).

k’∞ n≥k k’∞ n≥k

We call K num (dn ) the relative asymptotic condition number of the numer-

num

ical method (2.12) and Kabs (dn ) absolute asymptotic condition number,

corresponding to the datum dn .

The numerical method is said to be well conditioned if K num is “small”

for any admissible datum dn , ill conditioned otherwise. As in (2.6), let us

2.2 Stability of Numerical Methods 39

consider the case where, for each n, the functional relation (2.1) de¬nes a

mapping Gn between the sets of the numerical data and the solutions

xn = Gn (dn ), that is Fn (Gn (dn ), dn ) = 0. (2.18)

Assuming that Gn is di¬erentiable, we can obtain from (2.17)

dn

Kn (dn ) Gn (dn ) , Kabs,n (dn ) Gn (dn ) . (2.19)

Gn (dn )

Example 2.6 (Sum and subtraction) The function f : R2 ’ R, f (a, b) =

a + b, is a linear mapping whose gradient is the vector f (a, b) = (1, 1)T . Using

the vector norm · 1 de¬ned in (1.13) yields K(a, b) (|a| + |b|)/(|a + b|), from

which it follows that summing two numbers of the same sign is a well conditioned

operation, being K(a, b) 1. On the other hand, subtracting two numbers almost

equal is ill conditioned, since |a + b| |a| + |b|. This fact, already pointed out in

Example 2.2, leads to the cancellation of signi¬cant digits whenever numbers can

be represented using only a ¬nite number of digits (as in ¬‚oating-point arithmetic,

•

see Section 2.5).

Example 2.7 Consider again the problem of computing the roots of a polyno-

mial of second degree analyzed in Example 2.2. When p > 1 (separated roots),

such a problem is well conditioned. However, we generate an unstable algorithm

if we evaluate the root x’ by the formula x’ = p ’ p2 ’ 1. This formula is

indeed subject to errors due to numerical cancellation of signi¬cant digits (see

Section 2.4) that are introduced by the ¬nite arithmetic of the computer. A pos-

sible remedy to this trouble consists of computing x+ = p + p2 ’ 1 at ¬rst,

then x’ = 1/x+ . Alternatively, one can solve F (x, p) = x2 ’ 2px + 1 = 0 using

Newton™s method (proposed in Example 2.5)

xn = xn’1 ’ (x2 ’ 2pxn’1 + 1)/(2xn’1 ’ 2p) = fn (p), n ≥ 1, x0 given.

n’1

|p|/|xn ’ p|. To compute K num (p)

Applying (2.19) for p > 1 yields Kn (p)

we notice that, in the case when the algorithm converges, the solution xn would

converge to one of the roots x+ or x’ ; therefore, |xn ’ p| ’ p2 ’ 1 and thus

Kn (p) ’ K num (p) |p|/ p2 ’ 1, in perfect agreement with the value (2.8) of

the condition number of the exact problem.

We can conclude that Newton™s method for the search of simple roots of a

second order algebraic equation is ill conditioned if |p| is very close to 1, while it

•

is well conditioned in the other cases.

The ¬nal goal of numerical approximation is, of course, to build, through

numerical problems of the type (2.12), solutions xn that “get closer” to the

solution of problem (2.1) as much as n gets larger. This concept is made

precise in the next de¬nition.

De¬nition 2.2 The numerical method (2.12) is convergent i¬

∀µ > 0 ∃n0 (µ), ∃δ(n0 , µ) > 0 :

(2.20)

∀n > n0 (µ), ∀ δdn < δ(n0 , µ) ’ x(d) ’ xn (d + δdn ) ¤ µ,

40 2. Principles of Numerical Mathematics

where d is an admissible datum for the problem (2.1), x(d) is the corre-

sponding solution and xn (d + δdn ) is the solution of the numerical problem

(2.12) with datum d + δdn .

To verify the implication (2.20) it su¬ces to check that under the same

assumptions

µ

x(d + δdn ) ’ xn (d + δdn ) ¤ . (2.21)

2

Indeed, thanks to (2.3) we have

x(d) ’ xn (d + δdn ) ¤ x(d) ’ x(d + δdn )

+ x(d + δdn ) ’ xn (d + δdn ) ¤ K(δ(n0 , µ), d) δdn + 2 .

µ

µ

Choosing δdn such that K(δ(n0 , µ), d) δdn < 2 , one obtains (2.20).

Measures of the convergence of xn to x are given by the absolute error

or the relative error, respectively de¬ned as

|x ’ xn |

E(xn ) = |x ’ xn |, Erel (xn ) = , (if x = 0). (2.22)

|x|

In the cases where x and xn are matrix or vector quantities, in addition

to the de¬nitions in (2.22) (where the absolute values are substituted by

suitable norms) it is sometimes useful to introduce the error by component

de¬ned as

|(x ’ xn )ij |

c

Erel (xn ) = max . (2.23)

|xij |

i,j

2.2.1 Relations between Stability and Convergence

The concepts of stability and convergence are strongly connected.

First of all, if problem (2.1) is well posed, a necessary condition in order

for the numerical problem (2.12) to be convergent is that it is stable.

Let us thus assume that the method is convergent, and prove that it is

stable by ¬nding a bound for δxn . We have

xn (d + δdn ) ’ xn (d) ¤ xn (d) ’ x(d)

δxn =

x(d) ’ x(d + δdn ) + x(d + δdn ) ’ xn (d + δdn )

+ (2.24)

¤ K(δ(n0 , µ), d) δdn + µ,

having used (2.3) and (2.21) twice. From (2.24) we can conclude that, for n

su¬ciently large, δxn / δdn can be bounded by a constant of the order

2.3 A priori and a posteriori Analysis 41

of K(δ(n0 , µ), d), so that the method is stable. Thus, we are interested in

stable numerical methods since only these can be convergent.

The stability of a numerical method becomes a su¬cient condition for

the numerical problem (2.12) to converge if this latter is also consistent

with problem (2.1). Indeed, under these assumptions we have

x(d + δdn ) ’ xn (d + δdn ) ¤ x(d + δdn ) ’ x(d)

x(d) ’ xn (d) + xn (d) ’ xn (d + δdn ) .

+

Thanks to (2.3), the ¬rst term at right-hand side can be bounded by δdn

(up to a multiplicative constant independent of δdn ). A similar bound holds

for the third term, due to the stability property (2.16). Finally, concerning

the remaining term, if Fn is di¬erentiable with respect to the variable x,

an expansion in a Taylor series gives

‚Fn

Fn (x(d), d) ’ Fn (xn (d), d) = |(x,d) (x(d) ’ xn (d)),

‚x

for a suitable x “between” x(d) and xn (d). Assuming also that ‚Fn /‚x is

invertible, we get

’1

‚Fn

x(d) ’ xn (d) = [Fn (x(d), d) ’ Fn (xn (d), d)]. (2.25)

‚x |(x,d)

On the other hand, replacing Fn (xn (d), d) with F (x(d), d) (since both terms

are equal to zero) and passing to the norms, we ¬nd

’1

‚Fn

x(d) ’ xn (d) ¤ Fn (x(d), d) ’ F (x(d), d) .

‚x |(x,d)

Thanks to (2.13) we can thus conclude that x(d)’xn (d) ’ 0 for n ’ ∞.

The result that has just been proved, although stated in qualitative terms,

is a milestone in numerical analysis, known as equivalence theorem (or

Lax-Richtmyer theorem): “for a consistent numerical method, stability is

equivalent to convergence”. A rigorous proof of this theorem is available in

[Dah56] for the case of linear Cauchy problems, or in [Lax65] and in [RM67]

for linear well-posed initial value problems.

2.3 A priori and a posteriori Analysis

The stability analysis of a numerical method can be carried out following

di¬erent strategies:

1. forward analysis, which provides a bound to the variations δxn on

the solution due to both perturbations in the data and to errors that

are intrinsic to the numerical method;

42 2. Principles of Numerical Mathematics

2. backward analysis, which aims at estimating the perturbations that

should be “impressed” to the data of a given problem in order to

obtain the results actually computed under the assumption of working

in exact arithmetic. Equivalently, given a certain computed solution

xn , backward analysis looks for the perturbations δdn on the data

such that Fn (xn , dn + δdn ) = 0. Notice that, when performing such

an estimate, no account at all is taken into the way xn has been

obtained (that is, which method has been employed to generate it).

Forward and backward analyses are two di¬erent instances of the so

called a priori analysis. This latter can be applied to investigate not only

the stability of a numerical method, but also its convergence. In this case

it is referred to as a priori error analysis, which can again be performed

using either a forward or a backward technique.

A priori error analysis is distincted from the so called a posteriori error

analysis, which aims at producing an estimate of the error on the grounds

of quantities that are actually computed by a speci¬c numerical method.

Typically, denoting by xn the computed numerical solution, approximation

to the solution x of problem (2.1), the a posteriori error analysis aims at

evaluating the error x ’ xn as a function of the residual rn = F (xn , d) by

means of constants that are called stability factors (see [EEHJ96]).

Example 2.8 For the sake of illustration, consider the problem of ¬nding the

zeros ±1 , . . . , ±n of a polynomial pn (x) = n ak xk of degree n.

k=0

Denoting by pn (x) = n ak xk a perturbed polynomial whose zeros are ±i ,

˜ ˜ ˜

k=0

forward analysis aims at estimating the error between two corresponding zeros

±i and ±i , in terms of the variations on the coe¬cients ak ’ ak , k = 0, 1, . . . , n.

˜ ˜

On the other hand, let {±i } be the approximate zeros of pn (computed some-

ˆ

how). Backward analysis provides an estimate of the perturbations δak which

should be impressed to the coe¬cients so that n (ak + δak )±i = 0, for a ¬xed

ˆk

k=0

±i . The goal of a posteriori error analysis would rather be to provide an estimate

ˆ

of the error ±i ’ ±i as a function of the residual value pn (±i ).

ˆ ˆ

•

This analysis will be carried out in Section 6.1.

Example 2.9 Consider the linear system Ax=b, where A∈ Rn—n is a nonsin-

gular matrix.

˜

˜x