<< . .

. 10
( : 95)

. . >>

= ρ(A)ρ(A’1 )
K2 (A) = (3.5)
where »max and »min are the maximum and minimum eigenvalues of A.
To check (3.5), notice that

ρ(AT A) = ρ(A2 ) = »2
A = max = »max .

Moreover, since »(A’1 ) = 1/»(A), one gets A’1 2 = 1/»min from which
(3.5) follows. For that reason, K2 (A) is called spectral condition number.

Remark 3.1 De¬ne the relative distance of A ∈ Cn—n from the set of
singular matrices with respect to the p-norm by
δA p
distp (A) = min : A + δA is singular .

It can then be shown that ([Kah66], [Gas83])
distp (A) = . (3.6)
Kp (A)

Equation (3.6) suggests that a matrix A with a high condition number
can behave like a singular matrix of the form A+δA. In other words, null
60 3. Direct Methods for the Solution of Linear Systems

perturbations in the right hand side do not necessarily yield non vanishing
changes in the solution since, if A+δA is singular, the homogeneous system
(A + δA)z = 0 does no longer admit only the null solution. From (3.6) it
also follows that if A+δA is nonsingular then

δA A < 1. (3.7)
p p

Relation (3.6) seems to suggest that a natural candidate for measuring
the ill-conditioning of a matrix is its determinant, since from (3.3) one is
prompted to conclude that small determinants mean nearly-singular matri-
ces. However this conclusion is wrong, as there exist examples of matrices
with small (respectively, high) determinants and small (respectively, high)
condition numbers (see Exercise 2).

3.1.2 Forward a priori Analysis
In this section we introduce a measure of the sensitivity of the system to
changes in the data. These changes will be interpreted in Section 3.10 as
being the e¬ects of rounding errors induced by the numerical method used
to solve the system. For a more comprehensive analysis of the subject we
refer to [Dat95], [GL89], [Ste73] and [Var62].
Due to rounding errors, a numerical method for solving (3.2) does not
provide the exact solution but only an approximate one, which satis¬es a
perturbed system. In other words, a numerical method yields an (exact)
solution x + δx of the perturbed system

(A + δA)(x + δx) = b + δb. (3.8)

The next result provides an estimate of δx in terms of δA and δb.

Theorem 3.1 Let A ∈ Rn—n be a nonsingular matrix and δA ∈ Rn—n be
such that (3.7) is satis¬ed for a matrix norm · . Then, if x∈ Rn is the
solution of Ax=b with b ∈ Rn (b = 0) and δx ∈ Rn satis¬es (3.8) for
δb ∈ Rn ,
δx δb
K(A) δA
¤ + . (3.9)
1 ’ K(A) δA / A
x b A
Proof. From (3.7) it follows that the matrix A’1 δA has norm less than 1. Then,
due to Theorem 1.5, I + A’1 δA is invertible and from (1.26) it follows that
1 1
(I + A’1 δA)’1 ¤ ¤ . (3.10)
1’ A 1’ A
’1 δA ’1 δA
On the other hand, solving for δx in (3.8) and recalling that Ax = b, one gets

δx = (I + A’1 δA)’1 A’1 (δb ’ δAx),
3.1 Stability Analysis of Linear Systems 61

from which, passing to the norms and using (3.10), it follows that

δx ¤ ( δb + δA x ).
1 ’ A’1 δA

Finally, dividing both sides by x (which is nonzero since b = 0 and A is
nonsingular) and noticing that x ≥ b / A , the result follows. 3

Well-conditioning alone is not enough to yield an accurate solution of the
linear system. It is indeed crucial, as pointed out in Chapter 2, to resort to
stable algorithms. Conversely, ill-conditioning does not necessarily exclude
that for particular choices of the right side b the overall conditioning of the
system is good (see Exercise 4).
A particular case of Theorem 3.1 is the following.

Theorem 3.2 Assume that the conditions of Theorem 3.1 hold and let
δA = 0. Then
δb δx δb
¤ ¤ K(A) . (3.11)
K(A) b x b

Proof. We will prove only the ¬rst inequality since the second one directly
follows from (3.9). Relation δx = A’1 δb yields δb ¤ A δx . Multiplying
both sides by x and recalling that x ¤ A’1 b it follows that x δb ¤
K(A) b δx , which is the desired inequality. 3

In order to employ the inequalities (3.10) and (3.11) in the analysis of
propagation of rounding errors in the case of direct methods, δA and
δb should be bounded in terms of the dimension of the system and of
the characteristics of the ¬‚oating-point arithmetic that is being used.
It is indeed reasonable to expect that the perturbations induced by a
method for solving a linear system are such that δA ¤ γ A and δb ¤
γ b , γ being a positive number that depends on the roundo¬ unit u (for
example, we shall assume henceforth that γ = β 1’t , where β is the base
and t is the number of digits of the mantissa of the ¬‚oating-point system
F). In such a case (3.9) can be completed by the following theorem.

Theorem 3.3 Assume that δA ¤ γ A , δb ¤ γ b with γ ∈ R+ and
δA ∈ Rn—n , δb ∈ Rn . Then, if γK(A) < 1 the following inequalities hold

x + δx 1 + γK(A)
¤ , (3.12)
1 ’ γK(A)

δx 2γ
¤ K(A). (3.13)
1 ’ γK(A)
62 3. Direct Methods for the Solution of Linear Systems

Proof. From (3.8) it follows that (I + A’1 δA)(x + δx) = x + A’1 δb. Moreover,
since γK(A) < 1 and δA ¤ γ A it turns out that I + A’1 δA is nonsingular.
Taking the inverse of such a matrix and passing to the norms we get x + δx ¤
(I + A’1 δA)’1 x + γ A’1 b . From Theorem 1.5 it then follows that
x + γ A’1
x + δx ¤ b ,
1 ’ A’1 δA

which implies (3.12), since A’1 δA ¤ γK(A) and b ¤ A x .
Let us prove (3.13). Subtracting (3.2) from (3.8) it follows that

Aδx = ’δA(x + δx) + δb.

Inverting A and passing to the norms, the following inequality is obtained
A’1 δA x + δx + A’1
δx ¤ δb
γK(A) x + δx + γ A’1
¤ b.

Dividing both sides by x and using the triangular inequality x+δx ¤ δx +
x , we ¬nally get (3.13).

Remarkable instances of perturbations δA and δb are those for which
|δA| ¤ γ|A| and |δb| ¤ γ|b| with γ ≥ 0. Hereafter, the absolute value
notation B = |A| denotes the matrix n — n having entries bij = |aij | with
i, j = 1, . . . , n and the inequality C ¤ D, with C, D ∈ Rm—n has the
following meaning

cij ¤ dij for i = 1, . . . , m, j = 1, . . . , n.

If is considered, from (3.14) it follows that

|A’1 | |A| |x| + |A’1 | |b| ∞
δx ∞
(1 ’ γ |A’1 | |A| ∞ ) x ∞

|A’1 | |A|
¤ ∞.
1 ’ γ |A’1 | |A| ∞

Estimate (3.15) is generally too pessimistic; however, the following compo-
nentwise error estimates of δx can be derived from (3.15)

|δxi | ¤ γ|rT | |A| |x + δx|, i = 1, . . . , n if δb = 0,

|rT | |b|
|δxi | (i)
¤γ T , i = 1, . . . , n if δA = 0,
|xi | |r(i) b|

being rT the row vector eT A’1 . Estimates (3.16) are more stringent than
(3.15), as can be seen in Example 3.1. The ¬rst inequality in (3.16) can be
used when the perturbed solution x + δx is known, being henceforth x + δx
the solution computed by a numerical method.
3.1 Stability Analysis of Linear Systems 63

In the case where |A’1 | |b| = |x|, the parameter γ in (3.15) is equal
to 1. For such systems the components of the solution are insensitive to
perturbations to the right side. A slightly worse situation occurs when A
is a triangular M-matrix and b has positive entries. In such a case γ is
bounded by 2n ’ 1, since
|rT | |A| |x| ¤ (2n ’ 1)|xi |.

For further details on the subject we refer to [Ske79], [CI95] and [Hig89].
Results linking componentwise estimates to normwise estimates through
the so-called hypernorms can be found in [ADR92].

Example 3.1 Consider the linear system Ax=b with
®  ®2 
1 1
±± ± +±
A=° », b = ° »
1 1
0± ±

which has solution xT = (±, 1), where 0 < ± < 1. Let us compare the results
obtained using (3.15) and (3.16). From
’1 ’1
|A | |A| |x| = |A | |b| = ± + 2,1 (3.17)
it follows that the supremum of (3.17) is unbounded as ± ’ 0, exactly as it
happens in the case of A ∞ . On the other hand, the ampli¬cation factor of
the error in (3.16) is bounded. Indeed, the component of the maximum absolute
value, x2 , of the solution, satis¬es |rT | |A| |x|/|x2 | = 1. •

3.1.3 Backward a priori Analysis
The numerical methods that we have considered thus far do not require the
explicit computation of the inverse of A to solve Ax=b. However, we can
always assume that they yield an approximate solution of the form x = Cb,
where the matrix C, due to rounding errors, is an approximation of A’1 .
In practice, C is very seldom constructed; in case this should happen, the
following result yields an estimate of the error that is made substituting C
for A’1 (see [IK66], Chapter 2, Theorem 7).

Property 3.1 Let R = AC ’ I; if R < 1, then A and C are nonsingular
¤ C ’ A’1 ¤
A’1 ¤ , . (3.18)
1’ R 1’ R

In the frame of backward a priori analysis we can interpret C as being the
inverse of A + δA (for a suitable unknown δA). We are thus assuming that
C(A + δA) = I. This yields
δA = C’1 ’ A = ’(AC ’ I)C’1 = ’RC’1
64 3. Direct Methods for the Solution of Linear Systems

and, as a consequence, if R < 1 it turns out that
δA ¤ , (3.19)
1’ R
having used the ¬rst inequality in (3.18), where A is assumed to be an
approximation of the inverse of C (notice that the roles of C and A can be

3.1.4 A posteriori Analysis
Having approximated the inverse of A by a matrix C turns into having an
approximation of the solution of the linear system (3.2). Let us denote by
y a known approximate solution. The aim of the a posteriori analysis is to
relate the (unknown) error e = y ’ x to quantities that can be computed
using y and C.
The starting point of the analysis relies on the fact that the residual
vector r = b ’ Ay is in general nonzero, since y is just an approximation
to the unknown exact solution. The residual can be related to the error
through Property 3.1 as follows. We have e = A’1 (Ay ’ b) = ’A’1 r and
thus, if R < 1 then
e¤ . (3.20)
1’ R
Notice that the estimate does not necessarily require y to coincide with
the solution x = Cb of the backward a priori analysis. One could therefore
think of computing C only for the purpose of using the estimate (3.20) (for
instance, in the case where (3.2) is solved through the Gauss elimination
method, one can compute C a posteriori using the LU factorization of A,
see Sections 3.3 and 3.3.1).
We conclude by noticing that if δb is interpreted in (3.11) as being the
residual of the computed solution y = x + δx, it also follows that
e r
¤ K(A) . (3.21)
x b
The estimate (3.21) is not used in practice since the computed residual
is a¬ected by rounding errors. A more signi¬cant estimate (in the · ∞
norm) is obtained letting r = f l(b ’ Ay) and assuming that r = r + δr
with |δr| ¤ γn+1 (|A| |y| + |b|), where γn+1 = (n + 1)u/(1 ’ (n + 1)u) > 0,
from which we have
|A’1 |(|r| + γn+1 (|A||y| + |b|))
e ∞ ∞
¤ .
y y∞

Formulae like this last one are implemented in the library for linear algebra
LAPACK (see [ABB+ 92]).
3.2 Solution of Triangular Systems 65

3.2 Solution of Triangular Systems
Consider the nonsingular 3—3 lower triangular system
® ®  ® 
l11 0 0 x1 b1
° l21 l22 0»° x2 » = ° b2 » .
l31 l32 l33 x3 b3

Since the matrix is nonsingular, its diagonal entries lii , i = 1, 2, 3, are
non vanishing, hence we can solve sequentially for the unknown values
xi , i = 1, 2, 3 as follows

x1 = b1 /l11 ,
x2 = (b2 ’ l21 x1 )/l22 ,
x3 = (b3 ’ l31 x1 ’ l32 x2 )/l33 .

This algorithm can be extended to systems n — n and is called forward
substitution. In the case of a system Lx=b, with L being a nonsingular
lower triangular matrix of order n (n ≥ 2), the method takes the form

x1 = ,
l11« 
lij xj  , i = 2, . . . , n.
bi ’
xi =
lii j=1

The number of multiplications and divisions to execute the algorithm is
equal to n(n+1)/2, while the number of sums and subtractions is n(n’1)/2.
The global operation count for (3.22) is thus n2 ¬‚ops.
Similar conclusions can be drawn for a linear system Ux=b, where U
is a nonsingular upper triangular matrix of order n (n ≥ 2). In this case
the algorithm is called backward substitution and in the general case can
be written as
xn = ,
« 
uij xj  , i = n ’ 1, . . . , 1.
bi ’
xi =
uii j=i+1

Its computational cost is still n2 ¬‚ops.

3.2.1 Implementation of Substitution Methods
Each i-th step of algorithm (3.22) requires performing the scalar product
between the row vector L(i, 1 : i ’ 1) (this notation denoting the vector
extracted from matrix L taking the elements of the i-th row from the ¬rst
66 3. Direct Methods for the Solution of Linear Systems

to the (i-1)-th column) and the column vector x(1 : i ’ 1). The access to
matrix L is thus by row; for that reason, the forward substitution algorithm,
when implemented in the form above, is called row-oriented.
Its coding is reported in Program 1 (the Program mat square that is
called by forward row merely checks that L is a square matrix).

Program 1 - forward row : Forward substitution: row-oriented version

function [x]=forward row(L,b)
[n]=mat square(L); x(1) = b(1)/L(1,1);
for i = 2:n, x (i) = (b(i)-L(i,1:i-1)*(x(1:i-1))™)/L(i,i); end

To obtain a column-oriented version of the same algorithm, we take ad-
vantage of the fact that i-th component of the vector x, once computed,
can be conveniently eliminated from the system.
An implementation of such a procedure, where the solution x is over-
written on the right vector b, is reported in Program 2.
Program 2 - forward col : Forward substitution: column-oriented version

function [b]=forward col(L,b)
[n]=mat square(L);
for j=1:n-1,
b(j)= b(j)/L(j,j); b(j+1:n)=b(j+1:n)-b(j)*L(j+1:n,j);
end; b(n) = b(n)/L(n,n);

Implementing the same algorithm by a row-oriented rather than a column-
oriented approach, might dramatically change its performance (but of course,
not the solution). The choice of the form of implementation must therefore
be subordinated to the speci¬c hardware that is used.
Similar considerations hold for the backward substitution method, pre-
sented in (3.23) in its row-oriented version.
In Program 3 only the column-oriented version of the algorithm is coded.
As usual, the vector x is overwritten on b.
Program 3 - backward col : Backward substitution: column-oriented ver-
function [b]=backward col(U,b)
[n]=mat square(U);

<< . .

. 10
( : 95)

. . >>