118 3. Direct Methods for the Solution of Linear Systems

crease of nz with m can be clearly appreciated at the expense of a dramatic

¬ll-in growing with m if no reordering is performed (dashed line).

4

x 10

6

5

4

3

2

1

0

0 100 200 300 400 500 600 700 800 900 1000

FIGURE 3.12. Number of nonzero entries in the L-factor of A as a function of

the size m of the matrix, with (solid line) and without (dashed line) reordering

3.14.2 Regularization of a Triangular Grid

The numerical solution of a problem in a two-dimensional domain D of

polygonal form, for instance by ¬nite element or ¬nite di¬erence methods,

very often requires that D be decomposed in smaller subdomains, usually

of triangular form (see for instance Section 9.9.2).

T , where Th is the considered triangulation (also

Suppose that D =

T ∈Th

called computational grid) and h is a positive parameter which characterizes

the triangulation. Typically, h denotes the maximum length of the triangle

edges. We shall also assume that two triangles of the grid, T1 and T2 , have

either null intersection or share a vertex or a side.

The geometrical properties of the computational grid can heavily a¬ect the

quality of the approximate numerical solution. It is therefore convenient to

devise a su¬ciently regular triangulation, such that, for any T ∈ Th , the

ratio between the maximum length of the sides of T (the diameter of T )

and the diameter of the circle inscribed within T (the sphericity of T ) is

bounded by a constant independent of T . This latter requirement can be

satis¬ed employing a regularization procedure, applied to an existing grid.

We refer to [Ver96] for further details on this subject.

Let us assume that Th contains NT triangles and N vertices, of which Nb ,

lying on the boundary ‚D of D, are kept ¬xed and having coordinates

(‚D) (‚D) (‚D)

). We denote by Nh the set of grid nodes, excluding

xi = (xi , yi

the boundary nodes, and for each node xi = (xi , yi )T ∈ Nh , let Pi and Zi

respectively be the set of triangles T ∈ Th sharing xi (called the patch of

3.14 Applications 119

xk

xj

T

xi

FIGURE 3.13. An example of a decomposition into triangles of a polygonal do-

main D (left), and the e¬ect of the barycentric regularization on a patch of

triangles (right). The newly generated grid is plotted in dashed line

xi ) and the set of nodes of Pi except node xi itself (see Figure 3.13, right).

We let ni = dim(Zi ).

The regularization procedure consists of moving the generic node xi to

a new position which is determined by the center of gravity of the polygon

generated by joining the nodes of Zi , and for that reason it is called a

barycentric regularization. The e¬ect of such a procedure is to force all the

triangles that belong to the interior of the domain to assume a shape that

is as regular as possible (in the limit, each triangle should be equilateral).

In practice, we let

«

xi = xj /ni ,

(‚D)

∀xi ∈ Nh , if xi ∈ ‚D.

xi = xi

xj ∈Zi

Two systems must then be solved, one for the x-components {xi } and the

other for the y-components {yi }. Denoting by zi the generic unknown, the

i-th row of the system, in the case of internal nodes, reads

ni zi ’ ∀i ∈ Nh ,

zj = 0, (3.78)

zj ∈Zi

(‚D)

while for the boundary nodes the identities zi = zi hold. Equations

(3.78) yield a system of the form Az = b, where A is a symmetric and pos-

itive de¬nite matrix of order N ’Nb which can be shown to be an M-matrix

(see Section 1.12). This property ensures that the new grid coordinates sat-

isfy minimum and maximum discrete principles, that is, they take a value

which is between the minimum and the maximum values attained on the

boundary.

Let us apply the regularization technique to the triangulation of the unit

square in Figure 3.14, which is a¬ected by a severe non uniformity of the

triangle size. The grid consists of NT = 112 triangles and N = 73 vertices,

120 3. Direct Methods for the Solution of Linear Systems

1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

FIGURE 3.14. Triangulation before (left) and after (right) the regularization

of which Nb = 32 are on the boundary. The size of each of the two linear

systems (3.78) is thus equal to 41 and their solution is carried out by the

LU factorization of matrix A in its original form (1) and using its sparse

format (2), obtained using the Cuthill-McKee inverse reordering algorithm

described in Section 3.9.1.

In Figure 3.15 the sparsity patterns of A are displayed, without and with

reordering; the integer nz = 237 denotes the number of nonzero entries

in the matrix. Notice that in the second case there is a decrease in the

bandwidth of the matrix, to which corresponds a large reduction in the

operation count from 61623 to 5552. The ¬nal con¬guration of the grid is

displayed in Figure 3.14 (right), which clearly shows the e¬ectiveness of the

regularization procedure.

0 0

5 5

10 10

15 15

20 20

25 25

30 30

35 35

40 40

0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40

nz = 237 nz = 237

FIGURE 3.15. Sparsity patterns of matrix A without and with reordering (left

and right, respectively)

3.15 Exercises 121

3.15 Exercises

1. For any square matrix A∈ Rn—n , prove the following relations

1 1

K2 (A) ¤ K1 (A) ¤ nK2 (A), K∞ (A) ¤ K2 (A) ¤ nK∞ (A),

n n

1

K1 (A) ¤ K∞ (A) ¤ n2 K1 (A).

n2

They allow us to conclude that if a matrix is ill-conditioned in a certain

norm it remains so even in another norm, up to a factor depending on n.

2. Check that the matrix B ∈ Rn—n : bii = 1, bij = ’1 if i < j, bij = 0 if

i > j, has determinant equal to 1, yet K∞ (B) is large (equal to n2n’1 ).

3. Prove that K(AB) ¤ K(A)K(B), for any two square nonsingular matrices

A,B∈ Rn—n .

4. Given the matrix A ∈ R2—2 , a11 = a22 = 1, a12 = γ, a21 = 0, check that

for γ ≥ 0, K∞ (A) = K1 (A) = (1 + γ)2 . Next, consider the linear system

Ax = b where b is such that x = (1 ’ γ, 1)T is the solution. Find a bound

for δx ∞ / x ∞ in terms of δb ∞ / b ∞ when δb = (δ1 , δ2 )T . Is the

problem well- or ill-conditioned?

5. Consider the matrix A ∈ Rn—n , with entries aij = 1 if i = j or j = n,

aij = ’1 if i > j, zero otherwise. Show that A admits an LU factorization,

with |lij | ¤ 1 and unn = 2n’1 .

6. Consider matrix (3.68) in Example 3.7. Prove that the matrices L and U

have entries very large in module. Check that using GEM with complete

pivoting yields the exact solution.

7. Devise a variant of GEM that transforms a nonsingular matrix A ∈ Rn—n

directly into a diagonal matrix D. This process is commonly known as the

Gauss-Jordan method. Find the Gauss-Jordan transformation matrices Gi ,

i = 1, . . . , n, such that Gn . . . G1 A = D.

8. Let A be a sparse matrix of order n. Prove that the computational cost of

the LU factorization of A is given by (3.61). Prove also that it is always

less than

n

1

mk (A) (mk (A) + 3) .

2

k=1

9. Prove that, if A is a symmetric and positive de¬nite matrix, solving the

linear system Ax = b amounts to computing x= n (ci /»i )vi , where »i

i=1

are the eigenvalues of A and vi are the corresponding eigenvectors.

10. (From [JM92]). Consider the following linear system

1001 1000 x1 b1

= .

1000 1001 x2 b2

Using Exercise 9, explain why, when b = (2001, 2001)T , a small change

δb = (1, 0)T produces large variations in the solution, while, conversely,

122 3. Direct Methods for the Solution of Linear Systems

when b = (1, ’1)T , a small variation δx = (0.001, 0)T in the solution

induces a large change in b.

[Hint : expand the right hand side on the basis of the eigenvectors of the

matrix.]

11. Characterize the ¬ll-in for a matrix A ∈ Rn—n having nonzero entries only

on the main diagonal and on the ¬rst column and last row. Propose a

permutation that minimizes the ¬ll-in.

[Hint : it su¬ces to exchange the ¬rst row and the ¬rst column with the

last row and the last column, respectively.]

12. Consider the linear system Hn x = b, where Hn is the Hilbert matrix of

order n. Estimate, as a function of n, the maximum number of signi¬cant

digits that are expected when solving the system by GEM.

13. Given the vectors

v1 = [1, 1, 1, ’1]T , v2 = [2, ’1, ’1, 1]T

v3 = [0, 3, 3, ’3]T , v4 = [’1, 2, 2, 1]T

generate an orthonormal system using the Gram-Schmidt algorithm, in

either its standard and modi¬ed versions, and compare the obtained results.

What is the dimension of the space generated by the given vectors?

14. Prove that if A=QR then

1

K1 (A) ¤ K1 (R) ¤ nK1 (A),

n

while K2 (A) = K2 (R).

15. Let A ∈ Rn—n be a nonsingular matrix. Determine the conditions under

which the ratio y 2 / x 2 , with x and y as in (3.70), approximates A’1 2 .

[Solution : let UΣVT be the singular value decomposition of A. Denote

by ui , vi the column vectors of U and V, respectively, and expand the

˜

vector d in (3.70) on the basis spanned by {vi }. Then d = n di vi and,

i=1

˜ ˜

n n 2

from (3.70), x = i=1 (di /σi )ui , y = i=1 (di /σi )vi , having denoted the

singular values of A by σ1 , . . . , σn .

The ratio

1/2

n n

˜2 ˜

(di /σi )2 / 2

y 2/ x = (di /σi )

2

i=1 i=1

is about equal to σn = A’1 2 if: (i) y has a relevant component in the

’1

˜ ˜

direction of vn (i.e., if dn is not excessively small), and (ii) the ratio dn /σn

˜

is not negligible with respect to the ratios di /σi for i = 1, . . . , n ’ 1. This

last circumstance certainly occurs if A is ill-conditioned in the · 2 -norm

since σn σ1 .]

4

Iterative Methods for Solving Linear

Systems

Iterative methods formally yield the solution x of a linear system after an

in¬nite number of steps. At each step they require the computation of the

residual of the system. In the case of a full matrix, their computational

cost is therefore of the order of n2 operations for each iteration, to be

compared with an overall cost of the order of 2 n3 operations needed by

3

direct methods. Iterative methods can therefore become competitive with

direct methods provided the number of iterations that are required to con-

verge (within a prescribed tolerance) is either independent of n or scales

sublinearly with respect to n.

In the case of large sparse matrices, as discussed in Section 3.9, direct

methods may be unconvenient due to the dramatic ¬ll-in, although ex-

tremely e¬cient direct solvers can be devised on sparse matrices featuring

special structures like, for example, those encountered in the approximation

of partial di¬erential equations (see Chapters 12 and 13).

Finally, we notice that, when A is ill-conditioned, a combined use of direct

and iterative methods is made possible by preconditioning techniques that

will be addressed in Section 4.3.2.

4.1 On the Convergence of Iterative Methods

The basic idea of iterative methods is to construct a sequence of vectors

x(k) that enjoy the property of convergence

x = lim x(k) , (4.1)

k’∞

124 4. Iterative Methods for Solving Linear Systems

where x is the solution to (3.2). In practice, the iterative process is stopped

at the minimum value of n such that x(n) ’ x < µ, where µ is a ¬xed

tolerance and · is any convenient vector norm. However, since the exact

solution is obviously not available, it is necessary to introduce suitable

stopping criteria to monitor the convergence of the iteration (see Section

4.6).

To start with, we consider iterative methods of the form

k ≥ 0,

x(0) given, x(k+1) = Bx(k) + f , (4.2)

having denoted by B an n — n square matrix called the iteration matrix

and by f a vector that is obtained from the right hand side b.

De¬nition 4.1 An iterative method of the form (4.2) is said to be consis-

tent with (3.2) if f and B are such that x = Bx + f . Equivalently,

f = (I ’ B)A’1 b.

Having denoted by

e(k) = x(k) ’ x (4.3)

the error at the k-th step of the iteration, the condition for convergence

(4.1) amounts to requiring that lim e(k) = 0 for any choice of the initial

k’∞

(0)

datum x (often called the initial guess).

Consistency alone does not su¬ce to ensure the convergence of the iter-

ative method (4.2), as shown in the following example.

Example 4.1 To solve the linear system 2Ix = b, consider the iterative method

x(k+1) = ’x(k) + b,

which is obviously consistent. This scheme is not convergent for any choice of

the initial guess. If, for instance, x(0) = 0, the method generates the sequence

x(2k) = 0, x(2k+1) = b, k = 0, 1, . . . .

•

On the other hand, if x(0) = 1 b the method is convergent.

2

Theorem 4.1 Let (4.2) be a consistent method. Then, the sequence of vec-

tors x(k) converges to the solution of (3.2) for any choice of x(0) i¬

ρ(B) < 1.

Proof. From (4.3) and the consistency assumption, the recursive relation e(k+1) =

Be(k) is obtained. Therefore,

∀k = 0, 1, . . .

e(k) = Bk e(0) , (4.4)

4.1 On the Convergence of Iterative Methods 125

Thus, thanks to Theorem 1.5, it follows that lim Bk e(0) = 0 for any e(0) i¬

k’∞

ρ(B) < 1.

Conversely, suppose that ρ(B) > 1, then there exists at least one eigenvalue

»(B) with module greater than 1. Let e(0) be an eigenvector associated with »;

then Be(0) = »e(0) and, therefore, e(k) = »k e(0) . As a consequence, e(k) cannot