enforcing the identity A=LDMT entry by entry.

3.4.2 Symmetric and Positive De¬nite Matrices: The

Cholesky Factorization

As already pointed out, the factorization LDMT simpli¬es considerably

when A is symmetric because in such a case M=L, yielding the so-called

LDMT factorization. The computational cost halves, with respect to the

LU factorization, to about (n3 /3) ¬‚ops.

As an example, the Hilbert matrix of order 3 admits the following LDLT

factorization

® ® ® ®

113 1

113 1

100 10 0

2 2

1 1 1 1

H3 = 2 3 4 = 2 1 0 0 12 0 0 1 1 .

1

° »° »° »° »

1 1

1 1 1

001

11 0 0 180

3

3 4 5

In the case that A is also positive de¬nite, the diagonal entries of D in the

LDLT factorization are positive. Moreover, we have the following result.

Theorem 3.6 Let A ∈ Rn—n be a symmetric and positive de¬nite matrix.

Then, there exists a unique upper triangular matrix H with positive diagonal

entries such that

A = HT H. (3.44)

This factorization is called Cholesky factorization and the entries hij of HT

√

can be computed as follows: h11 = a11 and, for i = 2, . . . , n,

j’1

aij ’ /hjj , j = 1, . . . , i ’ 1,

hij = hik hjk

k=1

(3.45)

1/2

i’1

aii ’ h2

hii = .

ik

k=1

3.4 Other Types of Factorization 81

Proof. Let us prove the theorem proceeding by induction on the size i of the

matrix (as done in Theorem 3.4), recalling that if Ai ∈ Ri—i is symmetric positive

de¬nite, then all its principal submatrices enjoy the same property.

For i = 1 the result is obviously true. Thus, suppose that it holds for i ’ 1 and

prove that it also holds for i. There exists an upper triangular matrix Hi’1 such

that Ai’1 = HT Hi’1 . Let us partition Ai as

i’1

Ai’1 v

Ai = ,

vT ±

with ± ∈ R+ , vT ∈ Ri’1 and look for a factorization of Ai of the form

HT Hi’1 h

0

Ai = HT Hi = i’1

.

i

0T

hT β

β

Enforcing the equality with the entries of Ai yields the equations HT h = v

i’1

and hT h + β 2 = ±. The vector h is thus uniquely determined, since HT is

i’1

nonsingular. As for β, due to the properties of determinants

0 < det(Ai ) = det(HT ) det(Hi ) = β 2 (det(Hi’1 ))2 ,

i

√

we can conclude that it must be a real number. As a result, β = ± ’ hT h is

the desired diagonal entry and this concludes the inductive argument.

√

Let us now prove formulae (3.45). The fact that h11 = a11 is an immediate

consequence of the induction argument for i = 1. In the case of a generic i,

relations (3.45)1 are the forward substitution formulae for the solution of the

linear system HT h = v = (a1i , a2i , . . . , ai’1,i )T , while formulae (3.45)2 state

√ i’1

that β = ± ’ hT h, where ± = aii . 3

The algorithm which implements (3.45) requires about (n3 /3) ¬‚ops and it

turns out to be stable with respect to the propagation of rounding errors.

˜

It can indeed be shown that the upper triangular matrix H is such that

˜˜

HT H = A + δA, where δA is a pertubation matrix such that δA 2 ¤

8n(n + 1)u A 2 , when the rounding errors are considered and assuming

that 2n(n + 1)u ¤ 1 ’ (n + 1)u (see [Wil68]).

Also, for the Cholesky factorization it is possible to overwrite the matrix

T

H in the lower triangular portion of A, without any further memory stor-

age. By doing so, both A and the factorization are preserved, noting that

A is stored in the upper triangular section since it is symmetric and that

i’1

its diagonal entries can be computed as a11 = h2 , aii = h2 + k=1 h2 ,

11 ii ik

i = 2, . . . , n.

An example of implementation of the Cholesky factorization is coded in

Program 7.

Program 7 - chol2 : Cholesky factorization

function [A] = chol2 (A)

[n,n]=size(A);

82 3. Direct Methods for the Solution of Linear Systems

for k=1:n-1

A(k,k)=sqrt(A(k,k)); A(k+1:n,k)=A(k+1:n,k)/A(k,k);

for j=k+1:n, A(j:n,j)=A(j:n,j)-A(j:n,k)*A(j,k); end

end

A(n,n)=sqrt(A(n,n));

3.4.3 Rectangular Matrices: The QR Factorization

De¬nition 3.1 A matrix A ∈ Rm—n , with m ≥ n, admits a QR fac-

torization if there exist an orthogonal matrix Q ∈ Rm—m and an upper

trapezoidal matrix R ∈ Rm—n with null rows from the n + 1-th one on,

such that

A = QR. (3.46)

This factorization can be constructed either using suitable transformation

matrices (Givens or Householder matrices, see Section 5.6.1) or using the

Gram-Schmidt orthogonalization algorithm discussed below.

It is also possible to generate a reduced version of the QR factorization

(3.46), as stated in the following result.

Property 3.3 Let A ∈ Rm—n be a matrix of rank n for which a QR fac-

torization is known. Then there exists a unique factorization of A of the

form

A = QR (3.47)

where Q and R are submatrices of Q and R given respectively by

(3.48)

Q = Q(1 : m, 1 : n), R = R(1 : n, 1 : n).

Moreover, Q has orthonormal vector columns and R is upper triangular

and coincides with the Cholesky factor H of the symmetric positive de¬nite

matrix AT A, that is, AT A = RT R.

˜

If A has rank n (i.e., full rank), then the column vectors of Q form an

orthonormal basis for the vector space range(A) (de¬ned in (1.5)). As a

consequence, constructing the QR factorization can also be interpreted as

a procedure for generating an orthonormal basis for a given set of vectors.

If A has rank r < n, the QR factorization does not necessarily yield an

orthonormal basis for range(A). However, one can obtain a factorization of

the form

R11 R12

QT AP = ,

0 0

3.4 Other Types of Factorization 83

m’n

n n n

˜

Rn

0

˜

A

m =Q

m’n

FIGURE 3.1. The reduced factorization. The matrices of the QR factorization

are drawn in dashed lines

where Q is orthogonal, P is a permutation matrix and R11 is a nonsingular

upper triangular matrix of order r.

In general, when using the QR factorization, we shall always refer to its

reduced form (3.47) as it ¬nds a remarkable application in the solution of

overdetermined systems (see Section 3.13).

˜ ˜

The matrix factors Q and R in (3.47) can be computed using the Gram-

Schmidt orthogonalization. Starting from a set of linearly independent vec-

tors, x1 , . . . , xn , this algorithm generates a new set of mutually orthogonal

vectors, q1 , . . . , qn , given by

q1 = x1 ,

k (3.49)

(qi , xk+1 )

qk+1 = xk+1 ’ k = 1, . . . , n ’ 1.

qi ,

(qi , qi )

i=1

˜

Denoting by a1 , . . . , an the column vectors of A, we set q1 = a1 / a1 2

˜ as

and, for k = 1, . . . , n ’ 1, compute the column vectors of Q

˜

qk+1 = qk+1 / qk+1 2 ,

where

k

qk+1 = ak+1 ’ (˜ j , ak+1 )˜ j .

q q

j=1

˜˜ ˜

Next, imposing that A=QR and exploiting the fact that Q is orthogonal

(that is, Q’1 = QT ), the entries of R can easily be computed. The overall

˜ ˜ ˜

computational cost of the algorithm is of the order of mn2 ¬‚ops.

It is also worth noting that if A has full rank, the matrix AT A is sym-

metric and positive de¬nite (see Section 1.9) and thus it admits a unique

Cholesky factorization of the form HT H. On the other hand, since the or-

˜

thogonality of Q implies

˜ ˜ ˜˜ ˜˜

HT H = AT A = RT QT QR = RT R,

84 3. Direct Methods for the Solution of Linear Systems

˜

we conclude that R is actually the Cholesky factor H of AT A. Thus, the

˜

diagonal entries of R are all nonzero only if A has full rank.

The Gram-Schmidt method is of little practical use since the generated

vectors lose their linear independence due to rounding errors. Indeed, in

¬‚oating-point arithmetic the algorithm produces very small values of qk+1 2

and rkk with a consequent numerical instability and loss of orthogonality

˜

˜

for matrix Q (see Example 3.4).

These drawbacks suggest employing a more stable version, known as

modi¬ed Gram-Schmidt method. At the beginning of the k + 1-th step, the

˜ ˜

projections of the vector ak+1 along the vectors q1 , . . . , qk are progressively

subtracted from ak+1 . On the resulting vector, the orthogonalization step

is then carried out. In practice, after computing (˜ 1 , ak+1 )˜ 1 at the k +1-th

q q

step, this vector is immediately subtracted from ak+1 . As an example, one

lets

(1)

ak+1 = ak+1 ’ (˜ 1 , ak+1 )˜ 1 .

q q

(1)

˜

This new vector ak+1 is projected along the direction of q2 and the obtained

(1)

projection is subtracted from ak+1 , yielding

(2) (1) (1)

ak+1 = ak+1 ’ (˜ 2 , ak+1 )˜ 2

q q

(k)

and so on, until ak+1 is computed.

(k)

It can be checked that ak+1 coincides with the corresponding vector qk+1

in the standard Gram-Schmidt process, since, due to the orthogonality of

˜˜ ˜

vectors q1 , q2 , . . . , qk ,

(k)

= ak+1 ’ (˜ 1 , ak+1 )˜ 1 ’ (˜ 2 , ak+1 ’ (˜ 1 , ak+1 )˜ 1 ) q2 + . . .

q˜

ak+1 q q q q

k

= ak+1 ’ (˜ j , ak+1 )˜ j .

q q

j=1

Program 8 implements the modi¬ed Gram-Schmidt method. Notice that

it is not possible to overwrite the computed QR factorization on the ma-

trix A. In general, the matrix R is overwritten on A, whilst Q is stored

separately. The computational cost of the modi¬ed Gram-Schmidt method

has the order of 2mn2 ¬‚ops.

Program 8 - mod grams : Modi¬ed Gram-Schmidt method

function [Q,R] = mod grams(A)

[m,n]=size(A);

Q=zeros(m,n); Q(1:m,1) = A(1:m,1); R=zeros(n); R(1,1)=1;

for k = 1:n

R(k,k) = norm (A(1:m,k)); Q(1:m,k) = A(1:m,k)/R(k,k);

3.5 Pivoting 85

for j=k+1:n

R (k,j) = Q (1:m,k)™ * A(1:m,j);

A (1:m,j) = A (1:m,j) - Q(1:m,k)*R(k,j);

end

end

Example 3.4 Let us consider the Hilbert matrix H4 of order 4 (see (3.32)). The

˜

matrix Q, generated by the standard Gram-Schmidt algorithm, is orthogonal up

to the order of 10’10 , being

®

0.0000 ’0.0000 0.0001 ’0.0041

’0.0000 0.0004 ’0.0099

0

I ’ QT Q = 10’10

˜˜

° 0.0001 0 ’0.4785 »

0.0004

’0.0041 ’0.0099 ’0.4785 0

and I ’ QT Q ∞ = 4.9247 · 10’11 . Using the modi¬ed Gram-Schmidt method,

˜˜

we would obtain

®

0.0001 ’0.0005 0.0069 ’0.2853

’0.0005 0.0213

0 ’0.0023

I ’ QT Q = 10’12

˜˜

° 0.0069 ’0.0023 0.0002 ’0.0103 »

’0.2853 0.0213 ’0.0103 0

and this time I ’ QT Q ∞ = 3.1686 · 10’13 .

˜˜

An improved result can be obtained using, instead of Program 8, the intrinsic

function QR of MATLAB. This function can be properly employed to generate

•

both the factorization (3.46) as well as its reduced version (3.47).

3.5 Pivoting

As previously pointed out, the GEM process breaks down as soon as a zero

pivotal entry is computed. In such an event, one needs to resort to the so-

called pivoting technique, which amounts to exchanging rows (or columns)

of the system in such a way that non vanishing pivots are obtained.

Example 3.5 Let us go back to matrix (3.33) for which GEM furnishes at the

second step a zero pivotal element. By simply exchanging the second row with

the third one, we can execute one step further of the elimination method, ¬nding

a nonzero pivot. The generated system is equivalent to the original one and it

can be noticed that it is already in upper triangular form. Indeed

®

1 2 3

A(2) = ° 0 ’6 ’12 » = U,

’1

0 0

while the transformation matrices are given by

® ®

1 00 1 0 0

° ’2 1 0 » , M(2) = ° 0 0 ».

(1)

1

M=

’7 0 1 0 0 1

86 3. Direct Methods for the Solution of Linear Systems

From an algebraic standpoint, a permutation of the rows of A has been performed.

In fact, it now no longer holds that A=M’1 M’1 U, but rather A=M’1 P M’1 U,

1 2 1 2

P being the permutation matrix

®

100

P = ° 0 0 1 ». (3.50)

010

•

The pivoting strategy adopted in Example 3.5 can be generalized by look-

ing, at each step k of the elimination procedure, for a nonzero pivotal entry

by searching within the entries of the subcolumn A(k) (k : n, k). For that

reason, it is called partial pivoting (by rows).

From (3.30) it can be seen that a large value of mik (generated for ex-

(k)

ample by a small value of the pivot akk ) might amplify the rounding errors

(k)

a¬ecting the entries akj . Therefore, in order to ensure a better stability,

the pivotal element is chosen as the largest entry (in module) of the column

A(k) (k : n, k) and partial pivoting is generally performed at every step of

the elimination procedure, even if not strictly necessary (that is, even if

nonzero pivotal entries are found).

Alternatively, the searching process could have been extended to the

whole submatrix A(k) (k : n, k : n), ending up with a complete pivoting

(see Figure 3.2). Notice, however, that while partial pivoting requires an

additional cost of about n2 searches, complete pivoting needs about 2n3 /3,

with a considerable increase of the computational cost of GEM.

00000000000000000

11111111111111111 11111111111111111

00000000000000000

00000000000000000

11111111111111111 11111111111111111

00000000000000000

00000000000000000

11111111111111111 11111111111111111

00000000000000000

11111111111111111

00000000000000000 00000000000000000

11111111111111111

11111111111111111

00000000000000000 11111111111111111

00000000000000000

11111111111111111