5.9.2 Generalized Real Schur Form of Symmetric-De¬nite

Pencils

A remarkable situation occurs when both A and B are symmetric, and one

of them, say B, is also positive de¬nite. In such a case, the pair (A,B) forms

a symmetric-de¬nite pencil for which the following result holds.

Theorem 5.7 The symmetric-de¬nite pencil (A,B) has real eigenvalues

and linearly independent eigenvectors. Moreover, the matrices A and B can

be simultaneously diagonalized. Precisely, there exists a nonsingular matrix

X ∈ Rn—n such that

XT AX = Λ = diag(»1 , »2 , . . . , »n ), XT BX = In ,

where for i = 1, . . . , n, »i are the eigenvalues of the pencil (A, B).

Proof. Since B is symmetric positive de¬nite, it admits a unique Cholesky fac-

torization B = HT H, where H is upper triangular (see Section 3.4.2). From (5.58)

we deduce that Cz = »z with C = H’T AH’1 , z = Hx, where (», x) is an

eigenvalue/eigenvector pair of (A,B).

The matrix C is symmetric; therefore, its eigenvalues are real and a set of

orthonormal eigenvectors (y1 , . . . , yn ) = Y exists. As a consequence, letting X =

H’1 Y allows for simultaneously diagonalizing both A and B since

XT AX = YT H’T AH’1 Y = YT CY = Λ = diag(»1 , . . . , »n ),

XT BX = YT H’T BH’1 Y = YT Y = In .

3

The following QR-Cholesky algorithm computes the eigenvalues »i and

the corresponding eigenvectors xi of a symmetric-de¬nite pencil (A,B), for

i = 1, . . . , n (see for more details [GL89], Section 8.7, [Dat95], Section 9.5):

1. compute the Cholesky factorization B = HT H;

2. compute C = H’T AH’1 ;

3. for i = 1, . . . , n, compute the eigenvalues »i and eigenvectors zi of the

symmetric matrix C using the QR iteration. Then construct from the

set {zi } an orthonormal set of eigenvectors {yi } (using, for instance,

the modi¬ed Gram-Schmidt procedure of Section 3.4.3);

4. for i = 1, . . . , n, compute the eigenvectors xi of the pencil (A,B) by

solving the systems Hxi = yi .

5.10 Methods for Eigenvalues of Symmetric matrices 227

This algorithm requires an order of 14n3 ¬‚ops and it can be shown (see

ˆ

[GL89], p. 464) that, if » is a computed eigenvalue, then

» ∈ σ(H’T AH’1 + E), B’1 2 .

ˆ with E uA

2 2

Thus, the generalized eigenvalue problem in the symmetric-de¬nite case

may become unstable with respect to rounding errors propagation if B is

ill-conditioned. For a stabilized version of the QR-Cholesky method, see

[GL89], p. 464 and the references cited therein.

5.10 Methods for Eigenvalues of Symmetric

matrices

In this section we deal with the computation of the eigenvalues of a sym-

metric matrix A ∈ Rn—n . Besides the QR method previously examined,

speci¬c algorithms which take advantage of the symmetry of A are avail-

able.

Among these, we ¬rst consider the Jacobi method, which generates a

sequence of matrices orthogonally similar to A and converging to the diag-

onal Schur form of A. Then, the Sturm sequence and Lanczos procedures

are presented, for handling the case of tridiagonal matrices and large sparse

matrices respectively.

5.10.1 The Jacobi Method

The Jacobi method generates a sequence of matrices A(k) that are orthog-

onally similar to matrix A and converge to a diagonal matrix whose entries

are the eigenvalues of A. This is done using the Givens similarity transfor-

mations (5.43) as follows.

Given A(0) = A, for any k = 1, 2, . . . , a pair of indices p and q is ¬xed,

with 1 ¤ p < q ¤ n. Next, letting Gpq = G(p, q, θ), the matrix A(k) =

(Gpq )T A(k’1) Gpq , orthogonally similar to A, is constructed in such a way

that

(k)

aij = 0 if (i, j) = (p, q). (5.59)

Letting c = cos θ and s = sin θ, the procedure for computing the entries of

A(k) that are changed with respect to those of A(k’1) , can be written as

® ®

T

(k) (k) (k’1) (k’1)

app apq cs app apq cs

° »= ° » . (5.60)

’s c ’s c

(k) (k) (k’1) (k’1)

apq aqq apq aqq

(k’1) (k’1)

If apq = 0, we can satisfy (5.59) by taking c = 1 and s = 0. If apq =

0, letting t = s/c, (5.60) requires the solution of the following algebraic

228 5. Approximation of Eigenvalues and Eigenvectors

equation

(k’1) (k’1)

’ app

aqq

t + 2·t ’ 1 = 0,

2

·= . (5.61)

(k’1)

2apq

The root t = 1/(· + 1 + · 2 ) is chosen in (5.61) if · ≥ 0, otherwise we

take t = ’1/(’· + 1 + · 2 ); next, we let

1

c= √ , s = ct. (5.62)

1 + t2

To examine the rate at which the o¬-diagonal entries of A(k) tend to zero,

it is convenient to introduce, for any matrix M ∈ Rn—n , the nonnegative

quantity

« 1/2

1/2

n n

¬ ·

’

m2 2

m2

Ψ(M) = = M . (5.63)

ij F ii

i,j=1 i=1

i=j

The Jacobi method ensures that Ψ(A(k) ) ¤ Ψ(A(k’1) ) for any k ≥ 1.

Indeed, the computation of (5.63) for matrix A(k) yields

2

(Ψ(A(k) ))2 = (Ψ(A(k’1) ))2 ’ 2 a(k’1) ¤ (Ψ(A(k’1) ))2 . (5.64)

pq

The estimate (5.64) suggests that, at each step k, the optimal choice of the

indices p and q is that corresponding to the entry in A(k’1) such that

(k’1)

|apq | = max |aij |.

(k’1)

i=j

The computational cost of this strategy is of the order of n2 ¬‚ops for the

search of the maximum module entry, while the updating step A(k) =

(Gpq )T A(k’1) Gpq requires only a cost of the order of n ¬‚ops, as already

noticed in Section 5.6.5. It is thus convenient to resort to the so called row

cyclic Jacobi method, in which the choice of the indices p and q is done by

a row-sweeping of the matrix A(k) according to the following algorithm: for

any k = 1, 2, . . . and for any i-th row of A(k) (i = 1, . . . , n ’ 1), we set

p = i and q = (i + 1), . . . , n. Each complete sweep requires N = n(n ’ 1)/2

Jacobi transformations. Assuming that |»i ’ »j | ≥ δ for i = j, it can be

shown that the cyclic Jacobi method converges quadratically, that is (see

[Wil65], [Wil62])

1

Ψ(A(k+N ) ) ¤ √ (Ψ(A(k) ))2 , k = 1, 2, . . .

δ2

For further details of the algorithm, we refer to [GL89], Section 8.4.

5.10 Methods for Eigenvalues of Symmetric matrices 229

Example 5.15 Let us apply the cyclic Jacobi method to the Hilbert matrix H4 ,

whose eigenvalues read (to ¬ve signi¬cant ¬gures) »1 = 1.5002, »2 = 1.6914·10’1 ,

»3 = 6.7383 · 10’3 and »4 = 9.6702 · 10’5 . Running Program 40 with toll =

10’15 , the method converges in 3 sweeps to a matrix whose diagonal entries

coincide with the eigenvalues of H4 unless 4.4409 · 10’16 . As for the o¬-diagonal

(k)

•

entries, the values attained by Ψ(H4 ) are reported in Table 5.3.

(k) (k) (k)

Sweep Ψ(H4 ) Sweep Ψ(H4 ) Sweep Ψ(H4 )

5.262 · 10’2 3.824 · 10’5 5.313 · 10’16

1 2 3

TABLE 5.3. Convergence of the cyclic Jacobi algorithm

Formulae (5.63) and (5.62) are implemented in Programs 38 and 39.

Program 38 - psinorm : Evaluation of Ψ(A)

function [psi]=psinorm(A)

n=max(size(A)); psi=0;

for i=1:(n-1), for j=(i+1):n, psi=psi+A(i,j)ˆ2+A(j,i)ˆ2; end; end; psi=sqrt(psi);

Program 39 - symschur : Evaluation of c and s

function [c,s]=symschur(A,p,q)

if (A(p,q)==0), c=1; s=0; else,

eta=(A(q,q)-A(p,p))/(2*A(p,q));

if (eta >= 0), t=1/(eta+sqrt(1+etaˆ2));

else, t=-1/(-eta+sqrt(1+etaˆ2)); end; c=1/sqrt(1+tˆ2); s=c*t;

end

A coding of the cyclic Jacobi method is implemented in Program 40.

This program gets as input parameters the symmetric matrix A ∈ Rn—n

and a tolerance toll. The program returns a matrix D = GT AG, G be-

ing orthogonal, such that Ψ(D) ¤ toll A F , the value of Ψ(D) and the

number of sweeps to achieve convergence.

Program 40 - cycjacobi : Cyclic Jacobi method for symmetric matrices

function [D,sweep,psi]=cycjacobi(A,toll)

n=max(size(A)); D=A; psiD=norm(A,™fro™);

epsi=toll*psiD; psiD=psinorm(D); [psi]=psiD; sweep=0;

while (psiD > epsi), sweep=sweep+1;

for p=1:(n-1), for q=(p+1):n

[c,s]=symschur(D,p,q); [D]=gacol(D,c,s,1,n,p,q); [D]=garow(D,c,s,p,q,1,n);

end; end; psiD=psinorm(D); psi=[psi; psiD];

end

230 5. Approximation of Eigenvalues and Eigenvectors

5.10.2 The Method of Sturm Sequences

In this section we deal with the calculation of the eigenvalues of a real,

tridiagonal and symmetric matrix T. Typical instances of such a problem

arise when applying the Householder transformation to a given symmetric

matrix A (see Section 5.6.2) or when solving boundary value problems in

one spatial dimension (see for an example Section 5.12.1).

We analyze the method of Sturm sequences, or Givens method, introduced

in [Giv54]. For i = 1, . . . , n, we denote by di the diagonal entries of T and by

bi , i = 1, . . . , n ’ 1, the elements of the upper and lower subdiagonals of T.

We shall assume that bi = 0 for any i. Otherwise, indeed, the computation

reduces to problems of less complexity.

Letting Ti be the principal minor of order i of matrix T and p0 (x) = 1,

we de¬ne for i = 1, . . . , n the following sequence of polynomials pi (x) =

det(Ti ’ xIi )

p1 (x) = d1 ’ x

(5.65)

pi (x) = (di ’ x)pi’1 (x) ’ b2 pi’2 (x), i = 2, . . . , n.

i’1

It can be checked that pn is the characteristic polynomial of T; the com-

putational cost of its evaluation at point x is of the order of 2n ¬‚ops. The

sequence (5.65) is called the Sturm sequence owing to the following result,

for whose proof we refer to [Wil65], Chapter 2, Section 47 and Chapter 5,

Section 37.

Property 5.11 (of Sturm sequence) For i = 2, . . . , n the eigenvalues

of Ti’1 strictly separate those of Ti , that is

»i (Ti ) < »i’1 (Ti’1 ) < »i’1 (Ti ) < . . . < »2 (Ti ) < »1 (Ti’1 ) < »1 (Ti ).

Moreover, letting for any real number µ

Sµ = {p0 (µ), p1 (µ), . . . , pn (µ)},

the number s(µ) of sign changes in Sµ yields the number of eigenvalues of T

that are strictly less than µ, with the convention that pi (µ) has opposite sign

to pi’1 (µ) if pi (µ) = 0 (two consecutive elements in the sequence cannot

vanish at the same value of µ).

Example 5.16 Let T be the tridiagonal part of the Hilbert matrix H4 ∈ R4—4 ,

having entries hij = 1/(i + j ’ 1). The eigenvalues of T are (to ¬ve signi¬cant

¬gures) »1 = 1.2813, »2 = 0.4205, »3 = ’0.1417 and »4 = 0.1161. Taking µ = 0,

Program 41 computes the following Sturm sequence

S0 = {p0 (0), p1 (0), p2 (0), p3 (0), p4 (0)} = {1, 1, 0.0833, ’0.0458, ’0.0089}

from which, applying Property 5.11, one concludes that matrix T has one eigen-

value less than 0. In the case of matrix T = tridiag4 (’1, 2, ’1), with eigenvalues

5.10 Methods for Eigenvalues of Symmetric matrices 231

{0.38, 1.38, 2.62, 3.62} (to three signi¬cant ¬gures), we get, taking µ = 3

{p0 (3), p1 (3), p2 (3), p3 (3), p4 (3)} = {1, ’1, 0, 1, ’1}

which shows that matrix T has three eigenvalues less than 3, since three sign

•

changes occur.

The Givens method for the calculation of the eigenvalues of T proceeds as

follows. Letting b0 = bn = 0, Theorem 5.2 yields the interval J = [±, β]

which contains the spectrum of T, where

± = min [di ’ (|bi’1 | + |bi |)] , β = max [di + (|bi’1 | + |bi |)] .

1¤i¤n 1¤i¤n

The set J is used as an initial guess in the search for generic eigenvalues

»i of matrix T, for i = 1, . . . , n, using the bisection method (see Chapter

6).

Precisely, given a(0) = ± and b(0) = β, we let c(0) = (± + β)/2 and

compute s(c(0) ); then, recalling Property 5.11, we let b(1) = c(0) if s(c(0) ) >

(n ’ i), otherwise we set a(1) = c(0) . After r iterations, the value c(r) =

(a(r) + b(r) )/2 provides an approximation of »i within (|±| + |β|) · 2’(r+1) ,

as is shown in (6.9).

A systematic procedure can be set up to store any information about

the position within the interval J of the eigenvalues of T that are being

computed by the Givens method. The resulting algorithm generates a se-

(r) (r)

quence of neighboring subintervals aj , bj , for j = 1, . . . , n, each one of

arbitrarily small length and containing one eigenvalue »j of T (for further

details, see [BMW67]).

Example 5.17 Let us employ the Givens method to compute the eigenvalue

»2 2.62 of matrix T considered in Example 5.16. Letting toll=10’4 in Program

42 we obtain the results reported in Table 5.4, which demonstrate the convergence

of the sequence c(k) to the desired eigenvalue in 13 iterations. We have denoted

for brevity, s(k) = s(c(k) ). Similar results are obtained by running Program 42 to

•

compute the remaining eigenvalues of T.

a(k) b(k) c(k) s(k) a(k) b(k) c(k) s(k)

k k

0 0 4.000 2.0000 2 7 2.5938 2.625 2.6094 2

1 2.0000 4.000 3.0000 3 8 2.6094 2.625 2.6172 2

2 2.0000 3.000 2.5000 2 9 2.6094 2.625 2.6172 2

3 2.5000 3.000 2.7500 3 10 2.6172 2.625 2.6211 3

4 2.5000 2.750 2.6250 3 11 2.6172 2.621 2.6191 3

5 2.5000 2.625 2.5625 2 12 2.6172 2.619 2.6182 3

6 2.5625 2.625 2.5938 2 13 2.6172 2.618 2.6177 2

TABLE 5.4. Convergence of the Givens method for the calculation of the eigen-

value »2 of the matrix T in Example 5.16

232 5. Approximation of Eigenvalues and Eigenvectors

An implementation of the polynomial evaluation (5.65) is given in Pro-

gram 41. This program receives in input the vectors dd and bb containing

the main and the upper diagonals of T. The output values pi (x) are stored,

for i = 0, . . . , n, in the vector p.

Program 41 - sturm : Sturm sequence evaluation

function [p]=sturm(dd,bb,x)

n=length(dd); p(1)=1; p(2)=d(1)-x;

for i=2:n, p(i+1)=(dd(i)-x)*p(i)-bb(i-1)ˆ2*p(i-1); end

A basic implementation of the Givens method is provided in Program

42. In input, ind is the pointer to the searched eigenvalue, while the other

parameters are similar to those in Program 41. In output the values of

the elements of sequences a(k) , b(k) and c(k) are returned, together with

the required number of iterations niter and the sequence of sign changes

s(c(k) ).

Program 42 - givsturm : Givens method using the Sturm sequence

function [ak,bk,ck,nch,niter]=givsturm(dd,bb,ind,toll)

[a, b]=bound(dd,bb); dist=abs(b-a); s=abs(b)+abs(a);

n=length(d); niter=0; nch=[];

while (dist > (toll*s)),

niter=niter+1; c=(b+a)/2;

ak(niter)=a; bk(niter)=b; ck(niter)=c;

nch(niter)=chcksign(dd,bb,c);

if (nch(niter) > (n-ind)), b=c;

else, a=c; end; dist=abs(b-a); s=abs(b)+abs(a);

end

Program 43 - chcksign : Sign changes in the Sturm sequence

function nch=chcksign(dd,bb,x)

[p]=sturm(dd,bb,x); n=length(dd); nch=0; s=0;

for i=2:(n+1),

if ((p(i)*p(i-1)) <= 0), nch=nch+1; end

if (p(i)==0), s=s+1; end

end

nch=nch-s;

Program 44 - bound : Calculation of the interval J = [±, β]

function [alfa,beta]=bound(dd,bb)

n=length(dd); alfa=dd(1)-abs(bb(1)); temp=dd(n)-abs(bb(n-1));

if (temp < alfa), alfa=temp; end;

for i=2:(n-1),

temp=dd(i)-abs(bb(i-1))-abs(bb(i));

5.11 The Lanczos Method 233

if (temp < alfa), alfa=temp; end;

end

beta=dd(1)+abs(bb(1)); temp=dd(n)+abs(bb(n-1));

if (temp > beta), beta=temp; end;

for i=2:(n-1),

temp=dd(i)+abs(bb(i-1))+abs(bb(i));

if (temp > beta), beta=temp; end;

end

5.11 The Lanczos Method

Let A ∈ Rn—n be a symmetric sparse matrix, whose (real) eigenvalues are

ordered as

»1 ≥ »2 ≥ . . . ≥ »n’1 ≥ »n . (5.66)

When n is very large, the Lanczos method [Lan50] described in Section

4.4.3 can be applied to approximate the extremal eigenvalues »n and »1 . It

generates a sequence of tridiagonal matrices Hm whose extremal eigenvalues

rapidly converge to the extremal eigenvalues of A.

To estimate the convergence of the tridiagonalization process, we intro-

duce the Rayleigh quotient r(x) = (xT Ax)/(xT x) associated with a nonnull

vector x ∈ Rn . The following result, known as Courant-Fisher Theorem,

holds (for the proof see [GL89], p. 394)

»1 (A) = max r(x), »n (A) = min r(x).

n n

x∈R x∈R

x=0 x=0

T

Its application to the matrix Hm = Vm AVm , yields

(Vm x)T A(Vm x)

= max r(Hm x) ¤ »1 (A)

»1 (Hm ) = maxx∈Rn

xT x

x=0 x 2 =1

(5.67)

T

(Vm x) A(Vm x)

= min r(Hm x) ≥ »n (A).

»m (Hm ) = minx∈Rn

xT x

x=0 x 2 =1

At each step of the Lanczos method, the estimates (5.67) provide a lower

and upper bound for the extremal eigenvalues of A. The convergence of the