<< . .

. 34
( : 95)



. . >>

the command eig(A,B).


5.9.2 Generalized Real Schur Form of Symmetric-De¬nite
Pencils
A remarkable situation occurs when both A and B are symmetric, and one
of them, say B, is also positive de¬nite. In such a case, the pair (A,B) forms
a symmetric-de¬nite pencil for which the following result holds.

Theorem 5.7 The symmetric-de¬nite pencil (A,B) has real eigenvalues
and linearly independent eigenvectors. Moreover, the matrices A and B can
be simultaneously diagonalized. Precisely, there exists a nonsingular matrix
X ∈ Rn—n such that

XT AX = Λ = diag(»1 , »2 , . . . , »n ), XT BX = In ,

where for i = 1, . . . , n, »i are the eigenvalues of the pencil (A, B).
Proof. Since B is symmetric positive de¬nite, it admits a unique Cholesky fac-
torization B = HT H, where H is upper triangular (see Section 3.4.2). From (5.58)
we deduce that Cz = »z with C = H’T AH’1 , z = Hx, where (», x) is an
eigenvalue/eigenvector pair of (A,B).
The matrix C is symmetric; therefore, its eigenvalues are real and a set of
orthonormal eigenvectors (y1 , . . . , yn ) = Y exists. As a consequence, letting X =
H’1 Y allows for simultaneously diagonalizing both A and B since

XT AX = YT H’T AH’1 Y = YT CY = Λ = diag(»1 , . . . , »n ),

XT BX = YT H’T BH’1 Y = YT Y = In .
3

The following QR-Cholesky algorithm computes the eigenvalues »i and
the corresponding eigenvectors xi of a symmetric-de¬nite pencil (A,B), for
i = 1, . . . , n (see for more details [GL89], Section 8.7, [Dat95], Section 9.5):
1. compute the Cholesky factorization B = HT H;
2. compute C = H’T AH’1 ;
3. for i = 1, . . . , n, compute the eigenvalues »i and eigenvectors zi of the
symmetric matrix C using the QR iteration. Then construct from the
set {zi } an orthonormal set of eigenvectors {yi } (using, for instance,
the modi¬ed Gram-Schmidt procedure of Section 3.4.3);
4. for i = 1, . . . , n, compute the eigenvectors xi of the pencil (A,B) by
solving the systems Hxi = yi .
5.10 Methods for Eigenvalues of Symmetric matrices 227

This algorithm requires an order of 14n3 ¬‚ops and it can be shown (see
ˆ
[GL89], p. 464) that, if » is a computed eigenvalue, then

» ∈ σ(H’T AH’1 + E), B’1 2 .
ˆ with E uA
2 2

Thus, the generalized eigenvalue problem in the symmetric-de¬nite case
may become unstable with respect to rounding errors propagation if B is
ill-conditioned. For a stabilized version of the QR-Cholesky method, see
[GL89], p. 464 and the references cited therein.


5.10 Methods for Eigenvalues of Symmetric
matrices
In this section we deal with the computation of the eigenvalues of a sym-
metric matrix A ∈ Rn—n . Besides the QR method previously examined,
speci¬c algorithms which take advantage of the symmetry of A are avail-
able.
Among these, we ¬rst consider the Jacobi method, which generates a
sequence of matrices orthogonally similar to A and converging to the diag-
onal Schur form of A. Then, the Sturm sequence and Lanczos procedures
are presented, for handling the case of tridiagonal matrices and large sparse
matrices respectively.


5.10.1 The Jacobi Method
The Jacobi method generates a sequence of matrices A(k) that are orthog-
onally similar to matrix A and converge to a diagonal matrix whose entries
are the eigenvalues of A. This is done using the Givens similarity transfor-
mations (5.43) as follows.
Given A(0) = A, for any k = 1, 2, . . . , a pair of indices p and q is ¬xed,
with 1 ¤ p < q ¤ n. Next, letting Gpq = G(p, q, θ), the matrix A(k) =
(Gpq )T A(k’1) Gpq , orthogonally similar to A, is constructed in such a way
that
(k)
aij = 0 if (i, j) = (p, q). (5.59)
Letting c = cos θ and s = sin θ, the procedure for computing the entries of
A(k) that are changed with respect to those of A(k’1) , can be written as
®  ® 
T
(k) (k) (k’1) (k’1)
app apq cs app apq cs
° »= ° » . (5.60)
’s c ’s c
(k) (k) (k’1) (k’1)
apq aqq apq aqq
(k’1) (k’1)
If apq = 0, we can satisfy (5.59) by taking c = 1 and s = 0. If apq =
0, letting t = s/c, (5.60) requires the solution of the following algebraic
228 5. Approximation of Eigenvalues and Eigenvectors

equation
(k’1) (k’1)
’ app
aqq
t + 2·t ’ 1 = 0,
2
·= . (5.61)
(k’1)
2apq

The root t = 1/(· + 1 + · 2 ) is chosen in (5.61) if · ≥ 0, otherwise we
take t = ’1/(’· + 1 + · 2 ); next, we let
1
c= √ , s = ct. (5.62)
1 + t2

To examine the rate at which the o¬-diagonal entries of A(k) tend to zero,
it is convenient to introduce, for any matrix M ∈ Rn—n , the nonnegative
quantity
« 1/2
1/2
n n
¬ ·

m2  2
m2
Ψ(M) =  = M . (5.63)
ij F ii
i,j=1 i=1
i=j


The Jacobi method ensures that Ψ(A(k) ) ¤ Ψ(A(k’1) ) for any k ≥ 1.
Indeed, the computation of (5.63) for matrix A(k) yields
2
(Ψ(A(k) ))2 = (Ψ(A(k’1) ))2 ’ 2 a(k’1) ¤ (Ψ(A(k’1) ))2 . (5.64)
pq


The estimate (5.64) suggests that, at each step k, the optimal choice of the
indices p and q is that corresponding to the entry in A(k’1) such that
(k’1)
|apq | = max |aij |.
(k’1)
i=j

The computational cost of this strategy is of the order of n2 ¬‚ops for the
search of the maximum module entry, while the updating step A(k) =
(Gpq )T A(k’1) Gpq requires only a cost of the order of n ¬‚ops, as already
noticed in Section 5.6.5. It is thus convenient to resort to the so called row
cyclic Jacobi method, in which the choice of the indices p and q is done by
a row-sweeping of the matrix A(k) according to the following algorithm: for
any k = 1, 2, . . . and for any i-th row of A(k) (i = 1, . . . , n ’ 1), we set
p = i and q = (i + 1), . . . , n. Each complete sweep requires N = n(n ’ 1)/2
Jacobi transformations. Assuming that |»i ’ »j | ≥ δ for i = j, it can be
shown that the cyclic Jacobi method converges quadratically, that is (see
[Wil65], [Wil62])
1
Ψ(A(k+N ) ) ¤ √ (Ψ(A(k) ))2 , k = 1, 2, . . .
δ2
For further details of the algorithm, we refer to [GL89], Section 8.4.
5.10 Methods for Eigenvalues of Symmetric matrices 229

Example 5.15 Let us apply the cyclic Jacobi method to the Hilbert matrix H4 ,
whose eigenvalues read (to ¬ve signi¬cant ¬gures) »1 = 1.5002, »2 = 1.6914·10’1 ,
»3 = 6.7383 · 10’3 and »4 = 9.6702 · 10’5 . Running Program 40 with toll =
10’15 , the method converges in 3 sweeps to a matrix whose diagonal entries
coincide with the eigenvalues of H4 unless 4.4409 · 10’16 . As for the o¬-diagonal
(k)

entries, the values attained by Ψ(H4 ) are reported in Table 5.3.



(k) (k) (k)
Sweep Ψ(H4 ) Sweep Ψ(H4 ) Sweep Ψ(H4 )
5.262 · 10’2 3.824 · 10’5 5.313 · 10’16
1 2 3
TABLE 5.3. Convergence of the cyclic Jacobi algorithm


Formulae (5.63) and (5.62) are implemented in Programs 38 and 39.

Program 38 - psinorm : Evaluation of Ψ(A)

function [psi]=psinorm(A)
n=max(size(A)); psi=0;
for i=1:(n-1), for j=(i+1):n, psi=psi+A(i,j)ˆ2+A(j,i)ˆ2; end; end; psi=sqrt(psi);


Program 39 - symschur : Evaluation of c and s

function [c,s]=symschur(A,p,q)
if (A(p,q)==0), c=1; s=0; else,
eta=(A(q,q)-A(p,p))/(2*A(p,q));
if (eta >= 0), t=1/(eta+sqrt(1+etaˆ2));
else, t=-1/(-eta+sqrt(1+etaˆ2)); end; c=1/sqrt(1+tˆ2); s=c*t;
end

A coding of the cyclic Jacobi method is implemented in Program 40.
This program gets as input parameters the symmetric matrix A ∈ Rn—n
and a tolerance toll. The program returns a matrix D = GT AG, G be-
ing orthogonal, such that Ψ(D) ¤ toll A F , the value of Ψ(D) and the
number of sweeps to achieve convergence.

Program 40 - cycjacobi : Cyclic Jacobi method for symmetric matrices

function [D,sweep,psi]=cycjacobi(A,toll)
n=max(size(A)); D=A; psiD=norm(A,™fro™);
epsi=toll*psiD; psiD=psinorm(D); [psi]=psiD; sweep=0;
while (psiD > epsi), sweep=sweep+1;
for p=1:(n-1), for q=(p+1):n
[c,s]=symschur(D,p,q); [D]=gacol(D,c,s,1,n,p,q); [D]=garow(D,c,s,p,q,1,n);
end; end; psiD=psinorm(D); psi=[psi; psiD];
end
230 5. Approximation of Eigenvalues and Eigenvectors

5.10.2 The Method of Sturm Sequences
In this section we deal with the calculation of the eigenvalues of a real,
tridiagonal and symmetric matrix T. Typical instances of such a problem
arise when applying the Householder transformation to a given symmetric
matrix A (see Section 5.6.2) or when solving boundary value problems in
one spatial dimension (see for an example Section 5.12.1).
We analyze the method of Sturm sequences, or Givens method, introduced
in [Giv54]. For i = 1, . . . , n, we denote by di the diagonal entries of T and by
bi , i = 1, . . . , n ’ 1, the elements of the upper and lower subdiagonals of T.
We shall assume that bi = 0 for any i. Otherwise, indeed, the computation
reduces to problems of less complexity.
Letting Ti be the principal minor of order i of matrix T and p0 (x) = 1,
we de¬ne for i = 1, . . . , n the following sequence of polynomials pi (x) =
det(Ti ’ xIi )

p1 (x) = d1 ’ x
(5.65)
pi (x) = (di ’ x)pi’1 (x) ’ b2 pi’2 (x), i = 2, . . . , n.
i’1

It can be checked that pn is the characteristic polynomial of T; the com-
putational cost of its evaluation at point x is of the order of 2n ¬‚ops. The
sequence (5.65) is called the Sturm sequence owing to the following result,
for whose proof we refer to [Wil65], Chapter 2, Section 47 and Chapter 5,
Section 37.

Property 5.11 (of Sturm sequence) For i = 2, . . . , n the eigenvalues
of Ti’1 strictly separate those of Ti , that is

»i (Ti ) < »i’1 (Ti’1 ) < »i’1 (Ti ) < . . . < »2 (Ti ) < »1 (Ti’1 ) < »1 (Ti ).

Moreover, letting for any real number µ

Sµ = {p0 (µ), p1 (µ), . . . , pn (µ)},

the number s(µ) of sign changes in Sµ yields the number of eigenvalues of T
that are strictly less than µ, with the convention that pi (µ) has opposite sign
to pi’1 (µ) if pi (µ) = 0 (two consecutive elements in the sequence cannot
vanish at the same value of µ).

Example 5.16 Let T be the tridiagonal part of the Hilbert matrix H4 ∈ R4—4 ,
having entries hij = 1/(i + j ’ 1). The eigenvalues of T are (to ¬ve signi¬cant
¬gures) »1 = 1.2813, »2 = 0.4205, »3 = ’0.1417 and »4 = 0.1161. Taking µ = 0,
Program 41 computes the following Sturm sequence

S0 = {p0 (0), p1 (0), p2 (0), p3 (0), p4 (0)} = {1, 1, 0.0833, ’0.0458, ’0.0089}

from which, applying Property 5.11, one concludes that matrix T has one eigen-
value less than 0. In the case of matrix T = tridiag4 (’1, 2, ’1), with eigenvalues
5.10 Methods for Eigenvalues of Symmetric matrices 231

{0.38, 1.38, 2.62, 3.62} (to three signi¬cant ¬gures), we get, taking µ = 3

{p0 (3), p1 (3), p2 (3), p3 (3), p4 (3)} = {1, ’1, 0, 1, ’1}

which shows that matrix T has three eigenvalues less than 3, since three sign

changes occur.

The Givens method for the calculation of the eigenvalues of T proceeds as
follows. Letting b0 = bn = 0, Theorem 5.2 yields the interval J = [±, β]
which contains the spectrum of T, where

± = min [di ’ (|bi’1 | + |bi |)] , β = max [di + (|bi’1 | + |bi |)] .
1¤i¤n 1¤i¤n

The set J is used as an initial guess in the search for generic eigenvalues
»i of matrix T, for i = 1, . . . , n, using the bisection method (see Chapter
6).
Precisely, given a(0) = ± and b(0) = β, we let c(0) = (± + β)/2 and
compute s(c(0) ); then, recalling Property 5.11, we let b(1) = c(0) if s(c(0) ) >
(n ’ i), otherwise we set a(1) = c(0) . After r iterations, the value c(r) =
(a(r) + b(r) )/2 provides an approximation of »i within (|±| + |β|) · 2’(r+1) ,
as is shown in (6.9).
A systematic procedure can be set up to store any information about
the position within the interval J of the eigenvalues of T that are being
computed by the Givens method. The resulting algorithm generates a se-
(r) (r)
quence of neighboring subintervals aj , bj , for j = 1, . . . , n, each one of
arbitrarily small length and containing one eigenvalue »j of T (for further
details, see [BMW67]).

Example 5.17 Let us employ the Givens method to compute the eigenvalue
»2 2.62 of matrix T considered in Example 5.16. Letting toll=10’4 in Program
42 we obtain the results reported in Table 5.4, which demonstrate the convergence
of the sequence c(k) to the desired eigenvalue in 13 iterations. We have denoted
for brevity, s(k) = s(c(k) ). Similar results are obtained by running Program 42 to

compute the remaining eigenvalues of T.

a(k) b(k) c(k) s(k) a(k) b(k) c(k) s(k)
k k
0 0 4.000 2.0000 2 7 2.5938 2.625 2.6094 2
1 2.0000 4.000 3.0000 3 8 2.6094 2.625 2.6172 2
2 2.0000 3.000 2.5000 2 9 2.6094 2.625 2.6172 2
3 2.5000 3.000 2.7500 3 10 2.6172 2.625 2.6211 3
4 2.5000 2.750 2.6250 3 11 2.6172 2.621 2.6191 3
5 2.5000 2.625 2.5625 2 12 2.6172 2.619 2.6182 3
6 2.5625 2.625 2.5938 2 13 2.6172 2.618 2.6177 2
TABLE 5.4. Convergence of the Givens method for the calculation of the eigen-
value »2 of the matrix T in Example 5.16
232 5. Approximation of Eigenvalues and Eigenvectors

An implementation of the polynomial evaluation (5.65) is given in Pro-
gram 41. This program receives in input the vectors dd and bb containing
the main and the upper diagonals of T. The output values pi (x) are stored,
for i = 0, . . . , n, in the vector p.

Program 41 - sturm : Sturm sequence evaluation
function [p]=sturm(dd,bb,x)
n=length(dd); p(1)=1; p(2)=d(1)-x;
for i=2:n, p(i+1)=(dd(i)-x)*p(i)-bb(i-1)ˆ2*p(i-1); end

A basic implementation of the Givens method is provided in Program
42. In input, ind is the pointer to the searched eigenvalue, while the other
parameters are similar to those in Program 41. In output the values of
the elements of sequences a(k) , b(k) and c(k) are returned, together with
the required number of iterations niter and the sequence of sign changes
s(c(k) ).

Program 42 - givsturm : Givens method using the Sturm sequence
function [ak,bk,ck,nch,niter]=givsturm(dd,bb,ind,toll)
[a, b]=bound(dd,bb); dist=abs(b-a); s=abs(b)+abs(a);
n=length(d); niter=0; nch=[];
while (dist > (toll*s)),
niter=niter+1; c=(b+a)/2;
ak(niter)=a; bk(niter)=b; ck(niter)=c;
nch(niter)=chcksign(dd,bb,c);
if (nch(niter) > (n-ind)), b=c;
else, a=c; end; dist=abs(b-a); s=abs(b)+abs(a);
end


Program 43 - chcksign : Sign changes in the Sturm sequence
function nch=chcksign(dd,bb,x)
[p]=sturm(dd,bb,x); n=length(dd); nch=0; s=0;
for i=2:(n+1),
if ((p(i)*p(i-1)) <= 0), nch=nch+1; end
if (p(i)==0), s=s+1; end
end
nch=nch-s;


Program 44 - bound : Calculation of the interval J = [±, β]
function [alfa,beta]=bound(dd,bb)
n=length(dd); alfa=dd(1)-abs(bb(1)); temp=dd(n)-abs(bb(n-1));
if (temp < alfa), alfa=temp; end;
for i=2:(n-1),
temp=dd(i)-abs(bb(i-1))-abs(bb(i));
5.11 The Lanczos Method 233

if (temp < alfa), alfa=temp; end;
end
beta=dd(1)+abs(bb(1)); temp=dd(n)+abs(bb(n-1));
if (temp > beta), beta=temp; end;
for i=2:(n-1),
temp=dd(i)+abs(bb(i-1))+abs(bb(i));
if (temp > beta), beta=temp; end;
end



5.11 The Lanczos Method
Let A ∈ Rn—n be a symmetric sparse matrix, whose (real) eigenvalues are
ordered as
»1 ≥ »2 ≥ . . . ≥ »n’1 ≥ »n . (5.66)
When n is very large, the Lanczos method [Lan50] described in Section
4.4.3 can be applied to approximate the extremal eigenvalues »n and »1 . It
generates a sequence of tridiagonal matrices Hm whose extremal eigenvalues
rapidly converge to the extremal eigenvalues of A.
To estimate the convergence of the tridiagonalization process, we intro-
duce the Rayleigh quotient r(x) = (xT Ax)/(xT x) associated with a nonnull
vector x ∈ Rn . The following result, known as Courant-Fisher Theorem,
holds (for the proof see [GL89], p. 394)
»1 (A) = max r(x), »n (A) = min r(x).
n n
x∈R x∈R
x=0 x=0

T
Its application to the matrix Hm = Vm AVm , yields
(Vm x)T A(Vm x)
= max r(Hm x) ¤ »1 (A)
»1 (Hm ) = maxx∈Rn
xT x
x=0 x 2 =1
(5.67)
T
(Vm x) A(Vm x)
= min r(Hm x) ≥ »n (A).
»m (Hm ) = minx∈Rn
xT x
x=0 x 2 =1


At each step of the Lanczos method, the estimates (5.67) provide a lower
and upper bound for the extremal eigenvalues of A. The convergence of the

<< . .

. 34
( : 95)



. . >>