<< . .

. 29
( : 95)



. . >>

Theorem 5.5 Let A ∈ Cn—n be an hermitian matrix and let (», x) be
the computed approximations of an eigenvalue/eigenvector pair (», x) of
A. De¬ning the residual as

r = Ax ’ »x, x = 0,

it then follows that

r 2
min |» ’ »i | ¤ . (5.13)
x
»i ∈σ(A) 2

Proof. Since A is hermitian, it admits a system of orthonormal eigenvectors {uk }
n
which can be taken as a basis of C . In particular, x =
n
±i ui with ±i = uH x,
i
i=1
n
±i (»i ’ »)ui . As a consequence
and thus r =
i=1

n n
2
r 2
βi (»i ’ ») , with βi = |±k | /( |±j |2 ).
2 2
= (5.14)
x 2
i=1 j=1

n
3
Since βi = 1, the inequality (5.13) immediately follows from (5.14).
i=1

The estimate (5.13) ensures that a small absolute error corresponds to a
small relative residual in the computation of the eigenvalue of the matrix
A which is closest to ».
Let us now consider the following a posteriori estimate for the eigenvector
x (for the proof, see [IK66], pp. 142-143).

Property 5.6 Under the same assumptions of Theorem 5.5, suppose that
|»i ’ »| ¤ r 2 for i = 1, . . . , m and that |»i ’ »| ≥ δ > 0 for i =
m + 1, . . . , n. Then

r 2
d(x, Um ) ¤ (5.15)
δ
where d(x, Um ) is the Euclidean distance between x and the space Um gen-
erated by the eigenvectors ui , i = 1, . . . , m associated with the eigenvalues
»i of A.
192 5. Approximation of Eigenvalues and Eigenvectors

Notice that the a posteriori estimate (5.15) ensures that a small absolute
error corresponds to a small residual in the approximation of the eigenvec-
tor associated with the eigenvalue of A that is closest to », provided that
the eigenvalues of A are well-separated (that is, if δ is su¬ciently large).
In the general case of a nonhermitian matrix A, an a posteriori estimate
can be given for the eigenvalue » only when the matrix of the eigenvectors
of A is available. We have the following result (for the proof, we refer to
[IK66], p. 146).

Property 5.7 Let A ∈ Cn—n be a diagonalizable matrix, with matrix of
eigenvectors X = [x1 , . . . , xn ]. If, for some µ > 0,

¤ µ x 2,
r 2

then
min |» ’ »i | ¤ µ X’1 X 2.
2
»i ∈σ(A)

This estimate is of little practical use, since it requires the knowledge of all
the eigenvectors of A. Examples of a posteriori estimates that can actually
be implemented in a numerical algorithm will be provided in Sections 5.3.1
and 5.3.2.


5.3 The Power Method
The power method is very good at approximating the extremal eigenvalues
of the matrix, that is, the eigenvalues having largest and smallest module,
denoted by »1 and »n respectively, as well as their associated eigenvectors.
Solving such a problem is of great interest in several real-life applications
(geosysmic, machine and structural vibrations, electric network analysis,
quantum mechanics, . . . ) where the computation of »n (and its associated
eigenvector xn ) arises in the determination of the proper frequency (and
the corresponding fundamental mode) of a given physical system. We shall
come back to this point in Section 5.12.
Having approximations of »1 and »n can also be useful in the analysis of
numerical methods. For instance, if A is symmetric and positive de¬nite,
one can compute the optimal value of the acceleration parameter of the
Richardson method and estimate its error reducing factor (see Chapter
4), as well as perform the stability analysis of discretization methods for
systems of ordinary di¬erential equations (see Chapter 11).


5.3.1 Approximation of the Eigenvalue of Largest Module
Let A ∈ Cn—n be a diagonalizable matrix and let X ∈ Cn—n be the matrix of
its eigenvectors xi , for i = 1, . . . , n. Let us also suppose that the eigenvalues
5.3 The Power Method 193

of A are ordered as

|»1 | > |»2 | ≥ |»3 | . . . ≥ |»n |, (5.16)

where »1 has algebraic multiplicity equal to 1. Under these assumptions,
»1 is called the dominant eigenvalue of matrix A.
Given an arbitrary initial vector q(0) ∈ Cn of unit Euclidean norm, consider
for k = 1, 2, . . . the following iteration based on the computation of powers
of matrices, commonly known as the power method:

z(k) = Aq(k’1)
q(k) = z(k) / z(k) (5.17)
2

ν (k) = (q(k) )H Aq(k) .

Let us analyze the convergence properties of method (5.17). By induction
on k one can check that

Ak q(0)
k ≥ 1.
(k)
q = , (5.18)
Ak q(0) 2

This relation explains the role played by the powers of A in the method.
Because A is diagonalizable, its eigenvectors xi form a basis of Cn ; it is
thus possible to represent q(0) as
n
±i ∈ C,
(0)
q = ±i xi , i = 1, . . . , n. (5.19)
i=1

Moreover, since Axi = »i xi , we have
n k
±i »i
k (0)
±1 »k
Aq = x1 + xi , k = 1, 2, . . . (5.20)
1
± »1
i=2 1


Since |»i /»1 | < 1 for i = 2, . . . , n, as k increases the vector Ak q(0) (and
thus also q(k) , due to (5.18)), tends to assume an increasingly signi¬cant
component in the direction of the eigenvector x1 , while its components in
the other directions xj decrease. Using (5.18) and (5.20), we get

±1 »k (x1 + y(k) ) x1 + y(k)
1
(k)
q = = µk ,
±1 »k (x1 + y(k) ) 2 x1 + y(k) 2
1

where µk is the sign of ±1 »k and y(k) denotes a vector that vanishes as
1
k ’ ∞.
As k ’ ∞, the vector q(k) thus aligns itself along the direction of eigen-
vector x1 , and the following error estimate holds at each step k.
194 5. Approximation of Eigenvalues and Eigenvectors

Theorem 5.6 Let A ∈ Cn—n be a diagonalizable matrix whose eigenvalues
satisfy (5.16). Assuming that ±1 = 0, there exists a constant C > 0 such
that
k
»2
’ x1 ¤C k ≥ 1,
(k)
˜
q , (5.21)
2
»1
where
n k
q(k) Ak q(0) ±i »i
2
(k)
˜
q = = x1 + xi , k = 1, 2, . . . (5.22)
±1 »k ± »1
i=2 1
1

Proof. Since A is diagonalizable, without losing generality, we can pick up the
nonsingular matrix X in such a way that its columns have unit Euclidean length,
that is xi 2 = 1 for i = 1, . . . , n. From (5.20) it thus follows that
n n
k k
±i »i ±i »i
xi ’ x1
x1 + = xi
2 2
±1 »1 ±1 »1
i=2 i=2
1/2 1/2
n n
2 2k k 2
±i »i »2 ±i
¤ ¤ ,
±1 »1 »1 ±1
i=2 i=2

1/2
n
3
2
that is (5.21) with C = (±i /±1 ) .
i=2

Estimate (5.21) expresses the convergence of the sequence q(k) towards x1 .
˜
Therefore the sequence of Rayleigh quotients
H
((˜ (k) )H A˜ (k) )/ q(k) 2
= q(k) Aq(k) = ν (k)
˜
q q 2

will converge to »1 . As a consequence, limk’∞ ν (k) = »1 , and the conver-
gence will be faster when the ratio |»2 /»1 | is smaller.
If the matrix A is real and symmetric it can be proved, always assuming
that ±1 = 0, that (see [GL89], pp. 406-407)
2k
»2
|»1 ’ ν | ¤ |»1 ’ »n | tan (θ0 )
(k) 2
, (5.23)
»1
where cos(θ0 ) = |xT q(0) | = 0. Inequality (5.23) outlines that the conver-
1
gence of the sequence ν (k) to »1 is quadratic with respect to the ratio |»2 /»1 |
(we refer to Section 5.3.3 for numerical results).
We conclude the section by providing a stopping criterion for the iteration
(5.17). For this purpose, let us introduce the residual at step k
r(k) = Aq(k) ’ ν (k) q(k) , k ≥ 1,
H
and, for µ > 0, the matrix µE(k) = ’r(k) q(k) ∈ Cn—n with E(k) = 1.
2
Since
µE(k) q(k) = ’r(k) , k ≥ 1, (5.24)
5.3 The Power Method 195

we obtain A + µE(k) q(k) = ν (k) q(k) . As a result, at each step of the
power method ν (k) is an eigenvalue of the perturbed matrix A + µE(k) .
From (5.24) and from de¬nition (1.20) it also follows that µ = r(k) 2 for
k = 1, 2, . . . . Plugging this identity back into (5.10) and approximating the
partial derivative in (5.10) by the incremental ratio |»1 ’ ν (k) |/µ, we get

r(k) 2
|»1 ’ ν | k ≥ 1,
(k)
, (5.25)
| cos(θ» )|

where θ» is the angle between the right and the left eigenvectors, x1 and y1 ,
associated with »1 . Notice that, if A is an hermitian matrix, then cos(θ» ) =
1, so that (5.25) yields an estimate which is analogue to (5.13).
In practice, in order to employ the estimate (5.25) it is necessary at each
step k to replace | cos(θ» )| with the module of the scalar product between
two approximations q(k) and w(k) of x1 and y1 , computed by the power
method. The following a posteriori estimate is thus obtained

r(k) 2
|»1 ’ ν | k ≥ 1.
(k)
, (5.26)
|(w(k) )H q(k) |

Examples of applications of (5.26) will be provided in Section 5.3.3.


5.3.2 Inverse Iteration
In this section we look for an approximation of the eigenvalue of a matrix
A ∈ Cn—n which is closest to a given number µ ∈ C, where µ ∈ σ(A).
’1
For this, the power iteration (5.17) can be applied to the matrix (Mµ ) =
’1
(A ’ µI) , yielding the so-called inverse iteration or inverse power method.
The number µ is called a shift.
The eigenvalues of M’1 are ξi = (»i ’ µ)’1 ; let us assume that there
µ
exists an integer m such that

|»m ’ µ| < |»i ’ µ|, ∀i = 1, . . . , n and i = m. (5.27)

This amounts to requiring that the eigenvalue »m which is closest to µ has
multiplicity equal to 1. Moreover, (5.27) shows that ξm is the eigenvalue of
M’1 with largest module; in particular, if µ = 0, »m turns out to be the
µ
eigenvalue of A with smallest module.
Given an arbitrary initial vector q(0) ∈ Cn of unit Euclidean norm, for
k = 1, 2, . . . the following sequence is constructed:

(A ’ µI) z(k) = q(k’1)
q(k) = z(k) / z(k) (5.28)
2

σ (k) = (q(k) )H Aq(k) .
196 5. Approximation of Eigenvalues and Eigenvectors

Notice that the eigenvectors of Mµ are the same as those of A since Mµ =
X (Λ ’ µIn ) X’1 , where Λ = diag(»1 , . . . , »n ). For this reason, the Rayleigh
quotient in (5.28) is computed directly on the matrix A (and not on M’1 ). µ
The main di¬erence with respect to (5.17) is that at each step k a linear
system with coe¬cient matrix Mµ = A ’ µI must be solved. For numerical
convenience, the LU factorization of Mµ is computed once for all at k = 1,
so that at each step only two triangular systems are to be solved, with a
cost of the order of n2 ¬‚ops.
Although being more computationally expensive than the power method
(5.17), the inverse iteration has the advantage that it can converge to any
desired eigenvalue of A (namely, the one closest to the shift µ). Inverse iter-
ation is thus ideally suited for re¬ning an initial estimate µ of an eigenvalue
of A, which can be obtained, for instance, by applying the localization tech-
niques introduced in Section 5.1. Inverse iteration can be also e¬ectively
employed to compute the eigenvector associated with a given (approximate)
eigenvalue, as described in Section 5.8.1.

In view of the convergence analysis of the iteration (5.28) we assume that
A is diagonalizable, so that q(0) can be represented in the form (5.19).
Proceeding in the same way as in the power method, we let
n k
±i ξi
(k)
˜
q = xm + xi ,
±m ξm
i=1,i=m

where xi are the eigenvectors of M’1 (and thus also of A), while ±i are as
µ
in (5.19). As a consequence, recalling the de¬nition of ξi and using (5.27),
we get
lim q(k) = xm , lim σ (k) = »m .
˜
k’∞ k’∞
Convergence will be faster when µ is closer to »m . Under the same assump-
tions made for proving (5.26), the following a posteriori estimate can be
obtained for the approximation error on »m
r(k) 2
|»m ’ σ | k ≥ 1,
(k)
, (5.29)
|(w(k) )H q(k) |
where r(k) = Aq(k) ’ σ (k) q(k) and w(k) is the k-th iterate of the inverse
power method to approximate the left eigenvector associated with »m .


5.3.3 Implementation Issues
The convergence analysis of Section 5.3.1 shows that the e¬ectiveness of
the power method strongly depends on the dominant eigenvalues being
well-separated (that is, |»2 |/|»1 | 1). Let us now analyze the behavior of
iteration (5.17) when two dominant eigenvalues of equal module exist (that
is, |»2 | = |»1 |). Three cases must be distinguished:
5.3 The Power Method 197

1. »2 = »1 : the two dominant eigenvalues are coincident. The method
is still convergent, since for k su¬ciently large (5.20) yields

Ak q(0) »k (±1 x1 + ±2 x2 )
1

which is an eigenvector of A. For k ’ ∞, the sequence q(k) (after
˜
a suitable rede¬nition) converges to a vector lying in the subspace
spanned by the eigenvectors x1 and x2 , while the sequence ν (k) still
converges to »1 .
2. »2 = ’»1 : the two dominant eigenvalues are opposite. In this case
the eigenvalue of largest module can be approximated by applying the
power method to the matrix A2 . Indeed, for i = 1, . . . , n, »i (A2 ) =
2
[»i (A)] , so that »2 = »2 and the analysis falls into the previous case,
1 2
where the matrix is now A2 .
3. »2 = »1 : the two dominant eigenvalues are complex conjugate. Here,
undamped oscillations arise in the sequence of vectors q(k) and the
power method is not convergent (see [Wil65], Chapter 9, Section 12).


As for the computer implementation of (5.17), it is worth noting that nor-
malizing the vector q(k) to 1 keeps away from over¬‚ow (when |»1 | > 1) or
under¬‚ow (when |»1 | < 1) in (5.20). We also point out that the requirement
±1 = 0 (which is a priori impossible to ful¬l when no information about
the eigenvector x1 is available) is not essential for the actual convergence
of the algorithm.
Indeed, although it can be proved that, working in exact arithmetic, the
sequence (5.17) converges to the pair (»2 , x2 ) if ±1 = 0 (see Exercise 10),
the arising of (unavoidable) rounding errors ensures that in practice the
vector q(k) contains a non-null component also in the direction of x1 . This
allows for the eigenvalue »1 to “show-up” and the power method to quickly
converge to it.
An implementation of the power method is given in Program 26. Here
and in the following algorithm, the convergence check is based on the a
posteriori estimate (5.26).
Here and in the remainder of the chapter, the input data z0, toll and
nmax are the initial vector, the tolerance for the stopping test and the
maximum admissible number of iterations, respectively. In output, the vec-
tors nu1 and err contain the sequences {ν (k) } and { r(k) 2 /| cos(θ» )|} (see
(5.26)), whilst x1 and niter are the approximation of the eigenvector x1
and the number of iterations taken by the algorithm to converge, respec-
tively.

<< . .

. 29
( : 95)



. . >>