<< . .

. 28
( : 95)



. . >>

from the real Schur form of A, with an extra amount of work, as described
in Section 5.8.2.
Finally, some ad hoc methods for dealing e¬ectively with the special case
where A is a symmetric (n — n) matrix are considered in Section 5.10.



5.1 Geometrical Location of the Eigenvalues
Since the eigenvalues of A are the roots of the characteristic polynomial
pA (») (see Section 1.7), iterative methods must be used for their approxi-
mation when n ≥ 5. Knowledge of eigenvalue location in the complex plane
can thus be helpful in accelerating the convergence of the process.
184 5. Approximation of Eigenvalues and Eigenvectors

A ¬rst estimate is provided by Theorem 1.4,

|»| ¤ A , ∀» ∈ σ(A), (5.1)

for any consistent matrix norm · . Inequality (5.1), which is often quite
rough, states that all the eigenvalues of A are contained in a circle of radius
R A = A centered at the origin of the Gauss plane.
Another result is obtained by extending the Decomposition Property 1.23
to complex-valued matrices.

Theorem 5.1 If A ∈ Cn—n , let

iS = A ’ AH /2
H = A + AH /2 and

be the hermitian and skew-hermitian parts of A, respectively, i being the
imaginary unit. For any » ∈ σ(A)

»min (H) ¤ Re(») ¤ »max (H), »min (S) ¤ Im(») ¤ »max (S). (5.2)

Proof. From the de¬nition of H and S it follows that A = H + iS. Let u ∈ Cn ,
u 2 = 1, be the eigenvector associated with the eigenvalue »; the Rayleigh
quotient (introduced in Section 1.7) reads

» = uH Au = uH Hu + iuH Su. (5.3)

Notice that both H and S are hermitian matrices, whilst iS is skew-hermitian.
Matrices H and S are thus unitarily similar to a real diagonal matrix (see Section
1.7), and therefore their eigenvalues are real. In such a case, (5.3) yields

Re(») = uH Hu, Im(») = uH Su,

3
from which (5.2) follows.

An a priori bound for the eigenvalues of A is given by the following result.

Theorem 5.2 (of the Gershgorin circles) Let A ∈ Cn—n . Then
n n
σ(A) ⊆ SR = Ri , Ri = {z ∈ C : |z ’ aii | ¤ |aij |}. (5.4)
j=1
i=1
j=i

The sets Ri are called Gershgorin circles.
Proof. Let us decompose A as A = D + E, where D is the diagonal part of
A, whilst eii = 0 for i = 1, . . . , n. For » ∈ σ(A) (with » = aii , i = 1, . . . , n),
let us introduce the matrix B» = A ’ »I = (D ’ »I) + E. Since B» is singular,
there exists a non-null vector x ∈ Cn such that B» x = 0. This means that
((D ’ »I) + E) x = 0, that is, passing to the · ∞ norm,

x = ’(D ’ »I)’1 Ex, ¤ (D ’ »I)’1 E
x x ∞,
∞ ∞
5.1 Geometrical Location of the Eigenvalues 185

and thus
n n
|ekj | |akj |
’1
1 ¤ (D ’ »I) E = =

|akk ’ »| |akk ’ »| (5.5)
j=1
j=1
j=k

for a certain k, 1 ¤ k ¤ n. Inequality (5.5) implies » ∈ Rk and thus (5.4). 3

The bounds (5.4) ensure that any eigenvalue of A lies within the union
of the circles Ri . Moreover, since A and AT share the same spectrum,
Theorem 5.2 also holds in the form
n n
σ(A) ⊆ SC = Cj , Cj = {z ∈ C : |z ’ ajj | ¤ |aij |}. (5.6)
i=1
j=1
i=j

The circles Ri in the complex plane are called row circles, and Cj column
circles. The immediate consequence of (5.4) and (5.6) is the following.

Property 5.1 (First Gershgorin theorem) For a given matrix A ∈
Cn—n ,

∀» ∈ σ(A), » ∈ SR SC . (5.7)

The following two location theorems can also be proved (see [Atk89], pp.
588-590 and [Hou75], pp. 66-67).

Property 5.2 (Second Gershgorin theorem) Let
m n
S1 = Ri , S2 = Ri .
i=1 i=m+1

If S1 © S2 = …, then S1 contains exactly m eigenvalues of A, each one being
accounted for with its algebraic multiplicity, while the remaining eigenvalues
are contained in S2 .

Remark 5.1 Properties 5.1 and 5.2 do not exclude the possibility that
there exist circles containing no eigenvalues, as happens for the matrix in
Exercise 1.

De¬nition 5.1 A matrix A ∈ Cn—n is called reducible if there exists a
permutation matrix P such that
B11 B12
PAPT = ,
0 B22

where B11 and B22 are square matrices; A is irreducible if it is not reducible.
186 5. Approximation of Eigenvalues and Eigenvectors

To check if a matrix is reducible, the oriented graph of the matrix can be
conveniently employed. Recall from Section 3.9 that the oriented graph of a
real matrix A is obtained by joining n points (called vertices of the graph)
P1 , . . . , Pn through a line oriented from Pi to Pj if the corresponding
matrix entry aij = 0. An oriented graph is strongly connected if for any
pair of distinct vertices Pi and Pj there exists an oriented path from Pi to
Pj . The following result holds (see [Var62] for the proof).

Property 5.3 A matrix A ∈ Rn—n is irreducible i¬ its oriented graph is
strongly connected.

Property 5.4 (Third Gershgorin theorem) Let A ∈ Cn—n be an irre-
ducible matrix. An eigenvalue » ∈ σ(A) cannot lie on the boundary of SR
unless it belongs to the boundary of every circle Ri , for i = 1, . . . , n.

Example 5.1 Let us consider the matrix
® 
10 2 3
 
A =  ’1 2 ’1 
° »
01 3

whose spectrum is (to four signi¬cant ¬gures) σ(A) = {9.687, 2.656±i0.693}. The
following values of the norm of A: A 1 = 11, A 2 = 10.72, A ∞ = 15 and
A F = 11.36 can be used in the estimate (5.1). Estimate (5.2) provides instead
1.96 ¤ Re(»(A)) ¤ 10.34, ’2.34 ¤ Im(»(A)) ¤ 2.34, while the row and column
circles are given respectively by R1 = {|z| : |z ’10| ¤ 5}, R2 = {|z| : |z ’2| ¤ 2},
R3 = {|z| : |z ’ 3| ¤ 1} and C1 = {|z| : |z ’ 10| ¤ 1}, C2 = {|z| : |z ’ 2| ¤ 3},
C3 = {|z| : |z ’ 3| ¤ 4}.
In Figure 5.1, for i = 1, 2, 3 the Ri and Ci circles and the intersection SR © SC
(shaded areas) are drawn. In agreement with Property 5.2, we notice that an
eigenvalue is contained in C1 , which is disjoint from C2 and C3 , while the remaining
eigenvalues, thanks to Property 5.1, lie within the set R2 ∪ {C3 © R1 }. •




5.2 Stability and Conditioning Analysis
In this section we introduce some a priori and a posteriori estimates that
are relevant in the stability analysis of the matrix eigenvalue and eigenvec-
tor problem. The presentation follows the guidelines that have been traced
in Chapter 2.


A priori Estimates
5.2.1
Assume that A ∈ Cn—n is a diagonalizable matrix and denote by X =
(x1 , . . . , xn ) ∈ Cn—n the matrix of its right eigenvectors, where xk ∈ Cn
5.2 Stability and Conditioning Analysis 187


Im(z)
R1
C3
C2
C1
R2
R3
23 10 Re(z)




FIGURE 5.1. Row and column circles for matrix A in Example 5.1

for k = 1, . . . , n, such that D = X’1 AX = diag(»1 , . . . , »n ), »i being the
eigenvalues of A, i = 1, . . . , n. Moreover, let E ∈ Cn—n be a perturbation
of A. The following theorem holds.

Theorem 5.3 (Bauer-Fike) Let µ be an eigenvalue of the matrix A+E ∈
Cn—n ; then

min |» ’ µ| ¤ Kp (X) E (5.8)
p
»∈σ(A)


where · p is any matrix p-norm and Kp (X) = X p X’1 is called the
p
condition number of the eigenvalue problem for matrix A.
Proof. We ¬rst notice that if µ ∈ σ(A) then (5.8) is trivially veri¬ed, since
X p X’1 p E p ≥ 0. Let us thus assume henceforth that µ ∈ σ(A). From the
de¬nition of eigenvalue it follows that matrix (A+E’µI) is singular, which means
that, since X is invertible, the matrix X’1 (A + E ’ µI)X = D + X’1 EX ’ µI is
singular. Therefore, there exists a non-null vector x ∈ Cn such that

(D ’ µI) + X’1 EX x = 0.

Since µ ∈ σ(A), the diagonal matrix (D ’ µI) is invertible and the previous
equation can be written in the form

I + (D ’ µI)’1 (X’1 EX) x = 0.

·
Passing to the norm and proceeding as in the proof of Theorem 5.2, we get
p

1 ¤ (D ’ µI)’1 p Kp (X) E p,

from which the estimate (5.8) follows, since

(D ’ µI)’1 = ( min |» ’ µ|)’1 .
p
»∈σ(A)

3
188 5. Approximation of Eigenvalues and Eigenvectors

If A is a normal matrix, from the Schur decomposition theorem (see Section
1.8) it follows that the similarity transformation matrix X is unitary so that
Kp (X) = 1. This implies that

∀µ ∈ σ(A + E), min |» ’ µ| ¤ E p , (5.9)
»∈σ(A)


hence the eigenvalue problem is well-conditioned with respect to the abso-
lute error. This, however, does not prevent the matrix eigenvalue problem
from being a¬ected by signi¬cant relative errors, especially when A has a
widely spread spectrum.

Example 5.2 Let us consider, for 1 ¤ n ¤ 10, the calculation of the eigenvalues
of the Hilbert matrix Hn ∈ Rn—n (see Example 3.2, Chapter 3). It is symmetric
(thus, in particular, normal) and exhibits, for n ≥ 4, a very large condition
number. Let En ∈ Rn—n be a matrix having constant entries equal to · = 10’3 .
We show in Table 5.1 the results of the computation of the minimum in (5.9),
taking p = 2 (that is, En 2 = n·). Notice how the absolute error is decreasing,
since the eigenvalue of minimum module tends to zero, whilst the relative error
is increasing as the size n of the matrix increases, due to the higher sensitivity of

“small” eigenvalues with respect to rounding errors.


n Abs. Err. Rel. Err. En 2 K2 (Hn ) K2 (Hn + En )
1 · 10’3 1 · 10’3 1 · 10’3 1 · 10’3
1 1
1.677 · 10’4 1.446 · 10’3 2 · 10’3
2 19.28 19.26
5.080 · 10’7 2.207 · 10’3 4 · 10’3 1.551 · 104 1.547 · 104
4
1.156 · 10’12 3.496 · 10’3 8 · 10’3 1.526 · 1010 1.515 · 1010
8
1.355 · 10’15 4.078 · 10’3 1 · 10’2 1.603 · 1013 1.589 · 1013
10
TABLE 5.1. Relative and absolute errors in the calculation of the eigenvalues of
the Hilbert matrix (using the MATLAB intrinsic function eig). “Abs. Err.” and
“Rel. Err.” denote respectively the absolute and relative errors (with respect to
»)



The Bauer-Fike theorem states that the matrix eigenvalue problem is well-
conditioned if A is a normal matrix. Failure to ful¬l this property, however,
does not necessarily imply that A must exhibit a “strong” numerical sen-
sitivity to the computation of every one of its eigenvalues. In this respect,
the following result holds, which can be regarded as an a priori estimate of
the conditioning of the calculation of a particular eigenvalue of a matrix.

Theorem 5.4 Let A ∈ Cn—n be a diagonalizable matrix; let », x and y
be a simple eigenvalue of A and its associated right and left eigenvectors,
respectively, with x 2 = y 2 = 1. Moreover, for µ > 0, let A(µ) =
A + µE, with E ∈ Cn—n such that E 2 = 1. Denoting by »(µ) and x(µ) the
5.2 Stability and Conditioning Analysis 189

eigenvalue and the corresponding eigenvector of A(µ), such that »(0) = »
and x(0) = x,

‚» 1
(0) ¤ H . (5.10)
|y x|
‚µ

Proof. Let us ¬rst prove that yH x = 0. Setting Y = (y1 , . . . , yn ) = (XH )’1 ,
with yk ∈ Cn for k = 1, . . . , n, it follows that yk A = »k yk , i.e., the rows
H H

of X’1 = YH are left eigenvectors of A. Then, since YH X = I, yi xj = δij H

for i, j = 1, . . . , n, δij being the Kronecker symbol. This result is equivalent to
saying that the eigenvectors {x} of A and the eigenvectors {y} of AH form a
bi-orthogonal set (see (4.67)).
Let us now prove (5.10). Since the roots of the characteristic equation are
continuous functions of the coe¬cients of the characteristic polynomial associated
with A(µ), it follows that the eigenvalues of A(µ) are continuous functions of µ
(see, for instance, [Hen74], p. 281). Therefore, in a neighborhood of µ = 0,

(A + µE)x(µ) = »(µ)x(µ).

Di¬erentiating the previous equation with respect to µ and setting µ = 0 yields

‚x ‚» ‚x
A (0) + Ex = (0)x + » (0),
‚µ ‚µ ‚µ

from which, left-multiplying both sides by yH and recalling that yH is a left
eigenvector of A,
yH Ex
‚»
(0) = H .
‚µ yx
3
Using the Cauchy-Schwarz inequality gives the desired estimate (5.10).

Notice that |yH x| = | cos(θ» )|, where θ» is the angle between the eigenvec-
tors yH and x (both having unit Euclidean norm). Therefore, if these two
vectors are almost orthogonal the computation of the eigenvalue » turns
out to be ill-conditioned. The quantity
1 1
κ(») = = (5.11)
|yH x| | cos(θ» )|

can thus be taken as the condition number of the eigenvalue ». Obviously,
κ(») ≥ 1; when A is a normal matrix, since it is unitarily similar to a
diagonal matrix, the left and right eigenvectors y and x coincide, yielding
κ(») = 1/ x 2 = 1.
2
Inequality (5.10) can be roughly interpreted as stating that perturbations
of the order of δµ in the entries of matrix A induce changes of the order of
δ» = δµ/| cos(θ» )| in the eigenvalue ». If normal matrices are considered,
the calculation of » is a well-conditioned problem; the case of a generic non-
symmetric matrix A can be conveniently dealt with using methods based
on similarity transformations, as will be seen in later sections.
190 5. Approximation of Eigenvalues and Eigenvectors

It is interesting to check that the conditioning of the matrix eigenvalue
problem remains unchanged if the transformation matrices are unitary. To
this end, let U ∈ Cn—n be a unitary matrix and let A = UH AU. Also let
»j be an eigenvalue of A and denote by κj the condition number (5.11).
Moreover, let κj be the condition number of »j when it is regarded as an
eigenvalue of A. Finally, let {xk }, {yk } be the right and left eigenvectors of
A respectively. Clearly, {UH xk }, {UH yk } are the right and left eigenvectors
of A. Thus, for any j = 1, . . . , n,
’1
κj = yj UUH xj
H
= κj ,

from which it follows that the stability of the computation of »j is not
a¬ected by performing similarity transformations using unitary matrices.
It can also be checked that unitary transformation matrices do not change
the Euclidean length and the angles between vectors in Cn . Moreover, the
following a priori estimate holds (see [GL89], p. 317)

f l X’1 AX = X’1 AX + E, with E uK2 (X) A (5.12)
2 2

where f l(M) is the machine representation of matrix M and u is the roundo¬
unit (see Section 2.5). From (5.12) it follows that using nonunitary trans-
formation matrices in the eigenvalue computation can lead to an unstable
process with respect to rounding errors.
We conclude this section with a stability result for the approximation of
the eigenvector associated with a simple eigenvalue. Under the same as-
sumptions of Theorem 5.4, the following result holds (see for the proof,
[Atk89], Problem 6, pp. 649-650).

Property 5.5 The eigenvectors xk and xk (µ) of the matrices A and A(µ) =
A + µE, with xk (µ) 2 = xk 2 = 1 for k = 1, . . . , n, satisfy
µ
xk (µ) ’ xk ¤ + O(µ2 ), ∀k = 1, . . . , n.
E
2 2
minj=k |»k ’ »j |

Analogous to (5.11), the quantity
1
κ(xk ) =
minj=k |»k ’ »j |
can be regarded as being the condition number of the eigenvector xk . Com-
puting xk might be an ill-conditioned operation if some eigenvalues »j are
“very close” to the eigenvalue »k associated with xk .


A posteriori Estimates
5.2.2
The a priori estimates examined in the previous section characterize the
stability properties of the matrix eigenvalue and eigenvector problem. From
5.2 Stability and Conditioning Analysis 191

the implementation standpoint, it is also important to dispose of a pos-
teriori estimates that allow for a run-time control of the quality of the
approximation that is being constructed. Since the methods that will be
considered later are iterative processes, the results of this section can be
usefully employed to devise reliable stopping criteria for these latter.

<< . .

. 28
( : 95)



. . >>