in Section 5.8.2.

Finally, some ad hoc methods for dealing e¬ectively with the special case

where A is a symmetric (n — n) matrix are considered in Section 5.10.

5.1 Geometrical Location of the Eigenvalues

Since the eigenvalues of A are the roots of the characteristic polynomial

pA (») (see Section 1.7), iterative methods must be used for their approxi-

mation when n ≥ 5. Knowledge of eigenvalue location in the complex plane

can thus be helpful in accelerating the convergence of the process.

184 5. Approximation of Eigenvalues and Eigenvectors

A ¬rst estimate is provided by Theorem 1.4,

|»| ¤ A , ∀» ∈ σ(A), (5.1)

for any consistent matrix norm · . Inequality (5.1), which is often quite

rough, states that all the eigenvalues of A are contained in a circle of radius

R A = A centered at the origin of the Gauss plane.

Another result is obtained by extending the Decomposition Property 1.23

to complex-valued matrices.

Theorem 5.1 If A ∈ Cn—n , let

iS = A ’ AH /2

H = A + AH /2 and

be the hermitian and skew-hermitian parts of A, respectively, i being the

imaginary unit. For any » ∈ σ(A)

»min (H) ¤ Re(») ¤ »max (H), »min (S) ¤ Im(») ¤ »max (S). (5.2)

Proof. From the de¬nition of H and S it follows that A = H + iS. Let u ∈ Cn ,

u 2 = 1, be the eigenvector associated with the eigenvalue »; the Rayleigh

quotient (introduced in Section 1.7) reads

» = uH Au = uH Hu + iuH Su. (5.3)

Notice that both H and S are hermitian matrices, whilst iS is skew-hermitian.

Matrices H and S are thus unitarily similar to a real diagonal matrix (see Section

1.7), and therefore their eigenvalues are real. In such a case, (5.3) yields

Re(») = uH Hu, Im(») = uH Su,

3

from which (5.2) follows.

An a priori bound for the eigenvalues of A is given by the following result.

Theorem 5.2 (of the Gershgorin circles) Let A ∈ Cn—n . Then

n n

σ(A) ⊆ SR = Ri , Ri = {z ∈ C : |z ’ aii | ¤ |aij |}. (5.4)

j=1

i=1

j=i

The sets Ri are called Gershgorin circles.

Proof. Let us decompose A as A = D + E, where D is the diagonal part of

A, whilst eii = 0 for i = 1, . . . , n. For » ∈ σ(A) (with » = aii , i = 1, . . . , n),

let us introduce the matrix B» = A ’ »I = (D ’ »I) + E. Since B» is singular,

there exists a non-null vector x ∈ Cn such that B» x = 0. This means that

((D ’ »I) + E) x = 0, that is, passing to the · ∞ norm,

x = ’(D ’ »I)’1 Ex, ¤ (D ’ »I)’1 E

x x ∞,

∞ ∞

5.1 Geometrical Location of the Eigenvalues 185

and thus

n n

|ekj | |akj |

’1

1 ¤ (D ’ »I) E = =

∞

|akk ’ »| |akk ’ »| (5.5)

j=1

j=1

j=k

for a certain k, 1 ¤ k ¤ n. Inequality (5.5) implies » ∈ Rk and thus (5.4). 3

The bounds (5.4) ensure that any eigenvalue of A lies within the union

of the circles Ri . Moreover, since A and AT share the same spectrum,

Theorem 5.2 also holds in the form

n n

σ(A) ⊆ SC = Cj , Cj = {z ∈ C : |z ’ ajj | ¤ |aij |}. (5.6)

i=1

j=1

i=j

The circles Ri in the complex plane are called row circles, and Cj column

circles. The immediate consequence of (5.4) and (5.6) is the following.

Property 5.1 (First Gershgorin theorem) For a given matrix A ∈

Cn—n ,

∀» ∈ σ(A), » ∈ SR SC . (5.7)

The following two location theorems can also be proved (see [Atk89], pp.

588-590 and [Hou75], pp. 66-67).

Property 5.2 (Second Gershgorin theorem) Let

m n

S1 = Ri , S2 = Ri .

i=1 i=m+1

If S1 © S2 = …, then S1 contains exactly m eigenvalues of A, each one being

accounted for with its algebraic multiplicity, while the remaining eigenvalues

are contained in S2 .

Remark 5.1 Properties 5.1 and 5.2 do not exclude the possibility that

there exist circles containing no eigenvalues, as happens for the matrix in

Exercise 1.

De¬nition 5.1 A matrix A ∈ Cn—n is called reducible if there exists a

permutation matrix P such that

B11 B12

PAPT = ,

0 B22

where B11 and B22 are square matrices; A is irreducible if it is not reducible.

186 5. Approximation of Eigenvalues and Eigenvectors

To check if a matrix is reducible, the oriented graph of the matrix can be

conveniently employed. Recall from Section 3.9 that the oriented graph of a

real matrix A is obtained by joining n points (called vertices of the graph)

P1 , . . . , Pn through a line oriented from Pi to Pj if the corresponding

matrix entry aij = 0. An oriented graph is strongly connected if for any

pair of distinct vertices Pi and Pj there exists an oriented path from Pi to

Pj . The following result holds (see [Var62] for the proof).

Property 5.3 A matrix A ∈ Rn—n is irreducible i¬ its oriented graph is

strongly connected.

Property 5.4 (Third Gershgorin theorem) Let A ∈ Cn—n be an irre-

ducible matrix. An eigenvalue » ∈ σ(A) cannot lie on the boundary of SR

unless it belongs to the boundary of every circle Ri , for i = 1, . . . , n.

Example 5.1 Let us consider the matrix

®

10 2 3

A = ’1 2 ’1

° »

01 3

whose spectrum is (to four signi¬cant ¬gures) σ(A) = {9.687, 2.656±i0.693}. The

following values of the norm of A: A 1 = 11, A 2 = 10.72, A ∞ = 15 and

A F = 11.36 can be used in the estimate (5.1). Estimate (5.2) provides instead

1.96 ¤ Re(»(A)) ¤ 10.34, ’2.34 ¤ Im(»(A)) ¤ 2.34, while the row and column

circles are given respectively by R1 = {|z| : |z ’10| ¤ 5}, R2 = {|z| : |z ’2| ¤ 2},

R3 = {|z| : |z ’ 3| ¤ 1} and C1 = {|z| : |z ’ 10| ¤ 1}, C2 = {|z| : |z ’ 2| ¤ 3},

C3 = {|z| : |z ’ 3| ¤ 4}.

In Figure 5.1, for i = 1, 2, 3 the Ri and Ci circles and the intersection SR © SC

(shaded areas) are drawn. In agreement with Property 5.2, we notice that an

eigenvalue is contained in C1 , which is disjoint from C2 and C3 , while the remaining

eigenvalues, thanks to Property 5.1, lie within the set R2 ∪ {C3 © R1 }. •

5.2 Stability and Conditioning Analysis

In this section we introduce some a priori and a posteriori estimates that

are relevant in the stability analysis of the matrix eigenvalue and eigenvec-

tor problem. The presentation follows the guidelines that have been traced

in Chapter 2.

A priori Estimates

5.2.1

Assume that A ∈ Cn—n is a diagonalizable matrix and denote by X =

(x1 , . . . , xn ) ∈ Cn—n the matrix of its right eigenvectors, where xk ∈ Cn

5.2 Stability and Conditioning Analysis 187

Im(z)

R1

C3

C2

C1

R2

R3

23 10 Re(z)

FIGURE 5.1. Row and column circles for matrix A in Example 5.1

for k = 1, . . . , n, such that D = X’1 AX = diag(»1 , . . . , »n ), »i being the

eigenvalues of A, i = 1, . . . , n. Moreover, let E ∈ Cn—n be a perturbation

of A. The following theorem holds.

Theorem 5.3 (Bauer-Fike) Let µ be an eigenvalue of the matrix A+E ∈

Cn—n ; then

min |» ’ µ| ¤ Kp (X) E (5.8)

p

»∈σ(A)

where · p is any matrix p-norm and Kp (X) = X p X’1 is called the

p

condition number of the eigenvalue problem for matrix A.

Proof. We ¬rst notice that if µ ∈ σ(A) then (5.8) is trivially veri¬ed, since

X p X’1 p E p ≥ 0. Let us thus assume henceforth that µ ∈ σ(A). From the

de¬nition of eigenvalue it follows that matrix (A+E’µI) is singular, which means

that, since X is invertible, the matrix X’1 (A + E ’ µI)X = D + X’1 EX ’ µI is

singular. Therefore, there exists a non-null vector x ∈ Cn such that

(D ’ µI) + X’1 EX x = 0.

Since µ ∈ σ(A), the diagonal matrix (D ’ µI) is invertible and the previous

equation can be written in the form

I + (D ’ µI)’1 (X’1 EX) x = 0.

·

Passing to the norm and proceeding as in the proof of Theorem 5.2, we get

p

1 ¤ (D ’ µI)’1 p Kp (X) E p,

from which the estimate (5.8) follows, since

(D ’ µI)’1 = ( min |» ’ µ|)’1 .

p

»∈σ(A)

3

188 5. Approximation of Eigenvalues and Eigenvectors

If A is a normal matrix, from the Schur decomposition theorem (see Section

1.8) it follows that the similarity transformation matrix X is unitary so that

Kp (X) = 1. This implies that

∀µ ∈ σ(A + E), min |» ’ µ| ¤ E p , (5.9)

»∈σ(A)

hence the eigenvalue problem is well-conditioned with respect to the abso-

lute error. This, however, does not prevent the matrix eigenvalue problem

from being a¬ected by signi¬cant relative errors, especially when A has a

widely spread spectrum.

Example 5.2 Let us consider, for 1 ¤ n ¤ 10, the calculation of the eigenvalues

of the Hilbert matrix Hn ∈ Rn—n (see Example 3.2, Chapter 3). It is symmetric

(thus, in particular, normal) and exhibits, for n ≥ 4, a very large condition

number. Let En ∈ Rn—n be a matrix having constant entries equal to · = 10’3 .

We show in Table 5.1 the results of the computation of the minimum in (5.9),

taking p = 2 (that is, En 2 = n·). Notice how the absolute error is decreasing,

since the eigenvalue of minimum module tends to zero, whilst the relative error

is increasing as the size n of the matrix increases, due to the higher sensitivity of

•

“small” eigenvalues with respect to rounding errors.

n Abs. Err. Rel. Err. En 2 K2 (Hn ) K2 (Hn + En )

1 · 10’3 1 · 10’3 1 · 10’3 1 · 10’3

1 1

1.677 · 10’4 1.446 · 10’3 2 · 10’3

2 19.28 19.26

5.080 · 10’7 2.207 · 10’3 4 · 10’3 1.551 · 104 1.547 · 104

4

1.156 · 10’12 3.496 · 10’3 8 · 10’3 1.526 · 1010 1.515 · 1010

8

1.355 · 10’15 4.078 · 10’3 1 · 10’2 1.603 · 1013 1.589 · 1013

10

TABLE 5.1. Relative and absolute errors in the calculation of the eigenvalues of

the Hilbert matrix (using the MATLAB intrinsic function eig). “Abs. Err.” and

“Rel. Err.” denote respectively the absolute and relative errors (with respect to

»)

The Bauer-Fike theorem states that the matrix eigenvalue problem is well-

conditioned if A is a normal matrix. Failure to ful¬l this property, however,

does not necessarily imply that A must exhibit a “strong” numerical sen-

sitivity to the computation of every one of its eigenvalues. In this respect,

the following result holds, which can be regarded as an a priori estimate of

the conditioning of the calculation of a particular eigenvalue of a matrix.

Theorem 5.4 Let A ∈ Cn—n be a diagonalizable matrix; let », x and y

be a simple eigenvalue of A and its associated right and left eigenvectors,

respectively, with x 2 = y 2 = 1. Moreover, for µ > 0, let A(µ) =

A + µE, with E ∈ Cn—n such that E 2 = 1. Denoting by »(µ) and x(µ) the

5.2 Stability and Conditioning Analysis 189

eigenvalue and the corresponding eigenvector of A(µ), such that »(0) = »

and x(0) = x,

‚» 1

(0) ¤ H . (5.10)

|y x|

‚µ

Proof. Let us ¬rst prove that yH x = 0. Setting Y = (y1 , . . . , yn ) = (XH )’1 ,

with yk ∈ Cn for k = 1, . . . , n, it follows that yk A = »k yk , i.e., the rows

H H

of X’1 = YH are left eigenvectors of A. Then, since YH X = I, yi xj = δij H

for i, j = 1, . . . , n, δij being the Kronecker symbol. This result is equivalent to

saying that the eigenvectors {x} of A and the eigenvectors {y} of AH form a

bi-orthogonal set (see (4.67)).

Let us now prove (5.10). Since the roots of the characteristic equation are

continuous functions of the coe¬cients of the characteristic polynomial associated

with A(µ), it follows that the eigenvalues of A(µ) are continuous functions of µ

(see, for instance, [Hen74], p. 281). Therefore, in a neighborhood of µ = 0,

(A + µE)x(µ) = »(µ)x(µ).

Di¬erentiating the previous equation with respect to µ and setting µ = 0 yields

‚x ‚» ‚x

A (0) + Ex = (0)x + » (0),

‚µ ‚µ ‚µ

from which, left-multiplying both sides by yH and recalling that yH is a left

eigenvector of A,

yH Ex

‚»

(0) = H .

‚µ yx

3

Using the Cauchy-Schwarz inequality gives the desired estimate (5.10).

Notice that |yH x| = | cos(θ» )|, where θ» is the angle between the eigenvec-

tors yH and x (both having unit Euclidean norm). Therefore, if these two

vectors are almost orthogonal the computation of the eigenvalue » turns

out to be ill-conditioned. The quantity

1 1

κ(») = = (5.11)

|yH x| | cos(θ» )|

can thus be taken as the condition number of the eigenvalue ». Obviously,

κ(») ≥ 1; when A is a normal matrix, since it is unitarily similar to a

diagonal matrix, the left and right eigenvectors y and x coincide, yielding

κ(») = 1/ x 2 = 1.

2

Inequality (5.10) can be roughly interpreted as stating that perturbations

of the order of δµ in the entries of matrix A induce changes of the order of

δ» = δµ/| cos(θ» )| in the eigenvalue ». If normal matrices are considered,

the calculation of » is a well-conditioned problem; the case of a generic non-

symmetric matrix A can be conveniently dealt with using methods based

on similarity transformations, as will be seen in later sections.

190 5. Approximation of Eigenvalues and Eigenvectors

It is interesting to check that the conditioning of the matrix eigenvalue

problem remains unchanged if the transformation matrices are unitary. To

this end, let U ∈ Cn—n be a unitary matrix and let A = UH AU. Also let

»j be an eigenvalue of A and denote by κj the condition number (5.11).

Moreover, let κj be the condition number of »j when it is regarded as an

eigenvalue of A. Finally, let {xk }, {yk } be the right and left eigenvectors of

A respectively. Clearly, {UH xk }, {UH yk } are the right and left eigenvectors

of A. Thus, for any j = 1, . . . , n,

’1

κj = yj UUH xj

H

= κj ,

from which it follows that the stability of the computation of »j is not

a¬ected by performing similarity transformations using unitary matrices.

It can also be checked that unitary transformation matrices do not change

the Euclidean length and the angles between vectors in Cn . Moreover, the

following a priori estimate holds (see [GL89], p. 317)

f l X’1 AX = X’1 AX + E, with E uK2 (X) A (5.12)

2 2

where f l(M) is the machine representation of matrix M and u is the roundo¬

unit (see Section 2.5). From (5.12) it follows that using nonunitary trans-

formation matrices in the eigenvalue computation can lead to an unstable

process with respect to rounding errors.

We conclude this section with a stability result for the approximation of

the eigenvector associated with a simple eigenvalue. Under the same as-

sumptions of Theorem 5.4, the following result holds (see for the proof,

[Atk89], Problem 6, pp. 649-650).

Property 5.5 The eigenvectors xk and xk (µ) of the matrices A and A(µ) =

A + µE, with xk (µ) 2 = xk 2 = 1 for k = 1, . . . , n, satisfy

µ

xk (µ) ’ xk ¤ + O(µ2 ), ∀k = 1, . . . , n.

E

2 2

minj=k |»k ’ »j |

Analogous to (5.11), the quantity

1

κ(xk ) =

minj=k |»k ’ »j |

can be regarded as being the condition number of the eigenvector xk . Com-

puting xk might be an ill-conditioned operation if some eigenvalues »j are

“very close” to the eigenvalue »k associated with xk .

A posteriori Estimates

5.2.2

The a priori estimates examined in the previous section characterize the

stability properties of the matrix eigenvalue and eigenvector problem. From

5.2 Stability and Conditioning Analysis 191

the implementation standpoint, it is also important to dispose of a pos-

teriori estimates that allow for a run-time control of the quality of the

approximation that is being constructed. Since the methods that will be

considered later are iterative processes, the results of this section can be

usefully employed to devise reliable stopping criteria for these latter.