Ax = »x, yH A = »yH .

The eigenvalue » corresponding to the eigenvector x can be determined by

computing the Rayleigh quotient » = xH Ax/(xH x). The number » is the

solution of the characteristic equation

pA (») = det(A ’ »I) = 0,

where pA (») is the characteristic polynomial. Since this latter is a polyno-

mial of degree n with respect to », there certainly exist n eigenvalues of A

not necessarily distinct. The following properties can be proved

n n

det(A) = »i , tr(A) = »i , (1.6)

i=1 i=1

and since det(AT ’ »I) = det((A ’ »I)T ) = det(A ’ »I) one concludes that

¯

σ(A) = σ(AT ) and, in an analogous way, that σ(AH ) = σ(A).

From the ¬rst relation in (1.6) it can be concluded that a matrix is

singular i¬ it has at least one null eigenvalue, since pA (0) = det(A) =

Πn »i .

i=1

Secondly, if A has real entries, pA (») turns out to be a real-coe¬cient

polynomial so that complex eigenvalues of A shall necessarily occur in com-

plex conjugate pairs.

1.7 Eigenvalues and Eigenvectors 13

Finally, due to the Cayley-Hamilton Theorem if pA (») is the charac-

teristic polynomial of A, then pA (A) = 0, where pA (A) denotes a matrix

polynomial (for the proof see, e.g., [Axe94], p. 51).

The maximum module of the eigenvalues of A is called the spectral radius

of A and is denoted by

ρ(A) = max |»|. (1.7)

»∈σ(A)

Characterizing the eigenvalues of a matrix as the roots of a polynomial

¯

implies in particular that » is an eigenvalue of A ∈ Cn—n i¬ » is an eigen-

value of AH . An immediate consequence is that ρ(A) = ρ(AH ). Moreover,

k

∀A ∈ Cn—n , ∀± ∈ C, ρ(±A) = |±|ρ(A), and ρ(Ak ) = [ρ(A)] ∀k ∈ N.

Finally, assume that A is a block triangular matrix

®

A11 A12 ... A1k

0 A22 ... A2k

A= .

. .

..

° »

. .

.

. .

0 ... 0 Akk

As pA (») = pA11 (»)pA22 (») · · · pAkk (»), the spectrum of A is given by the

union of the spectra of each single diagonal block. As a consequence, if A

is triangular, the eigenvalues of A are its diagonal entries.

For each eigenvalue » of a matrix A the set of the eigenvectors associated

with », together with the null vector, identi¬es a subspace of Cn which is

called the eigenspace associated with » and corresponds by de¬nition to

ker(A-»I). The dimension of the eigenspace is

dim [ker(A ’ »I)] = n ’ rank(A ’ »I),

and is called geometric multiplicity of the eigenvalue ». It can never be

greater than the algebraic multiplicity of », which is the multiplicity of

» as a root of the characteristic polynomial. Eigenvalues having geometric

multiplicity strictly less than the algebraic one are called defective. A matrix

having at least one defective eigenvalue is called defective.

The eigenspace associated with an eigenvalue of a matrix A is invariant

with respect to A in the sense of the following de¬nition.

De¬nition 1.13 A subspace S in Cn is called invariant with respect to a

square matrix A if AS ‚ S, where AS is the transformed of S through A.

14 1. Foundations of Matrix Analysis

1.8 Similarity Transformations

De¬nition 1.14 Let C be a square nonsingular matrix having the same

order as the matrix A. We say that the matrices A and C’1 AC are similar,

and the transformation from A to C’1 AC is called a similarity transfor-

mation. Moreover, we say that the two matrices are unitarily similar if C

is unitary.

Two similar matrices share the same spectrum and the same characteris-

tic polynomial. Indeed, it is easy to check that if (», x) is an eigenvalue-

eigenvector pair of A, (», C’1 x) is the same for the matrix C’1 AC since

(C’1 AC)C’1 x = C’1 Ax = »C’1 x.

We notice in particular that the product matrices AB and BA, with A ∈

Cn—m and B ∈ Cm—n , are not similar but satisfy the following property

(see [Hac94], p.18, Theorem 2.4.6)

σ(AB)\ {0} = σ(BA)\ {0}

that is, AB and BA share the same spectrum apart from null eigenvalues

so that ρ(AB) = ρ(BA).

The use of similarity transformations aims at reducing the complexity

of the problem of evaluating the eigenvalues of a matrix. Indeed, if a given

matrix could be transformed into a similar matrix in diagonal or triangular

form, the computation of the eigenvalues would be immediate. The main

result in this direction is the following theorem (for the proof, see [Dem97],

Theorem 4.2).

Property 1.5 (Schur decomposition) Given A∈ Cn—n , there exists U

unitary such that

®

»1 b12 . . . b1n

0 »2 b2n

’1 H

U AU = U AU = . . = T,

..

°. .»

.

. .

0 ... 0 »n

where »i are the eigenvalues of A.

It thus turns out that every matrix A is unitarily similar to an upper

triangular matrix. The matrices T and U are not necessarily unique [Hac94].

The Schur decomposition theorem gives rise to several important results;

among them, we recall:

1. every hermitian matrix is unitarily similar to a diagonal real ma-

trix, that is, when A is hermitian every Schur decomposition of A is

diagonal. In such an event, since

U’1 AU = Λ = diag(»1 , . . . , »n ),

1.8 Similarity Transformations 15

it turns out that AU = UΛ, that is, Aui = »i ui for i = 1, . . . , n so

that the column vectors of U are the eigenvectors of A. Moreover,

since the eigenvectors are orthogonal two by two, it turns out that

an hermitian matrix has a system of orthonormal eigenvectors that

generates the whole space Cn . Finally, it can be shown that a matrix

A of order n is similar to a diagonal matrix D i¬ the eigenvectors of

A form a basis for Cn [Axe94];

2. a matrix A ∈ Cn—n is normal i¬ it is unitarily similar to a diagonal

matrix. As a consequence, a normal matrix A ∈ Cn—n admits the

n

following spectral decomposition: A = UΛUH = i=1 »i ui uH being

i

U unitary and Λ diagonal [SS90];

3. let A and B be two normal and commutative matrices; then, the

generic eigenvalue µi of A+B is given by the sum »i + ξi , where

»i and ξi are the eigenvalues of A and B associated with the same

eigenvector.

There are, of course, nonsymmetric matrices that are similar to diagonal

matrices, but these are not unitarily similar (see, e.g., Exercise 7).

The Schur decomposition can be improved as follows (for the proof see,

e.g., [Str80], [God66]).

Property 1.6 (Canonical Jordan Form) Let A be any square matrix.

Then, there exists a nonsingular matrix X which transforms A into a block

diagonal matrix J such that

X’1 AX = J = diag (Jk1 (»1 ), Jk2 (»2 ), . . . , Jkl (»l )) ,

which is called canonical Jordan form, »j being the eigenvalues of A and

Jk (») ∈ Ck—k a Jordan block of the form J1 (») = » if k = 1 and

®

»1 0 ... 0

.

0 » 1 ··· . .

. ..

..

Jk (») = . 1 0 , for k > 1.

. .

.

.

..

°. . » 1»

.

0 ... ... 0»

If an eigenvalue is defective, the size of the corresponding Jordan block

is greater than one. Therefore, the canonical Jordan form tells us that a

matrix can be diagonalized by a similarity transformation i¬ it is nonde-

fective. For this reason, the nondefective matrices are called diagonalizable.

In particular, normal matrices are diagonalizable.

16 1. Foundations of Matrix Analysis

Partitioning X by columns, X = (x1 , . . . , xn ), it can be seen that the

ki vectors associated with the Jordan block Jki (»i ) satisfy the following

recursive relation

i’1

Axl = »i xl , l= mj + 1,

(1.8)

j=1

Axj = »i xj + xj’1 , j = l + 1, . . . , l ’ 1 + ki , if ki = 1.

The vectors xi are called principal vectors or generalized eigenvectors of A.

Example 1.6 Let us consider the following matrix

®

’1/4 ’1/4 ’1/4

7/4 3/4 1/4

0

2 0 0 0 0

’1/2 ’1/2

’1/2

5/2 1/2 1/2

A= .

’1/2 ’1/2 ’1/2

5/2 1/2 1/2

° ’1/4 ’1/4 ’1/4 ’1/4 11/4 »

1/4

’3/2 ’1/2 ’1/2 1/2 1/2 7/2

The Jordan canonical form of A and its associated matrix X are given by

® ®

21000 0 100001

0 2 0 0 0 0 0 1 0 0 0 1

0 0 3 1 0 0 0 0 1 0 0 1

J= , X =

0 0 0 1 0 1 .

0 0 0 3 1 0

°0 0 0 0 3 0» °0 0 0 0 1 1»

00000 2 111111

Notice that two di¬erent Jordan blocks are related to the same eigenvalue (» =

2). It is easy to check property (1.8). Consider, for example, the Jordan block

associated with the eigenvalue »2 = 3; we have

Ax3 = [0 0 3 0 0 3]T = 3 [0 0 1 0 0 1]T = »2 x3 ,

Ax4 = [0 0 1 3 0 4]T = 3 [0 0 0 1 0 1]T + [0 0 1 0 0 1]T = »2 x4 + x3 ,

Ax5 = [0 0 0 1 3 4]T = 3 [0 0 0 0 1 1]T + [0 0 0 1 0 1]T = »2 x5 + x4 .

•

1.9 The Singular Value Decomposition (SVD)

Any matrix can be reduced in diagonal form by a suitable pre and post-

multiplication by unitary matrices. Precisely, the following result holds.

Property 1.7 Let A∈ Cm—n . There exist two unitary matrices U∈ Cm—m

and V∈ Cn—n such that

UH AV = Σ = diag(σ1 , . . . , σp ) ∈ Cm—n with p = min(m, n) (1.9)

and σ1 ≥ . . . ≥ σp ≥ 0. Formula (1.9) is called Singular Value Decompo-

sition or (SVD) of A and the numbers σi (or σi (A)) are called singular

values of A.

1.10 Scalar Product and Norms in Vector Spaces 17

If A is a real-valued matrix, U and V will also be real-valued and in (1.9)

UT must be written instead of UH . The following characterization of the

singular values holds

»i (AH A),

σi (A) = i = 1, . . . , n. (1.10)

Indeed, from (1.9) it follows that A = UΣVH , AH = VΣUH so that, U and

V being unitary, AH A = VΣ2 VH , that is, »i (AH A) = »i (Σ2 ) = (σi (A))2 .

Since AAH and AH A are hermitian matrices, the columns of U, called the

left singular vectors of A, turn out to be the eigenvectors of AAH (see

Section 1.8) and, therefore, they are not uniquely de¬ned. The same holds

for the columns of V, which are the right singular vectors of A.

Relation (1.10) implies that if A ∈ Cn—n is hermitian with eigenvalues given

by »1 , »2 , . . . , »n , then the singular values of A coincide with the modules

of the eigenvalues of A. Indeed because AAH = A2 , σi = »2 = |»i | for i

i = 1, . . . , n. As far as the rank is concerned, if

σ1 ≥ . . . ≥ σr > σr+1 = . . . = σp = 0,

then the rank of A is r, the kernel of A is the span of the column vectors

of V, {vr+1 , . . . , vn }, and the range of A is the span of the column vectors

of U, {u1 , . . . , ur }.

De¬nition 1.15 Suppose that A∈ Cm—n has rank equal to r and that it

admits a SVD of the type UH AV = Σ. The matrix A† = VΣ† UH is called

the Moore-Penrose pseudo-inverse matrix, being

1 1

Σ† = diag , . . . , , 0, . . . , 0 . (1.11)

σ1 σr

The matrix A† is also called the generalized inverse of A (see Exercise 13).

Indeed, if rank(A) = n < m, then A† = (AT A)’1 AT , while if n = m =

rank(A), A† = A’1 . For further properties of A† , see also Exercise 12.

1.10 Scalar Product and Norms in Vector Spaces

Very often, to quantify errors or measure distances one needs to compute

the magnitude of a vector or a matrix. For that purpose we introduce in

this section the concept of a vector norm and, in the following one, of a

matrix norm. We refer the reader to [Ste73], [SS90] and [Axe94] for the

proofs of the properties that are reported hereafter.

18 1. Foundations of Matrix Analysis

De¬nition 1.16 A scalar product on a vector space V de¬ned over K

is any map (·, ·) acting from V — V into K which enjoys the following

properties:

1. it is linear with respect to the vectors of V, that is

(γx + »z, y) = γ(x, y) + »(z, y), ∀x, z ∈ V, ∀γ, » ∈ K;

2. it is hermitian, that is, (y, x) = (x, y), ∀x, y ∈ V ;

3. it is positive de¬nite, that is, (x, x) > 0, ∀x = 0 (in other words,

(x, x) ≥ 0, and (x, x) = 0 if and only if x = 0).

In the case V = Cn (or Rn ), an example is provided by the classical Eu-

clidean scalar product given by

n

H

(x, y) = y x = xi yi ,

¯

i=1

where z denotes the complex conjugate of z.

¯

Moreover, for any given square matrix A of order n and for any x, y∈ Cn

the following relation holds

(Ax, y) = (x, AH y). (1.12)

In particular, since for any matrix Q ∈ Cn—n , (Qx, Qy) = (x, QH Qy), one

gets

Property 1.8 Unitary matrices preserve the Euclidean scalar product, that

is, (Qx, Qy) = (x, y) for any unitary matrix Q and for any pair of vectors

x and y.

De¬nition 1.17 Let V be a vector space over K. We say that the map

· from V into R is a norm on V if the following axioms are satis¬ed:

1. (i) v ≥ 0 ∀v ∈ V and (ii) v = 0 if and only if v = 0;

2. ±v = |±| v ∀± ∈ K, ∀v ∈ V (homogeneity property);

3. v + w ¤ v + w ∀v, w ∈ V (triangular inequality),

where |±| denotes the absolute value of ± if K = R, the module of ± if

K = C.

1.10 Scalar Product and Norms in Vector Spaces 19

The pair (V, · ) is called a normed space. We shall distinguish among

norms by a suitable subscript at the margin of the double bar symbol. In

the case the map | · | from V into R enjoys only the properties 1(i), 2 and

3 we shall call such a map a seminorm. Finally, we shall call a unit vector

any vector of V having unit norm.

An example of a normed space is Rn , equipped for instance by the p-norm

(or H¨lder norm); this latter is de¬ned for a vector x of components {xi }

o

as

1/p

n

|xi | for 1 ¤ p < ∞.

p

x = , (1.13)

p

i=1

Notice that the limit as p goes to in¬nity of x p exists, is ¬nite, and equals

the maximum module of the components of x. Such a limit de¬nes in turn

a norm, called the in¬nity norm (or maximum norm), given by

= max |xi |.

x ∞

1¤i¤n

When p = 2, from (1.13) the standard de¬nition of Euclidean norm is

recovered

1/2

n

1/2

|xi |2

= (x, x)1/2 = = xT x

x ,

2

i=1

for which the following property holds.

Property 1.9 (Cauchy-Schwarz inequality) For any pair x, y ∈ Rn ,

|(x, y)| = |xT y| ¤ x y 2, (1.14)

2

where strict equality holds i¬ y = ±x for some ± ∈ R.

We recall that the scalar product in Rn can be related to the p-norms

introduced over Rn in (1.13) by the H¨lder inequality

o

11

|(x, y)| ¤ x y q, with + = 1.

p

pq

In the case where V is a ¬nite-dimensional space the following property

holds (for a sketch of the proof, see Exercise 14).

Property 1.10 Any vector norm · de¬ned on V is a continuous function

of its argument, namely, ∀µ > 0, ∃C > 0 such that if x ’ x ¤ µ then

| x ’ x | ¤ Cµ, for any x, x ∈ V .

New norms can be easily built using the following result.