A= 0 1 0 .

00

0 0 0

10

° »

0 0 0 1 0

To compute its eigenvalues, given by ’4, ±i, 2 and 5, we apply the QR method

and we compute the matrix T(40) after 40 iterations of Program 30. Notice that

the algorithm converges to the real Schur decomposition of A (5.34), with three

blocks Rii of order 1 (i = 1, 2, 3) and with the block R44 = T(40) (4 : 5, 4 : 5)

having eigenvalues equal to ±i

®

4.9997 18.9739 ’34.2570 32.8760 ’28.4604

0 5.6216

’3.9997 6.7693 ’6.4968

= 0 1.1562 .

2 ’1.4557

(40)

0

T

0 ’0.8709

0 0 0.3129

° »

’0.3129

0 0 0 1.2607

212 5. Approximation of Eigenvalues and Eigenvectors

•

Example 5.9 Let us now employ the QR method to generate the Schur real

decomposition of the matrix A below, after reducing it to upper Hessenberg form

®

17 24 1 8 15

23 16

5 7 14

A= 4 22 .

6 13 20

10 3

12 19 21

° »

11 18 25 2 9

The eigenvalues of A are real and given (to four signi¬cant ¬gures) by »1 =

65, »2,3 = ±21.28 and »4,5 = ±13.13. After 40 iterations of Program 30, the

computed matrix reads

®

65 0 0 0 0

0 4.4848 ’3.4375

14.6701 14.2435

= 0 2.0416 .

16.6735 ’14.6701 ’1.2159

(40)

T

0 0 ’13.0293 ’0.7643

0

° »

’3.3173

0 0 0 13.0293

It is not upper triangular, but block upper triangular, with a diagonal block

R11 = 65 and the two blocks

’13.0293 ’0.7643

14.6701 14.2435

R22 = , R33 = ,

’14.6701 ’3.3173

16.6735 13.0293

having spectrums given by σ(R22 ) = »2,3 and σ(R33 ) = »4,5 respectively.

It is important to recognize that matrix T(40) is not the real Schur decomposi-

tion of A, but only a “cheating” version of it. In fact, in order for the QR method

to converge to the real Schur decomposition of A, it is mandatory to resort to

•

the shift techniques introduced in Section 5.7.

5.6.5 Implementation of Transformation Matrices

In the de¬nition (5.42) it is convenient to choose the minus sign, obtaining

(n’k)

w(k) = x(n’k) ’ x(n’k) 2 e1 , in such a way that the vector Rn’k x(n’k)

(n’k)

is a positive multiple of e1 . If xk+1 is positive, in order to avoid nu-

merical cancellations, the computation can be rationalized as follows

n

’ x2

j

’

x2 x(n’k) 2 j=k+2

(k) 2

k+1

w1 = = .

(n’k) + x(n’k)

xk+1 + x xk+1

2 2

5.6 The QR Method for Matrices in Hessenberg Form 213

The construction of the Householder vector is performed by Program 32,

which takes as input a vector p ∈ Rn’k (formerly, the vector x(n’k) ) and

returns a vector q ∈ Rn’k (the Householder vector w(k) ), with a cost of

the order of n ¬‚ops.

If M ∈ Rm—m is the generic matrix to which the Householder matrix P

(5.38) is applied (where I is the identity matrix of order m and v∈ Rm ),

letting w = MT v, then

PM = M ’ βvwT , β = 2/ v 2 . (5.49)

2

Therefore, performing the product PM amounts to a matrix-vector product

(w = MT v) plus an external product vector-vector (vwT ). The overall

computational cost of the product PM is thus equal to 2(m2 + m) ¬‚ops.

Similar considerations hold in the case where the product MP is to be

computed; de¬ning w = Mv, we get

MP = M ’ βwvT . (5.50)

Notice that (5.49) and (5.50) do not require the explicit construction of

the matrix P. This reduces the computational cost to an order of m2 ¬‚ops,

whilst executing the product PM without taking advantage of the special

structure of P would increase the operation count to an order of m3 ¬‚ops.

Program 32 - vhouse : Construction of the Householder vector

function [v,beta]=vhouse(x)

n=length(x); x=x/norm(x); s=x(2:n)™*x(2:n); v=[1; x(2:n)];

if (s==0), beta=0;

else

mu=sqrt(x(1)ˆ2+s);

if (x(1) <= 0), v(1)=x(1)-mu;

else, v(1)=-s/(x(1)+mu); end

beta=2*v(1)ˆ2/(s+v(1)ˆ2); v=v/v(1);

end

Concerning the Givens rotation matrices, the computation of c and s is

carried out as follows. Let i and k be two ¬xed indices and assume that

the k-th component of a given vector x ∈ Rn must be set to zero. Letting

r = x2 + x2 , relation (5.44) yields

i k

c ’s xi r

= (5.51)

s c xk 0

hence there is no need of explicitly computing θ, nor evaluating any trigono-

metric function.

Executing Program 33 to solve system (5.51), requires 5 ¬‚ops, plus the

evaluation of a square root. As already noticed in the case of Householder

214 5. Approximation of Eigenvalues and Eigenvectors

matrices, even for Givens rotations we don™t have to explicitly compute the

matrix G(i, k, θ) to perform its product with a given matrix M∈ Rm—m .

For that purpose Programs 34 and 35 are used, both at the cost of 6m

¬‚ops. Looking at the structure (5.43) of matrix G(i, k, θ), it is clear that

the ¬rst algorithm only modi¬es rows i and k of M, whilst the second one

only changes columns i and k of M.

We conclude by noticing that the computation of the Householder vector

v and of the Givens sine and cosine (c, s), are well-conditioned operations

with respect to rounding errors (see [GL89], pp. 212-217 and the references

therein).

The solution of system (5.51) is implemented in Program 33. The input

parameters are the vector components xi and xk , whilst the output data

are the Givens cosine and sine c and s.

Program 33 - givcos : Computation of Givens cosine and sine

function [c,s]=givcos(xi, xk)

if (xk==0), c=1; s=0; else,

if abs(xk) > abs(xi)

t=-xi/xk; s=1/sqrt(1+tˆ2); c=s*t;

else

t=-xk/xi; c=1/sqrt(1+tˆ2); s=c*t;

end

end

Programs 34 and 35 compute G(i, k, θ)T M and MG(i, k, θ) respectively.

The input parameters c and s are the Givens cosine and sine. In Program

34, the indices i and k identify the rows of the matrix M that are being

a¬ected by the update M ← G(i, k, θ)T M, while j1 and j2 are the indices

of the columns involved in the computation. Similarly, in Program 35 i

and k identify the columns e¬ected by the update M ← MG(i, k, θ), while

j1 and j2 are the indices of the rows involved in the computation.

Program 34 - garow : Product G(i, k, θ)T M

function [M]=garow(M,c,s,i,k,j1,j2)

for j=j1:j2

t1=M(i,j);

t2=M(k,j);

M(i,j)=c*t1-s*t2;

M(k,j)=s*t1+c*t2;

end

Program 35 - gacol : Product MG(i, k, θ)

function [M]=gacol(M,c,s,j1,j2,i,k)

for j=j1:j2

t1=M(j,i);

5.7 The QR Iteration with Shifting Techniques 215

t2=M(j,k);

M(j,i)=c*t1-s*t2;

M(j,k)=s*t1+c*t2;

end

5.7 The QR Iteration with Shifting Techniques

Example 5.9 reveals that the QR iteration does not always converge to the

real Schur form of a given matrix A. To make this happen, an e¬ective

approach consists of incorporating in the QR iteration (5.32) a shifting

technique similar to that introduced for inverse iteration in Section 5.3.2.

This leads to the QR method with single shift described in Section 5.7.1,

which is used to accelerate the convergence of the QR iteration when A has

eigenvalues with moduli very close to each other.

In Section 5.7.2, a more sophisticated shifting technique is considered,

which guarantees the convergence of the QR iteration to the (approximate)

Schur form of matrix A (see Property 5.8). The resulting method (known as

QR iteration with double shift) is the most popular version of the QR iter-

ation (5.32) for solving the matrix eigenvalue problem, and is implemented

in the MATLAB intrinsic function eig.

5.7.1 The QR Method with Single Shift

Given µ ∈ R, the shifted QR iteration is de¬ned as follows. For k = 1, 2, . . . ,

until convergence:

determine Q(k) , R(k) such that

Q(k) R(k) = T(k’1) ’ µI (QR factorization);

(5.52)

then, let

T(k) = R(k) Q(k) + µI.

T

where T(0) = Q(0) AQ(0) is in upper Hessenberg form. Since the QR

factorization in (5.52) is performed on the shifted matrix T(k’1) ’ µI, the

scalar µ is called shift. The sequence of matrices T(k) generated by (5.52)

is still similar to the initial matrix A, since for any k ≥ 1

T

R(k) Q(k) + µI = Q(k) Q(k) R(k) Q(k) + µQ(k)

T T

Q(k) Q(k) R(k) + µI Q(k) = Q(k) T(k’1) Q(k)

=

k ≥ 0.

= (Q(0) Q(1) . . . Q(k) )T A(Q(0) Q(1) . . . Q(k) ),

216 5. Approximation of Eigenvalues and Eigenvectors

Assume µ is ¬xed and that the eigenvalues of A are ordered in such a way

that

|»1 ’ µ| ≥ |»2 ’ µ| ≥ . . . ≥ |»n ’ µ|.

(k)

Then it can be shown that, for 1 < j ¤ n, the subdiagonal entry tj,j’1

tends to zero with a rate that is proportional to the ratio

|(»j ’ µ)/(»j’1 ’ µ)|k .

This extends the convergence result (5.37) to the shifted QR method (see

[GL89], Sections 7.5.2 and 7.3).

The result above suggests that if µ is chosen in such a way that

|»n ’ µ| < |»i ’ µ|, i = 1, . . . , n ’ 1,

(k)

then the matrix entry tn,n’1 in the iteration (5.52) tends rapidly to zero

as k increases. (In the limit, if µ were equal to an eigenvalue of T(k) , that

(k) (k)

is of A, then tn,n’1 = 0 and tn,n = µ). In practice one takes

µ = t(k) , (5.53)

n,n

yielding the so called QR iteration with single shift. Correspondingly, the

(k)

convergence to zero of the sequence tn,n’1 is quadratic in the sense that

(k) (k+1)

if |tn,n’1 |/ T(0) 2 = ·k < 1, for some k ≥ 0, then |tn,n’1 |/ T(0) 2 = O(·k )

2

(see [Dem97], pp. 161-163 and [GL89], pp. 354-355).

This can be pro¬tably taken into account when programming the QR

iteration with single shift by monitoring the size of the subdiagonal entry

(k) (k)

|tn,n’1 |. In practice, tn,n’1 is set equal to zero if

(k) (k)

|tn,n’1 | ¤ µ(|tn’1,n’1 | + |tn,n |), k ≥ 0,

(k)

(5.54)

for a prescribed µ, in general of the order of the roundo¬ unit. (This con-

vergence test is adopted in the library EISPACK). If A is an Hessenberg

(k) (k)

matrix, when for a certain k an,n’1 is set to zero, tn,n provides the desired

approximation of »n . Then the QR iteration with shift can continue on the

matrix T(k) (1 : n ’ 1, 1 : n ’ 1), and so on. This is a de¬‚ation algorithm

(for another example see Remark 5.3).

Example 5.10 We consider again the matrix A as in Example 5.9. Program 36,

with toll equal to the roundo¬ unit, converges in 14 iterations to the following

approximate real Schur form of A, which displays the correct eigenvalues of matrix

A on its diagonal (to six signi¬cant ¬gures)

®

65 0 0 0 0

0 2.5888 ’0.0445 ’4.2959

’21.2768

= 0 0 ’13.1263 ’4.0294 ’13.079 .

(40)

T

0 0 21.2768 ’2.6197

0

° »

0 0 0 0 13.1263

5.7 The QR Iteration with Shifting Techniques 217

(k)

We also report in Table 5.2 the convergence rate p(k) of the sequence tn,n’1

(n = 5) computed as

(k)

|tn,n’1 |

1

k ≥ 1.

(k)

p =1+ log (k’1) ,

|tn,n’1 |

log(·k )

The results show good agreement with the expected quadratic rate.

(k)

|tn,n’1 |/ T(0) 2 p(k)

k

0 0.13865

1.5401 · 10’2

1 2.1122

1.2213 · 10’4

2 2.1591

1.8268 · 10’8

3 1.9775

8.9036 · 10’16

4 1.9449

(k)

TABLE 5.2. Convergence rate of the sequence tn,n’1 in the QR iteration with

single shift

•

The coding of the QR iteration with single shift (5.52) is given in Pro-

gram 36. The code utilizes Program 29 to reduce the matrix A in upper

Hessenberg form and Program 31 to perform the QR factorization step.

The input parameters toll and itmax are the tolerance µ in (5.54) and

the maximum admissible number of iterations, respectively. In output, the

program returns the (approximate) real Schur form of A and the number

of iterations needed for its computation.

Program 36 - qrshift : QR iteration with single shift

function [T,iter]=qrshift(A,toll,itmax)

n=max(size(A)); iter=0; [T,Q]=houshess(A);

for k=n:-1:2

I=eye(k);

while abs(T(k,k-1)) > toll*(abs(T(k,k))+abs(T(k-1,k-1)))

iter=iter+1;

if (iter > itmax),

return

end

mu=T(k,k); [Q,R,c,s]=qrgivens(T(1:k,1:k)-mu*I);

T(1:k,1:k)=R*Q+mu*I;

end

T(k,k-1)=0;

end

218 5. Approximation of Eigenvalues and Eigenvectors

5.7.2 The QR Method with Double Shift

The single-shift QR iteration (5.52) with the choice (5.53) for µ is e¬ective

if the eigenvalues of A are real, but not necessarily when complex conjugate

eigenvalues are present, as happens in the following example.

Example 5.11 The matrix A ∈ R4—4 (reported below to ¬ve signi¬cant ¬gures)

®

’0.6392 ’1.3143

1.5726 3.7696

’1.2054

’0.0420

0.2166 0.4006

A=

’0.1411

0.0226 0.3592 0.2045

° »