<< . .

. 15
( : 95)



. . >>

tem.
94 3. Direct Methods for the Solution of Linear Systems

3.8.1 Block LU Factorization
Let A∈ Rn—n be the following block partitioned matrix

A11 A12
A= ,
A21 A22

where A11 ∈ Rr—r is a nonsingular square matrix whose factorization
L11 D1 R11 is known, while A22 ∈ R(n’r)—(n’r) . In such a case it is possible
to factorize A using only the LU factorization of the block A11 . Indeed, it
is true that
A11 A12 L11 0 D1 0 R11 R12
= ,
A21 A22 L21 In’r 0 ∆2 0 In’r

where
L21 = A21 R’1 D’1 , R12 = D’1 L’1 A12 ,
11 1 1 11

∆2 = A22 ’ L21 D1 R12 .

If necessary, the reduction procedure can be repeated on the matrix ∆2 ,
thus obtaining a block-version of the LU factorization.
If A11 were a scalar, the above approach would reduce by one the size of
the factorization of a given matrix. Applying iteratively this method yields
an alternative way of performing the Gauss elimination.
We also notice that the proof of Theorem 3.4 can be extended to the
case of block matrices, obtaining the following result.

Theorem 3.7 Let A ∈ Rn—n be partitioned in m — m blocks Aij with
i, j = 1, . . . , m. A admits a unique LU block factorization (with L having
unit diagonal entries) i¬ the m ’ 1 dominant principal block minors of A
are nonzero.
Since the block factorization is an equivalent formulation of the standard
LU factorization of A, the stability analysis carried out for the latter holds
for its block-version as well. Improved results concerning the e¬cient use
in block algorithms of fast forms of matrix-matrix product are dealt with
in [Hig88]. In the forthcoming section we focus solely on block-tridiagonal
matrices.


3.8.2 Inverse of a Block-partitioned Matrix
The inverse of a block matrix can be constructed using the LU factorization
introduced in the previous section. A remarkable application is when A is
a block matrix of the form

A = C + UBV,
3.8 Block Systems 95

where C is a block matrix that is “easy” to invert (for instance, when C
is given by the diagonal blocks of A), while U, B and V take into account
the connections between the diagonal blocks. In such an event A can be
inverted by using the Sherman-Morrison or Woodbury formula

’1
’1
BVC’1 , (3.57)
A’1 = (C + UBV) = C’1 ’ C’1 U I + BVC’1 U

having assumed that C and I + BVC’1 U are two nonsingular matrices.
This formula has several practical and theoretical applications, and is par-
ticularly e¬ective if connections between blocks are of modest relevance.



3.8.3 Block Tridiagonal Systems
Consider block tridiagonal systems of the form
® ®  ® 
0
A11 A12 x1 b1
   
.. . .
 A21   
. .
.
A22
  . .
= ,
An x =   (3.58)
 
.. ..
 ° . .
»° »
. .
. .
° An’1,n » . .
0 xn bn
An,n’1 Ann

where Aij are matrices of order ni — nj and xi and bi are column vectors of
size ni , for i, j = 1, . . . , n. We assume that the diagonal blocks are squared,
although not necessarily of the same size. For k = 1, . . . , n, set

®U 
®
0
A12
0
In1 1
 
 L1  .. 
In2
  .
U2 
Ak =   .
.. ..
  .. 
. .
° »° . Ak’1,k »
0 0
Lk’1 Ink Uk

Equating for k = n the matrix above with the corresponding blocks of An ,
it turns out that U1 = A11 , while the remaining blocks can be obtained
solving sequentially, for i = 2, . . . , n, the systems Li’1 Ui’1 = Ai,i’1 for
the columns of L and computing Ui = Aii ’ Li’1 Ai’1,i .
This procedure is well de¬ned only if all the matrices Ui are nonsingular,
which is the case if, for instance, the matrices A1 , . . . , An are nonsingular.
As an alternative, one could resort to factorization methods for banded
matrices, even if this requires the storage of a large number of zero entries
(unless a suitable reordering of the rows of the matrix is performed).
A remarkable instance is when the matrix is block tridiagonal and sym-
metric, with symmetric and positive de¬nite blocks. In such a case (3.58)
96 3. Direct Methods for the Solution of Linear Systems

takes the form
® ®  ® 
A11 AT 0 x1 b1
21
   
.. . .
 A21 A22   
. .
.
  . .
= .
   
.. ..
 ° . .
»° »
. .
AT
. .
° » . .
n,n’1
0 xn bn
An,n’1 Ann

Here we consider an extension to the block case of the Thomas algorithm,
which aims at transforming A into a block bidiagonal matrix. To this pur-
pose, we ¬rst have to eliminate the block corresponding to matrix A21 .
Assume that the Cholesky factorization of A11 is available and denote by
H11 the Cholesky factor. If we multiply the ¬rst row of the block system
by H’T , we ¬nd
11

H11 x1 + H’T AT x2 = H’T b1 .
21
11 11

Letting H21 = H’T AT and c1 = H11 b1 , it follows that A21 = HT H11 and
21 21
11
thus the ¬rst two rows of the system are

H11 x1 + H21 x2 = c1 ,
HT H11 x1 + A22 x2 + AT x3 = b2 .
21 32

As a consequence, multiplying the ¬rst row by HT and subtracting it from
21
the second one, the unknown x1 is eliminated and the following equivalent
equation is obtained
(1)
A22 x2 + AT x3 = b2 ’ H21 c1 ,
32

(1) (1)
with A22 = A22 ’ HT H21 . At this point, the factorization of A22 is carried
21
out and the unknown x3 is eliminated from the third row of the system,
and the same is repeated for the remaining rows of the system. At the end
n’1
of the procedure, which requires solving (n ’ 1) j=1 nj linear systems to
compute the matrices Hi+1,i , i = 1, . . . , n ’ 1, we end up with the following
block bidiagonal system
® ® ® 
0
H11 H21 x1 c1
  .   . 
..
  .   . 
.
H22
  .   . 
  = 
..
 . Hn,n’1  ° . » ° . ». .
° » . .
0 xn cn
Hnn

which can be solved with a (block) back substitution method. If all blocks
have the same size p, then the number of multiplications required by the
algorithm is about (7/6)(n’1)p3 ¬‚ops (assuming both p and n very large).
3.9 Sparse Matrices 97

3.9 Sparse Matrices
In this section we brie¬‚y address the numerical solution of linear sparse
systems, that is, systems where the matrix A∈ Rn—n has a number of
nonzero entries of the order of n (and not n2 ). We call a pattern of a sparse
matrix the set of its nonzero coe¬cients.
Banded matrices with su¬ciently small bands are sparse matrices. Ob-
viously, for a sparse matrix the matrix structure itself is redundant and it
can be more conveniently substituted by a vector-like structure by means
of matrix compacting techniques, like the banded matrix format discussed
in Section 3.7.

4
5 3

x xxx x x 2
6
xxx x
x
xxxx
xxxxxx 1
7
x xx
xxx
xx xx x
xxx xx 8 12
xxx x
xx x
11
xxx xx 9
x xx xx 10


FIGURE 3.3. Pattern of a symmetric sparse matrix (left) and of its associated
graph (right). For the sake of clarity, the loops have not been drawn; moreover,
since the matrix is symmetric, only one of the two sides associated with each
aij = 0 has been reported

For sake of convenience, we associate with a sparse matrix A an oriented
graph G(A). A graph is a pair (V, X ) where V is a set of p points and X
is a set of q ordered pairs of elements of V that are linked by a line. The
elements of V are called the vertices of the graph, while the connection lines
are called the paths of the graph.
The graph G(A) associated with a matrix A∈ Rm—n can be constructed
by identifying the vertices with the set of the indices from 1 to the maximum
between m and n and supposing that a path exists which connects two
vertices i and j if aij = 0 and is directed from i to j, for i = 1, . . . , m and
j = 1, . . . , n. For a diagonal entry aii = 0, the path joining the vertex i
with itself is called a loop. Since an orientation is associated with each side,
the graph is called oriented (or ¬nite directed). As an example, Figure 3.3
displays the pattern of a symmetric and sparse 12 — 12 matrix, together
with its associated graph.
As previously noticed, during the factorization procedure, nonzero entries
can be generated in memory positions that correspond to zero entries in
98 3. Direct Methods for the Solution of Linear Systems

the starting matrix. This action is referred to as ¬ll-in. Figure 3.4 shows the
e¬ect of ¬ll-in on the sparse matrix whose pattern is shown in Figure 3.3.
Since use of pivoting in the factorization process makes things even more
complicated, we shall only consider the case of symmetric positive de¬nite
matrices for which pivoting is not necessary.
A ¬rst remarkable result concerns the amount of ¬ll-in. Let mi (A) =
i ’ min {j < i : aij = 0} and denote by E(A) the convex hull of A, given
by

E(A) = {(i, j) : 0 < i ’ j ¤ mi (A)} . (3.59)

For a symmetric positive de¬nite matrix,

E(A) = E(H + HT ) (3.60)

where H is the Cholesky factor, so that ¬ll-in is con¬ned within the convex
hull of A (see Figure 3.4). Moreover, if we denote by lk (A) the number of
active rows at the k-th step of the factorization (i.e., the number of rows
of A with i > k and aik = 0), the computational cost of the factorization
process is
n
1
lk (A) (lk (A) + 3) ¬‚ops, (3.61)
2
k=1

having accounted for all the nonzero entries of the convex hull. Con¬nement
of ¬ll-in within E(A) ensures that the LU factorization of A can be stored
without extra memory areas simply by storing all the entries of E(A) (in-
cluding the null elements). However, such a procedure might still be highly
ine¬cient due to the large number of zero entries in the hull (see Exercise
11).
On the other hand, from (3.60) one gets that the reduction in the convex
hull re¬‚ects a reduction of ¬ll-in, and in turn, due to (3.61), of the number
of operations needed to perform the factorization. For this reason several
strategies for reordering the graph of the matrix have been devised. Among
them, we recall the Cuthill-McKee method, which will be addressed in the
next section.
An alternative consists of decomposing the matrix into sparse subma-
trices, with the aim of reducing the original problem to the solution of
subproblems of reduced size, where matrices can be stored in full format.
This approach leads to submatrix decomposition methods which will be
addressed in Section 3.9.2.


3.9.1 The Cuthill-McKee Algorithm
The Cuthill-McKee algorithm is a simple and e¬ective method for reorder-
ing the system variables.
3.9 Sparse Matrices 99


x xxx x x
x
xxx x
x x
xxxx
xxx
xxxxxx
xxxx
x 000 x
111x1111
0000 x xx
000x x
111
x 0000
1111 xxx
x 000
111
0 1111x x
x 0000x
1 0000 xx xx x
1111 xxx xx
xxx x
xx
xx x
xx
x 000x
111
0000000 x 000x x
1111111 111 xxx xx
xx
x 0000000
1111111 x xx xx
x
000
111
FIGURE 3.4. The shaded regions in the left ¬gure show the areas of the matrix
that can be a¬ected by ¬ll-in, for the matrix considered in Figure 3.3. Solid
lines denote the boundary of E(A). The right ¬gure displays the factors that
have been actually computed. Black dots denote the elements of A that were
originarily equal to zero

The ¬rst step of the algorithm consists of associating with each vertex of
the graph the number of its connections with neighboring vertices, called
the degree of the vertex. Next, the following steps are taken:

1. a vertex with a low number of connections is chosen as the ¬rst vertex
of the graph;

2. the vertices connected to it are progressively re-labeled starting from
those having lower degrees;

3. the procedure is repeated starting from the vertices connected to the
second vertex in the updated list. The nodes already re-labeled are
ignored. Then, a third new vertex is considered, and so on, until all
the vertices have been explored.

The usual way to improve the e¬ciency of the algorithm is based on the
so-called reverse form of the Cuthill-McKee method. This consists of ex-
ecuting the Cuthill-McKee algorithm described above where, at the end,
the i-th vertex is moved into the n ’ i + 1-th position of the list, n being
the number of nodes in the graph. Figure 3.5 reports, for comparison, the
graphs obtained using the direct and reverse Cuthill-McKee reordering in
the case of the matrix pattern represented in Figure 3.3, while in Figure
3.6 the factors L and U are compared. Notice the absence of ¬ll-in when
the reverse Cuthill-McKee method is used.

Remark 3.5 For an e¬cient solution of linear systems with sparse ma-
trices, we mention the public domain libraries SPARSKIT [Saa90], UMF-
PACK [DD95] and the MATLAB sparfun package.
100 3. Direct Methods for the Solution of Linear Systems


4 (8)
5 (12) 3 (11)
4 (5)
3 (2)
5 (6)
6 (7) 2 (9)
6 (1) 2 (4)


7 (6)
7 (12) 1 (3)
1 (10)

8 (5)
12 (1)
12 (7)
8 (8)

11 (11) 9 (3) 11 (2)
9 (10)
10 (4)
10 (9)

<< . .

. 15
( : 95)



. . >>