<< . .

. 28
( : 59)

. . >>


A nice insight into the properties of this local estimator was given by
Rodr´±guez ([64], see also [35]), who compared it with the corresponding
residual estimator (4.9). Namely, neglecting in the residual estimator just
the residual part, i.e., setting
1 2
hE [νE · ∇uh ]E
˜2 and · 2 := ˜2
·K := ˜ ·K ,

then the following result is true:
Theorem 4.3 There exist two constants c1 , c2 > 0 depending only on the
family of triangulations such that
c1 · ¤ · ¤ c2 · .
˜ ˜
The motivation for the method of gradient recovery is to be seen in the
fact that Rh uh possesses special convergence properties. Namely, under
certain assumptions the recovered gradient Rh uh converges asymptoti-
cally to ∇u faster than ∇uh does. In such a case Rh uh is said to be a
superconvergent approximation to ∇u.
If superconvergence holds, the simple decomposition
∇u ’ ∇uh = Rh uh ’ ∇uh + ∇u ’ Rh uh
demonstrates that the ¬rst di¬erence on the right-hand side represents
the asymptotically dominating, computable part of the gradient error
∇u ’ ∇uh . In other words, if we could de¬ne, for the class of problems
under consideration, a superconvergent gradient recovery Rh uh that is com-
putable with moderate expense, then the quantities · K and · de¬ned above
may serve as a tool for a posteriori error estimation.
194 4. Grid Generation and A Posteriori Error Estimation

Unfortunately, such superconvergence properties are valid only under
rather restrictive assumptions (especially with respect to the grid and to the
regularity of the weak solution). Thus it is di¬cult to obtain a full math-
ematical foundation in practice. Nevertheless, gradient recovery is often
applied and yields satisfactory results in many situations.
The following example, which is due to Repin [63], shows that a recovered
gradient does not have to re¬‚ect the real behaviour of the error.
Example 4.4 Consider the following boundary value problem for d = 1
and „¦ = (0, 1):
’u = f u(0) = u(1) ’ 1 = 0 .
in „¦ ,
If f is constant, the exact solution reads u(x) = x(2 + (1 ’ x)f )/2. Suppose
we have found the function vh = x as an approximate solution. For an
arbitrary partition of „¦ into subintervals, this function is piecewise linear
and it satis¬es the boundary conditions formulated above. Now let Rh be
an arbitrary gradient recovery operator that is able to reproduce at least
constants. Since vh = 1, we have vh ’ Rh vh = 0, whereas the real error is
vh ’ u = (x ’ 1 )f.
An interpretation of this e¬ect is that the function vh does not solve the
corresponding discrete (Galerkin) equations. But this property of uh is used
for the proof of superconvergence. This property also plays an important
role in the derivation of the residual error estimates, because the error
equation is used therein.

Dual-Weighted Residual Error Estimators
The aforementioned a posteriori error estimates have two disadvantages:
On the one hand, certain global constants, which are not known in general,
are part of the bounds. Typical examples are ±’1 in (4.6) and the constants
C1 , C2 in the equivalence relation (4.10). On the other hand, we obtained

scaling factors like hK and hE simply by using a particular approximation
In the following, we will outline a method that attempts to circumvent
these drawbacks. It is especially appropriate for the estimation of errors of
functionals depending linearly on the solution.
So let J : V ’ R denote a linear, continuous functional. We are interested
in an estimate of |J(u) ’ J(uh )|. Therefore, the following auxiliary dual
problem is considered:
Find w ∈ V such that a(v, w) = J(v) for all v ∈ V.
Taking v = u ’ uh , we get immediately
J(u) ’ J(uh ) = J(u ’ uh ) = a(u ’ uh , w) .
If wh ∈ Vh is an arbitrary element, the error equation (2.39) yields
J(u) ’ J(uh ) = a(u ’ uh , w ’ wh ) .
4.2. A Posteriori Error Estimates and Grid Adaptation 195

Obviously, the right-hand side is of the same structure as in the derivation
of the estimate (4.8). Consequently, by using the same arguments it follows

|J(u) ’ J(uh )| ¤ w ’ wh
rK (uh ) 0,K 0,K

[νE · ∇uh ]E w ’ wh
+ .

In contrast to the previous approaches, here the norms of w ’ wh will not
be theoretically analyzed but numerically approximated. This can be done
by an approximation of the dual solution w. There are several (more or less
heuristic) ways to do this.

(1) Estimation of the approximation error: Here, the norms of w ’ wh
are estimated as in the case of residual error estimators. Since the
result depends on the unknown H 1 -seminorm of w, which is equiva-
lent to the L2 -norm of ∇w, the ¬nite element solution wh ∈ Vh of the
auxiliary problem is used to approximate ∇w. It is a great disadvan-
tage of this approach that again global constants enter in the ¬nal
estimate through the estimation of the approximation error. Further-
more, the discrete auxiliary problem is of similar complexity to that
of the original discrete problem.

(2) Higher-order discretizations of the auxiliary problem: The auxiliary
problem is solved numerically by using a method that is more accu-
rate than the original method to determine a solution in Vh . Then
w is replaced by that solution and wh ∈ Vh by an interpolant of
that solution. Unfortunately, since the discrete auxiliary problem is
of comparatively large dimension, this approach is rather expensive.

(3) Approximation by means of higher-order recovery: This method
works similarly to the approach described in the previous subsection;
w is replaced by an element that is recovered from the ¬nite element
solution wh ∈ Vh of the auxiliary problem. The recovered element
approximates w with higher order in both norms than wh does. This
method exhibits two problems: On the one hand, the auxiliary prob-
lem has to be solved numerically, and on the other hand, ensuring
the corresponding superconvergence properties may be di¬cult.

At the end of this section we want to mention how the method could be
used to estimate certain norms of the error. In the case where the norms
are induced by particular scalar products, there is a simple, formal way.
For example, for the L2 -norm we have
u ’ uh , u ’ uh 0
u ’ uh = .
u ’ uh 0
196 4. Grid Generation and A Posteriori Error Estimation

Keeping u and uh ¬xed, we get with the de¬nition
v, u ’ uh 0
J(v) :=
u ’ uh 0
a linear, continuous functional J : H 1 („¦) ’ R such that J(u) ’ J(uh ) =
u ’ uh 0 .
The practical di¬culty of this approach consists in the fact that to be
able to ¬nd the solution w of the auxiliary problem we have to know the
values of J, but they depend on the unknown element u ’ uh . The idea
of approximating these values immediately implies two problems: There is
additional expense, and the in¬‚uence of the approximation quality on the
accuracy of the obtained bounds has to be analyzed.

4.4 Let „¦ ‚ R2 be a bounded domain with a polygonal, Lipschitz con-
tinuous boundary and V := H0 („¦). Now consider a V -elliptic, continuous
bilinear form a and a continuous linear form b. The problem
u∈V : a(u, v) = b(v) for all v ∈ V
is discretized using piecewise linear, continuous ¬nite elements. If Ei de-
notes the support of the nodal basis functions of Vh associated with the
vertex ai , show that the abstract local error indicators
a(e, v)
·i := sup
v∈H0 (Ei )

can be estimated by means of the solutions ei ∈ H0 (Ei ) of the local

boundary value problems
ei ∈ H0 (Ei ) : a(ei , v) = b(v) ’ a(uh , v) for all v ∈ H0 (Ei )
1 1

as follows (M and ± denote the constants appearing in the continuity and
ellipticity conditions on a):
± ei ¤ ·i ¤ M ei .
If necessary, the elements of H0 (Ei ) are extended by zero to the whole
domain „¦.

4.5 A linear polynomial on some triangle is uniquely de¬ned either by
its values at the vertices or by its values at the edge midpoints. For a
¬xed triangulation of a polygonally bounded, simply connected domain
„¦ ‚ R2 , there can be de¬ned two ¬nite element spaces by identifying
common degrees of freedom of adjacent triangles.
Exercises 197

(a) Show that the dimension of the space de¬ned by the degrees of free-
dom located at the vertices is less than the dimension of the other
space (provided that the triangulation consists of more than one
(b) How can one explain this “loss of degrees of freedom”?

4.6 Denote by Th a triangulation of the domain „¦ ‚ Rd . Show that for
a function v : „¦ ’ R that is continuously di¬erentiable on each element
the jump [νE · ∇v]E of the normal derivative of v across an element edge
E does not depend on the orientation of the normal νE .

4.7 Let a regular family of triangulations (Th ) of a domain „¦ ‚ R2 be
given. Show that there exist constants C > 0 that depend only on the
family (Th ) such that
|v|2 ¤ for all v ∈ L2 („¦) ,
Cv 0
|v|2 ¤ for all v ∈ L2 („¦) .
Cv 0

4.8 Let „¦ ‚ Rd be a bounded domain. Show that there are constants
C1 , C2 > 0 such that for all v ∈ H0 („¦),

C1 |v|1 ¤ v ¤ C2 |v|1 .
Iterative Methods
for Systems of Linear Equations

We consider again the system of linear equations
Ax = b (5.1)
with nonsingular matrix A ∈ Rm,m , right-hand side b ∈ Rm , and solution
x ∈ Rm . As shown in Chapters 2 and 3, such systems of equations arise
from ¬nite element discretizations of elliptic boundary value problems. The
matrix A is the sti¬ness matrix and thus sparse, as can be seen from (2.37).
A sparse matrix is vaguely a matrix with so many vanishing elements that
using this structure in the solution of (5.1) is advantageous. Taking advan-
tage of a band or hull structure was discussed in Section 2.5. More precisely,
if (5.1) represents a ¬nite element discretization, then it is not su¬cient to
know the properties of the solution method for a ¬xed m. It is on the con-
trary necessary to study a sequence of problems with growing dimension
m, as it appears by the re¬nement of a triangulation. In the strict sense we
understand by the notion sparse matrices a sequence of matrices in which
the number of nonzero elements per row is bounded independently of the
dimension. This is the case for the sti¬ness matrices due to (2.37) if the un-
derlying sequence of triangulations is regular in the sense of De¬nition 3.28,
for example. In ¬nite element discretizations of time-dependent problems
(Chapter 7) as well as in ¬nite volume discretizations (Chapter 6) systems
of equations of equal properties arise, so that the following considerations
can be also applied there.
The described matrix structure is best applied in iterative methods that
have the operation matrix — vector as an essential module, where either
the system matrix A or a matrix of similar structure derived from it is
5. Iterative Methods for Systems of Linear Equations 199

concerned. If the matrix is sparse in the strict sense, then O(m) elementary
operations are necessary. In particular, list-oriented storage schemes can be
of use, as pointed out in Section 2.5.
The e¬ort for the approximative solution of (5.1) by an iterative method
is determined by the number of elementary operations per iteration step
and the number of iterations k that are necessary in order to reach the
desired relative error level µ > 0, i.e., to meet the demand
x(k) ’ x ¤ µ x(0) ’ x . (5.2)

Here x(k) k is the sequence of iterates for the initial value x(0) , · a ¬xed
norm in Rm , and x = A’1 b the exact solution of (5.1).
For all methods to be discussed we will have linear convergence of the
x(k) ’ x ¤ x(0) ’ x
with a contraction number with 0 < < 1, which in general depends on
the dimension m. To satisfy (5.2), k iterations are thus su¬cient, with
1 1
k≥ ln ln . (5.4)
The computational e¬ort of a method obviously depends on the size of µ,
although this will be seen as ¬xed and only the dependence on the dimen-
sion m is considered: often µ will be omitted in the corresponding Landau™s
symbols. The methods di¬er therefore by their convergence behaviour, de-
scribed by the contraction number and especially by its dependence
on m (for speci¬c classes of matrices and boundary value problems). A
method is (asymptotically) optimal if the contraction numbers are bounded
independently of m:
(m) ¤ < 1. (5.5)
In this case the total e¬ort for a sparse matrix is O(m) elementary opera-
tions, as for a matrix — vector step. Of course, for a more exact comparison,
the corresponding constants, which also re¬‚ect the e¬ort of an iteration
step, have to be exactly estimated.
While direct methods solve the system of equations (5.1) with machine
precision, provided it is solvable in a stable manner, one can freely choose
the accuracy with iterative methods. If (5.1) is generated by the discretiza-
tion of a boundary value problem, it is recommended to solve it only with
that accuracy with which (5.1) approximates the boundary value prob-
lem. Asymptotic statements hereto have, among others, been developed in
(3.89), (7.129) and give an estimation of the approximation error by Ch± ,
with constants C, ± > 0, whereby h is the mesh size of the corresponding
triangulation. Since the constants in these estimates are usually unknown,
the error level can be adapted only asymptotically in m, in order to gain
200 5. Iterative Methods for Systems of Linear Equations

an algorithmic error of equal asymptotics compared to the error of ap-
proximation. Although this contradicts the above-described point of view
of a constant error level, it does not alter anything in the comparison of
the methods: The respective e¬ort always has to be multiplied by a fac-
tor O(ln m) if in d space dimensions m ∼ h’d is valid, and the relations
between the methods compared remain the same.
Furthermore, the choice of the error level µ will be in¬‚uenced by the
quality of the initial iterate. Generally, statements about the initial iterate
are only possible for special situations: For parabolic initial boundary value
problems (Chapter 7) and a one-step time discretization it is recommended
to use the approximation of the old time level as initial iterate. In the
case of a hierarchy of space discretizations, a nested iteration is possible
(Section 5.6), where the initial iterates will naturally result.

5.1 Linear Stationary Iterative Methods
5.1.1 General Theory
We begin with the study of the following class of a¬ne-linear iteration
¦(x) := M x + N b , (5.6)
with matrices M, N ∈ Rm,m to be speci¬ed later. By means of ¦ an iter-
ation sequence x(0) , x(1) , x(2) , . . . is de¬ned through a ¬xed-point iteration

x(k+1) := ¦ x(k) , k = 0, 1, . . . , (5.7)
from an initial approximation x(0) . Methods of this kind are called linear
stationary, because of their form (5.6) with a ¬xed iteration matrix M . The
function ¦ : Rm ’ Rm is continuous, so that in case of convergence of x(k)
for k ’ ∞, for the limit x we have
x = ¦(x) = M x + N b .
In order to achieve that the ¬xed-point iteration de¬ned by (5.6) is con-
sistent with Ax = b, i.e., each solution of (5.1) is also a ¬xed point, we
must require
A’1 b = M A’1 b + N b for arbitrary b ∈ Rm ,
i.e., A’1 = M A’1 + N , and thus
I = M + NA . (5.8)
On the other hand, if N is nonsingular, which will always be the case in
the following, then (5.8) also implies that a ¬xed point of (5.6) solves the
system of equations.
5.1. Linear Stationary Iterative Methods 201

Assuming the validity of (5.8), the ¬xed-point iteration for (5.6) can also
be written as
x(k+1) = x(k) ’ N Ax(k) ’ b , (5.9)
M x(k) + N b = (I ’ N A) x(k) + N b .
If N is nonsingular, we have additionally an equivalent form given by
W x(k+1) ’ x(k) = ’ Ax(k) ’ b (5.10)
with W := N ’1 . The correction x(k+1) ’x(k) for x(k) is given by the residual

g (k) := Ax(k) ’ b
through (5.9) or (5.10), possibly by solving a system of equations. In order
to compete with the direct method, the solution of (5.10) should require
one order in m fewer elementary operations. For dense matrices no more
operations than O(m2 ) should be necessary as are already necessary for
the calculation of g (k) . The same holds for sparse matrices, for example
band matrices. On the other side the method should converge, and that as
quickly as possible.
In the form (5.6) ¦ is Lipschitz continuous for a given norm · on
R with Lipschitz constant M , where · is a norm on Rm,m that is

consistent with the vector norm (see (A3.9)).
More precisely, for a consistent iteration the error
e(k) := x(k) ’ x ,
with x = A’1 b still denoting the exact solution, even satis¬es
e(k+1) = M e(k) ,
because (5.7) and (5.8) imply
e(k+1) = x(k+1) ’ x = M x(k) + N b ’ M x ’ N Ax = M e(k) . (5.11)
The spectral radius of M , that is, the maximum of the absolute values of
the (complex) eigenvalues of M , will be denoted by (M ).
The following general convergence theorem holds:
Theorem 5.1 A ¬xed-point iteration given by (5.6) to solve Ax = b is
globally and linearly convergent if
(M ) < 1 . (5.12)
· on Rm,m induced by a norm ·
This is satis¬ed if for a matrix norm
on Rm we have
M < 1. (5.13)
202 5. Iterative Methods for Systems of Linear Equations

If the consistency condition (5.8) holds and the matrix and vector norms
applied are consistent, then the convergence is monotone in the following
e(k+1) ¤ M e(k) . (5.14)

Proof: Assuming (5.12), then for µ = (1 ’ (M )) /2 > 0 there is a norm
· S on Rm such that the induced norm · S on Rm,m satis¬es
¤ (M ) + µ < 1

(see [16, p. 34]). The function ¦ is a contraction with respect to this special
norm on Rm . Therefore, Banach™s ¬xed-point theorem (Theorem 8.4) can
be applied on X = (Rm , · S ), which ensures the global convergence of
the sequence x(k) k to a ¬xed point x of ¦.
If (5.13) holds, ¦ is a contraction even with respect to the norm ·
on Rm , and M is the Lipschitz constant. Finally relation (5.14) follows
from (5.11).

In any case, we have convergence in any norm on Rm , since they are all
equivalent. Linear convergence for (5.12) holds only in the generally not
available norm · S with M S as contraction number.
As termination criterion for the concrete iteration methods to be
introduced, often
g (k) ¤ δ g (0) (5.15)
is used with a control parameter δ > 0, abbreviated as g (k) = 0. The
connection to the desired reduction of the relative error according to (5.2)
is given by

<< . .

. 28
( : 59)

. . >>