<< . .

. 46
( : 95)



. . >>



0


’0.5


’1


’1.5


’2
’2 ’1.5 ’1 ’0.5 0 0.5 1 1.5 2


FIGURE 7.3. Convergence history of the penalty method in Example 7.6



7.3.3 The Method of Lagrange Multipliers
A variant of the penalty method makes use of (instead of L± (x) in (7.53))
the augmented Lagrangian function G± : Rm — Rn ’ R given by
1
G± (x, ») = f (x) + »T h(x) + ± h(x) 2 , (7.55)
2
2
where » ∈ Rm is a Lagrange multiplier. Clearly, if x— is a solution to prob-
lem (7.50), then it will also be a solution to (7.55), but with the advantage,
with respect to (7.53), of disposing of the further degree of freedom ». The
penalty method applied to (7.55) reads: for k = 0, 1, . . . , until convergence,
solve the sequence of problems

minimize G±k (x, »k ) for x ∈ „¦, (7.56)

where {»k } is a bounded sequence of unknown vectors in Rn , and the
parameters ±k are de¬ned as above (notice that if »k were zero, then we
would recover method (7.54)).
Property 7.16 also holds for method (7.56), provided that the multipliers
are assumed to be bounded. Notice that the existence of the minimizer
of (7.56) is not guaranteed, even in the case where f has a unique global
minimizer (see Example 7.7). This circumstance can be overcome by adding
further non quadratic terms to the augmented Lagrangian function (e.g.,
of the form h p , with p large).
2
318 7. Nonlinear Systems and Numerical Optimization

Example 7.7 Let us ¬nd the minimizer of f (x) = ’x4 under the constraint
x = 0. Such problem clearly admits the solution x— = 0. If, instead, one considers
the augmented Lagrangian function
1
L±k (x, »k ) = ’x4 + »k x + ±k x2 ,
2
one ¬nds that it no longer admits a minimum at x = 0, though vanishing there,

for any ±k di¬erent from zero.

As far as the choice of the multipliers is concerned, the sequence of vectors
»k is typically assigned by the following formula

»k+1 = »k + ±k h(x(k) ),

where »0 is a given value while the sequence of ±k can be set a priori or
modi¬ed during run-time.

As for the convergence properties of the method of Lagrange multipliers,
the following local result holds.

Property 7.17 Assume that x— is a regular strict local minimizer of (7.50)
and that:
1. f, hi ∈ C 2 (B(x— ; R)) with i = 1, . . . , m and for a suitable R > 0;
2. the pair (x— , »— ) satis¬es zT HG0 (x— , »— )z > 0, ∀z = 0 such that
Jh (x— )T z = 0;
3. ∃¯ > 0 such that HG± (x— , »— ) > 0.
± ¯


Then, there exist three positive scalars δ, γ and M such that, for any pair
(», ±) ∈ V = (», ±) ∈ Rm+1 : » ’ »— 2 < δ±, ± ≥ ± , the problem
¯

minimize G± (x, »), with x ∈ B(x— ; γ),

admits a unique solution x(», ±), di¬erentiable with respect to its argu-
ments. Moreover, ∀(», ±) ∈ V

¤ M » ’ »— 2 .
x(», ±) ’ x— 2

Under further assumptions (see [Ber82], Proposition 2.7), it can be proved
that the Lagrange multipliers method converges. Moreover, if ±k ’ ∞, as
k ’ ∞, then

»k+1 ’ »— 2
lim = 0.
»k ’ »— 2
k’∞

and the convergence of the method is more than linear. In the case where
the sequence ±k has an upper bound, the method converges linearly.
7.4 Applications 319

Finally, we notice that, unlike the penalty method, it is no longer nec-
essary that the sequence of ±k tends to in¬nity. This, in turn, limits the
ill-conditioning of problem (7.56) as ±k is growing. Another advantage con-
cerns the convergence rate of the method, which turns out to be indepen-
dent of the growth rate of the penalty parameter, in the case of the Lagrange
multipliers technique. This of course implies a considerable reduction of the
computational cost.
The method of Lagrange multipliers is implemented in Program 64. Com-
pared with Program 63, this further requires in input the initial value
lambda0 of the multiplier.

Program 64 - lagrmult : Method of Lagrange multipliers

function [x,vinc,nit]=lagrmult(x0,lambda0,alpha0,beta,f,h,toll)
x = x0; [r,c]=size(h); vinc = 0; lambda = lambda0;
for i=1:r, vinc = max(vinc,eval(h(i,1:c))); end
norm2h=[™(™,h(1,1:c),™)ˆ2™];
for i=2:r, norm2h=[norm2h,™+(™,h(i,1:c),™)ˆ2™]; end
alpha = alpha0; options(1)=0; options(2)=toll*0.1; nit = 0;
while vinc > toll
lh=[™(™,h(1,1:c),™)*™,num2str(lambda(1))];
for i=2:r, lh=[lh,™+(™,h(i,1:c),™)*™,num2str(lambda(i))];
end
g=[f,™+0.5*™,num2str(alpha,16),™*™,norm2h,™+™,lh];
[x]=fmins(g,x,options);
vinc=0; nit = nit + 1;
for i=1:r, vinc = max(vinc,eval(h(i,1:c))); end
alpha=alpha*beta;
for i=1:r, lambda(i)=lambda(i)+alpha*eval(h(i,1:c)); end
end



Example 7.8 We use the method of Lagrange multipliers to solve the prob-
lem presented in Example 7.6. Set » = 10 and leave the remaining parameters
unchanged. The method converges in 6 iterations and the crosses in Figure 7.4
show the iterates computed by Program 64. The constraint is here satis¬ed up

to machine precision.




7.4 Applications
The two applications of this section are concerned with nonlinear systems
arising in the simulation of the electric potential in a semiconductor device
and in the triangulation of a two-dimensional polygon.
320 7. Nonlinear Systems and Numerical Optimization


1




0.5




0




’0.5




’1
’1 ’0.5 0 0.5 1




FIGURE 7.4. Convergence history for the method of Lagrange multipliers in
Example 7.8


7.4.1 Solution of a Nonlinear System Arising from
Semiconductor Device Simulation
Let us consider the nonlinear system in the unknown u ∈ Rn

F(u) = Au + φ(u) ’ b = 0, (7.57)

where A = (»/h)2 tridiagn (’1, 2’1), for h = 1/(n+1), φi (u) = 2K sinh(ui )
for i = 1, . . . , n, where » and K are two positive constants and b ∈ Rn is
a given vector. Problem (7.57) arises in the numerical simulation of semi-
conductor devices in microelectronics, where u and b represent electric
potential and doping pro¬le, respectively.
In Figure 7.5 (left) we show schematically the particular device consid-
ered in the numerical example, a p ’ n junction diode of unit normalized
length, subject to an external bias V = Vb ’ Va , together with the doping
pro¬le of the device, normalized to 1 (right). Notice that bi = b(xi ), for
i = 1, . . . , n, where xi = ih. The mathematical model of the problem at
hand comprises a nonlinear Poisson equation for the electric potential and
two continuity equations of advection-di¬usion type, as those addressed in
Chapter 12, for the current densities. For the complete derivation of the
model and its analysis see, for instance, [Mar86] and [Jer96].
Solving system (7.57) corresponds to ¬nding the minimizer in Rn of the
function f : Rn ’ R de¬ned as
n
1
cosh(ui )) ’ bT u.
f (u) = uT Au + 2 (7.58)
2 i=1
7.4 Applications 321

b(x)
1

p n
x
0 L

’ ’1
+
∆V
FIGURE 7.5. Scheme of a semiconductor device (left); doping pro¬le (right)

It can be checked (see Exercise 5) that for any u, v ∈ Rn with u = v and
for any » ∈ (0, 1)

»f (u) + (1 ’ »)f (v) ’ f (»u + (1 ’ »)v) > (1/2)»(1 ’ ») u ’ v 2
A,

where · A denotes the energy norm introduced in (1.28). This implies
that f (u) is an uniformly convex function in Rn , that is, it strictly satis¬es
(7.49) with ρ = 1/2.
Property 7.10 ensures, in turn, that the function in (7.58) admits a unique
minimizer u— ∈ Rn and it can be shown (see Theorem 14.4.3, p. 503 [OR70])
that there exists a sequence {±k } such that the iterates of the damped
Newton method introduced in Section 7.2.6 converge to u— ∈ Rn (at least)
superlinearly.

Thus, using the damped Newton method for solving system (7.57) leads to
the following sequence of linearized problems:

given u(0) ∈ Rn , ∀k ≥ 0 solve
(k)
A + 2K diagn (cosh(ui )) δu(k) = b ’ Au(k) + φ(u(k) ) , (7.59)

then set u(k+1) = u(k) + ±k δu(k) .
Let us now address two possible choices of the acceleration parameters
±k . The ¬rst one has been proposed in [BR81] and is
1
±k = , k = 0, 1, . . . , (7.60)
F(u(k) )
1 + ρk
where · denotes a vector norm, for instance · = · ∞ , and the
coe¬cients ρk ≥ 0 are suitable acceleration parameters picked in such a
way that the descent condition F(u(k) + ±k δu(k) ) ∞ < F(u(k) ) ∞ is
satis¬ed (see [BR81] for the implementation details of the algorithm).
322 7. Nonlinear Systems and Numerical Optimization

We notice that, as F(u(k) ) ∞ ’ 0, (7.60) yields ±k ’ 1, thus recov-
ering the full (quadratic) convergence of Newton™s method. Otherwise, as
typically happens in the ¬rst iterations, F(u(k) ) ∞ 1 and ±k is quite
close to zero, with a strong reduction of the Newton variation (damping).
As an alternative to (7.60), the sequence {±k } can be generated using the
simpler formula, suggested in [Sel84], Chapter 7
±k = 2’i(i’1)/2 , k = 0, 1, . . . , (7.61)
where i is the ¬rst integer in the interval [1, Itmax ] such that the descent
condition above is satis¬ed, Itmax being the maximum admissible number
of damping cycles for any Newton™s iteration (¬xed equal to 10 in the
numerical experiments).
As a comparison, both damped and standard Newton™s methods have been
implemented, the former one with both choices (7.60) and (7.61) for the
coe¬cients ±k . In the case of Newton™s method, we have set in (7.59) ±k = 1
for any k ≥ 0.
The numerical examples have been performed with n = 49, bi = ’1 for
i ¤ n/2 and the remaining values bi equal to 1. Moreover, we have taken
»2 = 1.67 · 10’4 , K = 6.77 · 10’6 and ¬xed the ¬rst n/2 components of the
initial vector u(0) equal to Va and the remaining ones equal to Vb , where
Va = 0 and Vb = 10.
The tolerance on the maximum change between two successive iterates,
which monitors the convergence of damped Newton™s method (7.59), has
been set equal to 10’4 .
4
10 1

0.9
2
10 0.8

0.7
0
10 0.6
(1)
0.5
’2
10 0.4

0.3
(2)
’4
10 0.2

(3) 0.1
’6
10 0
0 1 2 0 2 4 6 8 10
10 10 10

FIGURE 7.6. Absolute error (left) and damping parameters ±k (right). The error
curve for standard Newton™s method is denoted by (1), while (2) and (3) refer to
damped Newton™s method with the choices (7.61) and (7.60) for the coe¬cients
±k , respectively

Figure 7.6 (left) shows the log-scale absolute error for the three algorithms
as functions of the iteration number. Notice the rapid convergence of the
7.4 Applications 323

damped Newton™s method (8 and 10 iterations in the case of (7.60) and
(7.61), respectively), compared with the extremely slow convergence of the
standard Newton™s method (192 iterations). Moreover, it is interesting to
analyze in Figure 7.6 (right) the plot of the sequences of parameters ±k as
functions of the iteration number.
The starred and the circled curves refer to the choices (7.60) and (7.61)
for the coe¬cients ±k , respectively. As previously observed, the ±k ™s start
from very small values, to converge quickly to 1 as the damped Newton
method (7.59) enters the attraction region of the minimizer x— .



7.4.2 Nonlinear Regularization of a Discretization Grid
In this section we go back to the problem of regularizing a discretization
grid that has been introduced in Section 3.14.2. There, we considered the
technique of barycentric regularization, which leads to solving a linear sys-
tem, typically of large size and featuring a sparse coe¬cient matrix.
In this section we address two alternative techniques, denoted as reg-
ularization by edges and by areas. The main di¬erence with respect to
the method described in Section 3.14.2 lies in the fact that these new ap-
proaches lead to systems of nonlinear equations.

Using the notation of Section 3.14.2, for each pair of nodes xj , xk ∈ Zi ,
denote by ljk the edge on the boundary ‚Pi of Pi which connects them
and by xjk the midpoint of ljk , while for each triangle T ∈ Pi we denote
by xb,T the centroid of T . Moreover, let ni = dim(Zi ) and denote for any
geometric entity (side or triangle) by | · | its measure in R1 or R2 .
In the case of regularization by edges, we let

« 

xi =  xjk |ljk | /|‚Pi |, ∀xi ∈ Nh , (7.62)
ljk ∈‚Pi



while in the case of regularization by areas, we let


xb,T |T | /|Pi |, ∀xi ∈ Nh .
xi = (7.63)
T ∈Pi


(‚D)
if xi ∈
In both the regularization procedures we assume that xi = xi
‚D, that is, the nodes lying on the boundary of the domain D are ¬xed. Let-
ting n = N ’Nb be the number of internal nodes, relation (7.62) amounts to
solving the following two systems of nonlinear equations for the coordinates
324 7. Nonlinear Systems and Numerical Optimization

{xi } and {yi } of the internal nodes, with i = 1, . . . , n
« 
1
xi ’  (xj + xk )|ljk | / |ljk | = 0,
2
« ljk ∈‚Pi  ljk ∈‚Pi (7.64)
1
yi ’  (yj + yk )|ljk | / |ljk | = 0.
2
ljk ∈‚Pi ljk ∈‚Pi

Similarly, (7.63) leads to the following nonlinear systems, for i = 1, . . . , n

1
xi ’ (x1,T + x2,T + x3,T )|T | / |T | = 0,
3
T ∈Pi T ∈Pi
(7.65)
1
yi ’ (y1,T + y2,T + y3,T )|T | / |T | = 0,
3
T ∈Pi T ∈Pi

where xs,T = (xs,T , ys,T ), for s = 1, 2, 3, are the coordinates of the vertices
of each triangle T ∈ Pi . Notice that the nonlinearity of systems (7.64) and
(7.65) is due to the presence of terms |ljk | and |T |.
Both systems (7.64) and (7.65) can be cast in the form (7.1), denoting,
as usual, by fi the i-th nonlinear equation of the system, for i = 1, . . . , n.
The complex functional dependence of fi on the unknowns makes it pro-
hibitive to use Newton™s method (7.4), which would require the explicit
computation of the Jacobian matrix JF .
A convenient alternative is provided by the nonlinear Gauss-Seidel method
(see [OR70], Chapter 7), which generalizes the corresponding method pro-
posed in Chapter 4 for linear systems and can be formulated as follows.
Denote by zi , for i = 1, . . . , n, either of the unknown xi or yi . Given the
(0) (0)
initial vector z(0) = (z1 , . . . , zn )T , for k = 0, 1, . . . until convergence,
solve
(k+1) (k+1) (k) (k)
fi (z1 , . . . , zi’1 , ξ, zi+1 , . . . , zn ) = 0, i = 1, . . . , n, (7.66)

<< . .

. 46
( : 95)



. . >>