<< . .

. 40
( : 70)



. . >>

so we are faced with a system of simultaneous partial di¬erential equations.
(For a very slightly di¬erent view of the matter see Exercise E.2.)
Remark 3: We proved Theorem 13.2.4 under the assumption the linear map
± : Rn ’ Rn given by

±t = Df (x0 , y0 )(0, t)

is invertible.
This condition gives unnecessary prominence to the last n coordinates.
To get round this, we introduce a new de¬nition.
De¬nition 13.2.7. We say that a linear map β : Rn+m ’ Rn has full rank
if β is surjective.
The next, very easy, lemma gives some context.
346 A COMPANION TO ANALYSIS

Lemma 13.2.8. Suppose that a linear map β : Rn+m ’ Rn has matrix B
with respect to standard coordinates. (We use column vectors.) Then the
following conditions are equivalent.
(i) β has full rank.
(ii) B has n linearly independent columns.
(iii) By permuting the coordinate axes, we can ensure that the last n
columns of B are linearly independent.
(iv) By permuting the coordinate axes, we can ensure that the linear map
± : Rn ’ Rn given by
±t = β(0, t)
is invertible.
Proof. To see that (i) implies (ii), observe that
rank β = dim βRn+m = dim span{b1 , b2 , , . . . , bn+m }
where bj is the jth column of B. If β has full rank then, since any spanning
set contains a basis, n of the bj must be linearly independent.
It is clear that (ii) implies (iii). To see that (iii) implies (iv), observe that,
if ± has matrix A with respect to standard coordinates after our permutation,
then A is an n — n matrix whose columns are the ¬rst n columns of B and
so linearly independent. Thus A is invertible and so ± is.
To see that (iv) implies (i), observe that
{b1 , b2 , , . . . , bn+m } ⊇ {a1 , a2 , , . . . , an }
where bk is the kth column of A, the matrix of ± with respect to standard
coordinates. Thus
Rn ⊇ span{b1 , b2 , , . . . , bn+m } ⊇ span{a1 , a2 , , . . . , an } = Rn ,
so βRn+m = Rn and β has full rank.
The appropriate modi¬cation of Theorem 13.2.4 is now obvious.
Theorem 13.2.9. Consider a function f : Rm+n ’ Rn which is di¬eren-
tiable on an open set U . Suppose further that Df is continuous at every point
of U , that w0 ∈ U and that Df (w0 ) has full rank. Then we can permute the
coordinate axes in such a way that, if we write w0 = (x0 , y0 ) ∈ Rm — Rn ,
we can ¬nd an open set B1 in Rm , with x0 ∈ B1 , and an open set B2 in
Rn with f (w0 ) ∈ B2 such that, if z ∈ B2 , there exists a di¬erentiable map
gz : B1 ’ Rm with (x, gz (x)) ∈ U and
f (x, gz (x)) = z
for all x ∈ B1 .
347
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Remark 4: In Appendix C, I emphasised the importance of a large stock of
examples, but I talked mainly about examples of badly behaved functions.
It is also true that mathematicians ¬nd it hard to lay their hands on a
su¬ciently diverse collection of well behaved functions. In this book we
started with the polynomials. To these we added the function x ’ 1/x
and got the rational functions x ’ P (x)/Q(x) with P and Q polynomials.
We then added power series and solutions of di¬erential equations. The
inverse function theorem gives us a new collection of interesting functions and
the implicit function theorem greatly extends this collection. The implicit
function theorem is particularly important in giving us interesting examples
of well behaved functions f : Rn ’ R when n ≥ 2.


Lagrange multipliers ™
13.3
Suppose that t : R4 ’ R and f : R4 ’ R2 are well behaved functions and we
wish to maximise t(x) subject to the condition f (x) = h.
This will involve a fair amount of calculation, so it seems reasonable to
simplify our task as much as possible. One obvious simpli¬cation is to trans-
late so that h = (0, 0) and f (0, 0, 0, 0) = (0, 0). We now wish to investigate
whether x = (0, 0, 0, 0) maximises t(x) subject to the condition f (x) = (0, 0).
A more interesting simpli¬cation is to observe that, since we are assuming
good behaviour, Df (0) will have full rank. We can therefore ¬nd invertible
linear maps ± : R4 ’ R4 and β : R2 ’ R2 such that βDf (0)± has matrix
1000
J=
0100
with respect to the standard basis (and using the column vector representa-
tion).
If we set
F(x) = β(f (±x)), T (x) = t(±x)
then our problem becomes that of investigating whether x = (0, 0, 0, 0) max-
imises T (x), subject to the condition F(x) = (0, 0). By the implicit function
theorem, we can ¬nd well behaved u, v : R ’ R2 such that u(0, 0) = v(0, 0) =
(0, 0) and (at least if x and y are small)
F(u(z, w), v(z, w), z, w) = (0, 0).
By the chain rule Df (0) has matrix J with respect to the standard basis
(using the column vector representation) and so
u,1 (0, 0) = u,2 (0, 0) = v,1 (0, 0) = v,2 (0, 0) = 0.
348 A COMPANION TO ANALYSIS

Before proceeding further the reader should do the next exercise.

Exercise 13.3.1. Explain why our conditions on F tell us that near (0, 0, 0, 0)

F(x, y, z, w) ≈ (x, y).

Suppose that, in fact, F(x, y, z, w) = (x, y) exactly. Find u and v. What
conditions on T ensure that x = (0, 0, 0, 0) maximises T (x) subject to the
condition F(x) = (0, 0)?

Since (at least locally) F(u(z, w), v(z, w), z, w) = (0, 0), our problem re-
duces to studying whether (z, w) = (0, 0) maximises „ (z, w) = T (u(z, w), v(z, w), z, w).
Since every maximum is a stationary point we ¬rst ask if (z, w) = (0, 0) is a
stationary point, that is, if
‚„ ‚„
(0, 0) = (0, 0) = 0.
‚z ‚w
By the chain rule,
‚„ ‚u ‚v
(z, w) = T,1 (u(z, w), v(z, w), z, w) (z, w) + T,2 (u(z, w), v(z, w), z, w) (z, w)
‚z ‚z ‚z
+ T,3 (u(z, w), v(z, w), z, w),

and so
‚„
(0, 0) = T,3 (0, 0, 0, 0).
‚z
A similar calculation gives
‚„
(0, 0) = T,4 (0, 0, 0, 0),
‚w
so (0, 0) is a stationary point of „ if and only if

T,3 (0, 0, 0, 0) = T,4 (0, 0, 0, 0) = 0.

We have found a necessary condition for (0, 0, 0, 0) to maximise T (x) subject
to F(x) = 0, and so for (0, 0, 0, 0) to maximise t(x) subject to f (x) = 0.
In order to make this condition usable, we must translate it back into the
terms of our original problem and this is most easily done by expressing it
in coordinate-free notation. Observe that if we write

l(x) = T (x) ’ T,1 (0)F1 (x) ’ T,2 (0)F2 (x)
349
Please send corrections however trivial to twk@dpmms.cam.ac.uk

then

l,1 (0) = T,1 (0) ’ T,1 (0)F1,1 (x) ’ T,2 (0)F2,1 (0) = T,1 (0) ’ T,1 (0) = 0,
l,2 (0) = T,2 (0) ’ T,1 (0)F1,2 (x) ’ T,2 (0)F2,2 (0) = T,2 (0) ’ T,2 (0) = 0,
l,3 (0) = T,3 (0) ’ T,1 (0)F1,3 (x) ’ T,2 (0)F2,3 (0) = T,3 (0),
l,4 (0) = T,4 (0) ’ T,1 (0)F1,4 (x) ’ T,2 (0)F2,4 (0) = T,4 (0),

so, if (0, 0, 0, 0) is a maximum,

D(T ’ T,1 (0)F1 ’ T,2 (0)F2 )(0) = 0.

Thus, if (0, 0, 0, 0) gives a maximum and θ : R2 ’ R is the linear map
given by θ(x, y) = T,1 (0)x + T,2 (0)y, we have

D(T ’ θF)(0) = 0.

Using the de¬nitions of T and F, this gives

D(t± ’ θβf (±))(0) = 0,

so by the chain rule

(Dt(0) ’ θβDf (0))± = 0.

Since ± is invertible, this is equivalent to saying

(Dt(0) ’ θβDf (0)) = 0,

and, since β is invertible, this, in turn, is equivalent to saying that there
exists a linear map φ : R2 ’ R such that

(Dt(0) ’ φDf (0)) = 0,

so, using the chain rule once again,

D(t ’ φf )(0) = 0.

Writing φ(x, y) = »1 x + »2 y, we see that that x = (0, 0, 0, 0) maximises T (x),
subject to the condition F(x) = (0, 0), only if we can ¬nd »1 and »2 such
that t ’ »1 F1 ’ »2 F2 has (0, 0, 0, 0) as a stationary point.
The argument generalises easily to any number of dimensions.

Exercise 13.3.2. Prove Lemma 13.3.3.
350 A COMPANION TO ANALYSIS

Lemma 13.3.3. (Lagrangian necessity.) Consider functions f : Rm+n ’
Rn and t : Rm+n ’ R which are di¬erentiable on an open set U . Suppose
further that Df is continuous and of full rank at every point of U . If h ∈ R n ,
then, if z ∈ U is such that
(i) f (z) = h, and
(ii) t(z) ≥ t(x) for all x ∈ U such that f (x) = h,
then it follows that there exists a » ∈ Rn such that the function t ’ » · f :
Rm+n ’ R has z as a stationary point.
Exercise 13.3.4. Suppose that f : R2 ’ R and t : R2 ’ R are very well
behaved. Convince yourself that, in this case, Lemma 13.3.3 reads as follows.
A man takes the path f (x, y) = h over a hill of height t(x, y). He reaches the
highest point of the path at a point (x0 , y0 ) where the path and the contour line
t(x, y) = t(x0 , y0 ) have a common tangent. Generalise to higher dimensions.
Lemma 13.3.3 gives us a recipe for ¬nding the maximum of t(x), subject to
the conditions x ∈ U and f (x) = h, when t : Rn+m ’ R and f : Rn+m ’ Rn
are well behaved functions and U is open.
(1) Set

L(», x) = t(x) ’ » · f (x)

with » ∈ Rn . (L is called the Lagrangian.)
(2) For each ¬xed », ¬nd the set E(») of stationary points of L(», ) lying
within U . (The previous sentence uses the notational convention described
in the paragraph labled Abuse of Language on Page 4215 .)
(3) Now vary » and ¬nd all »— and x— ∈ E(»— ) such that f (x— ) = h. The
maximum, if it exists, will lie among these x— .
Put like this, the whole procedure has the same unreal quality that in-
structions on how to ride a bicycle would have for someone who had only
seen a photograph of a bicycle. I suspect that the only way to learn how to
use Lagrange™s method is to use it. I expect that most of my readers will
indeed be familiar with Lagrange™s method and, in any case this is a book on
the theory of analysis. However, I include one simple example of the method
in practice.
Example 13.3.5. Find the circular cylinder of greatest volume with given
surface area. (That is, ¬nd the optimum dimensions for tin of standard
shape.)
5
This convention was thoroughly disliked by several of the readers of my manuscript
but, as one of them remarked, ˜No really satisfactory notation exists™. The reader should
feel free to use her own notation. In particular she may prefer to use a placeholder and
write L(», ·)
351
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Calculation. Let the height of the cylinder be h and the radius be r. We
wish to maximise the volume

V (r, h) = πr 2 h,

subject to keeping the surface area

A(r, h) = 2πr 2 + 2πrh

¬xed, with A(r, h) = A [r, h > 0], say.
The instructions for step (1) tell us to form the Lagrangian

L(», r, h) = V (r, h) ’ »A(r, h) = πr 2 h ’ »(2πr 2 + 2πrh).

The instructions for step (2) tell us to seek stationary values for L(», , )
when » is ¬xed. We thus seek to solve
‚L ‚L
(», r, h) = 0, (», r, h) = 0,
‚r ‚h
that is

2πrh ’ »(4πr ’ 2πh) = 0, πr 2 ’ 2»πr = 0,

or, simplifying,

rh ’ »(2r + h) = 0, r ’ 2» = 0

Thus, if » is ¬xed and positive, L(», , ) has a unique stationary point given
by r = r» = 2», h = h» = 4». (If » ¤ 0, there are no stationary points
consistent with our restrictions r > 0, h > 0.)
We now proceed to step (3) which requires us to ¬nd » such that

A = A(r» , h» )

that is

A = A(2», 4») = 24π»2 .

This has unique solution with » > 0. We know, on general grounds (see
Exercise 13.3.6), that a maximum exists so, since Lagrange™s method must
produce the maximum point, and produces only one point in this case, this
point must be the maximum.
The neatest way of writing our solution (and one that we could have
obtained directly by eliminating » from equation ) is that the cylinder of
maximum volume with given surface area has height twice its radius.
352 A COMPANION TO ANALYSIS

It must be admitted that the standard tin does not follow our prescription
very closely. (Presumably the cost of materials is relatively small compared
to other costs. We must also remember that near a maximum small changes
produce very small e¬ects so the penalties for deviating are not high.)
Exercise 13.3.6. We work with the notation and hypotheses introduced in
the calculations in Example 13.3.5. Choose r0 and h0 such that r0 , h0 > 0
and A(r0 , h0 ) = A. Show that there exists an R > 1 such that, if r > R or
R’1 > r > 0 and A(r, h) = A, then V (r, h) ¤ V (r0 , h0 )/2. Show that

K = {(r, h) : A(r, h) = A, R ≥ r ≥ R’1 , h > 0}

is a closed bounded set in R2 . Deduce that V (r, h) attains a maximum on K
and conclude that V (r, h) attains a maximum on

{(r, h) : A(r, h) = A, r > 0, h > 0}.

Exercise 13.3.7. Suppose that f : R2 ’ R and t : R2 ’ R are given by
f (x, y) = y and t(x, y) = y 2 ’ x2 . We seek the maximum value of t(x, y)
subject to f (x, y) = a. Show that L(», ) = t ’ »f is stationary at (0, »/2),
and that the only value of » which gives f (0, »/2) = a is » = 2a. [We again
use the notation described on Page 421.]
(i) Show that (0, a) does indeed maximise t(x, y) subject to f (x, y) = a.
(ii) Show, however, that L(2a, ) = t ’ 2af does not have a maximum at
(0, a).
(iii) Draw a diagram in the manner of Exercise 13.3.4 to illustrate what
is happening.

Exercise 13.3.8. Suppose that f : R2 ’ R and t : R2 ’ R are given by
f (x, y) = x ’ ±y 2 and t(x, y) = (x ’ 1)2 + y 2 . We seek maxima and minima
of t subject to f (x, y) = 0. Show that L(», ) = t ’ »f is stationary at
(1 + »/2, 0), and that the only value of » which gives f (0, 0) = 0 is » = ’2.
(i) Show that, if ± > 1/2, (0, 0) minimises t(x, y) subject to f (x, y) = 0.
Show, moreover, L(’2, ) = t ’ 2f has a strict minimum at 0.
(ii) Show that, if ± = 1/2, (0, 0) minimises t(x, y) subject to f (x, y) = 0.
Show that L(’2, ) has a minimum at 0 but this is not a strict minimum.
(iii) Show that, if ± < 1/2, (0, 0) maximises t(x, y) subject to f (x, y) = 0.
Show, however, that L(’2, ) does not have a maximum at 0.
(iv) Draw a diagram in the manner of Exercise 13.3.4 to illustrate (as far
as you can, do not overdo it) what is happening.

The following remark is entirely trivial but extremely useful.
353
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Lemma 13.3.9. (Lagrangian su¬ciency.) Consider a set X and func-
tions f : X ’ Rn and t : X ’ R. Set

L(», x) = t(x) ’ » · f (x)

for all » ∈ Rn and x ∈ X.
Suppose h ∈ Rn , »— ∈ Rn , and x— ∈ X are such that f (x— ) = h and

L(»— , x— ) ≥ L(»— , x)

for all x ∈ X. Then

t(x— ) ≥ t(x)

for all x ∈ X such that f (x) = h.
Proof. The proof is shorter than the statement. Observe that, under the
given hypotheses,

t(x— ) = t(x— ) ’ »— · f (x— ) + »— · h = L(»— , x— ) + »— · h
≥ L(»— , x) + »— · h = t(x) ’ »— · f (x) + »— · h = t(x)

for all x ∈ X such that f (x) = h, as required.
It is worth noting that Lemma 13.3.9 (our Lagrangian su¬ciency condition)
does not require f or t to be well behaved or make any demands on the un-
derlying space X. This is particularly useful since, if the reader notes where
Lagrange™s method is used, she will ¬nd that it is often used in circumstances
much more general than those envisaged in Lemma 13.3.3 (our Lagrangian
necessity condition).
However, the very simple examples given in Exercises 13.3.7 and 13.3.8
show that there is a very big gap between our Lagrangian necessity and suf-
¬ciency conditions even when we restrict ourselves to well behaved functions
on ¬nite dimensional spaces. In the ¬rst three chapters of his elegant text
Optimization Under Constraints [48] Whittle considers how this gap can be
closed. It turns out that in certain very important practical cases (in partic-
ular those occurring in linear programming) the su¬cient condition is also
necessary but that, from the point of view of the present book, these cases
have very special features.
In general, the Lagrangian method is very e¬ective in suggesting the
correct answer but when that answer has been found it is often easier to
prove correctness by some other method (compare the treatment of geodesics
in Section 10.5 where we found Theorem 10.5.11 by one method but proved
it by another).
354 A COMPANION TO ANALYSIS

The alternative strategy, once the Lagrangian method has produced a
possible solution is to claim that ˜it is obvious from physical considerations
that this is indeed the correct solution™. This works well provided at least
one of the following conditions apply:-
(1) you are answering an examination question, or
(2) you have genuine physical insight, or
(3) you are reasonably lucky.
It is, of course, characteristic of bad luck that it strikes at the most inconve-
nient time.
Chapter 14

Completion

14.1 What is the correct question?
We cannot do interesting analysis on the rationals but we can on the larger
space of real numbers. On the whole, we cannot do interesting analysis on
metric spaces which are not complete. Given such an incomplete metric
space, can we ¬nd a larger complete metric space which contains it? Our
¬rst version of this question might run as follows.
Question A: If (X, d) is a metric space, can we ¬nd a complete metric space
(Z, δ) such that Z ⊇ X and d(u, v) = δ(u, v) for all u, v ∈ X?
Most mathematicians prefer to ask this question in a slightly di¬erent
way.
Question A : If (X, d) is a metric space, can we ¬nd a complete metric space
˜ ˜
(Y, d) and a map θ : X ’ Y such that d(θu, θv) = d(u, v)?
Question A asks us to build a complete metric space containing (X, d).
Question A asks us to build a complete metric space containing a perfect
copy of (X, d). Since mathematicians cannot distinguish between the original
space (X, d) and a perfect copy, they consider the two questions to be equiv-
alent. However, both from a philosophical and a technical point of view, it
is easier to handle Question A .
Exercise 14.1.1. Convince yourself either that Question A is equivalent to
Question A or that Question A is more appropriate than Question A.
For the moment, we stick to Question A. A little thought shows that it
does not have a unique answer.
Exercise 14.1.2. Consider the open interval X = (0, 1) on R with the usual
metric d.
(i) Show that (X, d) is not complete.

355
356 A COMPANION TO ANALYSIS

(ii) Show that, if we take Z = [0, 1] with the usual metric δ, then (Z, δ)
answers Question A.
(iii) Show that, if we take Y = R with the usual metric δ, then (Z, δ)
answers Question A.

<< . .

. 40
( : 70)



. . >>