<< . .

. 19
( : 70)

. . >>

(yi ’ ati ’ b)2
f (a, b) = exp ’ ,

with k ti = 0. Use the results above to ¬nd the values of a and b which
maximise f . (Of course, this result can be obtained without calculus but most
people do it this way.)
Mathematicians with a good understanding of the topic they are investi-
gating can use insight as a substitute for rigorous veri¬cation, but intuition
may lead us astray.
Exercise 7.3.10. Four towns lie on the vertices of a square of side a. What
is the shortest total length of a system of roads joining all four towns? (The
answer is given in Exercise K.107, but try to ¬nd the answer ¬rst before
looking it up.)
The following are standard traps for the novice and occasional traps for
the experienced.
(1) Critical points need not be maxima or minima.
(2) Local maxima and minima need not be global maxima or minima.
(3) Maxima and minima may occur on the boundary and may then not
be critical points. [We may restate this more exactly as follows. Suppose
f : E ’ R. Unless E is open, f may take a maximum value at a point e ∈ E
such that we cannot ¬nd any δ > 0 with B(e, δ) ⊆ E. However well f is
behaved, the argument of Lemma 7.3.2 will fail. For a speci¬c instance see
Exercise 7.3.4.]
(4) A function need not have a maximum or minimum. [Consider f :
U ’ R given by f (x, y) = x where U = B(0, 1) or U = R2 .]
Exercise 7.3.11. Find the maxima and minima of the function f : R2 ’ R
given by

f (x, y) = y 2 ’ x3 ’ ax

in the region {(x, y) : x2 + y 2 ¤ 1}.
Your answer will depend on the constant a.

Figure 7.3: Light paths in an ellipse

Matters are further complicated by the fact that di¬erent kinds of prob-
lems call for di¬erent kinds of solutions. The engineer seeks a global minimum
to the cost of a process. On the other hand if we drop a handful of ball bear-
ings on the ground they will end up at local minima (lowest points) and most
people suspect that evolutionary, economic and social changes all involve lo-
cal maxima and minima. Finally, although we like to think of many physical
processes as minimising some function, it is often the case they are really
stationarising (¬nding critical points for) that function. We like to say that
light takes a shortest path, but, if you consider a bulb A at the centre of an
ellipse, light is re¬‚ected back to A from B and B , the two closest points on
the ellipse, and from C and C , the two furthest points (see Figure 7.3).
We have said that, if f : R2 ’ R has a Taylor expansion in the neighbour-
hood of a point, then (ignoring the possibility that the Hessian is singular)
the contour map will look like that in Figures 7.1 or 7.2. But it is very
easy to imagine other contour maps and the reader may ask what happens
if the local contour map does not look like that in Figures 7.1 or 7.2. The
answer is that the appropriate Taylor expansion has failed and therefore the
hypotheses which ensure the appropriate Taylor expansion must themselves
have failed.
Exercise 7.3.12. Suppose that f : R2 ’ R is given by f (0, 0) = 0 and

f (r cos θ, r sin θ) = rg(θ)

when r > 0, where g : R ’ R is periodic with period 2π. [Informally, we
de¬ne f using polar coordinates.] Show that, if g(’θ) = ’g(θ) for all θ, then
f has directional derivatives (see De¬nition 6.1.6) in all directions at (0, 0).
If we choose g(θ) = sin θ, we obtain a contour map like Figure 7.1, but,
if g(θ) = sin 3θ, we obtain something very di¬erent.
Exercise 7.3.13. We continue with the notation of Exercise 7.3.12.
Please send corrections however trivial to twk@dpmms.cam.ac.uk

(i) If g(θ) = sin θ, ¬nd f (x, y) and sketch the contour lines f (x, y) =
h, 2h, 3h, . . . with h small.
(ii) If g(θ) = sin 3θ, show that

y(3x2 ’ y 2 )
f (x, y) =
x2 + y 2
for (x, y) = 0. Sketch the contour lines f (x, y) = h, 2h, 3h, . . . with h
Example 7.3.14. If
y(3x2 ’ y 2 )
f (x, y) = for (x, y) = (0, 0),
x2 + y 2
f (0, 0) = 0,

then f is di¬erentiable except at (0, 0), is continuous everywhere, has direc-
tional derivatives in all directions at (0, 0) but is not di¬erentiable at (0, 0).
Proof. By standard results on di¬erentiation (the chain rule, product rule
and so on), f is di¬erentiable (and so continuous) except, perhaps, at (0, 0).
If u2 + v 2 = 1 we have
f (uh, vh) ’ f (0, 0)
’ v(3u2 ’ v 2 )
as h ’ 0, so f has directional derivatives in all directions at (0, 0). Since

4(max(|x|, |y|))3
|f (x, y) ’ f (0, 0)| ¤ = 4 max(|x|, |y|) ’ 0
max(|x|, |y|))2

as (x2 + y 2 )1/2 ’ 0, f is continuous at (0, 0).
Suppose f were di¬erentiable at (0, 0). Then

f (h, k) = f (0, 0) + Ah + Bk + (h, k)(h2 + k 2 )1/2

with (h, k) ’ 0 as (h2 + k 2 )1/2 ’ 0, and A = f,1 (0, 0), B = f,2 (0, 0). The
calculations of the previous paragraph with v = 0 show that f,1 (0, 0) = 0
and the same calculations with u = 0 show that f,2 (0, 0) = ’1. Thus

f (h, k) + k = (h, k)(h2 + k 2 )1/2

f (h, k) + k
(h2 + k 2 )1/2

as (h2 + k 2 )1/2 ’ 0. Setting k = h, we get
h+h f (h, h) + h
21/2 = ’0
(h2 + h2 )1/2 (h2 + h2 )1/2
as h ’ 0, which is absurd. Thus f is not di¬erentiable at (0, 0).
(We give a stronger result in Exercise C.8 and a weaker but slightly easier
result in Exercise 7.3.16.)
Exercise 7.3.15. Write down the details behind the ¬rst sentence of our
proof of Example 7.3.14. You will probably wish to quote Lemma 6.2.11 and
Exercise 6.2.17.
Exercise 7.3.16. If
f (x, y) = for (x, y) = (0, 0),
(x2 + y 2 )1/2
f (0, 0) = 0,

show that f is di¬erentiable except at (0, 0), is continuous at (0, 0) and has
partial derivatives f,1 (0, 0) and f,2 (0, 0) at (0, 0) but has directional deriva-
tives in no other directions at (0, 0). Discuss your results brie¬‚y using the
ideas of Exercise 7.3.12.
A further exercise on the ideas just used is given as Exercise K.108.
Emboldened by our success, we could well guess immediately a suitable
function to look for in the context of Theorem 7.2.6.
Exercise 7.3.17. Suppose that f : R2 ’ R is given by f (0, 0) = 0 and

f (r cos θ, r sin θ) = r 2 sin 4θ,

for r > 0. Show that
4xy(x2 ’ y 2 )
f (x, y) =
x2 + y 2
for (x, y) = 0. Sketch the contour lines f (x, y) = h, 22 h, 32 h, . . . and
compare the result with Figure 7.2.
Exercise 7.3.18. Suppose that
xy(x2 ’ y 2 )
f (x, y) = for (x, y) = (0, 0),
(x2 + y 2 )
f (0, 0) = 0.
Please send corrections however trivial to twk@dpmms.cam.ac.uk

(i) Compute f,1 (0, y), for y = 0, by using standard results of the calculus.
(ii) Compute f,1 (0, 0) directly from the de¬nition of the derivative.
(iii) Find f,2 (x, 0) for all x.
(iv) Compute f,12 (0, 0) and f,21 (0, 0).
(v) Show that f has ¬rst and second partial derivatives everywhere but
f,12 (0, 0) = f,21 (0, 0).
It is profoundly unfortunate that Example 7.3.14 and Exercise 7.3.18 seem
to act on some examiners like catnip on a cat. Multi-dimensional calculus
leads towards di¬erential geometry and in¬nite dimensional calculus (func-
tional analysis). Both subjects depend on understanding objects which we
know to be well behaved but which our limited geometric intuition makes it
hard for us to comprehend. Counterexamples, such as the ones just produced,
which depend on functions having some precise degree of di¬erentiability are
simply irrelevant.
At the beginning of this section we used a ¬rst order local Taylor expan-
sion and results on linear maps to establish the behaviour of a well behaved
function f near a point x where Df (x) = 0. We then used a second order lo-
cal Taylor expansion and results on bilinear maps to establish the behaviour
of a well behaved function f near a point x where Df (x) = 0 on condition
that D2 f (x) was non-singular. Why should we stop here?
It is not the case that we can restrict ourselves to functions f for which
D2 f (x) is non-singular at all points.
Exercise 7.3.19. (i) Let A(t) be a 3 — 3 real symmetric matrix with A(t) =
(aij (t)). Suppose that the entries aij : R ’ R are continuous. Explain why
det A : R ’ R is continuous. By using an expression for det A in terms
of the eigenvalues of A, show that, if A(0) is positive de¬nite and A(1) is
negative de¬nite, then there must exist a c ∈ (0, 1) with A(c) singular.
(ii) Let m be an odd positive integer, U an open subset of Rm and γ :
[0, 1] ’ U a continuous map. Suppose that f : U ’ R has continuous
second order partial derivatives on U , that f attains a local minimum at γ(0)
and a local maximum at γ(1). Show that there exists a c ∈ [0, 1] such that
D2 f (γ(t)) is singular.
There is nothing special about the choice of m odd in Exercise 7.3.19.
We do the case m = 2 in Exercise K.106 and ambitious readers may wish to
attack the general case themselves (however, it is probably only instructive if
you make the argument watertight). Exercise K.43 gives a slightly stronger
result when m = 1.
However, it is only when Df (x) vanishes and D 2 f (x) is singular at the
same point x that we have problems and we can readily convince ourselves
(note this is not the same as proving) that this is rather unusual.

Exercise 7.3.20. Let f : R ’ R be given by f (x) = ax3 + bx2 + cx + d
with a, b, c, d real. Show that there is a y with f (y) = f (y) = 0 if and
only if one of the following two conditions hold:- a = 0 and b2 = 3ac, or
a = b = c = 0,

Faced with this kind of situation mathematicians tend to use the word
generic and say ˜in the generic case, the Hessian is non-singular at the critical
points™. This is a useful way of thinking but we must remember that:-
(1) If we leave the word generic unde¬ned, any sentence containing the
word generic is, strictly speaking, meaningless.
(2) In any case, if we look at any particular function, it ceases to be
generic. (A generic function is one without any particular properties. Any
particular function that we look at has the particular property that we are
interested in it.)
(3) The generic case may be a lot worse than we expect. Most mathemati-
cians would agree that the generic function f : R ’ R is unbounded on every
interval (a, b) with a < b, that the generic bounded function f : R ’ R is dis-
continuous at every point and that the generic continuous function f : R ’ R
is nowhere di¬erentiable. We should have said something more precise like
˜the generic 3 times di¬erentiable function f : Rn ’ R has a non-singular
Hessian at its critical points™.
So far in this section we have looked at stationary points of f by studying
the local behaviour of the function. In this we have remained true to our
17th and 18th century predecessors. In a paper entitled On Hills and Dales,
Maxwell7 raises our eyes from the local and shows us the prospect of a global

Plausible statement 7.3.21. (Hill and dale theorem.) Suppose the
surface of the moon has a ¬nite number S of summits, B of bottoms and
P of passes (all heights being measured from the moon™s centre). Then

S + B ’ P = 2.

Plausible Proof. By digging out pits and piling up soil we may ensure that
all the bottoms are at the same height, that all the passes are at di¬erent
heights, but all higher than the bottoms, and that all the summits are at the
same height which is greater than the height of any pass. Now suppose that
it begins to rain and that the water level rises steadily (and that the level is
the same for each body of water). We write L(h) for the number of lakes (a
lake is the largest body of water that a swimmer can cover without going on
Maxwell notes that he was anticipated by Cayley.
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Figure 7.4: A pass vanishes under water

to dry land), I(h) for the number of islands (an island is the largest body of
dry land that a walker can cover without going into the water) and P (h) for
the number of passes visible when the height of the water is h.
When the rain has just begun and the height h0 , say, of the water is
higher than the bottoms, but lower than the lowest pass, we have
L(h0 ) = B, I(h0 ) = 1, P (h0 ) = P. (1)
(Observe that there is a single body of dry land that a walker can get to
without going into the water so I(h0 ) = 1 even if the man in the street would
object to calling the surface of the moon with a few puddles an island.) Every
time the water rises just high enough to drown a pass, then either
(a) two arms of a lake join so an island appears, a pass vanishes and the
number of lakes remains the same, or
(b) two lakes come together so the number of lakes diminishes by one, a
pass vanishes and the number of islands remains the same.
We illustrate this in Figure 7.4. In either case, we see that
I(h) ’ L(h) + P (h) remains constant
and so, by equation (1),
I(h) ’ L(h) + P (h) = I(h0 ) ’ L(h0 ) + P (h0 ) = 1 ’ B + P. (2)
When the water is at a height h1 , higher than the highest pass but lower
than the summits, we have
L(h1 ) = 1, I(h1 ) = S, P (h1 ) = 0. (3)
(Though the man in the street would now object to us calling something a
lake when it is obviously an ocean with S isolated islands.) Using equations
(2) and (3), we now have
1 ’ B + P = I(h1 ) ’ L(h1 ) + P (h1 ) = S ’ 1
and so B + S ’ P = 2.

Figure 7.5: One- and two-holed doughnuts

Exercise 7.3.22. State and provide plausible arguments for plausible results
corresponding to Plausible Statement 7.3.21 when the moon is in the shape
of a one-holed doughnut, two-holed doughnut and an n-holed doughnut (see
Figure 7.5).
Notice that local information about the nature of a function at special
points provides global ˜topological™ information about the number of holes in
a doughnut.
If you know Euler™s theorem (memory jogger ˜V-E+F=2™), can you con-
nect it with this discussion?

Exercise 7.3.23. The function f : R2 ’ R is well behaved (say 3 times
di¬erentiable). We have f (x, y) = 0 for x2 + y 2 = 1 and f (x, y) > 0 for
x2 + y 2 < 1. State and provide a plausible argument for a plausible result
concerning the number of maxima, minima and saddle points (x, y) for f
with x2 + y 2 < 1.

I ¬nd the plausible argument just used very convincing but it is not clear
how we would go about converting it into an argument from ¬rst principles
(in e¬ect, from the fundamental axiom of analysis). Here are some of the
problems we must face.
(1) Do contour lines actually exist (that is do the points (x, y) with
f (x, y) = h actually lie on nice curves)8 ? We shall answer this question
locally by the implicit function theorem (Theorem 13.2.4) and our discussion
of the solution of di¬erential equations in Section 12.3 will shed some light
on the global problem.
(2) ˜The largest body of water that a swimmer can cover without going
on to dry land™ is a vivid but not a mathematical expression. In later work
The reader will note that though we have used contour lines as a heuristic tool we have
not used them in proofs. Note that, in speci¬c cases, we do not need a general theorem to
tell us that contour lines exist. For example, the contour lines of f (x, y) = a ’2 x2 + b’2 y 2
are given parametrically by (x, y) = (ah1/2 cos θ, bh1/2 sin θ) for h ≥ 0.
Please send corrections however trivial to twk@dpmms.cam.ac.uk

this problem is resolved by giving a formal de¬nition of a connected set.
(3) Implicit in our argument is the idea that a loop divides a sphere
into two parts. A result called the Jordan curve theorem gives the formal
statement of this idea but the proof turns out to be unexpectedly hard,
Another, less important, problem is to show that the hypothesis that
there are only a ˜¬nite number S of summits, B of bottoms and P of passes™
applies to an interesting variety of cases. It is certainly not the case that a
function f : R ’ R will always have only a ¬nite number of maxima in a
closed bounded interval. In the same way, it is not true that a moon need
have only a ¬nite number of summits.

Exercise 7.3.24. Reread Example 7.1.5. De¬ne f : R ’ R by

f (x) = (cos(1/x) ’ 1) exp(’1/x2 ) if x = 0,
f (0) = 0

Show that f is in¬nitely di¬erentiable everywhere and that f has an in¬nite
number of distinct strict local maxima in the interval [’1, 1].
(Exercise K.42 belongs to the same circle of ideas.)

The answer, once again, is to develop a suitable notion of genericity but
we shall not do so here.
Some say will say that there is no need to answer these questions since
the plausible argument which establishes Plausible Statement 7.3.21 is in
some sense ˜obviously correct™. I would reply that the reason for attacking
these questions is their intrinsic interest. Plausible Statement 7.3.21 and the
accompanying discussion are the occasion for us to ask these questions, not
the reason for trying to answer them. I would add that we cannot claim to
understand Maxwell™s result fully unless we can see either how it generalises
to higher dimensions or why it does not.
Students often feel that multidimensional calculus is just a question of
generalising results from one dimension to many. Maxwell™s result shows that
the change from one to many dimensions introduces genuinely new phenom-
ena, whose existence cannot be guessed from a one dimensional perspective.
Chapter 8

The Riemann integral

8.1 Where is the problem ?
Everybody knows what area is, but then everybody knows what honey tastes
like. But does honey taste the same to you as it does to me? Perhaps the
question is unanswerable but, for many practical purposes, it is su¬cient
that we agree on what we call honey. In the same way, it is important that,
when two mathematicians talk about area, they should agree on the answers
to the following questions:-
(1) Which sets E actually have area?
(2) When a set E has area, what is that area?
One of the discoveries of 20th century mathematics is that decisions on (1)
and (2) are linked in rather subtle ways to the question:-
(3) What properties should area have?
As an indication of the ideas involved, consider the following desirable
properties for area.
(a) Every bounded set E in R2 has an area |E| with |E| ≥ 0.
(b) Suppose that E is a bounded set in R2 . If E is congruent to F (that
is E can be obtained from F by translation and rotation), then |E| = |F |.
(c) Any square E of side a has area |E| = a2 .
(d) If E1 , E2 , . . . are disjoint bounded sets in R2 whose union F = ∞ Ej

is also bounded, then |F | = i=1 |Ej | (so ˜the whole is equal to the sum of
its parts™).

Exercise 8.1.1. Suppose that conditions (a) to (d) all hold.
(i) Let A be a bounded set in R2 and B ⊆ A. By writing A = B ∪ (A \ B)
and using condition (d) together with other conditions, show that |A| ≥ |B|.
(ii) By using (i) and condition (c), show that, if A is a non-empty bounded
open set, in R2 then |A| > 0.


We now show that assuming all of conditions (a) to (d) leads to a con-
tradiction. We start with an easy remark.

Exercise 8.1.2. If 0 ¤ x, y < 1, write x ∼ y whenever x ’ y ∈ Q. Show
that if x, y, z ∈ [0, 1) we have
(i) x ∼ x,
(ii) x ∼ y implies y ∼ x,
(iii) x ∼ y and y ∼ z together imply x ∼ z.
(In other words, ∼ is an equivalence relation.)

[x] = {y ∈ [0, 1) : y ∼ x}.

(In other words, write [x] for the equivalence class of x.) By quoting the
appropriate theorem or direct proof, show that
(iv) [x] = [0, 1),

<< . .

. 19
( : 70)

. . >>