(yi ’ ati ’ b)2

f (a, b) = exp ’ ,

i=1

with k ti = 0. Use the results above to ¬nd the values of a and b which

i=1

maximise f . (Of course, this result can be obtained without calculus but most

people do it this way.)

Mathematicians with a good understanding of the topic they are investi-

gating can use insight as a substitute for rigorous veri¬cation, but intuition

may lead us astray.

Exercise 7.3.10. Four towns lie on the vertices of a square of side a. What

is the shortest total length of a system of roads joining all four towns? (The

answer is given in Exercise K.107, but try to ¬nd the answer ¬rst before

looking it up.)

The following are standard traps for the novice and occasional traps for

the experienced.

(1) Critical points need not be maxima or minima.

(2) Local maxima and minima need not be global maxima or minima.

(3) Maxima and minima may occur on the boundary and may then not

be critical points. [We may restate this more exactly as follows. Suppose

f : E ’ R. Unless E is open, f may take a maximum value at a point e ∈ E

such that we cannot ¬nd any δ > 0 with B(e, δ) ⊆ E. However well f is

behaved, the argument of Lemma 7.3.2 will fail. For a speci¬c instance see

Exercise 7.3.4.]

(4) A function need not have a maximum or minimum. [Consider f :

U ’ R given by f (x, y) = x where U = B(0, 1) or U = R2 .]

Exercise 7.3.11. Find the maxima and minima of the function f : R2 ’ R

given by

f (x, y) = y 2 ’ x3 ’ ax

in the region {(x, y) : x2 + y 2 ¤ 1}.

Your answer will depend on the constant a.

160 A COMPANION TO ANALYSIS

Figure 7.3: Light paths in an ellipse

Matters are further complicated by the fact that di¬erent kinds of prob-

lems call for di¬erent kinds of solutions. The engineer seeks a global minimum

to the cost of a process. On the other hand if we drop a handful of ball bear-

ings on the ground they will end up at local minima (lowest points) and most

people suspect that evolutionary, economic and social changes all involve lo-

cal maxima and minima. Finally, although we like to think of many physical

processes as minimising some function, it is often the case they are really

stationarising (¬nding critical points for) that function. We like to say that

light takes a shortest path, but, if you consider a bulb A at the centre of an

ellipse, light is re¬‚ected back to A from B and B , the two closest points on

the ellipse, and from C and C , the two furthest points (see Figure 7.3).

We have said that, if f : R2 ’ R has a Taylor expansion in the neighbour-

hood of a point, then (ignoring the possibility that the Hessian is singular)

the contour map will look like that in Figures 7.1 or 7.2. But it is very

easy to imagine other contour maps and the reader may ask what happens

if the local contour map does not look like that in Figures 7.1 or 7.2. The

answer is that the appropriate Taylor expansion has failed and therefore the

hypotheses which ensure the appropriate Taylor expansion must themselves

have failed.

Exercise 7.3.12. Suppose that f : R2 ’ R is given by f (0, 0) = 0 and

f (r cos θ, r sin θ) = rg(θ)

when r > 0, where g : R ’ R is periodic with period 2π. [Informally, we

de¬ne f using polar coordinates.] Show that, if g(’θ) = ’g(θ) for all θ, then

f has directional derivatives (see De¬nition 6.1.6) in all directions at (0, 0).

If we choose g(θ) = sin θ, we obtain a contour map like Figure 7.1, but,

if g(θ) = sin 3θ, we obtain something very di¬erent.

Exercise 7.3.13. We continue with the notation of Exercise 7.3.12.

161

Please send corrections however trivial to twk@dpmms.cam.ac.uk

(i) If g(θ) = sin θ, ¬nd f (x, y) and sketch the contour lines f (x, y) =

h, 2h, 3h, . . . with h small.

(ii) If g(θ) = sin 3θ, show that

y(3x2 ’ y 2 )

f (x, y) =

x2 + y 2

for (x, y) = 0. Sketch the contour lines f (x, y) = h, 2h, 3h, . . . with h

small.

Example 7.3.14. If

y(3x2 ’ y 2 )

f (x, y) = for (x, y) = (0, 0),

x2 + y 2

f (0, 0) = 0,

then f is di¬erentiable except at (0, 0), is continuous everywhere, has direc-

tional derivatives in all directions at (0, 0) but is not di¬erentiable at (0, 0).

Proof. By standard results on di¬erentiation (the chain rule, product rule

and so on), f is di¬erentiable (and so continuous) except, perhaps, at (0, 0).

If u2 + v 2 = 1 we have

f (uh, vh) ’ f (0, 0)

’ v(3u2 ’ v 2 )

h

as h ’ 0, so f has directional derivatives in all directions at (0, 0). Since

4(max(|x|, |y|))3

|f (x, y) ’ f (0, 0)| ¤ = 4 max(|x|, |y|) ’ 0

max(|x|, |y|))2

as (x2 + y 2 )1/2 ’ 0, f is continuous at (0, 0).

Suppose f were di¬erentiable at (0, 0). Then

f (h, k) = f (0, 0) + Ah + Bk + (h, k)(h2 + k 2 )1/2

with (h, k) ’ 0 as (h2 + k 2 )1/2 ’ 0, and A = f,1 (0, 0), B = f,2 (0, 0). The

calculations of the previous paragraph with v = 0 show that f,1 (0, 0) = 0

and the same calculations with u = 0 show that f,2 (0, 0) = ’1. Thus

f (h, k) + k = (h, k)(h2 + k 2 )1/2

and

f (h, k) + k

’0

(h2 + k 2 )1/2

162 A COMPANION TO ANALYSIS

as (h2 + k 2 )1/2 ’ 0. Setting k = h, we get

h+h f (h, h) + h

21/2 = ’0

=

(h2 + h2 )1/2 (h2 + h2 )1/2

as h ’ 0, which is absurd. Thus f is not di¬erentiable at (0, 0).

(We give a stronger result in Exercise C.8 and a weaker but slightly easier

result in Exercise 7.3.16.)

Exercise 7.3.15. Write down the details behind the ¬rst sentence of our

proof of Example 7.3.14. You will probably wish to quote Lemma 6.2.11 and

Exercise 6.2.17.

Exercise 7.3.16. If

xy

f (x, y) = for (x, y) = (0, 0),

(x2 + y 2 )1/2

f (0, 0) = 0,

show that f is di¬erentiable except at (0, 0), is continuous at (0, 0) and has

partial derivatives f,1 (0, 0) and f,2 (0, 0) at (0, 0) but has directional deriva-

tives in no other directions at (0, 0). Discuss your results brie¬‚y using the

ideas of Exercise 7.3.12.

A further exercise on the ideas just used is given as Exercise K.108.

Emboldened by our success, we could well guess immediately a suitable

function to look for in the context of Theorem 7.2.6.

Exercise 7.3.17. Suppose that f : R2 ’ R is given by f (0, 0) = 0 and

f (r cos θ, r sin θ) = r 2 sin 4θ,

for r > 0. Show that

4xy(x2 ’ y 2 )

f (x, y) =

x2 + y 2

for (x, y) = 0. Sketch the contour lines f (x, y) = h, 22 h, 32 h, . . . and

compare the result with Figure 7.2.

Exercise 7.3.18. Suppose that

xy(x2 ’ y 2 )

f (x, y) = for (x, y) = (0, 0),

(x2 + y 2 )

f (0, 0) = 0.

163

Please send corrections however trivial to twk@dpmms.cam.ac.uk

(i) Compute f,1 (0, y), for y = 0, by using standard results of the calculus.

(ii) Compute f,1 (0, 0) directly from the de¬nition of the derivative.

(iii) Find f,2 (x, 0) for all x.

(iv) Compute f,12 (0, 0) and f,21 (0, 0).

(v) Show that f has ¬rst and second partial derivatives everywhere but

f,12 (0, 0) = f,21 (0, 0).

It is profoundly unfortunate that Example 7.3.14 and Exercise 7.3.18 seem

to act on some examiners like catnip on a cat. Multi-dimensional calculus

leads towards di¬erential geometry and in¬nite dimensional calculus (func-

tional analysis). Both subjects depend on understanding objects which we

know to be well behaved but which our limited geometric intuition makes it

hard for us to comprehend. Counterexamples, such as the ones just produced,

which depend on functions having some precise degree of di¬erentiability are

simply irrelevant.

At the beginning of this section we used a ¬rst order local Taylor expan-

sion and results on linear maps to establish the behaviour of a well behaved

function f near a point x where Df (x) = 0. We then used a second order lo-

cal Taylor expansion and results on bilinear maps to establish the behaviour

of a well behaved function f near a point x where Df (x) = 0 on condition

that D2 f (x) was non-singular. Why should we stop here?

It is not the case that we can restrict ourselves to functions f for which

D2 f (x) is non-singular at all points.

Exercise 7.3.19. (i) Let A(t) be a 3 — 3 real symmetric matrix with A(t) =

(aij (t)). Suppose that the entries aij : R ’ R are continuous. Explain why

det A : R ’ R is continuous. By using an expression for det A in terms

of the eigenvalues of A, show that, if A(0) is positive de¬nite and A(1) is

negative de¬nite, then there must exist a c ∈ (0, 1) with A(c) singular.

(ii) Let m be an odd positive integer, U an open subset of Rm and γ :

[0, 1] ’ U a continuous map. Suppose that f : U ’ R has continuous

second order partial derivatives on U , that f attains a local minimum at γ(0)

and a local maximum at γ(1). Show that there exists a c ∈ [0, 1] such that

D2 f (γ(t)) is singular.

There is nothing special about the choice of m odd in Exercise 7.3.19.

We do the case m = 2 in Exercise K.106 and ambitious readers may wish to

attack the general case themselves (however, it is probably only instructive if

you make the argument watertight). Exercise K.43 gives a slightly stronger

result when m = 1.

However, it is only when Df (x) vanishes and D 2 f (x) is singular at the

same point x that we have problems and we can readily convince ourselves

(note this is not the same as proving) that this is rather unusual.

164 A COMPANION TO ANALYSIS

Exercise 7.3.20. Let f : R ’ R be given by f (x) = ax3 + bx2 + cx + d

with a, b, c, d real. Show that there is a y with f (y) = f (y) = 0 if and

only if one of the following two conditions hold:- a = 0 and b2 = 3ac, or

a = b = c = 0,

Faced with this kind of situation mathematicians tend to use the word

generic and say ˜in the generic case, the Hessian is non-singular at the critical

points™. This is a useful way of thinking but we must remember that:-

(1) If we leave the word generic unde¬ned, any sentence containing the

word generic is, strictly speaking, meaningless.

(2) In any case, if we look at any particular function, it ceases to be

generic. (A generic function is one without any particular properties. Any

particular function that we look at has the particular property that we are

interested in it.)

(3) The generic case may be a lot worse than we expect. Most mathemati-

cians would agree that the generic function f : R ’ R is unbounded on every

interval (a, b) with a < b, that the generic bounded function f : R ’ R is dis-

continuous at every point and that the generic continuous function f : R ’ R

is nowhere di¬erentiable. We should have said something more precise like

˜the generic 3 times di¬erentiable function f : Rn ’ R has a non-singular

Hessian at its critical points™.

So far in this section we have looked at stationary points of f by studying

the local behaviour of the function. In this we have remained true to our

17th and 18th century predecessors. In a paper entitled On Hills and Dales,

Maxwell7 raises our eyes from the local and shows us the prospect of a global

theory.

Plausible statement 7.3.21. (Hill and dale theorem.) Suppose the

surface of the moon has a ¬nite number S of summits, B of bottoms and

P of passes (all heights being measured from the moon™s centre). Then

S + B ’ P = 2.

Plausible Proof. By digging out pits and piling up soil we may ensure that

all the bottoms are at the same height, that all the passes are at di¬erent

heights, but all higher than the bottoms, and that all the summits are at the

same height which is greater than the height of any pass. Now suppose that

it begins to rain and that the water level rises steadily (and that the level is

the same for each body of water). We write L(h) for the number of lakes (a

lake is the largest body of water that a swimmer can cover without going on

7

Maxwell notes that he was anticipated by Cayley.

165

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Figure 7.4: A pass vanishes under water

to dry land), I(h) for the number of islands (an island is the largest body of

dry land that a walker can cover without going into the water) and P (h) for

the number of passes visible when the height of the water is h.

When the rain has just begun and the height h0 , say, of the water is

higher than the bottoms, but lower than the lowest pass, we have

L(h0 ) = B, I(h0 ) = 1, P (h0 ) = P. (1)

(Observe that there is a single body of dry land that a walker can get to

without going into the water so I(h0 ) = 1 even if the man in the street would

object to calling the surface of the moon with a few puddles an island.) Every

time the water rises just high enough to drown a pass, then either

(a) two arms of a lake join so an island appears, a pass vanishes and the

number of lakes remains the same, or

(b) two lakes come together so the number of lakes diminishes by one, a

pass vanishes and the number of islands remains the same.

We illustrate this in Figure 7.4. In either case, we see that

I(h) ’ L(h) + P (h) remains constant

and so, by equation (1),

I(h) ’ L(h) + P (h) = I(h0 ) ’ L(h0 ) + P (h0 ) = 1 ’ B + P. (2)

When the water is at a height h1 , higher than the highest pass but lower

than the summits, we have

L(h1 ) = 1, I(h1 ) = S, P (h1 ) = 0. (3)

(Though the man in the street would now object to us calling something a

lake when it is obviously an ocean with S isolated islands.) Using equations

(2) and (3), we now have

1 ’ B + P = I(h1 ) ’ L(h1 ) + P (h1 ) = S ’ 1

and so B + S ’ P = 2.

166 A COMPANION TO ANALYSIS

Figure 7.5: One- and two-holed doughnuts

Exercise 7.3.22. State and provide plausible arguments for plausible results

corresponding to Plausible Statement 7.3.21 when the moon is in the shape

of a one-holed doughnut, two-holed doughnut and an n-holed doughnut (see

Figure 7.5).

Notice that local information about the nature of a function at special

points provides global ˜topological™ information about the number of holes in

a doughnut.

If you know Euler™s theorem (memory jogger ˜V-E+F=2™), can you con-

nect it with this discussion?

Exercise 7.3.23. The function f : R2 ’ R is well behaved (say 3 times

di¬erentiable). We have f (x, y) = 0 for x2 + y 2 = 1 and f (x, y) > 0 for

x2 + y 2 < 1. State and provide a plausible argument for a plausible result

concerning the number of maxima, minima and saddle points (x, y) for f

with x2 + y 2 < 1.

I ¬nd the plausible argument just used very convincing but it is not clear

how we would go about converting it into an argument from ¬rst principles

(in e¬ect, from the fundamental axiom of analysis). Here are some of the

problems we must face.

(1) Do contour lines actually exist (that is do the points (x, y) with

f (x, y) = h actually lie on nice curves)8 ? We shall answer this question

locally by the implicit function theorem (Theorem 13.2.4) and our discussion

of the solution of di¬erential equations in Section 12.3 will shed some light

on the global problem.

(2) ˜The largest body of water that a swimmer can cover without going

on to dry land™ is a vivid but not a mathematical expression. In later work

8

The reader will note that though we have used contour lines as a heuristic tool we have

not used them in proofs. Note that, in speci¬c cases, we do not need a general theorem to

tell us that contour lines exist. For example, the contour lines of f (x, y) = a ’2 x2 + b’2 y 2

are given parametrically by (x, y) = (ah1/2 cos θ, bh1/2 sin θ) for h ≥ 0.

167

Please send corrections however trivial to twk@dpmms.cam.ac.uk

this problem is resolved by giving a formal de¬nition of a connected set.

(3) Implicit in our argument is the idea that a loop divides a sphere

into two parts. A result called the Jordan curve theorem gives the formal

statement of this idea but the proof turns out to be unexpectedly hard,

Another, less important, problem is to show that the hypothesis that

there are only a ˜¬nite number S of summits, B of bottoms and P of passes™

applies to an interesting variety of cases. It is certainly not the case that a

function f : R ’ R will always have only a ¬nite number of maxima in a

closed bounded interval. In the same way, it is not true that a moon need

have only a ¬nite number of summits.

Exercise 7.3.24. Reread Example 7.1.5. De¬ne f : R ’ R by

f (x) = (cos(1/x) ’ 1) exp(’1/x2 ) if x = 0,

f (0) = 0

Show that f is in¬nitely di¬erentiable everywhere and that f has an in¬nite

number of distinct strict local maxima in the interval [’1, 1].

(Exercise K.42 belongs to the same circle of ideas.)

The answer, once again, is to develop a suitable notion of genericity but

we shall not do so here.

Some say will say that there is no need to answer these questions since

the plausible argument which establishes Plausible Statement 7.3.21 is in

some sense ˜obviously correct™. I would reply that the reason for attacking

these questions is their intrinsic interest. Plausible Statement 7.3.21 and the

accompanying discussion are the occasion for us to ask these questions, not

the reason for trying to answer them. I would add that we cannot claim to

understand Maxwell™s result fully unless we can see either how it generalises

to higher dimensions or why it does not.

Students often feel that multidimensional calculus is just a question of

generalising results from one dimension to many. Maxwell™s result shows that

the change from one to many dimensions introduces genuinely new phenom-

ena, whose existence cannot be guessed from a one dimensional perspective.

Chapter 8

The Riemann integral

8.1 Where is the problem ?

Everybody knows what area is, but then everybody knows what honey tastes

like. But does honey taste the same to you as it does to me? Perhaps the

question is unanswerable but, for many practical purposes, it is su¬cient

that we agree on what we call honey. In the same way, it is important that,

when two mathematicians talk about area, they should agree on the answers

to the following questions:-

(1) Which sets E actually have area?

(2) When a set E has area, what is that area?

One of the discoveries of 20th century mathematics is that decisions on (1)

and (2) are linked in rather subtle ways to the question:-

(3) What properties should area have?

As an indication of the ideas involved, consider the following desirable

properties for area.

(a) Every bounded set E in R2 has an area |E| with |E| ≥ 0.

(b) Suppose that E is a bounded set in R2 . If E is congruent to F (that

is E can be obtained from F by translation and rotation), then |E| = |F |.

(c) Any square E of side a has area |E| = a2 .

(d) If E1 , E2 , . . . are disjoint bounded sets in R2 whose union F = ∞ Ej

i=1

∞

is also bounded, then |F | = i=1 |Ej | (so ˜the whole is equal to the sum of

its parts™).

Exercise 8.1.1. Suppose that conditions (a) to (d) all hold.

(i) Let A be a bounded set in R2 and B ⊆ A. By writing A = B ∪ (A \ B)

and using condition (d) together with other conditions, show that |A| ≥ |B|.

(ii) By using (i) and condition (c), show that, if A is a non-empty bounded

open set, in R2 then |A| > 0.

169

170 A COMPANION TO ANALYSIS

We now show that assuming all of conditions (a) to (d) leads to a con-

tradiction. We start with an easy remark.

Exercise 8.1.2. If 0 ¤ x, y < 1, write x ∼ y whenever x ’ y ∈ Q. Show

that if x, y, z ∈ [0, 1) we have

(i) x ∼ x,

(ii) x ∼ y implies y ∼ x,

(iii) x ∼ y and y ∼ z together imply x ∼ z.

(In other words, ∼ is an equivalence relation.)

Write

[x] = {y ∈ [0, 1) : y ∼ x}.

(In other words, write [x] for the equivalence class of x.) By quoting the

appropriate theorem or direct proof, show that

(iv) [x] = [0, 1),