<< . .

. 22
( : 70)



. . >>

f (s) ds = f (g(x))g (x) dx.
g(c) c

4
Arnol´d calls it the Newton-Leibniz-Gauss-Green-Ostrogradskii-Stokes-Poincar´ the-
e
orem but most mathematicians call it the generalised Stokes™ theorem or just Stokes™
theorem.
188 A COMPANION TO ANALYSIS

Exercise 8.3.14. (i) Prove Theorem 8.3.13 by considering
g(t) t
f (s) ds ’
U (t) = f (g(x))g (x) dx.
g(c) c

(ii) Derive Theorem 8.3.11 from Theorem 8.3.13 by choosing f appropri-
ately.
(iii) Strengthen Theorem 8.3.13 along the lines of Exercise 8.3.12.
(iv) (An alternative proof.) If f is as in Theorem 8.3.13 explain why we
can ¬nd an F : (±, β) ’ R with F = f . Obtain Theorem 8.3.13 by applying
the chain rule to F (g(x))g (x) = f (g(x))g (x).
Because the proof of Theorem 8.3.13 is so simple and because the main use
of the result in elementary calculus is to evaluate integrals, there is tendency
to underestimate the importance of this result. However, it is important for
later developments that the reader has an intuitive grasp of this result.
Exercise 8.3.15. (i) Suppose that f : R ’ R is the constant function f (t) =
K and that g : R ’ R is the linear function g(t) = »t + µ. Show by direct
calculation that
g(d) d
f (s) ds = f (g(x))g (x) dx,
g(c) c

and describe the geometric content of this result in words.
(ii) Suppose now that f : R ’ R and g : R ’ R are well behaved
functions. By splitting [c, d] into small intervals on which f is ˜almost con-
stant™ and g is ˜almost linear™, give a heuristic argument for the truth of
Theorem 8.3.13. To see how this heuristic argument can be converted into a
rigorous one, consult Exercise K.118.
Exercise 8.3.16. There is one peculiarity in our statement of Theorem 8.3.13
which is worth noting. We do not demand that g be bijective. Suppose that
f : R ’ R is continuous and g(t) = sin t. Show that, by choosing di¬erent
intervals (c, d), we obtain
sin ± ±
f (s) ds = f (sin x) cos x dx
0 0
±+2π π’±
= f (sin x) cos x dx = f (sin x) cos x dx.
0 0

Explain what is going on.
The extra ¬‚exibility given by allowing g not be bijective is one we are
usually happy to sacri¬ce in the interests of generalising Theorem 8.3.13.
189
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Exercise 8.3.17. The following exercise is traditional.
(i) Show that integration by substitution, using x = 1/t, gives
b 1/a
dx dt
=
1 + x2 1 + t2
a 1/b

when b > a > 0.
(ii) If we set a = ’1, b = 1 in the formula of (i), we obtain
1 1
dx ? dt
=’
1 + x2 1 + t2
’1 ’1

Explain this apparent failure of the method of integration by substitution.
(iii) Write the result of (i) in terms of tan’1 and prove it using standard
trigonometric identities.
In sections 5.4 and 5.6 we gave a treatment of the exponential and loga-
rithmic functions based on di¬erentiation. The reader may wish to look at
Exercise K.126 in which we use integration instead.
Another result which can be proved in much the same manner as Theo-
rems 8.3.11 and Theorem 8.3.13 is the lemma which justi¬es integration by
parts. (Recall the notation [h(x)]b = h(b) ’ h(a).)
a

Lemma 8.3.18. Suppose that f : (±, β) ’ R has continuous derivative and
g : (±, β) ’ R is continuous. Let G : (±, β) ’ R be an inde¬nite integral of
g. Then, if [a, b] ⊆ (±, β), we have
b b
[f (x)G(x)]b ’
f (x)g(x) dx = f (x)G(x) dx.
a
a a

Exercise 8.3.19. (i) Obtain Lemma 8.3.18 by di¬erentiating an appropriate
U in the style of the proofs of Theorems 8.3.11 and Theorem 8.3.13. Quote
carefully the results that you use.
(ii) Obtain Lemma 8.3.18 by integrating both sides of the equality (uv) =
u v + uv and choosing appropriate u and v. Quote carefully the results that
you use.
(iii) Strengthen Lemma 8.3.18 along the lines of Exercise 8.3.12.
Integration by parts gives a global Taylor theorem with a form that is
easily remembered and proved for examination.
Theorem 8.3.20. (A global Taylor™s theorem with integral remain-
der.) If f : (u, v) ’ R is n times continuously di¬erentiable and 0 ∈ (u, v),
then
n’1
f (j) (0) j
f (t) = t + Rn (f, t)
j!
j=0
190 A COMPANION TO ANALYSIS

where
t
1
(t ’ x)n’1 f (n) (x) dx.
Rn (f, t) =
(n ’ 1)! 0

Exercise 8.3.21. By integration by parts, show that
f (n’1) (0) n’1
Rn (f, t) = t + Rn’1 (f, t).
(n ’ 1)!
Use repeated integration by parts to obtain Theorem 8.3.20.
Exercise 8.3.22. Reread Example 7.1.5. If F is as in that example, identify
Rn’1 (F, t).
Exercise 8.3.23. If f : (’a, a) ’ R is n times continuously di¬erentiable
with |f (n) (t)| ¤ M for all t ∈ (’a, a), show that
n’1
f (j) (0) j M |t|n
f (t) ’ t¤ .
j! n!
j=0

Explain why this result is slightly weaker than that of Exercise 7.1.1 (v).
There are several variants of Theorem 8.3.20 with di¬erent expressions for
Rn (f, t) (see, for example, Exercise K.49 (vi)). However, although the theory
of the Taylor expansion is very important (see, for example, Exercise K.125
and Exercise K.266), these global theorems are not much used in relation to
speci¬c functions outside the examination hall. We discuss two of the reasons
why at the end of Section 11.5. In Exercises 11.5.20 and 11.5.22 I suggest
that it is usually easier to obtain Taylor series by power series solutions rather
than by using theorems like Theorem 8.3.20. In Exercise 11.5.23 I suggest
that power series are often not very suitable for numerical computation.


First steps in the calculus of variations ™
8.4
The most famous early problem in the calculus of variations is that of the
brachistochrone. It asks for the equation y = f (x) of the wire down which a
frictionless particle with initial velocity v will slide from one point (a, ±) to
another (b, β) (so f (a) = ±, f (b) = β, a = b and ± > β) in the shortest time.
It turns out that that time taken by the particle is
1/2
b
1 + f (x)2
1
J(f ) = dx
(2g)1/2 κ ’ f (x)
a

where κ = v 2 /(2g) + ± and g is the acceleration due to gravity.
191
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Exercise 8.4.1. If you know su¬cient mechanics, verify this. (Your argu-
ment will presumably involve arc length which has not yet been mentioned in
this book.)
This is a problem of minimising which is very di¬erent from those dealt
with in elementary calculus. Those problems ask us to choose a point x0 from
a one-dimensional space which minimises some function g(x). In section 7.3
we considered problems in which we sought to choose a point x0 from a
n-dimensional space which minimises some function g(x). Here we seek to
choose a function f0 from an in¬nite dimensional space to minimise a function
J(f ) of functions f .
Exercise 8.4.2. In the previous sentence we used the words ˜in¬nite dimen-
sional™ somewhat loosely. However we can make precise statements along the
same lines.
(i) Show that the collection P of polynomials P with P (0) = P (1) = 0
forms a vector space over R with the obvious operations. Show that P is
in¬nite dimensional (in other words, has no ¬nite spanning set).
(ii) Show that the collection E of in¬nitely di¬erentiable functions f :
[0, 1] ’ R with f (0) = f (1) forms a vector space over R with the obvious
operations. Show that E is in¬nite dimensional.
John Bernoulli published the brachistochrone problem as a challenge in
1696. Newton, Leibniz, L™Hˆpital, John Bernoulli and James Bernoulli all
o
found solutions within a year5 . However, it is one thing to solve a particular
problem and quite another to ¬nd a method of attack for the general class
of problems to which it belongs. Such a method was developed by Euler
and Lagrange. We shall see that it does not resolve all di¬culties but it
represents a marvelous leap of imagination.
We begin by proving that, under certain circumstances, we can inter-
change the order of integration and di¬erentiation. (We will extend the
result in Theorem 11.4.21.)
Theorem 8.4.3. (Di¬erentiation under the integral.) Let (a , b ) —
(c , d ) ⊇ [a, b] — [c, d]. Suppose that g : (a , b ) — (c , d ) ’ R is continuous
and that the partial derivative g,2 exists and is continuous. Then writing
b
G(y) = a g(x, y) dx we have G di¬erentiable on (c, d) with
b
G (y) = g,2 (x, y) dx.
a
5
They were giants in those days. Newton had retired from mathematics and submitted
his solution anonymously. ˜But™ John Bernoulli said ˜one recognises the lion by his paw.™
192 A COMPANION TO ANALYSIS

This result is more frequently written as
b b
d ‚g
g(x, y) dx = (x, y) dx,
dy ‚y
a a

and interpreted as ˜the d clambers through the integral and curls up™. If we
use the D notation we get
b
G (y) = D2 g(x, y) dx.
a

b
It may, in the end, be more helpful to note that a g(x, y) dx is a function of
the single variable y, but g(x, y) is a function of the two variables x and y.
Proof. We use a proof technique which is often useful in this kind of situation
(we have already used a simple version in Theorem 8.3.6, when we proved
the fundamental theorem of the calculus).
We ¬rst put everything under one integral sign. Suppose y, y + h ∈ (c, d)
and h = 0. Then
b b
G(y + h) ’ G(y) 1
’ G(y + h) ’ G(y) ’
g,2 (x, y) dx = hg,2 (x, y) dx
|h|
h a a
b
1
g(x, y + h) ’ g(x, y) ’ hg,2 (x, y) dx
=
|h| a
In order to estimate the last integral we use the simple result (Exercise 8.2.13 (iv))

|integral| ¤ length — sup

which gives us
b
1
g(x, y + h) ’ g(x, y) ’ hg,2 (x, y) dx
|h| a
b’a
¤ sup |g(x, y + h) ’ g(x, y) ’ hg,2 (x, y)|.
|h| x∈[a,b]

We expect |g(x, y +h)’g(x, y)’hg,2 (x, y)| to be small when h is small be-
cause the de¬nition of the partial derivative tells us that g(x, y+h)’g(x, y) ≈
hg,2 (x, y). In such circumstances, the mean value theorem is frequently use-
ful. In this case, setting f (t) = g(x, y + t) ’ g(x, y), the mean value theorem
tells us that

|f (h)| = |f (h) ’ f (0)| ¤ |h| sup |f (θh)|
0¤θ¤1
193
Please send corrections however trivial to twk@dpmms.cam.ac.uk

and so
|g(x, y + h) ’ g(x, y) ’ hg,2 (x, y)| ¤ |h| sup |g,2 (x, y + θh) ’ g,2 (x, y)|.
0¤θ¤1

There is one further point to notice. Since we are taking a supremum
over all x ∈ [a, b], we shall need to know, not merely that we can make
|g,2 (x, y + θh) ’ g,2 (x, y)| small at a particular x by taking h su¬ciently
small, but that we can make |g,2 (x, y + θh) ’ g,2 (x, y)| uniformly small for
all x. However, we know that g,2 is continuous on [a, b] — [c, d] and that a
function which is continuous on a closed bounded set is uniformly continuous
and this will enable us to complete the proof.
Let > 0. By Theorem 4.5.5, g,2 is uniformly continuous on [a, b] — [c, d]
and so we can ¬nd a δ( ) > 0 such that
|g,2 (x, y) ’ g,2 (u, v)| ¤ /(b ’ a)
whenever (x’u)2 +(y ’v)2 < δ( ) and (x, y), (u, v) ∈ [a, b]—[c, d]. It follows
that, if y, y + h ∈ (c, d) and |h| < δ( ), then
sup |g,2 (x, y + θh) ’ g,2 (x, y)| ¤ /(b ’ a)
0¤θ¤1

for all x ∈ [a, b]. Putting all our results together, we have shown that
b
G(y + h) ’ G(y)
’ g,2 (x, y) dx <
h a

whenever y, y + h ∈ (c, d) and 0 < |h| < δ( ) and the result follows.
Exercise 8.4.4. Because I have tried to show where the proof comes from,
the proof above is not written in a very economical way. Rewrite it more
economically.
A favourite examiner™s variation on the theme of Theorem 8.4.3 is given in
Exercise K.132.
Exercise 8.4.5. In what follows we will use a slightly di¬erent version of
Theorem 8.4.3.
Suppose g : [a, b] — [c, d] is continuous and that the partial derivative g ,2
b
exists and is continuous. Then, writing G(y) = a g(x, y) dx, we have G
di¬erentiable on [c, d] with
b
G (y) = g,2 (x, y) dx.
a

Explain what this means in terms of left and right derivatives and prove
it.
194 A COMPANION TO ANALYSIS

The method of Euler and Lagrange applies to the following class of prob-
lems. Suppose that F : R3 ’ R has continuous second partial derivatives.
We consider the set A of functions f : [a, b] ’ R which are di¬erentiable
with continuous derivative and are such that f (a) = ± and f (b) = β. We
write
b
J(f ) = F (t, f (t), f (t)) dt.
a

and seek to minimise J, that is to ¬nd an f0 ∈ A such that

J(f0 ) ¤ J(f )

whenever f ∈ A.
In section 7.3, when we asked if a particular point x0 from an n-dimensional
space minimised g : Rn ’ R, we examined the behaviour of g close to x0 . In
other words, we looked at g(x0 + ·u) when u was an arbitrary vector and ·
was small. The idea of Euler and Lagrange is to look at

Gh (·) = J(f0 + ·h)

where h : [a, b] ’ R is di¬erentiable with continuous derivative and is such
that h(a) = 0 and h(b) = 0 (we shall call the set of such functions E). We
observe that Gh is a function from R and that Gh has a minimum at 0 if J
is minimised by f0 . This observation, combined with some very clever, but
elementary, calculus gives the celebrated Euler-Lagrange equation.

Theorem 8.4.6. Suppose that F : R3 ’ R has continuous second partial
derivatives. Consider the set A of functions f : [a, b] ’ R which are di¬er-
entiable with continuous derivative and are such that f (a) = ± and f (b) = β.
We write
b
J(f ) = F (t, f (t), f (t)) dt.
a

If f ∈ A such that

J(f ) ¤ J(g)

whenever g ∈ A then

d
F,2 (t, f (t), f (t)) = F,3 (t, f (t), f (t)).
dt
195
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Proof. We use the notation of the paragraph preceding the statement of
the theorem. If h ∈ E (that is to say h : [a, b] ’ R is di¬erentiable with
continuous derivative and is such that h(a) = 0 and h(b) = 0) then the chain
rule tells us that the function gh : R2 ’ R given by

gh (·, t) = F (t, f (t) + ·h(t), f (t) + ·h (t))

has continuous partial derivative

gh,1 (·, t) = h(t)F,2 (t, f (t) + ·h(t), f (t) + ·h (t)) + h (t)F,3 (t, f (t) + ·h(t), f (t) + ·h (t)).

Thus by Theorem 8.4.3, we may di¬erentiate under the integral to show that
Gh is di¬erentiable everywhere with

Gh (·) =
b
h(t)F,2 (t, f (t) + ·h(t), f (t) + ·h (t)) + h (t)F,3 (t, f (t) + ·h(t), f (t) + ·h (t)) dt.
a

If f minimises J, then 0 minimises Gh and so Gh (0) = 0. We deduce that
b
0= h(t)F,2 (t, f (t), f (t)) + h (t)F,3 (t, f (t), f (t)) dt
a
b b
= h(t)F,2 (t, f (t), f (t)) dt + h (t)F,3 (t, f (t), f (t)) dt.
a a

Using integration by parts and the fact that h(a) = h(b) = 0 we obtain
b b
d
b

h (t)F,3 (t, f (t), f (t)) dt = [h(t)F,3 (t, f (t), f (t))]a h(t) F,3 (t, f (t), f (t)) dt
dt
a a
b
d
=’ h(t) F,3 (t, f (t), f (t)) dt.
dt
a

Combining the results of the last two sentences, we see that
b
d
h(t) F,2 (t, f (t), f (t)) ’
0= F,3 (t, f (t), f (t)) dt.
dt
a

Since this result must hold for all h ∈ A, we see that
d
F,2 (t, f (t), f (t)) ’ F,3 (t, f (t), f (t)) = 0
dt
for all t ∈ [a, b] (for details see Lemma 8.4.7 below) and this is the result we
set out to prove.

<< . .

. 22
( : 70)



. . >>