<< . .

. 27
( : 70)



. . >>


If we choose nj ’ ∞ and mj ’ ∞ in such a way that m2 /nj ’ » as
j
j ’ ∞, what happens to A(mj , nj )? Can you choose nj ’ ∞ and mj ’ ∞
so that A(mj , nj ) ’ ∞ as j ’ ∞? Can you choose nj ’ ∞ and mj ’ ∞
so that A(mj , nj ) is bounded but does not converge? Can you choose nj ’ ∞
and mj ’ ∞ so that A(mj , nj ) ’ π?
Exercise 9.5.15. Explain in words and without using calculations why, if
we ¬x n, A(m, n) ’ ∞ as m ’ ∞. (Unless you can give a simple ge-
ometric account of what is going on, you have not understood Schwarz™s
example.) Deduce directly that we can choose nj ’ ∞ and mj ’ ∞ such
that A(mj , nj ) ’ ∞ as j ’ ∞. [Thus, if we simply want to show that the
naive approach fails, we do not need the calculations of the previous exercise.
However, if we want more information (for example, necessary and su¬cient
conditions for A(mj , nj ) ’ 2π) then we must do the calculations.]
Once the reader has grasped the point of Exercise 9.5.15, she may feel
that it is obvious what is wrong. However, the problem is not that we cannot
see what the area of a cylinder ought to be, but that we cannot think of a
simple de¬nition of area which will apply to general surfaces in the same
way that our de¬nition of length applied to general curves. Up to now, all
attempts to produce a de¬nition of area to parallel our de¬nition of length
have failed12 .
If a surface is su¬ciently well behaved, it is more or less clear how to frame
a notion of approximation by polyhedra which excludes Schwarz™s example.
But, if a surface is su¬ciently well behaved, we can produce a rigorous de¬ni-
tion of its area by tidying up the standard mathematical methods de¬nition.
If we are going to do this for the area of a surface and its higher dimensional
analogues, it seems a waste of time to have a special theory for length. Life,
as the saying goes, is too short to stu¬ an olive. The next exercise gives the
standard development.
Exercise 9.5.16. (Standard treatment of line integrals.) We deal only
with curves γ : [a, b] ’ Rm which are continuously di¬erentiable (so that,
writing
γ(t) = (γ1 (t), γ2 (t), . . . , γm (t)),
we know that γj : [a, b] ’ R exists and is continuous). If f : Rm ’ R is
continuous, we de¬ne
b
f (γ1 (t), γ2 (t), . . . , γm (t))(γ1 (t)2 + γ2 (t)2 + · · · + γm (t)2 )1/2 dt.
f (x) ds =
a
γ
12
There are other ways of tackling this problem. Once again I refer the reader to the
circle of ideas associated with the name of Hausdor¬.
232 A COMPANION TO ANALYSIS



Suppose that θ : [c, d] ’ [a, b] is a surjective function with continuous
derivative. Explain why „ = γ —¦ θ is a continuously di¬erentiable function
from [c, d] to Rm . Show that, if θ is injective, then

f (x) ds = f (x) ds,
„ γ

but give an example to show that, if θ is not injective, this may not hold.
We de¬ne the length of γ to be 1 ds.
γ
Chapter 10

Metric spaces

Sphere packing ™
10.1
(In this section and the next we are much more interested in ideas than
rigour. We shall use methods and results which go well beyond the scope of
this book.)
Human beings prefer order to disorder. Asked to pack a crate of oranges,
we do not throw them in at random, but try to pack them in a regular
pattern. But choosing patterns requires insight and sometimes we have no
insight. Consider the problem of packing n dimensional balls in a very large
box. (We shall take our balls and boxes to be open, but it should be clear
that such details do not matter.)
As an example of a regular packing, let the balls have radius 1/2 and let
us adopt a cubical packing so that the centre of each ball is one of integer
points Zn . Is this reasonably e¬cient or not? To answer this question it is
helpful to know the volume Vn (r) of an n dimensional ball of radius r.
Lemma 10.1.1. (i) If we write Vn = Vn (1), then Vn (r) = Vn rn .
(ii) If a > 0 and I[0,a] : [0, ∞) ’ R is de¬ned by I[0,a] (t) = 1 if t ∈ [0, a],
I[0,a] (t) = 0, otherwise, then

I[0,a] ( r ) dV = nVn I[0,a] (r)rn’1 dr.
Rn 0
k
(iii) If f : [0, ∞) ’ R is given by f = »j I[0,aj ] , then
j=1

f (r)rn’1 dr.
f ( r ) dV = nVn
Rn 0
(iv) If f : [0, ∞) ’ R is such that f (r)r n+1 ’ 0 as r ’ ∞, then

f (r)rn’1 dr.
f ( r ) dV = nVn
Rn 0

233
234 A COMPANION TO ANALYSIS

(v) Taking f (r) = exp(’r 2 /2) in (ii), we obtain

n/2
rn’1 exp(’r2 /2) dr,
(2π) = nVn
0

and so
πn n!22n π n’1
V2n = , V2n’1 = .
n! (2n)!

Sketch proof. (i) Use similarity.
(ii) This is a restatement of (i).
(iii) Use linearity of the integral.
(iv) Use an approximation argument.
(v) Using repeated integration,
∞ ∞ ∞
exp(’(x2 + x2 + · · · + x2 )/2) dx1 dx2 . . . dxn
···
f ( r ) dV = 1 2 n
Rn ’∞ ’∞ ’∞
∞ ∞ ∞
exp(’x2 /2) dx1 exp(’x2 /2) dx2 · · · exp(’x2 /2) dxn = (2π)n/2 .
= 1 2 n
’∞ ’∞ ’∞

On the other hand, integration by parts gives
∞ ∞
n’1 2
rn’3 exp(’r2 /2) dr,
exp(’r /2) dr = (n ’ 2)
r
0 0

so

n/2
rn’1 exp(’r2 /2) dr
(2π) = nVn
0

rn’3 exp(’r2 /2) dr = (2π)(n’2)/2 nVn /Vn’2
= n(n ’ 2)Vn
0

and Vn = (2π)Vn’2 /n. The stated results follow by induction.
Remark: Since we are not seeking rigour in this section, I have only sketched
a proof. However, few mathematicians would demand more proof for this
result. As I emphasise in Appendix C, we need rigour when we make general
statements which must apply to objects we have not yet even imagined. Here,
we need a result about a speci¬c property of a speci¬c object (the volume of
an n-dimension sphere), so we can have much more con¬dence in our sketched
proof.
If we use a cubical packing, the proportion of space occupied by balls is
2 Vn (think about the ball center q radius 1/2 inside the cube n (qj ’
’n
j=1
235
Please send corrections however trivial to twk@dpmms.cam.ac.uk

1/2, qj ’ 1/2)) and Lemma 10.1.1 shows that this proportion drops very
rapidly indeed as n becomes large.
What happens if we ˜just place balls wherever we can™. Suppose we are
trying to ¬ll the cube

C = {x : ’N ’ 1/2 < xj < N + 1/2 [1 ¤ j ¤ n]}

with non-intersecting balls of radius 1/2, and that we have managed to place
m such balls

Sk = {x : x ’ yk < 1/2} [1 ¤ k ¤ m].

Now consider the balls with the same centres and doubled radius

σk = {x : x ’ yk < 1} [1 ¤ k ¤ m].

A little thought shows that, if y lies inside the cube

C = {x : ’N < xj < N [1 ¤ j ¤ n]}

and does not lie in any σk , then the ball

S = {x : x ’ y < 1/2}

lies in C and does not intersect any Sk , so we may add another ball to our
collection. Such a point y will always exist if
m
Vol C > Vol σk
k=1
m m
σk , and so C \ σk = …). Thus if
(since then Vol C > Vol k=1 k=1

Vol C > mVn

we can add extra balls and so, by just ¬lling up available spaces, without
following any pattern, we can ¬nd at least Vol C /Vn disjoint balls of radius
1/2 in C. Since the volume of a ball of radius 1/2 is 2’n Vn the proportion of
1
space occupied by balls is 2’n Vol C / Vol C = 2’n (1 + 2N )’n so, for N large,
the proportion of space occupied by balls is essentially 2’n , an astounding
gain in e¬ciency over cubical packing.
What morals should we draw? The most immediate is that we do not
know what an n-dimensional ball looks like! (Here, ˜we™ refers to the writer
and most of his audience. Specialists in topics like the geometry of Banach
spaces do have substantial insight.) The second is that, when we have little
insight, trying to impose order on things may well be counter productive.
236 A COMPANION TO ANALYSIS

Shannon™s theorem ™
10.2
It costs money to send messages. In the old days, telegrams might be charged
for by the word. Nowadays, we pay for communication channels which carry
so many bits per second (so we are charged by the bit). Suppose that I can
a¬ord to send three bits of information to my partner. Then we can make
up eight words 000, 001, . . . and use them to send eight di¬erent messages.
For example, 000 could mean ˜send money™, 001 ˜come at once™, 010 ˜sell all
shares™, 011 ˜¬‚ee the country™ and so on. Unfortunately, bits are sometimes
received wrongly, so 011 could be sent and 001 received with unfortunate
consequences. We might, therefore, decide to have only two messages 000
(˜all well™) and 111 (˜¬‚ee at once™) so that, if at most one digit was received
wrongly, my partner could still take the correct action.
Exercise 10.2.1. Explain why the last statement is true.
If I can a¬ord to send 100 bits, then we can still decide to only have two
messages 0000 . . . 0 and 1111 . . . 1 but, if mistakes are rare, this is extremely
wasteful. How many di¬erent messages can we send and still be reasonably
con¬dent that we can tell which message was transmitted even when errors
occur? This section gives a simple but very useful model for the problem
and solves it.
Consider X = {0, 1}n as a vector space over F2 1 . Each element x ∈
{0, 1}n may be considered as a ˜word™ so X contains 2n possible words. In
transmitting a word over a noisy channel it may become corrupted to
x+e
where each coordinate ej of the random error e takes the value 1 with prob-
ability p and the value 0 with probability 1 ’ p, independent of the other
coordinates [0 < p < 1/2].
Exercise 10.2.2. Why do we not have to consider p > 1/2? Why is it
hopeless to consider p = 1/2?
If we choose q such that 1/2 > q > p > 0, then the law of large numbers
tells us that, provided n is large enough, it is very unlikely that more than
qn of the j are non-zero2 We may, therefore, simplify our problem to one in
which we know that at most qn of the j are non-zero.
It is natural to introduce the following notion of the distance d(x, y)
between two words (the ˜Hamming distance™).
1
This means that when we add two vectors x and y the jth component of x + y has
the form xj + yj where 0 + 0 = 1 + 1 = 0 and 1 + 0 = 0 + 1 = 1.
2
We prove this result directly in Exercise K.173.
237
Please send corrections however trivial to twk@dpmms.cam.ac.uk

De¬nition 10.2.3. If x, y ∈ X, we write
n
|xi ’ yi |.
d(x, y) =
j=1


Suppose that we only transmit the words y1 , y2 , . . . ym and that the
balls

Sk = {x : d(x, yk ) < qn} [1 ¤ k ¤ m],

do not intersect. Since there are no more than qn errors, the received message
will lie in one of the balls Sj , say, and the transmitted message will have been
yj . Thus our system allows us to communicate m distinct messages correctly
in spite of the random noise e.
This sounds impressive, but is not very useful unless m is reasonably
large. How large can m be? This is clearly a variant of our orange packing
problem. The natural way to proceed is to de¬ne the volume of a subset E
of X to be the number of points in E. Reusing the ideas of our previous
argument, we consider the balls with the same centres and doubled radius

σk = {x : d(x, yk ) < 2qn} [1 ¤ k ¤ m].

If y does not lie in any σk , then the ball

S = {x : d(x, y) < qn}

does not intersect any Sk , so we may add another ball to our collection. Such
a point y will always exist if
m
Vol X > Vol σk ,
k=1

that is, if
m
2n > Vol σk .
k=1

The only problem that faces us, is to estimate the volume of a ball in this
context.
The key turns out to be a simple but, to many people, initially surprising
result.
238 A COMPANION TO ANALYSIS

Lemma 10.2.4. Suppose 0 < » < 1/2.
(i) If 1 ¤ r ¤ »n then
n » n
¤ .
r’1 1’» r
(ii) There exists a constant A(») such that a ball of radius »n in X has
n n
volume at most A(») and at least .
[»n] [»n]
Proof. (i) Observe that
r
n n r »
n
¤
= = .
1 ’ r’1
r’1 n+1’r 1’»
r n

(ii) Thus, if r» = [»n], the greatest integer less than »n, we have
n
Volume ball radius »n =
r
r¤»n
r» ’r
» n
¤
1’» r»
r¤r»
∞ m
» n
¤
1’» r»
m=0
1’» n
= .
1 ’ 2» [»n]
Taking A(») = (1 ’ »)/(1 ’ 2»), we have the required result.
Exercise 10.2.5. (i) If 1 > » > 1/2, ¬nd a good estimate for the volume of
a ball of radius »n in X when n is large.
(ii) Lemma 10.2.4 says, in e¬ect, that the volume of an appropriate ball in
X is concentrated near its ˜surface™ (that is those points whose distance from
the centre is close to the radius). To what extent is this true for ordinary
balls in Rm when m is large?
In part (i) of Lemma 10.2.6 we recall a simple form of Stirling™s formula
(obtained in Exercise 9.2.7 (iii)). In the rest of the lemma we use it to obtain
estimates of the volume V (», n) of a ball of radius »n in X = {0, 1}n when
n is large. We use the notation
log b
loga b =
log a
where a, b > 0. (The reader probably knows that log a b is called ˜the loga-
rithm of b to base a™. She should check the relation aloga b = b.)
239
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Lemma 10.2.6. (i) loge N ! = N loge N + N + θ(N )N , where θ(N ) ’ 0 as
N ’ ∞.
(ii) If 0 < » < 1/2, then
n’1 loge V (», n) ’ ’» loge » ’ (1 ’ ») loge (1 ’ »)
as n ’ ∞.
(iii) If 0 < » < 1/2, then
n’1 log2 V (», n) ’ ’» log2 » ’ (1 ’ ») log2 (1 ’ »)
as n ’ ∞.
Proof. (i) This is Exercise 9.2.7 (iii).
(ii) In what follows we shall be replacing a real number y by an integer
m with |m ’ y| ¤ 1. Observe that, if f is well behaved, the mean value
inequality gives |f (y) ’ f (m)| ¤ sup|t’y|¤1 |f (t)|.
By Lemma 10.2.4, part (i) of the present lemma, and the remark just
made, there exist θ1 (n), θ2 (n) ’ 0, as n ’ ∞ such that
n
n’1 loge V (», n) = n’1 loge
[»n]
= n’1 loge n! ’ loge (n ’ [»n])! ’ loge [»n]!
= n’1 n loge n + n ’ (n ’ [»n]) loge (n ’ [»n]) ’ [»n] ’ [»n] loge [»n] + θ1 (n)
= n’1 n loge n + n ’ (n ’ »n) loge (n ’ »n) ’ »n ’ »n loge »n + θ2 (n)
= n’1 n loge n + n ’ (1 ’ »)n(loge (1 ’ ») + loge n) ’ »n(loge » + loge n) + θ2 (n)
= ’» loge » ’ (1 ’ ») loge (1 ’ ») + θ2 (n),
and this is the required result.
(iii) Just apply the de¬nition of log 2 .
Let us write H(0) = H(1) = 0 and
H(») = ’» log2 » ’ (1 ’ ») log2 (1 ’ »)
for 0 < » < 1.
Exercise 10.2.7. Prove the following results and then sketch the graph of
H.
(i) H(s) = H(1 ’ s) for all s ∈ [0, 1].
(ii) H : [0, 1] ’ R is continuous.
(iii) H is twice di¬erentiable on (0, 1) with H (t) < 0 for all t ∈ (0, 1).
(iv) 0 ¤ H(t) ¤ 1 for all t ∈ [0, 1]. Identify the maximum and minimum
points.
(v) H(t)/t ’ 1 as t ’ 0 through positive values.
240 A COMPANION TO ANALYSIS

Our discussion has shown that we can ¬nd 2n(1’H(2q)) code words all a
Hamming distance at least q n apart. We can immediately interpret this
result in terms of our original problem.

Lemma 10.2.8. Consider our noisy channel. If p, the probability of error
in a single bit, is less than 1/4, then, choosing p < p < 1/4, we can transmit,
with error rate as small as we please, so that information is passed at a rate
of 1 ’ H(2p ) times that of our original channel.

Shannon pushed the argument a little bit further. Let · > 0 be small
and let N be such that

N — Vol ball radius qn ≈ · Vol X.

Choose N points yj at random in X and let them be code words. There is
no reason to expect that the balls

Sk = {x : d(x, yk ) < qn} [1 ¤ k ¤ N ],

do not intersect (and excellent reasons for supposing that they will).

<< . .

. 27
( : 70)



. . >>