<< . .

. 18
( : 20)



. . >>

outcome has changed from Republican in one year to Democratic in
the next year, and for b the fraction of reverse changes.
We can obtain a better approximation by taking into account the
previous two elections. In this case our states are RR, RD, DR, and
DD, indicating the outcome of two successive elections. Being in state
RR means that the last two elections were Republican victories. If the
next election is a Democratic victory, we will be in state RD. If the
election outcomes for a series of years is DDDRDRR, then our process
has moved from state DD to DD to DR to RD to DR, and ¬nally to RR.
Notice that the ¬rst letter of the state to which we move must agree
with the second letter of the state from which we came, since these
refer to the same election year. Our matrix of transition probabilities
will then have the form,
RR DR RD DD
« 
1’a
RR 0 a 0
¬ ·
1’b
DR b 0 0 .
¬ ·
¬ ·
1’c
RD 0 0 c
 
1’d
DD 0 d 0
Again the numbers a, b, c, and d would have to be estimated. The
™¦
study of this example is continued in ??.

Example 4.30 The following example of a Markov chain has been
used in physics as a simple model for di¬usion of gases. We shall see
later that a similar model applies to an idealized problem in changing
populations.
We imagine n black balls and n white balls which are put into
two urns so that there are n balls in each urn. A single experiment
consists in choosing a ball from each urn at random and putting the ball
obtained from the ¬rst urn into the second urn, and the ball obtained
from the second urn into the ¬rst. We take as state the number of
black balls in the ¬rst urn. If at any time we know this number, then
we know the exact composition of each urn. That is, if there are j black
balls in urn 1, there must be n ’ j black balls in urn 2, n ’ j white
balls in urn 1, and j white balls in urn 2. If the process is in state j,
then after the next exchange it will be in state j ’ 1, if a black ball is
chosen from urn 1 and a white ball from urn 2. It will be in state j if a
ball of the same color is drawn from each urn. It will be in state j + 1
169
4.13. MARKOV CHAINS

if a white ball is drawn from urn 1 and a black ball from urn 2. The
transition probabilities are then given by (see Exercise 12)

j
pjj’1 = ( )2 , j > 0
n
2j(n ’ j)
pjj =
n2
n’j 2
pjj+1 = ( ) , j<n
n
pjk = 0 otherwise.

A physicist would be interested, for example, in predicting the compo-
sition of the urns after a certain number of exchanges have taken place.
Certainly any predictions about the early stages of the process would
depend upon the initial composition of the urns. For example, if we
started with all black balls in urn 1, we would expect that for some time
there would be more black balls in urn 1 than in urn 2. On the other
hand, it might be expected that the e¬ect of this initial distribution
would wear o¬ after a large number of exchanges. We shall see later,
™¦
in ??, that this is indeed the case.


Exercises
1. Draw a state diagram for the Markov chain with transition prob-
abilities given by the following matrices.
« 
1 1
0
2 2
¬
0 1 0 ·,
 
1 1
02
2

« 
1 1 1
3 3 3
¬ ·
1 1 1
,
 3 3 3
1 1 1
3 3 3

01
,
10
« 
0 100
¬ 0 0 0·
1
¬ ·
1 ·.
¬ 1
0 0 2 2

011
0 2 2
170 CHAPTER 4. PROBABILITY THEORY




Figure 4.24: ™¦

2. Give the matrix of transition probabilities corresponding to the
transition diagrams in Figure 4.24.

3. Find the matrix P (2) for the Markov chain determined by the
matrix of transition probabilities
1 1
2 2
P= .
1 2
3 3


5 7
12 12
[Ans. .]
7 11
18 18


4. What is the matrix of transition probabilities for the Markov
chain in Example 4.30, for the case of two white balls and two
black balls?

5. Find the matrices P (2) , P (3) , P (4) for the Markov chain deter-
mined by the transition probabilities

10
.
01

Find the same for the Markov chain determined by the matrix

01
.
10
171
4.13. MARKOV CHAINS

6. Suppose that a Markov chain has two states, a1 and a2 , and
transition probabilities given by the matrix
1 2
3 3 .
1 1
2 2

By means of a separate chance device we choose a state in which
1
to start the process. This device chooses a1 with probability 2
1
and a2 with probability 2 . Find the probability that the process
is in state a1 after the ¬rst step. Answer the same question in the
1
case that the device chooses a1 with probability 3 and a2 with
2
probability 3 .

54
[Ans. ; .]
12 9


7. Referring to the Markov chain with transition probabilities indi-
cated in Figure 4.22, construct the tree measures and determine
the values of
(3) (3) (3)
p21 , p22 , p23
and
(3) (3) (3)
p31 , p32 , p33 .

8. A certain calculating machine uses only the digits 0 and 1. It is
supposed to transmit one of these digits through several stages.
However, at every stage there is a probability p that the digit
which enters this stage will be changed when it leaves. We form a
Markov chain to represent the process of transmission by taking
as states the digits 0 and 1. What is the matrix of transition
probabilities?

9. For the Markov chain in Exercise 8, draw a tree and assign a tree
measure, assuming that the process begins in state 0 and moves
through three stages of transmission. What is the probability
that the machine after three stages produces the digit 0, i.e., the
correct digit? What is the probability that the machine never
changed the digit from 0?

10. Assume that a man™s profession can be classi¬ed as professional,
skilled laborer, or unskilled laborer. Assume that of the sons
of professional men 80 per cent are professional, 10 per cent are
skilled laborers, and 10 per cent are unskilled laborers. In the
172 CHAPTER 4. PROBABILITY THEORY




Figure 4.25: ™¦

case of sons of skilled laborers, 60 per cent are skilled laborers, 20
per cent are professional, and 20 per cent are unskilled laborers.
Finally, in the case of unskilled laborers, 50 per cent of the sons
are unskilled laborers, and 25 per cent each are in the other two
categories. Assume that every man has a son, and form a Markov
chain by following a given family through several generations. Set
up the matrix of transition probabilities. Find the probability
that the grandson of an unskilled laborer is a professional man.

[Ans. .375.]

11. In Exercise 10 we assumed that every man has a son. Assume
instead that the probability a man has a son is .8. Form a Markov
chain with four states. The ¬rst three states are as in Exercise
10, and the fourth state is such that the process enters it if a man
has no son, and that the state cannot be left. This state repre-
sents families whose male line has died out. Find the matrix of
transition probabilities and ¬nd the probability that an unskilled
laborer has a grandson who is a professional man.

[Ans. .24.]

12. Explain why the transition probabilities given in Example 4.30
are correct.

Supplementary exercises.

13. Five points are marked on a circle. A process moves clockwise
2
from a given point to its neighbor with probability 3 , or counter-
clockwise to its neighbor with probability 1 .
3
173
4.13. MARKOV CHAINS

(a) Considering the process to be a Markov chain process, ¬nd
the matrix of transition probabilities.
(b) Given that the process starts in a state 3, what is the prob-
ability that it returns to the same state in two steps?

14. In northern New England, years for apples can be described as
good, average, or poor. Suppose that following a good year the
probabilities of good, average, or poor years are respectively .4, .4,
and .2. Following a poor year the probabilities of good, average,
or poor years are .2, .4, and .4 respectively. Following an average
year the probabilities that the next year will be good or poor are
each .2, and of an average year, .6.

(a) Set up the transition matrix of this Markov chain.
(b) 1965 was a good year. Compute the probabilities for 1966,
1967, and 1968.
[Ans. For 1967: .28, .48, .24.]
1
15. In Exercise 14 suppose that there is probability 4 for a good
year, 2 for an average year, and 1 for a poor year. What are the
1
4
probabilities for the following year?

16. A teacher in an oversized mathematics class ¬nds, after grading
all homework papers for the ¬rst two assignments, that it is nec-
essary to reduce the amount of time spent in such grading. He
therefore designs the following system: Papers will be marked
satisfactory or unsatisfactory. All papers of students receiving a
mark of unsatisfactory on any assignment will be read on each of
the two succeeding days. Of the remaining papers, the teacher
will read one-¬fth, chosen at random. Assuming that each paper
has a probability of one-¬fth of being classi¬ed “unsatisfactory”,

(a) Set up a three-state Markov chain to describe the process.
(b) Suppose that a student has just handed in a satisfactory
paper. What are the probabilities for the next two assign-
ments?

17. In another model for di¬usion, it is assumed that there are two
urns which together contain N balls numbered from 1 to N . Each
second a number from 1 to N is chosen at random, and the ball
with the corresponding number is moved to the other urn. Set
174 CHAPTER 4. PROBABILITY THEORY

up a Markov chain by taking as state the number of balls in urn
1. Find the transition matrix.


4.14 The central limit theorem
We continue our discussion of the independent trials process with two
outcomes. As usual, let p be the probability of success on a trial, and
f (n, p; x) be the probability of exactly x successes in n trials.
In Figure 4.26 we have plotted bar graphs which represent f (n, .3; x)
for n = 10, 50, 100, and 200. We note ¬rst of all that the graphs are
drifting o¬ to the right. This is not surprising, since their peaks occur
at np, which is steadily increasing. We also note that while the total
area is always 1, this area becomes more and more spread out.
We want to redraw these graphs in a manner that prevents the
drifting and the spreading out. First of all, we replace x by x ’ np,
assuring that our peak always occurs at 0. Next we introduce a new
unit for measuring the deviation, which depends on n, and which gives
comparable scales. As we saw in Section 4.10, the standard deviation

npq is such a unit.
We must still insure that probabilities are represented by areas in
the graph. In Figure 4.26 this is achieved by having a unit base for each
rectangle, and having the probability f (n, p; x) as height. Since we are
now representing a standard deviation as a single unit on the horizontal

axis, we must take f (n, p; x) npq as the heights of our rectangles. The
resulting curves for n = 50 and 200 are shown in Figures 4.27 and 4.28,
respectively.
We note that the two ¬gures look very much alike. We have also
shown in Figure 4.28 that it can be approximated by a bell-shaped
curve. This curve represents the function

1 2
f (x) = √ e’x /2


and is known as the normal curve. It is a fundamental theorem of
probability theory that as n increases, the appropriately rescaled bar
graphs more and more closely approach the normal curve. The theorem
is known as the Central Limit Theorem, and we have illustrated it
graphically.
More precisely, the theorem states that for any two numbers a and
175
4.14. THE CENTRAL LIMIT THEOREM




Figure 4.26: ™¦
176 CHAPTER 4. PROBABILITY THEORY




Figure 4.27: ™¦




Figure 4.28: ™¦
177
4.14. THE CENTRAL LIMIT THEOREM




Figure 4.29: ™¦

b, with a < b,
x ’ np
Pr[a < √ < b]
npq
approaches the area under the normal curve between a and b, as n
increases. This theorem is particularly interesting in that the normal
curve is symmetric about 0, while f (n, p; x) is symmetric about the
1
expected value np only for the case p = 2 . It should also be noted that
we always arrive at the same normal curve, no matter what the value
of p is.
In Figure 4.29 we give a table for the area under the normal curve
between 0 and d. Since the total area is 1, and since it is symmetric
about the origin, we can compute arbitrary areas from this table. For
example, suppose that we wish the area between ’1 and +2. The area
between 0 and 2 is given in the table as .477. The area between ’1
and 0 is the same as between 0 and 1, and hence is given as .341. Thus
the total area is .818. The area outside the interval (’1, 2) is then
178 CHAPTER 4. PROBABILITY THEORY

1 ’ .818 = .182.

Example 4.31 Let us ¬nd the probability that x di¬ers from the ex-
pected value np by as much as d standard deviations.
x ’ np

Pr[|x ’ np| ≥ d npq] = Pr[| √ ≥ d].
npq

and hence the approximate answer should be the area outside the
interval (’d, d) under the normal curve. For d = 1, 2, 3 we obtain
1 ’ (2 · .341) = .318, 1 ’ (2 · .477) = .046, 1 ’ (2 · .4987) = .0026, re-
spectively. These agree with the values given in Section 4.10, to within
rounding errors. In fact, the Central Limit Theorem is the basis of
™¦
those estimates.

Example 4.32 In Example 4.19 we considered the example of throw-
ing a coin 10,000 times. The expected √ number of heads that turn up
is 5000 and the standard deviation is 10, 000 · 2 · 1 = 50. We ob-
1
2
served that the probability of a deviation of more than two standard
deviations (or 100) was very unlikely. On the other hand, consider the
probability of a deviation of less than .1 standard deviation. That is,

<< . .

. 18
( : 20)



. . >>