curve is .040 and hence the probability of a deviation from 5000 of less

than ¬ve is approximately .08. Thus, while a deviation of 100 is very

unlikely, it is also very unlikely that a deviation of less than ¬ve will

™¦

occur.

Example 4.33 The normal approximation can be used to estimate

the individual probabilities f (n, x; p) for large n. For example, let us

estimate f (200, 65; .3). The graph of the probabilities f (200, x; .3) was

given in Figure 4.28 together with the normal approximation. The

desired probability is the area of the bar corresponding to x = 65. An

inspection of the graph suggests that we should take the area under the

normal curve between 64.5 and 65.5 as an estimate for this probability.

In normalized units this is the area between

4.5

√

200(.3)(.7)

and

5.5

√

200(.3)(.7)

179

4.14. THE CENTRAL LIMIT THEOREM

or between .6944 and .8487. Our table is not ¬ne enough to ¬nd this

area, but from more complete tables, or by machine computation, this

area may be found to be .046 to three decimal places. The exact value

to three decimal places is .045. This procedure gives us a good estimate.

If we check all of the values of f (200, x; .3) we ¬nd in each case

that we would make an error of at most .001 by using the normal

approximation. There is unfortunately no simple way to estimate the

error caused by the use of the Central Limit Theorem. The error will

clearly depend upon how large n is, but it also depends upon how near

1

™¦

p is to 0 or 1. The greatest accuracy occurs when p is near 2 .

Example 4.34 Suppose that a drug has been administered to a num-

ber of patients and found to be e¬ective a fraction p of the time. Assum-

ing an independent trials process, it is natural to take p as an estimate

for the unknown probability p for success on any one trial. It is useful

to have a method of estimating the reliability of this estimate. One

method is the following. Let x be the number of successes for the drug

given to n patients. Then by the Central Limit Theorem

x ’ np

| ¤ 2] ≈ .95.

Pr[| √

npq

This is the same as saying

x/n ’ p

| ¤ 2] ≈ .95.

Pr[|

pq/n

Putting p = x/n, we have

¯

Pr[|¯ ’ p| ¤ 2 pq/n] ≈ .95.

p

Using the fact that pq < 4 (see Exercise 12) we have

1

Pr[|¯ ’ p| ¤ √ ] ≥ .95.

p

n

This says that no matter what p is, with probability ≥ .95, the true

1

value will not deviate from the estimate p by more than √n It is cus-

tomary then to say that

1 1

p’ √ ¤p¤p+ √

¯ ¯

n n

180 CHAPTER 4. PROBABILITY THEORY

with con¬dence .95. The interval

1 1

[¯ ’ √ , p + √ ]

p ¯

n n

is called a 95 per cent con¬dence interval. Had we started with

x ’ np

| ¤ 3] ≈ .99,

Pr[| √

npq

we would have obtained the 99 per cent con¬dence interval

3 3

[¯ ’ √ , p + √ ]

p ¯

2n 2n

For example, if in 400 trials the drug is found e¬ective 124 times,

or .31 of the times, the 95 per cent con¬dence interval for p is

1 1

[.31 ’ , .31 + ] = [.26, .36]

20 20

and the 99 per cent con¬dence interval is

3 3

[.31 ’ , .31 + ] = [.235, .385].

40 40

™¦

Exercises

1. Let x be the number of successes in n trials of an independent

trials process with probability p for success. Let x = x’np For

√

npq

large n estimate the following probabilities.

(a) Pr[x < ’2.5].

[Ans. .006.]

(b) Pr[x < 2.5].

(c) Pr[x ≥ ’.5].

(d) Pr[’1.5 < x < 1].

[Ans. .774.]

2. A coin is biased in such a way that a head comes up with proba-

bility .8 on a single toss. Use the normal approximation to esti-

mate the probability that in a million tosses there are more than

800,400 heads.

181

4.14. THE CENTRAL LIMIT THEOREM

3. Plot a graph of the probabilities f (10, x, .5). Plot a graph also of

the normalized probabilities as in Figures 4.27 and 4.28.

4. An ordinary coin is tossed one million times. Let x be the number

of heads which turn up. Estimate the following probabilities.

(a) Pr[499, 500 < x < 500, 500]

[Ans. Approximately .682.]

(b) Pr[499, 000 < x < 501, 000],

[Ans. Approximately .954.]

(c) Pr[498, 500 < x < 501, 500],

[Ans. Approximately .997.]

5. Assume that a baseball player has probability .37 of getting a hit

each time he or she comes to bat. Find the probability of getting

an average of .388 or better if he or she comes to bat 300 times

during the season. (In 1957 Ted Williams had a batting average

of .388 and Mickey Mantle had an average of .353. If we assume

this di¬erence is due to chance, we may estimate the probability

of a hit as the combined average, which is about .37.)

[Ans. .242.]

6. A true-false examination has 48 questions. Assume that the prob-

ability that a given student knows the answer to any one question

3

is 4 . A passing score is 30 or better. Estimate the probability

that the student will fail the exam.

7. In Example 4.21 of Section 4.10, assume that the school decides

to admit 1296 students. Estimate the probability that they will

have to have additional dormitory space.

[Ans. Approximately .115.]

8. Peter and Paul each have 20 pennies. They agree to match pen-

nies 400 times, keeping score but not paying until the 400 matches

are over. What is the probability that one of the players will not

be able to pay? Answer the same question for the case that Peter

has 10 pennies and Paul has 30.

182 CHAPTER 4. PROBABILITY THEORY

9. In tossing a coin 100 times, the probability of getting 50 heads

is, to three decimal places, .080. Estimate this same probability

using the Central Limit Theorem.

[Ans. .080.]

10. A standard medicine has been found to be e¬ective in 80 per

cent of the cases where it is used. A new medicine for the same

purpose is found to be e¬ective in 90 of the ¬rst 100 patients on

which the medicine is used. Could this be taken as good evidence

that the new medication is better than the old?

11. In the Weldon dice experiment, 12 dice were thrown 26,306 times

and the appearance of a 5 or a 6 was considered to be a suc-

cess. The mean number of successes observed was, to four deci-

mal places, 4.0524. Is this result signi¬cantly di¬erent from the

expected average number of 4?

[Ans. Yes.]

1 1

12. Prove that pq ¤ 4 . [Hint: write p = + x.]

2

13. Suppose that out of 1000 persons interviewed 650 said that they

would vote for Mr. Big for mayor. Construct the 99 per cent

con¬dence interval for p, the proportion in the city that would

vote for Mr. Big.

14. Opinion pollsters in election years usually poll about 3000 voters.

Suppose that in an election year 51 per cent favor candidate A and

49 per cent favor candidate B. Construct 95 per cent con¬dence

limits for candidate A winning.

[Ans. [.492, .528].]

15. In an experiment with independent trials we are going to estimate

p by the fraction p of successes. We wish our estimate to be within

.02 of the correct value with probability .95. Show that 2500

observations will always su¬ce. Show that if it is known that p

is approximately .1, then 900 observations would be su¬cient.

183

4.15. GAMBLER™S RUIN

16. An experimenter has an independent trials process and he or she

has a hypothesis that the true value of p is p0 . He decides to carry

out a number of trials, and from the observed r calculate the 95

per cent con¬dence interval for p. He will reject p0 if it does not

fall within these limits. What is the probability that he or she

will reject p0 when in fact it is correct? Should he or she accept

p0 if it does fall within the con¬dence interval?

17. A coin is tossed 100 times and turns up heads 61 times. Using

the method of Exercise 16 test the hypothesis that the coin is a

fair coin.

[Ans. Reject.]

18. Two railroads are competing for the passenger tra¬c of 1000 pas-

sengers by operating similar trains at the same hour. If a given

passenger is equally likely to choose one train as the other, how

many seats should the railroad provide if it wants to be sure that

its seating capacity is su¬cient in 99 out of 100 cases?

[Ans. 537.]

4.15 Gambler™s ruin

In this section we will study a particular Markov chain, which is inter-

esting in itself and has far-reaching applications. Its name, “gambler™s

ruin”, derives from one of its many applications. In the text we will

describe the chain from the gambling point of view, but in the exercises

we will present several other applications.

Let us suppose that you are gambling against a professional gambler,

or gambling house. You have selected a speci¬c game to play, on which

you have probability p of winning. The gambler has made sure that the

1

game is in his or her favor, so that p < 2 . However, in most situations

p will be close to 1 . (The cases p = 1 and p > 2 are considered in the

1

2 2

exercises.)

At the start of the game you have A dollars, and the gambler has B

dollars. You bet $1 on each game, and play until one of you is ruined.

What is the probability that you will be ruined? Of course, the answer

depends on the exact values of p, A, and B. We will develop a formula

for the ruin-probability in terms of these three given numbers.

184 CHAPTER 4. PROBABILITY THEORY

Figure 4.30: ™¦

First we will set the problem up as a Markov chain. Let N = A+B,

the total amount of money in the game. As states for the chain we

choose the numbers 0, 1, 2, . . . , N . At any one moment the position of

the chain is the amount of money you have. The initial position is

shown in Figure 4.30.

If you win a game, your money increases by $1, and the gambler™s

fortune decreases by $1. Thus the new position is one state to the right

of the previous one. If you lose a game, the chain moves one step to

the left. Thus at any step there is probability p of moving one step

to the right, and probability q = 1 ’ p of one step to the left. Since

the probabilities for the next position are determined by the present

position, it is a Markov chain.

If the chain reaches 0 or N , we stop. When 0 is reached, you are

ruined. When N is reached, you have all the money, and you have

ruined the gambler. We will be interested in the probability of your

ruin, i.e., the probability of reaching 0.

Let us suppose that p and N are ¬xed. We actually want the prob-

ability of ruin when we start at A. However, it turns out to be easier to

solve a problem that appears much harder: Find the ruin-probability

for every possible starting position. For this reason we introduce the

notation xi , to stand for the probability of your ruin if you start in

position i (that is, if you have i dollars).

Let us ¬rst solve the problem for the case N = 5. We have the

unknowns x0 , x1 , x2 , x3 , x4 , x5 . Suppose that we start at position 2.

The chain moves to 3, with probability p, or to 1, with probability q.

Thus

Pr[ruin|start at 2] = Pr[ruin|start at 3] · p + Pr[ruin|start at 1] · q.

using the conditional probability formula, with a set of two alternatives.

But once it has reached state 3, a Markov chain behaves just as if it

had been started there. Thus

Pr[ruin|start at 3] = x3 .

185

4.15. GAMBLER™S RUIN

And, similarly,

Pr[ruin|start at 1] = x1 .

We obtain the key relation

x2 = px3 + qx1 .

We can modify this as follows:

(p + q)x2 = px3 + qx1 ,

p(x2 ’ x3 ) = q(x1 ’ x2 ),

x1 ’ x2 = r(x2 ’ x3 ),

where r = p/q and hence r < 1. When we write such an equation for

each of the four “ordinary” positions, we obtain

x0 ’ x 1 r(x1 ’ x2 ),

=

x1 ’ x 2 r(x2 ’ x3 ),

=

(4.4)

x2 ’ x 3 r(x3 ’ x4 ),

=

x3 ’ x 4 r(x4 ’ x5 )

=

We must still consider the two extreme positions. Suppose that the

chain reaches 0. Then you are ruined, hence the probability of your

ruin is 1. While if the chain reaches N = 5, the gambler drops out of

the game, and you can™t be ruined. Thus

x0 = 1, x5 = 0. (4.5)

If we substitute the value of x5 in the last equation of 4.4, we have

x3 ’x4 = rx4 . This in turn may be substituted in the previous equation,

etc. We thus have the simpler equations

1 · x4 ,

x4 =

x3 ’ x 4 = rx4 .

r 2 x4 .

x2 ’ x 3 = (4.6)

r 3 x4 .

x1 ’ x 2 =

r 4 x4

x0 ’ x 1 =

Let us add all the equations. We obtain

x0 = (1 + r + r 2 + r 3 + r 4 )x4 .

From 4.5 we have that x0 = 1. We also use the simple identity

(1 ’ r)(1 + r + r 2 + r 3 + r 4 ) = 1 ’ r 5 .

186 CHAPTER 4. PROBABILITY THEORY

And then we solve for x4 :

1’r

x4 = .

1 ’ r5

If we add the ¬rst two equations in 4.6, we have that x3 = (1 + r)x4 .

Similarly, adding the ¬rst three equations, we solve for x2 , and adding

the ¬rst four equations we obtain x1 . We now have our entire solution,

1 ’ r4 1 ’ r3 1 ’ r2 1 ’ r1

x1 = , x2 = , x3 = , x4 = . (4.7)

1 ’ r5 1 ’ r5 1 ’ r5 1 ’ r5

The same method will work for any value of N . And it is easy to guess

from 4.7 what the general solution looks like. If we want xA , the answer

is a fraction like those in 4.7. In the denominator the exponent of r is

always N . In the numerator the exponent is N ’ A, or B. Thus the

ruin-probability is

1 ’ rB

xA = . (4.8)

1 ’ rN

We recall that A is the amount of money you have, B is the gambler™s

stake, N = A + B, p is your probability of winning a game, and r =

p/(1 ’ p).

In Figure 4.31 we show some typical values of the ruin-probability.

Some of these are quite startling. If the probability of p is as low as .45

(odds against you on each game 11: 9) and the gambler has 20 dollars

to put up, you are almost sure to be ruined. Even in a nearly fair game,

say p = .495, with each of you having $50 to start with, there is a .731

chance for your ruin.

It is worth examining the ruin-probability formula 4.8 more closely.

Since the denominator is always less than 1, your probability of ruin is

at least 1 ’ r B . This estimate does not depend on how much money

you have, only on p and B. Since r is less than 1, by making B large

enough, we can make r B practically 0, and hence make it almost certain

that you will be ruined.

Suppose, for example, that a gambler wants to have probability .999

of ruining you. (You can hardly call him or her a gambler under those

circumstances!) The gambler must make sure that r B < .001. For

example, if p = .495, the gambler needs $346 to have probability .999

of ruining you, even if you are a millionaire. If p = .48, the gambler

needs only $87. And even for the almost fair game with p = .499, $1727

will su¬ce.

187

4.15. GAMBLER™S RUIN

Figure 4.31: ™¦

188 CHAPTER 4. PROBABILITY THEORY

There are two ways that gamblers achieve this goal. Small gambling

houses will ¬x the odds quite a bit in their favor, making r much less

than 1. Then even a relatively small bank of B dollars su¬ces to assure

them of winning. Larger houses, with B quite sizable, can a¬ord to let

you play nearly fair games.

Exercises

1. An urn has nine white balls and 11 black balls. A ball is drawn,

and replaced. If it is white, you win ¬ve cents, if black, you lose

¬ve cents. You have a dollar to gamble with, and your opponent

has ¬fty cents. If you keep on playing till one of you loses all