ñòð. 16 |

Bulkley and Tonks (1989) exploit the fact that PI # V, to generate profit

based on an aggregate UK stock price index (annual 1918- 1985). The predictio

+

for real dividends is In D,= 6 &, where a, and g, are estimated recursi

,

then assume that the growth rate of dividends is used in the RVF and the

+

fundamental value from recursive estimates using the regression b, = t, gs

is constrained to be the same as in the dividend equation (but Es varies period

They then investigate the profitability of a â€˜switchingâ€™ trading rule whereby if a

PI exceeds the predicted price hf from the regression by more than K, perce

index and hold (risk-free) bonds. The investor then holds bonds until P, is

c, and then buys back the index. (At any date t, the value of K, is cho

below

would have maximised profits over the period (0,t - l).) The passive strateg

and hold the index. They find that (over 1930-1985) the switching strategy ea

annual excess return of 1.61 percent over the buy and hold strategy. Also,

of risk for the switching strategy is no higher than in the buy and hold strate

switching strategy only involves trades on seven separate occasions, transac

would have to be the order of 12.5 percent to outweigh net profits from the

strategy.

Thus the above models, where an elementary learning process is introduced

broad movements in price and fundamentals (i.e. dividends) are linked over lon

However, they also indicate that for long periods prices may deviate from fun

free' tests and 'model-based' tests. In the former we do not have to assume a

statistical model for the fundamental variables('). However, as we shall see, t

that we merely obtain a point estimate of the relevant test statistic but we can

confidence limits on this measure. Formal hypothesis testing is therefore not po

one can do is to try and ensure that the estimator (based on sample data) of

statistic is an unbiased estimate of its population value(2). Critics of the earl

ratio tests highlighted the problem of bias in finite samples (Flavin, 1983).

A 'model-based test' assumes a particular stochastic process for dividends an

associated statistical distribution. This provides a test statistic with appropriate

limits and enables one to examine small sample properties using Monto Carl

However, with model-based tests, rejection of the null that the RVF is correct is

on having the correct statistical model for dividends. We therefore have the pr

joint null hypothesis.

A further key factor in interpreting the various tests on stock prices is w

dividend process is assumed to be stationary or non-stationary . Under non

the usual distributional assumptions do not apply and interpretation of the re

variance bounds tests is problematic. Much recent work has been directed to

procedures that take account of non-stationarity in the data. This issue is dis

variance bounds tests in this chapter and is taken up again in Chapter 16.

6.2.1 Shiller Volatility Tests

Shiller takes the rational valuation formula as his model of the determinatio

prices. Hence stock prices are determined by economic fundamentals, V t , n

discounted present value (DPV) of expected future dividends.

n-1

+

where P,+n is the expected 'terminal price' at time t n . It is assumed that a

take the same view of the future. Hence all investors form the same expectatio

dividends and it is also assumed for the moment that the discount factor 6(

+

is a constant in all future periods. 6 is defined as 1/(1 k) where k is the re

of return (see Chapter 4). Clearly the assumption of a constant nominal requ

return is rather unrealistic and hence tests of the model invariably assume a co

discount rate and therefore stock prices are also measured in real terms.

To set the ball rolling, it is instructive to note that if, for each time period

data on expected future dividends, the expected terminal price and the cons

we could work out the right-hand side of equation (6.23) and compare it with

stock price P,.Of course, at time t , we do not know what investors' forecasts o

future dividends would have been. However, Shiller (1981) proposed a simp

ingenious way of getting round this problem.

Data are available on actual dividends in the past, say from 1900 onwar

have the actual price PI+,, today, say in 1996. It is assumed 6 is a known

from 1900 onwards. As described above, the data series P: has been compute

following formula:

n-1

+ 6" Pr+n

P: = GiDl+i

i=l

When calculating P for 1900 the influence of the terminal price Pt+n is fair

T

since n is large and Gn is relatively small. As we approach the end-point o

term anPt+n carries more weight in our calculation of P*.One option is t

truncate our sample, say ten years prior to the present, in order to apply the DP

Alternatively, we can assume that the actual price at the terminal date is 'c

expected value E,P;+, and the latter is usually done in empirical work.

Comparing PI and P; we see that they differ by the sum of the forecas

dividends wt+i, weighted by the discount factors 6' where

If agents do not make systematic forecast errors then we would expect these fore

in a long sample of data to be positive about as many times as they are negati

average for them to be close to zero). This is the unbiasedness assumption of R

up again. Hence we might expect the (weighted) sum of U,+; to be relatively sm

broad movements in P; should then be correlated with those for P,. Shiller (198

a graph of (detrended) P, and P (in real terms) for the period 1871-1979 (F

:

One can immediately see that the correlation between Pt and P; is low thu

rejection of the view that stock prices are determined by fundamentals in a

. Year

1

I I 1

I

1870 1890 1910 1930 1950 1970

Figure 6.9 Source: Shiller (1981). Reproduced by permission of the American

Association

r=l

where Z = sample mean and n = number of observations. In 1900 investors di

what future dividends were going to be and therefore the actual stock price

from the perfect foresight stock price. Hindsight has shown that investors ma

errors, qr, which may be represented as

(where q, is a weighted average of the forecast errors for dividends wr+i at t

etc.). If investors are rational then qr will be independent of all information

when investors made their forecast. In particular, q, will be independent of the

at time t. From (6.27) we obtain:

Informational efficiency (orthogonality) implies COv(P,, qr) is zero and (6.28)

+ var(qr)

var(P:) = var(Pr)

Since the variance of the forecast error is positive then:

var(P:) > var(Pr)

or

VR = var(P:)/var(P,) > 1

Hence if the market sets stock prices according to the rational valuation fo

(â€˜identicalâ€™) agents are rational in processing information and the discoun

constant, we would expect the variance inequality in equation (6.31) to ho

variance ratio (VR) (or standard deviation ratio (SDR)) to exceed unity.

For expositional reasons it is assumed that P is calculated as in (6.24) usin

:

formula. However, in much of the empirical work an equivalent method is

DPV formula (6.24) is consistent with the Euler equation:

+ D,+1) t = 1 , 2 , .. . n

P = 6(P:+,

:

Hence if we assume a terminal value for P + we can use (6.32) to calcula

:,

by backward recursion. This, in fact, is the method used in Shiller (1981)

whatever method is used to calculate an observable version of PT, a termina

is required. One criticism of Shiller (1981) is that he uses the sample mean o

the terminal value, i.e.

n

i=l

6.2.2 First Generation Volatility Tests

Empirical tests of the RVF often use â€˜real variablesâ€™, that is nominal variables

stock price and dividends are deflated by some general price index of goods (e.g

price index (CPI)). In this case the discount rate S must also be in real terms.

on volatility tests assumes a constant real discount rate and Shiller (1981) fou

stock prices are excessively volatile, that is to say, inequality (6.31) is grossly v

SDR = 5.59). However, LeRoy and Porter (1981) using a slightly different f

(see Appendix 6.1) found that although the variance bound is violated, the re

of borderline statistical significance.

Time Varying Real Interest Rates

So far, in our analysis, we have assumed that the real discount factor 6 i

However, we could rework the perfect foresight price, assuming the real requ

+

k, and hence 6, = (1 k,)-â€™ varies over time. For example, we could set k, t

actual real interest rate which existed in all future years r,, plus a constant ris

+

(k, = r, r p ) . Hence k, varies in each year and Pf is calculated as:

with a terminal value equal to the end of sample actual price. However, when t

bounds test is repeated using this new measure of PI* it is still violated (e.g. M

(1989) and Scott (1990)).

We can turn the above calculation on its head. Knowing what the variab

actual stock price was, we can calculate the variability in real returns k, tha

necessary to equate var(Pf ) with var(P,). Shiller (1981) performs this calcula

the assumption that P, and D,have deterministic trends. Using the detrended

finds that the standard deviation of real returns needs to be greater than 4

annum for the variance in the perfect foresight price to be brought into equali

variance of actual prices. However, the actual historic variability in real inter

much smaller than that required to â€˜saveâ€™ the variance bounds test. Hence the e

for the violation of the excess volatility relationship does not appear to lie w

varying ex-post real interest rate.

Consumption CAPM

Another line of attack, in order to â€˜rescueâ€™ the violation of the variance bound, i

that the actual or ex-post real interest rate (as used above) may not be a particu

proxy for the ex-ante real interest rate. The consumption CAPM, where the

maximises the discounted present value of the utility from future consumpt

to a lifetime budget constraint, can give one a handle on what the ex-ante r

rate of return by investors (i.e. the discount rate) depends upon the rate of

consumption and the constant rate of time preference.

For simplicity, assume for the moment that dividends have a growth rate

D,= Do(gd)' and the perfect foresight price is:

1

Hence P:* varies over time (Grossman and Shiller, 1981) depending on the c

of consumption relative to a weighted harmonic average of future consump

Clearly, this introduces much greater variability in P: than does the consta

rate assumption (i.e. where the coefficient of relative risk aversion, a = 0).

the constant growth rate of dividends by actual ex-post dividends while re

C-CAPM formulation, Shiller (1987) recalculates the variance bounds tests

for the.period 1889-1985 using a = 4. The pictorial evidence (Figure 6.10) su

up to about 1950 the variance bounds test is not violated.

However, the relationship between the variability of actual prices and perfe

prices is certainly not close, in the years after 1950. Over the whole period S

that the variance ratio is not violated under the assumption that a = 4, which s

view as an implausibly high value of the risk aversion parameter. Thus, on bala

not appear as if the assumption of a time varying discount rate based on the co

CAPM can wholly explain movements in stock prices.

Small Sample Problems

Flavin (1983) and Kleidon (1986) point out that there are biases in small

measuring var(P,) and var(Pr) which might invalidate some of the 'first

120 1

I

1

0' I

I I I I

I

1 I 1

1890 1910 1930 1950 1970

t (Year)

Figure 6.10 Consumption-based Time-Varying Interest Rates. Source: Grossman

(1981). Reproduced by permission of the American Economic Association

the degree of bias depending on the degree of serial correlation in

Since P: is more strongly autocorrelated than P, then var(P:) is esti

greater downward bias than var(P,). Hence it is possible that the sam

yield var(Pr) -= var(P,) in a finite sample, even when the null of the RV

(ii) Shillerâ€™s use of the sample average of prices as a proxy for terminal val

t + n also induces a bias towards rejection.

There is a further issue surrounding the terminal price, noted by Gilles and LeR

The correct value of the perfect foresight price is

i= 1

which is unobservable. The observable series P:ln should be constructed using

price PI+, at the end of the sample, since this ensures P:in = E(PTI52,). Howev

still a problem since the sample variance of P& understates the true (but un

variance of PF. Intuitively this is because Pzn is â€˜anchoredâ€™ on P,+,, and

take account of innovations in dividends which occur after the end of the s

implicitly sets these to zero but P; includes these, since the summation is

Clearly this problem is minimal if the sample is very large (infinite) but may b

in finite samples.

Flavinâ€™s (1983) criticisms of these â€˜first generation testsâ€™ assumed, as did

of these tests, stationarity of the series being used. Later work tackled the i

validity of variance bounds tests when the price and dividend series are non

(i.e. have a stochastic trend). The problem posed by non-stationary series

population variances are functions of time and hence the sample variances

constant) are not correct measures of their population values. However, it is n

how to â€˜removeâ€™ these stochastic trends from the data, in order meaningfully t

variance bounds tests. It is to this issue that we now turn.

6.2.3 Volatility Tests and Stationarity

Shillerâ€™s volatility inequality is a consequence purely of the assumption tha

stock price is an unbiased and optimal predictor of the perfect foresight price

+

P; = Pt ut

where u, is a random error term, with E(u,152,) = 0. Put another way, P,

cient statistic to forecast P ; . No information other than P, can improve on

of P:: in this sense P, is â€˜optimalâ€™. The latter implies that the conditional fo

E [ ( P : - P)t(52,]is independent of all information available at time t or earlie

is independent and therefore uncorrelated with 52,. Since P, c 52, then P, is i

of u, (i.e. cov(P,, u t ) = 0). The latter is the â€˜informational efficiencyâ€™ or RE or

assumption. Using the definition of covariance for a stationary series, it follo

where we have used cov(P,, U,) = 0. The definition of the correlation coefficie

P, and P: is

Substituting for â€˜covâ€™ from (6.36) in (6.35) we obtain a variance equality

= P(Pt p: )M:

1

M f ) 9

Since the maximum value of p = 1 then (6.37) implies the familiar variance

< w:)

dPr)

Under the assumption of informational efficiency and that P, is etermined

fundamentals, then (6.38) must hold in the population for any stationary se

Stationary series have a time invariant and constant population mean varianc

deviation) and covariance. The difficulty in applying (6.38) to a sample of data

whether the sample is drawn from an underlying stationary series in the popu

It is also worth noting that the standard deviations in (6.38) are simple un

measures. If a time series plot is such that it changes direction often and he

its mean value frequently (i.e. is â€˜jaggedâ€™) then in a short sample of data we

a â€˜goodâ€™ estimate of the population value of a(P)from its sample value. How

time series wanders substantially from its (constant) mean value in long slow s

one will need a long sample of data to obtain a good estimate of the â€˜trueâ€™

variance (i.e. a representative series of â€˜cyclesâ€™ in the data set is required, n

one-quarter or one-half a cycle). In fact, stock prices appear to move in quite l

or cycles (see Figure 6.9) and hence a long data set is required to measure acc

true standard deviation (or variance).

If a series is non-stationary then it has a time varying population mean or v

hence (6.38) is ˜ndefinedâ€˜˜â€™. then need to devise an alternative variance in

We

terms of a transformation of the variables P , and P; into â€˜newâ€™ variables that are

The latter has led to alternative forms of the variance inequality condition. T

is that it is often difficult to ascertain whether a particular series is station

from statistical tests based on any finite data set. For example, the series g

+ +

Xt = xt-l E , is non-stationary while x, = O.98xf-1 E, is stationary. Howe

finite data set (on stock prices) it is often difficult statistically to discriminate b

+

two, since in a regression xt = a bx,-l + E,, the estimate of â€˜bâ€™ is subject t

error and often one could take it as being either 1 or 0.98. (Also the distribu

test statistic that b = 1 is â€˜non-standardâ€™.)

The fact that P , is an optimal forecast of PT does not necessarily imply

accurate forecast but only that one cannot improve upon the forecast based o

+

For example, if dividends are accurately described by Df = a wt where

noise, then using the DPV formula, P , will be constant. However, the varia

will depend on a weighted average of the variance of 0 and is certainly greate

,

p(P,, P : ) = 1. Then the equality in (6.37) holds and a ( P t ) = a(Pr). The in

position is most likely to occur in practice as agents make imperfect forecast

hence 0 < p ( P , , P:) < 1.

Let us return now to the issue of whether (real) dividends and (therefore fo

6 ) P and P , are non-stationary. How much difference does non-stationarit

T

practice when estimating the sample values of a(P,) and a(P:) from a finite da

can generate artificial data for these variables (i.e. the population) under the

of non-stationarity and see if in the generated sample of data the variance

(6.38) is met (i.e. Monte Carlo studies). For example, in his early work Sh

â€˜detrendedâ€™ the variables P, and PT by dividing by a simple deterministic tre

+

where b is estimated from the regression lnP, = a bt over the whole sam

If P, follows a stochastic trend then â€˜detrendingâ€™ by assuming a determinis

statistically invalid. Use of A, will not â€˜correctlyâ€™ detrend the series. The qu

arises as to whether the violation of the variance bounds found in Shillerâ€™s (1

is due to this inappropriate detrending of the data.

Kleidon (1986)(5)and LeRoy and Parke (1992) examine this question us

Carlo methods. For example, Kleidon (1986) assumes that expected dividend

constant (= 0) so actual dividends are generated by a (geometric) random walk

lnD, =O+lnDr-l + E f

where Et is white noise. This yields a non-stationary stochastic trend for D

generated series for D, for rn observations using (6.39) one can use the DPV

generate a time series of length â€˜rnâ€™ for P; and for P,. One can then establi

var(P,) > var(PT) in the artificial sample of data of length rn. One can

â€˜experimentâ€™ n times (each time generating m observations) and see how m

var(P,) > var(P:). Since the EMH/fundamentals model is â€˜trueâ€™ by constru

would not expect the variance bound to be violated in a large number of cas

repeated experiments. (Some violations will be due to chance or â€˜statistical o

fact Kleidon (1986) finds that when using the generated data and detrending b

the variance bound is frequently violated even though the EMH/fundamenta

true. The frequency of violations is 90 percent (when using Shillerâ€™s method of d

while the frequency of â€˜gross violationsâ€™ (i.e. VR > 5) varied considerably de

the rate of interest (discount rate) assumed in the simulations. (For example,

percent the frequency of gross violations is only about 5 percent, but for r = 5

the figure rises dramatically to about 40 percent.)

Shiller (1988) refined Kleidonâ€™s procedure by noting that Kleidonâ€™s combine

tions for the growth rate of dividends (= 8) and the level of interest rates

implausible value for the dividend price ratio. Shiller allows 8 to vary with r

the artificially generated data, the dividend price ratio equals its average his

Under the null of the RVF he finds that the gross violations of the varianc

substantially less than those found by Kleidon.

Further, Shiller (1989, page 85) notes that in none of the above Monte Ca

is the violation of the variance inequality as large as that actually found by Shi

past values of dividends (i.e. an AR(q) process where q is large) and may not

root (i.e. the sum of the coefficients on lagged dividends is less than unity).

This debate highlights the problem of trying to discredit results which use

by using â€˜specific special casesâ€™ (e.g. random walk) in a Monte Carlo anal

Monte Carlo studies provide a highly specific â€˜sensitivity testâ€™ of empirical

such experiments may not provide an accurate description of real world data.

problem is that in a finite â€˜real worldâ€™ data set one often does not know what i

realisticâ€™ statistical representation of the data. Often, data are equally well repr

a stationary or non-stationary univariate or multivariate series. However, one

with Shiller (1989) that on a priori economic grounds it is hard to accept tha

believe that when faced with an unexpected increase in current dividends, of

they expect that dividends will be higher by z percent, in allfitureperiods. Ho

latter is implied by the (geometric) random walk model of dividends (equat

used by Kleidon and others. Shiller (1989) prefers the view that firms attempt

nominal dividends (and hardly ever cut dividends). In this case real dividend

in the above studies) may appear to be â€˜close toâ€™ a unit root series but the â€˜tru

stationary.

The outcome of all of the above arguments is that not only may the sm

properties of the variance bounds tests be unreliable but if there is non-statio

even tests based on large samples may be suspect. Clearly, all one can do

(while awaiting new time series data as â€˜timeâ€™ moves on!) is to assess the ro

the volatility results under different methods of detrending. For example, Shi

reworks some of his earlier variance inequality results using P,/E)â€™ and P:/

ñòð. 16 |