to 1 (or 1), the better the relationship. An r value of zero indicates no

relationship between the variables.

In a multivariable regression equation, the multiple R measures how

well the dependent variable is correlated to all of the independent vari-

ables in the regression equation. Multiple R measures the total amount

of variation in the dependent variable that is explained by the indepen-

dent variables. In our case, the value of 99.88% (B20) is very close to 1,

indicating that almost all of the variation in adjusted costs is explained

by sales.6

The square of the single or multiple R value, referred to as R-square

(or R 2), measures the percentage of the variation in the dependent vari-

able explained by the independent variable. It is the main measure of the

goodness of ¬t. We obtain an R 2 of 99.75% (B21), which means that sales

explains 99.75% of the variation in adjusted costs.

Adding more independent variables to the regression equation usu-

ally adds to R 2, even when there is no true causality. In statistics, this is

called ˜˜spurious correlation.™™ The adjusted R2, which is 99.72% in our

example (B22), removes the expected spurious correlation in the ˜˜gross™™

R2.

k n 1

Adj R 2 R2

n 1 n k 1

where n is the number of observations and k is the number of indepen-

dent variables (also known as regressors).

Although the data in Table 2-1A are ¬ctitious, in practice I have

found that regressions of adjusted costs versus sales usually give rise to

R 2 values of 98% or better.7

Standard Error of the y-Estimate

The standard error of the y-estimate is another important regression sta-

tistic that gives us information about the reliability of the regression es-

6. Although the spreadsheet labels this statistic Multiple R, because our example is an OLS

regression, it is simply R.

7. This obviously does not apply to start-ups.

CHAPTER 2 Using Regression Analysis 29

timate. We can multiply the standard error of $16,014 (B23) by two to

calculate an approximate 95% con¬dence interval for the regression es-

timate. Thus, we are 95% sure that the true adjusted costs are within

$32,028 of the regression estimate of total adjusted costs.8 Dividing

$64,000 by the mean of adjusted costs (approximately $1 million) leads

to a 95% con¬dence interval that varies by about 3%, or 6% total. Later

in the chapter we will calculate precise con¬dence intervals.

The Mean of a and b

Because a and b are speci¬c numbers that we calculate in a regression

analysis, it is easy to lose sight of the fact that they are not simply num-

bers, but rather random variables. Remember that we are trying to esti-

mate and , the true ¬xed and variable cost, which we will never know.

If we had 20 years of ¬nancial history for our Subject Company, we could

take any number of combinations of years for our regression analysis.

Suppose we had data for 1978“1997. We could use only the last ¬ve years,

1993“1997, or choose 1992“1995 and 1997, still keeping ¬ve years of data,

but excluding 1996”although there is no good reason to do so. We could

use 5, 6, 7, or more years of data. There are a large number of different

samples we can draw out of 20 years of data. Each different sample would

lead to a different calculation of a and b in our attempt to estimate and

, which is why a and b are random variables. Of course, we will never

be exactly correct in our estimate, and even if we were, there would be

no way to know it!

Equations (2-1) and (2-2) state that a and b are unbiased estimators

of and , which means that their expected values equal and . The

capital E is the expected value operator.

E (a) the mean of a is alpha (2-1)

E (b) the mean of b is beta (2-2)

The Variance of a and b

We want to do everything we can to minimize the variances of a and b

in order to improve their reliability as estimators of and . If their

variances are high, we cannot place much reliability on our regression

estimate of costs”something we would like to avoid.

Equations (2-3) and (2-4) below for the variance of a and b give us

important insights into deciding how many years of ¬nancial data to

gather and analyze. Common practice is that an appraisal should encom-

pass ¬ve years of data. Most appraisers consider anything older than ¬ve

years to be stale data, and anything less than ¬ve years insuf¬cient. You

will see that the common practice may be wrong.

The mathematical de¬nition for the variance of a is:

8. This is true at the sample mean of X, and the con¬dence interval widens as we move away

from that.

PART 1 Forecasting Cash Flows

30

2

Var (a) (2-3)

n

where 2 is the true and unobservable population variance around the

true regression line and n number of observations.9 Therefore, the var-

iance of our estimate of ¬xed costs decreases with n, the number of years

10, the variance of our estimate of is 1„2 of its variance

of data. If n

if we use a sample of ¬ve years of data. The standard deviation of a,

which is the square root of its variance, decreases somewhat less dra-

matically than the variance, but signi¬cantly nonetheless. Having 10 years

of data reduces the standard deviation of our estimate of ¬xed costs by

29% vis-a-vis ¬ve years of data. Thus, having more years of data may

`

increase the reliability of our statistical estimate of ¬xed costs if the data

are not ˜˜stale,™™ that is, out of date due to changes in the business, all else

being constant.

The variance of b is equal to the population variance divided by the

sum of the squared deviations from the mean of the independent variable,

or:

2

Var (b) (2-4)

n

x2

i

i1

where xi Xi X, the deviation of the independent variable of each

observation, Xi, from the mean, X, of all its observations. In this context,

it is each year™s sales minus the average of sales in the period of analysis.

Since we have no control over the numerator”indeed, we cannot even

know it”the denominator is the only portion where we can affect the

variance of b. Let™s take a further look at the denominator.

Table 2-2 is a simple example to illustrate the meaning of x versus

X. Expenses (Column C) is our Y (dependent) variable, and sales (Column

T A B L E 2-2

OLS Regression: Example of Deviation from Mean

A B C D E F

5 Variable

x2

6 Y X x

7 Deviation Squared Dev.

8 Observation Year Expenses Sales From Mean From Mean

9 1 1994 $ 80,000 $100,000 $(66,667) 4,444,444,444

10 2 1996 $115,000 $150,000 $(16,667) 277,777,778

11 3 1997 $195,000 $250,000 $ 83,333 6,9444,444,444

12 Total $500,000 $ - 11,666,666,667

13 Average $166,667

9. Technically this is true only when the y-axis is placed through the mean of x. The following

arguments are valid, however, in either case.

CHAPTER 2 Using Regression Analysis 31

D) is our X (independent) variable. The three years sales total $500,000

(cell D12), which averages to $166,667 (D13) per year, which is X. Column

E shows x, the deviation of each X observation from the sample mean,

X, of $166,667. In 1995, x1 $100,000 $166,667 $66,667. In 1996, x2

$150,000 $166,667 $16,667. Finally in 1997, x3 $250,000

$166,667 $83,333. The sum of all deviations is always zero, or

3

xi 0

i1

Finally, Column F shows x 2, the square of Column E. The sum of the

squared deviations,

3

x2 $11,666,666,667.

i

i1

This squared term appears in several OLS formulas and is particularly

important in calculating the variance of b.

When we use relatively fewer years of data, there tends to be less

variation in sales. If sales are con¬ned to a fairly narrow range, the

squared deviations in the denominator are relatively small, which makes

the variance of b large. The opposite is true when we use more years of

data. A countervailing consideration is that using more years of data may

lead to a higher sample variance, which is the regression estimate of 2.

Thus, it is dif¬cult to say in advance how many years of data are optimal.

This means that the common practice in the industry of using only

¬ve years of data so as not to corrupt our analysis with stale data may

be incorrect if there are no signi¬cant structural changes in the competi-

tive environment. The number of years of available data that gives the

best overall statistical output for the regression equation is the most de-

sirable. Ideally, the analyst should experiment with different numbers of

years of data and let the regression statistics”the adjusted R 2, t-statistics,

and standard error of the y-estimate”provide the feedback to making

the optimal choice of how many years of data to use.

Sometimes prior data can truly be stale. For example, if the number

of competitors in the Company™s geographic area doubled, this would

tend to drive down prices relative to costs, resulting in a decreased con-

tribution margin and an increase in variable costs per dollar of sales. In

this case, using the old data without adjustment would distort the re-

gression results. Nevertheless, it may be advisable in some circumstances

to use some of the old data”with adjustments”in order to have enough

data points for analysis. In the example of more competition in later years,

it is possible to reduce the sales in the years prior to the competitive

change on a pro forma basis, keeping the costs the same. The regression

on this adjusted data is often likely to be more accurate than ˜˜winging

it™™ with only two or three years of fresh data.

Of course, the company™s management has its view of the future. It

is important for the appraiser to understand that view and consider it in

his or her statistical work.

PART 1 Forecasting Cash Flows

32

Con¬dence Intervals

Constructing con¬dence intervals around the regression estimates a and

b is another important step in using regression analysis. We would like

to be able to make a statement that we are 95% sure that the true variable

(either or ) is within a speci¬c range of numbers, with our regression

estimate (a or b) at the midpoint. To calculate the range, we must use the

Student™s t-distribution, which we de¬ne in equation (2-6).

We begin with a standardized normal (Z) distribution. A standard-

ized normal distribution of b”our estimate of ”is constructed by sub-

tracting the mean of b, which is , and dividing by its standard deviation.

b

Z (2-5)

x2

/ i

i

Since we do not know , the population standard deviation, the best

we can do is estimate it with s, the sample standard deviation. The result

is the Student™s t-distribution, or simply the t-distribution. Figure 2-1

shows a z-distribution and a t-distribution. The t-distribution is very sim-

ilar to the normal (Z) distribution, with t being slightly more spread out.

The equation for the t-distribution is:

b

t (2-6)

x2

s/ i

i

where the denominator is the standard error of b, commonly denoted as

sb (the standard error of a is sa).

Since is unobservable, we have to make an assumption about it in

order to calculate a t-distribution for it. The usual procedure is to test for

the probability that, regardless of the regression™s estimate of ”which

is our b”the true is really zero. In statistics, this is known as the ˜˜null

hypothesis.™™ The magnitude of the t-statistic is indicative of our ability

to reject the null hypothesis for an individual variable in the regression

equation. When we reject the null hypothesis, we are saying that our

regression estimate of is statistically signi¬cant.

We can construct 95% con¬dence intervals around our estimate, b, of

the unknown . This means that we are 95% sure the correct value of

is in the interval described in equation (2-7).

b t0.025 sb (2-7)

Formula for 95% confidence interval for the slope

Figure 2-2 shows a graph of the con¬dence interval. The graph is a

t-distribution, with its center at b, our regression estimate of . The mark-

ings on the x-axis are the number of standard errors below or above b.

As mentioned before, we denote the standard error of b as sb. The lower

boundary of the 95% con¬dence interval is b t0.025 sb, and the upper

boundary of boundary of the 95% con¬dence interval is b t0.025 sb. The

CHAPTER 2 Using Regression Analysis 33

34

F I G U R E 2-1

Z-distribution vs t-distribution

0.4

0.35

0.3

0.25

probability density

0.2

0.15

0.1

t t

0.05

Z Z

0

-6 -4 -2 0 2 4 6

for Z, standard deviations from mean

For t, standard errors from mean

F I G U R E 2-2

t-distribution of B around the Estimate b

0.4

0.35

0.3

probability density

0.25

0.2

0.15

area = 2.5%

area =2.5%

0.1

0.05

0

-6 -4 -2 0 2 4 6

B =b+t0.025 sb

B =b

B= b“t 0.025sb B measured in standard

errors away from b

35

area under the curve for any given interval is the probability that will

be in that interval.

The t-distribution values are found in standard tables in most statis-

tics books. It is very important to use the 0.025 probability column in the

tables for a 95% con¬dence interval, not the 0.05 column. The 0.025 col-

umn tells us that for the given degrees of freedom there is a 21„2% prob-

ability that the true and unobservable is higher than the upper end of

the 95% con¬dence interval and a 21„2% probability that the true and

unobservable is lower than the lower end of the 95% con¬dence interval

(see Figure 2-2). The degrees of freedom is equal to n k 1, where n

is the number of observations and k is the number of independent vari-

ables.

Table 2-3 is an excerpt from a t-distribution table. We use the 0.025

column for a 95% con¬dence interval. To select the appropriate row in

the table, we need to know the number of degrees of freedom. Assuming

n 10 observations and k one independent variable, there are eight

degrees of freedom (10 1 1). The t-statistic in Table 2-3 is 2.306 (C7).

That means that we must go 2.306 standard errors below and above our

regression estimate to achieve a 95% con¬dence interval for . The re-

gression itself will provide us with the standard error of . As n, the

number of observations, goes to in¬nity, the t-distribution becomes a z-

distribution. When n is large”over 100”the t-distribution is very close

to a standardized normal distribution. You can see this in Table 2-3 in

that the standard errors in Row 9 are very close to those in Row 10, the

latter of which is equivalent to a standardized normal distribution.

The t-statistics for our regression in Table 2-1B are 3.82 (D33) and

56.94 (D34). The P-value, also known as the probability (or prob) value,

represents the level at which we can reject the null hypothesis. One minus

the P-value is the level of statistical signi¬cance of the y-intercept and

independent variable(s). The P-values of 0.005 (E33) and 10 11 (E34) mean

that the y-intercept and slope coef¬cients are signi¬cant at the 99.5% and