<< . .

. 8
( : 66)

. . >>

99.9% levels, respectively, which means we are 99.5% sure that the true
y-intercept is not zero and 99.9% sure that the true slope is not zero.10

T A B L E 2-3

Abbreviated Table of T-Statistics


4 Selected t Statistics

5 d.f.\Pr. 0.050 0.025 0.010
6 3 2.353 3.182 4.541
7 8 1.860 2.306 2.896
8 12 1.782 2.179 2.681
9 120 1.658 1.980 2.358
10 In¬nity 1.645 1.960 2.326

10. For spreadsheets that do not provide P-values, another way of calculating the statistical
signi¬cance is to look up the t-statistics in a Student™s t-distribution table and ¬nd the level
of statistical signi¬cance that corresponds to the t-statistic obtained in the regression.

PART 1 Forecasting Cash Flows
The F test is another method of testing the null hypothesis. In mul-
tivariable regressions, the F-statistic measures whether the independent
variables as a group explain a statistically signi¬cant portion of the var-
iation in Y.
We interpret the con¬dence intervals as follows: there is a 95% prob-
ability that true ¬xed costs (the y-intercept) fall between $22,496 (F33) and
$91,045 (G33); similarly, there is a 95% probability that the true variable
cost (the slope coef¬cient) falls between $0.77 (F34) and $0.84 (G34).
The denominator of equation (2-6) is called the standard error of b,
or sb. The standard error of the Y-estimate, which is de¬ned as
1 ˆ
s (Yi
n 2 i1

is $16,014 (B23). The larger the amount of scatter of the points around
the regression line, the greater the standard error.11

Precise Con¬dence Intervals12
Earlier in the chapter, we estimated 95% con¬dence intervals by subtract-
ing and adding two standard errors of the y-estimate around the regres-
sion estimate. In this section, we demonstrate how to calculate precise
95% con¬dence intervals around the regression estimate using the equa-

1 o
t0.25s (2-8)
n i

95% confidence interval for the mean forecast

1 o
t0.025s 1 (2-9)
n i

95% confidence interval for a specific year™s forecast
In the context of forecasting adjusted costs as a function of sales,
equation (2-8) is the formula for the 95% con¬dence interval for the mean
adjusted cost, while equation (2-9) is the 95% con¬dence interval for the
costs in a particular year. We will explain what that means at the end of
this section, after we present some material that illustrates this in Table
2-1B, page 2.
Note that these con¬dence intervals are different than those in equa-
tion (2-7), which was around the forecast slope only, i.e., b. In this section,

11. This standard error of the Y-estimate applies to the mean of our estimate of costs, i.e., the
average error if we estimate adjusted costs and expenses many times. This is appropriate in
valuation, as a valuation is a forecast of net income and / or cash ¬‚ows for an in¬nite
number of years. The standard error”and hence 95% con¬dence interval”for a single
year™s costs is higher.
12. This section is optional, as the material is somewhat advanced, and it is not necessary to
understand this in order to be able to use regression analysis in business valuation.
Nevertheless, it will enhance your understanding should you choose to read it.

CHAPTER 2 Using Regression Analysis 37
we are calculating con¬dence intervals around the entire regression fore-
The ¬rst 15 rows of Table 2-1B, page 2, are identical to the ¬rst page
and require no explanation. The $989,032 in B16 is the average of the 10
years of sales in B6“B15.
Column D is the deviation of each observation from the mean, which
is the sales in Column B minus the mean sales in B16. For example, D6
( $739,032) is equal to B6 ($250,000) minus B16 ($989,032). D7
( $489,032) equals B7 ($500,000) minus B16 ($989,032). The total of all
deviations from the mean must always equal zero, which it does (D16).
Column E is the squared deviations, i.e., the square of Column D. In
statistics, the independent variable(s) is known as X, while the deviations
from the mean are known as x, which explains the column labels in B5
and D5. The sum of squared deviations,
x i2
i 1988

1012 (E16).
equals 1.28
The next step is to compute the squared deviations for our sample
forecast year. We assume that forecast sales for 1997 is $1.6 million (B17).
We repeat the coef¬cients from the regression formula from the ¬rst page
of the table in B36 and B37. Applying the regression equation, we would
then forecast expenses at $1,343,928 (C17).
In order to compute a 95% con¬dence interval around the expense
forecast of $1,343,928, we apply equations (2-8) and (2-9). 1998 forecast
sales are $610,968 (D17 B17 B16) above the mean of the historical
period. That is the x0 in (3-8) and (3-9). We square the term to get 3.73
1011 (E17). Then we divide that by the sum of the squared deviations in
1012 (E16) to get 0.2905650 (F17), which we
the historical period 1.28
repeat below in Row 28.
In Row 25, we insert the t-statistic of 2.306, which one can ¬nd in a
table for a 95% con¬dence level (the 0.025 column in a two-tailed distri-
bution) and eight degrees of freedom (n 10 observations 1 indepen-
dent variable 1). In Row 26 we show the standard error of the y-
estimate of $16,014, which came from Table 2-1B, B23. Row 27 is 1/n
1/10 0.1, where n is the number of observations.
Row 28 is a repetition of F17, the ratio of the squared deviation of
the forecast to the sum of the squared deviations of the independent
variables from their mean.
In B29 we add zero, and in C29 we add 1, according to equations
(2-8) and (2-9), respectively. We will explain the difference in the two
formulas shortly.
In Row 30 we add Rows 27 to 29, which are the terms in the square
root sign in the equations. Obviously, C30 B30 1. In Row 31 we take
the square root of Row 30.
Finally, we are able to calculate our 95% con¬dence intervals as Row
25 Row 26 Row 31. The 95% con¬dence interval for the mean is
2.306 $16,014 0.6249520 $23,078 (B32), approximately 1.44 times
the size of the standard error of the y-estimate. The 95% con¬dence in-

PART 1 Forecasting Cash Flows
terval for the speci¬c year™s cost forecast is $43,547 (C32), approximately
2.72 times the size of the standard error of the y-estimate. The 95% con-
¬dence intervals are 1.7% (B33) and 3.2% (C33) of the forecast costs for
the mean and the speci¬c year™s forecast, respectively.
You can see that both the calculation of 95% con¬dence interval for
the mean and the speci¬c year™s forecast cost is roughly two times the
standard error of the y-estimate. Statisticians often loosely approximate
the 95% con¬dence intervals as two standard errors below and above the
regression estimate. Equations (2-8) and (2-9) are more precise.
Now we will discuss the difference between equations (2-8) and
(2-9). We forecast sales to be $1.6 million in 1998, which means that our
forecast of adjusted costs for that year according to the regression equa-
tion is $1,343,928. Of course, the actual expenses will not equal that num-
ber, even if actual sales by some miracle will equal forecast sales. The
95% con¬dence interval for the mean tells us that if we add and subtract
$23,078 to our forecast of $1,343,928, then we are 95% sure that the true
regression line at sales of $1.6 million should have been between
$1,320,850 and $1,367,006. If we would experience sales of $1.6 million
many times”say 1,000 times”we would be 95% sure that the average
cost would fall in our con¬dence interval.13 Equation (2-8) is the equation
describing this con¬dence interval.
That does not mean that we are 95% sure that costs would be be-
tween $1,320,850 and $1,367,006 in any particular year when sales is $1.6
million. We need a wider con¬dence interval to be 95% sure of costs in
a particular year, given a particular level of sales. Equation (2-9) describes
the con¬dence interval for a particular year.
Thus, the $23,078 con¬dence interval”meaning that we add and
subtract that number from forecast costs”appropriately quanti¬es our
long-run expectation of the con¬dence interval around forecast costs,
given the level of sales. In business valuation we are not very concerned
that every individual year conform to our forecasts. Rather, we are con-
cerned with the long-run accuracy of the regression equation. Thus, equa-
tion (2-8) is the relevant equation for 95% con¬dence intervals for valu-
ation analysts. Remember that the con¬dence interval expands the further
we move away from the mean of the historical period. Therefore, if we
forecast the costs to go with a forecast sales of, say, $5 million in the year
2005, the con¬dence interval around the cost estimate is wider than the
1.7% (B33) around 1998 forecast.

Selecting the Data Set and Regression Equation
Table 2-4 is otherwise identical to Table 2-1B, except that instead of all 10
years of data, it only contains the last 5 years. The regression equation
for the 5 years of data is (Table 2-4, B27 and B28)
Adjusted costs $71,252 ($0.79 Sales)
Examining the regression statistics, we ¬nd that the adjusted R 2 is

13. This ignores the need to recompute the regression equation with new data.

CHAPTER 2 Using Regression Analysis 39
T A B L E 2-4

Regression Analysis 1993“1997


4 Year Sales Adjusted Costs

5 1993 $1,123,600 $965,043
6 1994 $1,191,016 $1,012,745
7 1995 $1,262,477 $1,072,633
8 1996 $1,338,226 $1,122,714
9 1997 $1,415,000 $1,199,000


13 Regression Statistics

14 Multiple R 99.79%
15 R square 99.58%
16 Adjusted R square 99.44%
17 Standard error 6,840
18 Observations 5


21 df SS MS F Signi¬cance F

22 Regression 1 3.35E 10 3.35E 10 716 1.15E 04
23 Residual 3 1.40E 08 4.68E 07
24 Total 4 3.36E 10

26 Coef¬cients Standard Error t Stat P-value Lower 95% Upper 95%

27 Intercept [1] 71,252 37,624 1.89 0.15 (48,485) 190,989
28 Sales [2] 0.79 0.03 26.75 0.00 0.70 0.89

Regression Plot
[1] This is the regression estimate of ¬xed costs $1,250,000
[2] This is the regression estimate of variable costs

$1,200,000 y = 0.7924x + 71252
R2 = 0.9958



Adj. Costs





$1,100,000 $1,200,000 $1,300,000 $1,400,000 $1,500,000

PART 1 Forecasting Cash Flows
99.44% (B16), still indicating an excellent relationship. We do see a dif-
ference in the t-statistics for the two regressions.
The t-statistic for the intercept is now 1.89 (D27), indicating it is no
longer signi¬cant at the 95% level, whereas it was 3.82 in Table 2-1B.
Another effect of fewer data is that the 95% con¬dence interval for the
intercept value is $48,485 (F27) to $190,989 (G27), a range of $239,475.
In addition, the t-statistic for the slope coef¬cient, while still signi¬cant,
has fallen from 56.94 (Table 2-1B, D34) to 26.75 (D28). The 95% con¬dence
interval for the slope now becomes $0.70 (F28) to $0.89 (G28), a range
that is 31„2 times greater than that in Table 2-1B and indicates much more
uncertainty in the variable cost than we obtain using 10 years of data.
The standard error of the Y-estimate, however, decreases from
$16,014 (Table 2-1B, B23) to $6,840. This indicates that decreasing the
number of data points improves the Y-estimate, an opposite result from
all of the preceding. Why?
Earlier, we pointed out that using only a small range for the inde-
pendent variable leads to a small denominator in the variance of b, i.e.,

which leads to larger con¬dence intervals. However, larger data sets (us-
ing more years of data) tend to lead to a larger standard error of the y-
estimate, s. As we mentioned earlier,
1 ˆ
s (Yi
n 2 i1

where Yi are the forecast (regression ¬tted) costs, Yi are the historical
costs, and n is the number of observations.14 Thus, we often have a trade-
off in deciding how many years of data to include in the regression. More
years of data leads to better con¬dence intervals, but fewer years may
lead to smaller standard errors of the y-estimate.
Table 2-4 was constructed to demonstrate that you should evaluate
all of the regression statistics carefully to determine if the relationship is
suf¬ciently strong to merit using it and which data set is best to use.
Simply looking at the adjusted R 2 value is insuf¬cient; all the regression
statistics should be evaluated in their entirety, as an improvement in one
may be counterbalanced by a deterioration in another. Therefore, it is best
to test different data sets and compare all of the regression statistics to
select the regression equation that represents the best overall relationship
between the variables.

14. We divide by n 2 instead of n because it takes two points to determine a line. If we only had
two years of historical data, we could determine a regression line, but we would know
absolutely nothing about the variance around the line. It takes a minimum of three years of
data to be able to say anything at all about how well the regression line ¬ts the data, and
three years is usually insuf¬cient. It is much better to have at least ¬ve years of data,
though four years can often suf¬ce.

CHAPTER 2 Using Regression Analysis 41
Although regression analysis is a powerful tool, its blind application can
lead to serious errors. Various problems can be encountered, and one
should be cognizant of the limitations of this technique. Aside from the
obvious problems of poor ¬t and insuf¬cient data, structural changes in
the company can also invalidate the historical relationship of sales and

Insuf¬cient Data
Insuf¬cient data leads to increased error in the regression, which in turn
will lead to increased error in the forecast data. As mentioned previously,
to optimize the regression equation it is best to examine overlapping data
sets to determine which gives the best results.

Substantial Changes in Competition or Product/Service
Although regression analysis is applicable in most situations, substantial
structural changes in a business may render it inappropriate. As men-
tioned previously, the appraiser can often compensate for changes in the
competitive environment by making pro forma adjustments to historical
sales, keeping costs the same. However, when a company changes its
business, the past is less likely to be a good indicator of what may occur
in the future, depending on the signi¬cance of the change.

Table 2-5 is an example of using regression techniques to forecast sales.
In order to do this, it must be reasonable to assume that past performance
is a reasonable indicator of future expectations. If there are fundamental
changes in the industry that render the past a poor indicator of the future,
then regression may useless and even quite misleading. As cautioned by
Pratt, Reilly, and, Schweihs (1996), blind application of regression, where
past performance is the sole indicator of future sales, can be misleading
and incorrect. Instead, careful analysis is required to determine whether
past income generating forces will be duplicated in the future. Neverthe-
less, regression analysis is often useful as a benchmark in forecasting.
In our example in Table 2-5, the primary independent variable is
gross domestic product (GDP), which we show for the years 1988“1998
in billions of dollars in cells B5:B15 (the cell references separated by a
colon will be our way to indicate contiguous spreadsheet ranges). In C5:
C15, we show the square of GDP in billions of dollars, which is our
second potential independent variable.15 Our dependent variable is sales,
which appears in D5:D15.

15. Another variation of this procedure is to substitute the square root of GDP for its square.

PART 1 Forecasting Cash Flows

<< . .

. 8
( : 66)

. . >>