T A B L E 2-5

Regression Analysis of Sales as a Function of GDP [1]

A B C D E F G H I

GDP2

4 Year GDP Sales

5 1988 5,049.6 25,498,460.2 $1,000,000

6 1989 5,438.7 29,579,457.7 $1,090,000

7 1990 5,743.0 32,982,049.0 $1,177,200

8 1991 5,916.7 35,007,338.9 $1,259,604

9 1992 6,244.4 38,992,531.4 $1,341,478

10 1993 6,558.1 43,008,675.6 $1,442,089

11 1994 6,947.0 48,260,809.0 $1,528,614

12 1995 7,269.6 52,847,084.2 $1,617,274

13 1996 7,661.6 58,700,114.6 $1,706,224

14 1997 8,110.9 65,786,698.8 $1,812,010

15 1998 8,510.7 72,432,014.5 $1,929,791

17 SUMMARY OUTPUT

19 Regression Statistics

20 Multiple R 0.999156207

21 R square 0.998313125

22 Adjusted R square 0.997891407

23 Standard error 13893.80997

24 Observations 11

26 ANOVA

27 df SS MS F Signi¬cance F

28 Regression 2 9.13938E 11 4.5697E 11 2367.24925 8.0971E 12

29 Residual 8 1544303643 193037955.4

30 Total 10 9.15482E 11

32 Coef¬cients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%

33 Intercept 824833.1304 182213.8131 4.526732175 0.001932674 1245019.209 404647.0522 1245019.209 404647.0522

34 GDP 412.8368996 54.65310215 7.553768832 6.5848E-05 286.8065386 538.8672622 286.8065386 538.8672607

GDP2

35 0.010625314 0.004016833 2.64519663 0.029474667 0.019888154 0.001362473 0.019888154 0.001362473

[1] GDP, Gross Domestic Product, is in billions of dollars. GDP is a proxy for the overall economy.

43

Spreadsheet Procedures to Perform Regression

It is mandatory to put the variables in columns and the time periods in

rows. Electronic spreadsheets will not permit you to perform regression

analysis with time in columns and the variables in rows. In other words,

we cannot transpose the data in Table 2-5, cells A4:D15 and still perform

a regression analysis.

Another requirement is that all cells must contain numeric data. You

cannot perform regression with blank cells or cells with alphanumeric

data in them. Also, you will receive an error message if one of your

independent variables is a multiple of another. For example, if each cell

in C5:C15 is three times the corresponding cell in B5:B15, then the x var-

iables are perfectly collinear and the regression produce an error message.

We will explain regression procedures in Microsoft Excel ¬rst, then

in Lotus 123.

In Excel, the procedure to perform the regression analysis is as fol-

lows:

1. Select Tools Data Analysis Regression. This will bring up a

dialog box and automatically places the cursor in Input Y

Range.16

2. For the Y range (which is the dependent variable, sales in our

example), click on the range icon with the red arrow

immediately to the right. Doing so minimizes the dialog box

and enables you to highlight the cell range D4:D15 with your

mouse.17 Note that we have included the label Sales in D4 in

this range. Click again on the range icon again to return to the

dialog box.

3. For the X range, which are the independent variables GDP and

GDP2 in our case, repeat the procedure in (2) and highlight the

range B4:C15.

4. Click on the box Labels, which will put a check mark in the

box.

5. Click on Output Range. Click on the box to the right, click on

the range icon with the red arrow, and then click on cell A17.

This tells the spreadsheet to begin the regression output at that

cell.

6. Click OK.

Excel now calculates the regression and outputs the data as shown

in the bottom half of Table 2-5.

The instructions for Lotus 123 are almost identical. The only differ-

ences are:

1. The command is Range Analyze Regression.

2. The ranges for the dependent and independent variables should

not include the label in Row 4. Thus they are D5:D15 and B5:

C15, respectively.

16. If Data Analysis is not yet enabled in Excel, you must select add-ins and then select

Analysis ToolPak.

17. Excel actually shows the range with dollar signs, e.g., $D$4:$D$15

PART 1 Forecasting Cash Flows

44

3. Lotus 123 does not compute t-statistics for you.18 You will have

to do that manually by creating a formula. Divide the regression

coef¬cient by its standard error. Unfortunately, Lotus 123 does

not calculate the p-values either. You will have to look up your

results in a standard table of t-statistics. We will cover that later.

Examining the Regression Statistics

Once again, we look at the statistical measures resulting from the regres-

sion to determine how strong is the relationship between sales and time.

Adjusted R 2 is 99.8% (B22), a near-perfect relationship. The t-statistics for

the independent variables, GDP and GDP2, are 7.55 (D34) and “2.65

(D35), both statistically signi¬cant. The easiest way to determine the level

of statistical signi¬cance is through the p-value. One minus the p-value

is the level of statistical signi¬cance. For GDP, the p-value is 6.5848 10 5

(E34), which is much less than 0.1%. Thus GNP is statistically signi¬cant

at a level greater than 100% 0.1% 99.9%. The square of GDP has a

p-value of 0.029 (E35), which indicates statistical signi¬cance at the 97.1%

level. We normally accept any regressor with signi¬cance greater than or

equal to 95%, and we may consider accepting a regressor that is signi¬-

cant at the 90% to 95% level.

The standard error of the y-estimate, i.e., sales, is $13,894 (B23). Our

approximate 95% con¬dence interval is two standard errors

$27,788, which is less than 2% of the mean of sales.

In actual practice, adjusted R 2 for a regression of sales of mature

¬rms is often above 90% and frequently around 98%.

Adding Industry-Speci¬c Independent Variables

One should also consider adding industry-speci¬c independent variables.

For example, when valuing a jeweler, we should try adding the price of

gold and silver (and the nonlinear transformations, i.e., squares, square

roots, and logarithms) as independent variables. When valuing a ¬rm in

the oil industry, we should try using the price of a barrel of oil (and its

nonlinear transformations).

When valuing a coffee producer, we would want to have not only

the average price of coffee as an independent variable, but also the price

of tea and perhaps even sugar. The analyst should look to the prices of

the product itself, complements, and substitutes.

Once again, it is important to examine the statistical validity of the

relationship and use professional judgment to determine the usefulness

of the equation. Sales forecasts obtained from regression analysis can

serve as a benchmark from which adjustments can be made based on

qualitative factors that may in¬‚uence future sales.

One should also keep in mind that just because a less quantitative

method of forecasting sales does not have an embarrassingly low R 2 star-

ing the analyst in the face does not mean that it is superior to the re-

18. That is true of version 5, which is already at least four years old. If Lotus has added that

feature in a later version, I would not be aware of that.

CHAPTER 2 Using Regression Analysis 45

gression. It means we have no clue as to the reliability of the forecast. We

should always be uncomfortable with our ignorance.

Try All Combinations of Potential Independent Variables

It is important to try all combinations of independent variables. With a

statistics package, this is done automatically in using automated forward

or backward regression. However, statistics packages have their draw-

backs. They are not very user friendly in communicating with spread-

sheet programs, which most appraisers use in valuation analysis. Most

appraisers will ¬nd the spreadsheet regression capabilities more than ad-

equate.

Therefore, it is important to try all combinations of potential inde-

pendent variables in the regression process. For example, in regressing

sales against both GDP and GDP2, it is not at all unusual to ¬nd both

independent variables statistically insigni¬cant when regressed together,

i.e., p-values greater than 0.05. However, they still may be statistically

signi¬cant when regressed individually. So it is important to regress sales

against GDP and perform a second regression against GDP2. This process

becomes more complicated with additional candidates for independent

variables.

APPLICATION OF REGRESSION ANALYSIS TO THE

GUIDELINE COMPANY METHOD

Valuation using the guideline company method involves the use of ratios

of stock price to: earnings (P/E multiples), cash ¬‚ow (P/CF or P/EBIT

multiples), book value (P/BV multiples), sales (P/Sales), or other mea-

sures of income, cash ¬‚ow, or value. The stock prices typically are those

of public companies in the same or similar business as the company.

Consideration is therefore given to the opinion of the informed investor

and what he or she is willing to pay for the stock of comparative public

companies adjusted for the speci¬c circumstances of the company being

valued. While the use of ratios is common in valuation, regression anal-

ysis is more sophisticated and informative because it provides us with

statistical feedback on the strength of the relationship. Pratt, Reilly, and

Schweihs (1996) present a comprehensive chapter on use of the guideline

company method, so we will only discuss it within the context of regres-

sion analysis.

Table 2-6: Regression Analysis of Guideline Companies

Table 2-6 shows data from an actual guideline company analysis, with

the company names disguised in Column A. Column B contains the fair

market values (FMVs) (market capitalization) for 11 companies, ranging

from slightly over $3 million (B5) to over $150 million (B15). The average

FMV is $41.3 million (B16), with a standard deviation of $44.6 million

(B17). Net income (Column C) averages about $5.1 million (C16), with a

range of $600,000 to $16.9 million. We had to exclude companies A and

B, which were outliers with price earnings (PE) ratios over 60.

PART 1 Forecasting Cash Flows

46

T A B L E 2-6

Regression Analysis of Guideline Companies

A B C D E F G H I

4 Company FMV Net Income ln FMV ln NI 1/g g PE Ratio

5 C 3,165,958 602,465 14.9680 13.3088 20.0000 0.0500 5.2550

6 D 6,250,000 659,931 15.6481 13.3999 10.0000 0.1000 9.4707

7 E 12,698,131 1,375,000 16.3570 14.1340 10.5263 0.0950 9.2350

8 F 24,062,948 2,325,000 16.9962 14.6592 9.0909 0.1100 10.3497

9 G 23,210,578 2,673,415 16.9601 14.7989 12.1951 0.0820 8.6820

10 H 16,683,567 2,982,582 16.6299 14.9083 20.0000 0.0500 5.5937

11 I 37,545,523 4,369,808 17.4411 15.2902 12.5000 0.0800 8.5920

12 J 46,314,262 4,438,000 17.6510 15.3057 9.3023 0.1075 10.4358

13 K 36,068,550 7,384,000 17.4009 15.8148 20.8333 0.0480 4.8847

14 L 97,482,000 12,679,000 18.3952 16.3555 9.5238 0.1050 7.6885

15 M 150,388,518 16,865,443 18.8287 16.6408 9.0909 0.1100 8.9170

16 Average 41,260,912 5,123,149 17.0251 14.9651 13.0057 0.0852 8.1004

17 Standard deviation 44,558,275 5,233,919 1.1212 1.0814 4.8135 0.0252 1.9954

20 SUMMARY OUTPUT

22 Regression Statistics

23 Multiple R 0.997820486

24 R square 0.995645723

25 Adjusted R square 0.994557153

26 Standard error 0.082720079

27 Observations 11

29 ANOVA

30 df SS MS F Signi¬cance F

31 Regression 2 12.51701011 6.258505055 914.6369206 3.59471E-10

32 Residual 8 0.054740892 0.006842611

33 Total 10 12.571751

35 Coef¬cients Standard t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper

Error 95.0%

36 Intercept 3.430881701 0.390158993 8.79354767 2.19714E-05 2.531172869 4.330590533 2.531172869 4.330590533

37 ln NI 0.957081978 0.024655341 38.81844378 2.13125E-10 0.900226622 1.013937333 0.900226622 1.013937333

38 1/g 0.056021702 0.005538834 10.11434967 7.79687E-06 0.068794284 0.04324912 0.068794284 0.04324912

47

48

T A B L E 2-6 (continued)

Regression Analysis of Guideline Companies

A B C D E F G H I

40 Valuation

41 NI 100,000 200,000 300,000 400,000 500,000 1,000,000

42 In NI 11.5129 12.2061 12.6115 12.8992 13.1224 13.8155

43 X coef¬cient-NI 0.957081978 0.957081978 0.957081978 0.957081978 0.957081978 0.957081978

44 In NI X coef¬cient 11.01881347 11.68221215 12.07027549 12.34561082 12.55917749 13.22257672

45 g 0.05 0.055 0.06 0.065 0.07 0.075

46 1/g 20 18.18181818 16.66666667 15.38461538 14.28571429 13.33333333

47 X coef¬cient-1 / g 0.056021702 0.056021702 0.056021702 0.056021702 0.056021702 0.0560217

48 1/g X coef¬cient 1.120434033 1.018576394 0.933695028 0.861872333 0.800310024 0.746956022

49 Add intercept 3.430881701 3.430881701 3.43088176 3.430881701 3.430881701 3.430881702

50 Total ln FMV 13.329261101 14.09451745 14.56746217 14.91462019 15.18974917 15.90650185

51 FMV $614,928 $1,321,816 $2,121,136 $3,001,492 $3,952,067 $8,092,934

52 PE Ratio 6.149284138 6.609082291 7.070452024 7.50373099 7.904133036 8.09293361

54 95% Con¬dence Intervals

55 2 Standard errors 0.165440158

e2 Std Err

56 1.179912352

e-2 Std Err

57 0.847520579

First we will brie¬‚y describe the regression results for the regression

of FMV against net income. The regression yields an adjusted R 2 of 94.6%

and a t-statistic for the x-coef¬cient of 12.4, which seems to indicate a

successful regression. The regression equation obtained for the complete

data set is:

FMV $1,272,335 (8.3 Net Income)

If we were to use to value a ¬rm with net income of $100,000, the re-

gression would produce a value of $442,000. Something is wrong!

The problem is that the full regression equation is:

FMV a b Net Income ui (2-10)

where ui is an error term, assumed to be normally distributed with an

expected value of zero. Our speci¬c regression equation is:

$1,272,335 (8.3 Net Income) ui (2-11)

The problem is that this error term is additive and likely to be cor-

related to the size of the ¬rm. When that occurs, we have a problem called

˜˜heteroscedasticity.™™

There are two possible solutions to the problem. The ¬rst is to use

weighted least squares (WLS) instead of ordinary least squares regression.

In WLS, we weight the extreme values less than the more mainstream

values. This usually will not produce a usable solution for a privately

held ¬rm that is much smaller than the publicly traded guideline com-

panies.

The second possible solution is to use a log“log speci¬cation. In do-

ing so, we regress the natural logarithm of market capitalization as a

function of the natural logarithm of net income. Its form is:

ln FMVi a bi ln NI ui, i guideline company 1, 2, 3, . . . n

(2-12)

When we take antilogs, the original equation is:

A NIib vi

FMVi (2-13)

e a, vi e ui is Euler™s constant, and the expected value of

where A

vi 1.

In equation (2-13), the regression equation x-coef¬cient, bi, from equa-

tion (2-12) for net income thus becomes an exponent to net income. If

b 1, then size has no scaling effect on the FMV, and we would expect

price earnings ratios to be uncorrelated to size, all other things being

constant. If b 1, then the price earnings multiple should rise with net

income, and the opposite is true of b 1. Relating this to the log size

model in Chapter 4, we would thus expect to ¬nd b 1 because over

long periods of time large ¬rms have lower discount rates than small

¬rms, which means larger values relative to earnings.

Using equation (2-13), consider two identical errors of 20% for ¬rms

i and j, where ¬rm i has net income of $100,000 and ¬rm j has net income

of $200,000. In other words, the error terms vi and vj are both 1.2.19 For

19. This means the error terms ui and uj in equation (2-12) are equal to ln (1.2) 0.182.

CHAPTER 2 Using Regression Analysis 49

simplicity, suppose that b 1 for both ¬rms. The same statistical error in

the log of the fair market value of both ¬rms produces an error in fair

market value that is twice as large in ¬rm j as in ¬rm i. This is a desirable

property, as it corresponds to our intuition that large ¬rms will tend to

have larger absolute deviations from the regression determined values.

Thus, this form of regression is likely to be more successful than equation

(2-10) for valuing small ¬rms.

Equation (2-10) is probably ¬ne for valuing ¬rms of the same size as

the guideline companies. When we apply equation (2“10) to various lev-

els of net income, we ¬nd the forecast FMVs are $442,000, $0 (rounded),

$2.9 million, and $7.0 million for net incomes of $100,000, $154,000,

$500,000, and $1 million. Obviously equation (3-10) works poorly at the

low end. We would also have a similar, but opposite, scaling problem

forecasting value for a ¬rm with net income of $5 billion. The additive

error term restricts the applicability of equation (2-10) to subject compa-

nies of similar size to the guideline companies.

There is an important possible enhancement to the regression equa-

tion, and that is the introduction of forecast growth as an independent

variable. The emergence of the Internet makes it easier to obtain growth

forecasts, although frequently there are no such estimates for smaller pub-

licly traded ¬rms.

For a ¬rm with constant forecast growth, a midyear Gordon model

is its proper valuation equation.

1 r

FMV CFt (2-14)

1

r g

In Chapter 4, we show that New York Stock Exchange returns are nega-

tively related to the natural logarithm of market capitalization (which can

also be referred to as fair market value or size), which means that there

is a nonlinear relationship between return and size. Therefore, the dis-

count rate, r, in equation (2-14) impounds a nonlinear size effect. To the

extent that there is a nonlinear size effect in equation (2-13), we should

hopefully pick that up in the b coef¬cient.

Note that in equation (2-14) there is a growth term, g, which appears

in the denominator of the Gordon model multiple. Thus, it is reasonable

to try 1/g as an additional independent variable in equation (2-13).

Continuing our description of Table 2-6, Column C is net income and

Columns D and E are the natural logarithms of FMV and net income.

These are actual data from a real valuation. Column G shows a growth

rate, and it is not actual data (which were unavailable). Column F is the

inverse of Column G, i.e., 1/g. Thus, Column D is our dependent variable

and Columns E and F are our independent variables.20

Adjusted R 2 is 99.5% (B25), an excellent result. The standard error of

the y-estimate is 0.08272 (B26). The y-intercept is 3.43 (B36) and the x-

coef¬cients for ln NI and 1/g are 0.95708 and “0.05602 (B37, B38), re-

spectively.

20. Electronic spreadsheets require that the independent variables be in contiguous columns.

PART 1 Forecasting Cash Flows

50

On page 2 of Table 2-6, we show valuations for subject companies

with differing levels of net income and expected growth. Row 41 shows

¬rms with net incomes ranging from $100,000 to $1 million. Row 42 is

the natural log of net income.21 We multiply that by the x-coef¬cient for

net income in Row 43, which produces a subtotal in Row 44.

Row 45 contains our forecast of constant growth for the various sub-

ject companies. We are assuming growth of 5% per year for the $100,000

net income ¬rm in Column B, and we increase the growth estimate by

0.5% for each ¬rm. Row 46 is one divided by forecast growth.

In Row 47 we repeat the x-coef¬cient for 1/g from the regression,

and we multiply Row 46 Row 47 Row 48, which is another subtotal.

In Row 49 we repeat the y-intercept from the regression. In Row 50

we add Rows 44, 48, and 49, which is the natural logarithm of the forecast

FMV (at the marketable minority interest level). We must then exponen-

tiate that result, i.e., take the antilog. The Excel formula for B51 is

EXP(B50).22 Finally, we calculate the P/E ratio in Row 52 as Row 51 di-