ñòð. 4 |

following conditions

2, Ã€ 1

0<a 1, g > 0 (3:3:3)

b

The parameter m corresponds to the mean of the stable distribution

and can be any real number. The parameter a characterizes the

distribution peakedness. If a Â¼ 2, the distribution is normal. The

parameter b characterizes skewness of the distribution. Note that

skewness of the normal distribution equals zero and the parameter

b does not affect the characteristic function with a Â¼ 2. For the

normal distribution

ln FN (q) Â¼ imq Ã€ gq2 (3:3:4)

The non-negative parameter g is the scale factor that characterizes the

spread of the distribution. In the case of the normal distribution,

g Â¼ s2 =2 (where s2 is variance). The Cauchy distribution is defined

26 Probability Distributions

with the parameters a Â¼ 1 and b Â¼ 0. Its characteristic function

equals

ln FC (q) Â¼ imq Ã€ gjqj (3:3:5)

The important feature of the stable distributions with a < 2 is that

they exhibit the power-law decay at large absolute values of the

argument x

fL (jxj) $ jxjÃ€(1Ã¾a) (3:3:6)

The distributions with the power-law asymptotes are also named the

Pareto distributions. Many processes exhibit power-law asymptotic

behavior. Hence, there has been persistent interest to the stable distri-

butions.

The power-law distributions describe the scale-free processes. Scale

invariance of a distribution means that it has a similar shape on

different scales of independent variables. Namely, function f(x) is

scale-invariant to transformation x ! ax if there is such parameter

L that

f(x) Â¼ Lf(ax) (3:3:7)

The solution to equation (3.3.7) is simply the power law

f(x) Â¼ xn (3:3:8)

where n Â¼ Ã€ln (L)= ln (a). The power-law function f(x) (3.3.8) is scale-

free since the ratio f(ax)=f(x) Â¼ L does not depend on x. Note that the

parameter a is closely related to the fractal dimension of the function

f(x). The fractal theory will be discussed in Chapter 6.

Unfortunately, the moments of stable processes E[xn ] with power-

law asymptotes (i.e., when a < 2) diverge for n ! a. As a result, the

mean of a stable process is infinite when a 1. In addition, variance

of a stable process is infinite when a < 2. Therefore, the normal

distribution is the only stable distribution with finite mean and finite

variance.

The stable distributions have very helpful features for data analysis

such as flexible description of peakedness and skewness. However, as it

was mentioned previously, the usage of the stable distributions in

financial applications is often restricted because of their infinite vari-

ance at a < 2. The compromise that retains flexibility of the Levy

27

Probability Distributions

distribution yet yields finite variance is named truncated Levy flight.

This distribution is defined as [2]

jxj > â€˜

0,

fTL (x) Â¼ (3:3:9)

CfL (x), Ã€â€˜ x â€˜

In (3.3.9), fL (x) is the Levy distribution â€˜ is the cutoff length, and C is

the normalization constant. Sometimes the exponential cut-off is used

at large distances [3]

fTL (x) $ exp ( Ã€ ljxj), l > 0, jxj > â€˜ (3:3:10)

Since fTL (x) has finite variance, it converges to the normal distribu-

tion according to the central limit theorem.

3.4 REFERENCES FOR FURTHER READING

The Fellerâ€™s textbook is the classical reference to the probability

theory [1]. The concept of scaling in financial data has been advocated

by Mandelbrot since the 1960s (see the collection of his work in [7]).

This problem is widely discussed in the current Econophysics litera-

ture [2, 3, 8].

3.5 EXERCISES

1. Calculate the correlation coefficients between the prices of

Microsoft (MSFT), Intel (INTC), and Wal-Mart (WMT). Use

monthly closing prices for the period 1994â€“2003. What do you

think of the opposite signs for some of these coefficients?

2. Familiarize yourself with Microsoft Excelâ€™s statistical tools. As-

suming that Z is the standard normal distribution: (a) calculate

Pr(1 Z 3) using the NORMSDIST function; (b) calculate x

such that Pr(Z x) Â¼ 0:95 using the NORMSINV function; (c)

calculate x such that Pr(Z ! x) Â¼ 0:15; (d) generate 100 random

numbers from the standard normal distribution using Tools/

Data Analysis/Random Number Generation. Calculate the

sample mean and standard variance. How do they differ from

the theoretical values of m Â¼ 0 and s Â¼ 1, respectively? (e) Do

the same for the standard uniform distribution as in (d).

28 Probability Distributions

(f) Generate 100 normally distributed random numbers x using

the function x Â¼ NORMSINV(z) where z is taken from a sample

of the standard uniform distribution. Explain why it is possible.

Calculate the sample mean and the standard deviation. How do

they differ from the theoretical values of m and s, respectively?

3. Calculate mean, standard deviation, excess kurtosis, and skew

for the SPY data sample from Exercise 2.1. Draw the distribu-

tion function of this data set in comparison with the standard

normal distribution and the standard Cauchy distribution.

Compare results with Figure 3.1.

Hint: (1) Normalize returns by subtracting their mean and divid-

ing the results by the standard deviation. (2) Calculate the histo-

gram using the Histogram tool of the Data Analysis menu. (3)

Divide the histogram frequencies with the product of their sum and

the bin size (explain why it is necessary).

4. Let X1 and X2 be two independent copies of the normal distri-

bution X $ N(m, s2 ). Since X is stable, aX1 Ã¾ bX2 $ CX Ã¾ D.

Calculate C and D via given m, s, a, and b.

Chapter 4

Stochastic Processes

Financial variables, such as prices and returns, are random time-

dependent variables. The notion of stochastic process is used to de-

scribe their behavior. Specifically, the Wiener process (or the Brownian

motion) plays the central role in mathematical finance. Section 4.1

begins with the generic path: Markov process ! Chapmen-Kolmo-

gorov equation ! Fokker-Planck equation ! Wiener process. This

methodology is supplemented with two other approaches in Section

4.2. Namely, the Brownian motion is derived using the Langevinâ€™s

equation and the discrete random walk. Then the basics of stochastic

calculus are described. In particular, the stochastic differential equa-

tion is defined using the Itoâ€™s lemma (Section 4.3), and the stochastic

integral is given in both the Ito and the Stratonovich forms

(Section 4.4). Finally, the notion of martingale, which is widely popu-

lar in mathematical finance, is introduced in Section 4.5.

4.1 MARKOV PROCESSES

Consider a process X(t) for which the values x1 , x2 , . . . are meas-

ured at times t1 , t2 , . . . Here, one-dimensional variable x is used

for notational simplicity, though extension to multidimensional

systems is trivial. It is assumed that the joint probability density

f(x1 , t1 ; x2 , t2 ; . . . ) exists and defines the system completely. The con-

ditional probability density function is defined as

29

30 Stochastic Processes

f(x1 , t1 ; x2 , t2 ; . . . xk , tk jxkÃ¾1 , tkÃ¾1 ; xkÃ¾2 , tkÃ¾2 ; . . . ) Â¼

f(x1 , t1 ; x2 , t2 ; . . . xkÃ¾1 , tkÃ¾1 ; . . . )=f(xkÃ¾1 , tkÃ¾1 ; xkÃ¾2 , tkÃ¾2 ; . . . ) (4:1:1)

In (4.1.1) and further in this section, t1 > t2 > . . . tk > tkÃ¾1 > . . .

unless stated otherwise. In the simplest stochastic process, the present

has no dependence on the past. The probability density function for

such a process equals

Y

f(x1 , t1 ; x2 , t2 ; . . . ) Â¼ f(x1 , t1 )f(x2 , t2 ) . . . f(xi , ti ) (4:1:2)

i

The Markov process represents the next level of complexity, which

embraces an extremely wide class of phenomena. In this process, the

future depends on the present but not on the past. Hence, its condi-

tional probability density function equals

f(x1 , t1 ; x2 , t2 ; . . . xk , tk jxkÃ¾1 , tkÃ¾1 ; xkÃ¾2 , tkÃ¾2 ; . . . ) Â¼

f(x1 , t1 ; x2 , t2 ; . . . xk , tk jxkÃ¾1 , tkÃ¾1 ) (4:1:3)

This means that evolution of the system is determined with the initial

condition (i.e., with the value xkÃ¾1 at time tkÃ¾1 ). It follows for the

Markov process that

f(x1 , t1 ; x2 , t2 ; x3 , t3 ) Â¼ f(x1 , t1 jx2 , t2 )f(x2 , t2 jx3 , t3 ) (4:1:4)

Using the definition of the conditional probability density, one can

introduce the general equation

Ã°

f(x1 , t1 jx3 , t3 ) Â¼ f(x1 , t1 ; x2 , t2 jx3 , t3 )dx2

Ã°

Â¼ f(x1 , t1 jx2 , t2 ; x3 , t3 )f(x2 , t2 jx3 , t3 )dx2 (4:1:5)

For the Markov process,

f(x1 , t1 jx2 , t2 ; x3 , t3 ) Â¼ f(x1 , t1 jx2 , t2 ), (4:1:6)

Then the substitution of equation (4.1.6) into equation (4.1.5) leads to

the Chapmen-Kolmogorov equation

Ã°

f(x1 , t1 jx3 , t3 ) Â¼ f(x1 , t1 jx2 , t2 )f(x2 , t2 jx3 , t3 )dx2 (4:1:7)

This equation can be used as the starting point for deriving the

Fokker-Planck equation (see, e.g., [1] for details). First, equation

(4.1.7) is transformed into the differential equation

31

Stochastic Processes

1 @2

@ @

f(x, tjx0 , t0 ) Â¼Ã€ [A(x, t)f(x, tjx0 , t0 )] Ã¾ [D(x, t)f(x, tjx0 , t0 )]Ã¾

2 @x2

@t @x

Ã°

[R(xjz, t)f(z, tjx0 , t0 ) Ã€R(zjx, t)f(x, tjx0 , t0 )]dz (4:1:8)

In (4.1.8), the drift coefficient A(x, t) and the diffusion coefficient

D(x, t) are equal

Ã°

1

A(x, t) Â¼ lim (z Ã€ x)f(z, t Ã¾ Dtjx, t)dz (4:1:9)

Dt!0 Dt

Ã°

1

(z Ã€ x)2 f(z, t Ã¾ Dtjx, t)dz

D(x, t) Â¼ lim (4:1:10)

Dt!0 Dt

The integral in the right-hand side of the Chapmen-Kolmogorov

equation (4.1.8) is determined with the function

1

R(xjz, t) Â¼ lim f(x, t Ã¾ Dtjz, t) (4:1:11)

Dt!0 Dt

It describes possible discontinuous jumps of the random variable. Neg-

lecting this term in equation (4.1.8) yields the Fokker-Planck equation

@ @

f(x, tjx0 , t0 ) Â¼ Ã€ [A(x, t)f(x, tjx0 , t0 )]

@t @x

(4:1:12)

1 @2

Ã¾ [D(x, t)f(x, tjx0 , t0 )]

2 @x2

This equation with A(x, t) Â¼ 0 and D Â¼ const is reduced to the

diffusion equation that describes the Brownian motion

D @2

@

f(x, tjx0 , t0 ) Â¼ f(x, tjx0 , t0 ) (4:1:13)

2 @x2

@t

Equation (4.1.13) has the analytic solution in the Gaussian form

f(x, tjx0 , t0 ) Â¼ [2pD(t Ã€ t0 )]Ã€1=2 exp [Ã€(x Ã€ x0 )2 =2D(t Ã€ t0 )] (4:1:14)

Mean and variance for the distribution (4.1.14) equal

E[x(t)] Â¼ x0 , Var[x(t)] Â¼ E[(x(t) Ã€ x0 )2 ] Â¼ s2 Â¼ D(t Ã€ t0 ) (4:1:15)

The diffusion equation (4.1.13) with D Â¼ 1 describes the standard

Wiener process for which

E[(x(t) Ã€ x0 )2 ] Â¼ t Ã€ t0 (4:1:16)

32 Stochastic Processes

The notions of the generic Wiener process and the Brownian motion

are sometimes used interchangeably, though there are some fine

differences in their definitions [2, 3]. I shall denote the Wiener process

with W(t) and reserve this term for the standard version (4.1.16), as it

is often done in the literature.

The Brownian motion is the classical topic of statistical physics.

Different approaches for introducing this process are described in the

next section.

4.2 BROWNIAN MOTION

In mathematical statistics, the notion of the Brownian motion is

used for describing the generic stochastic process. Yet, this term

referred originally to Brownâ€™s observation of random motion of

pollen in water. Random particle motion in fluid can be described

using different theoretical approaches. Einsteinâ€™s original theory of

the Brownian motion implicitly employs both the Chapman-Kolmo-

gorov equation and the Fokker-Planck equation [1]. However, choos-

ing either one of these theories as the starting point can lead to the

diffusion equation. Langevin offered another simple method for de-

riving the Fokker-Planck equation. He considered one-dimensional

motion of a spherical particle of mass m and radius R that is subjected

to two forces. The first force is the viscous drag force described by the

dr

Stokes formula, F Â¼ Ã€6pZRv, where Z is viscosity and v Â¼ is the

dt

particle velocity. Another force, Z, describes collisions of the water

molecules with the particle and therefore has a random nature. The

Langevin equation of the particle motion is

dv

Â¼ Ã€6pZRv Ã¾ Z

m (4:2:1)

dt

Let us multiply both sides of equation (4.2.1) by r. Since

dv d 1d 2

r Â¼ (rv) Ã€ v2 and rv Â¼ (r ), then

dt dt 2dt

dr 2

1 d2 2 d

Â¼ Ã€3pZR (r2 ) Ã¾ Zr

m 2 (r ) Ã€ m (4:2:2)

2 dt dt dt

Note that the mean kinetic energy of a spherical particle, E[ 1 mv2 ],

2

3

equals 2 kT. Since E[Zr] Â¼ 0 due to the random nature of Z, averaging

of equation (4.2.2) yields

33

Stochastic Processes

d2 d

m 2 E[r2 ] Ã¾ 6pZR E[r2 ] Â¼ 6kT (4:2:3)

dt dt

The solution to equation (4.2.3) is

d

E[r2 ] Â¼ kT=(pZR) Ã¾ C exp (Ã€6pZRt=m) (4:2:4)

dt

where C is an integration constant. The second term in equation

(4.2.4) decays exponentially and can be neglected in the asymptotic

solution. Then

E[r2 ] Ã€ r2 Â¼ [kT=(pZR)]t (4:2:5)

0

where r0 is the particle position at t Â¼ 0. It follows from the compari-

son of equations (4.2.5) and (4.1.15) that D Â¼ kT=(pZR).1

The Brownian motion can be also derived as the continuous limit

for the discrete random walk (see, e.g., [3]). First, let us introduce the

process e(t) that is named the white noise and satisfies the following

conditions

E[e(t)] Â¼ 0; E[e2 (t)] Â¼ s2 ; E[e(t) e(s)] Â¼ 0, if t 6Â¼ s: (4:2:6)

Hence, the white noise has zero mean and constant variance s2 . The

last condition in (4.2.6) implies that there is no linear correlation

between different observations of the white noise. Such a model repre-

sents an independently and identically distributed process (IID) and is

sometimes denoted IID(0, s2 ). The IID process can still have non-

linear correlations (see Section 5.3). The normal distribution N(0, s2 )

is the special case of the white noise. First, consider a simple discrete

process

y(k) Â¼ y(k Ã€ 1) Ã¾ e(k) (4:2:7)

where the white noise innovations can take only two values2

D, with probability p, p Â¼ const < 1

e(k) Â¼ (4:2:8)

Ã€D, with probability (1 Ã€ p)

Now, let us introduce the continuous process yn (t) within the time

interval t 2 [0, T], such that

yn (t) Â¼ y([t=h]) Â¼ y([nt=T]), t 2 [0, T] (4:2:9)

34 Stochastic Processes

In (4.2.9), [x] denotes the greatest integer that does not exceed x. The

process yn (t) has the stepwise form: it is constant except the moments

t Â¼ kh, k Â¼ 1, . . . , n. Mean and variance of the process yn (T) equal

E[yn (T)] Â¼ n(2p Ã€ 1)D Â¼ T(2p Ã€ 1)D=h (4:2:10)

Var[yn (T)] Â¼ nD2 Â¼ TD2 =h (4:2:11)

Both mean (4.2.10) and variance (4.2.11) become infinite in the

limiting case h ! 0 with arbitrary D. Hence, we must impose a rela-

tion between D and h that ensures the finite values of the moments

E[yn (T)] and Var[yn (T)]. Namely, let us set

pï¬ƒï¬ƒï¬ƒ pï¬ƒï¬ƒï¬ƒ

p Â¼ (1 Ã¾ m h=s)=2, D Â¼ s h (4:2:12)

where m and s are some parameters. Then

E[yn (T)] Â¼ mT, Var[yn (T)] Â¼ s2 T (4:2:13)

It can be shown that yn (T) converges to the normal distribution

N(mT, s2 T) in the continuous limit. Hence, m and s are the drift

and diffusion parameters, respectively. Obviously, the drift parameter

differs from zero only when p 6Â¼ 0:5, that is when there is preference

for one direction of innovations over another. The continuous process

defined with the relations (4.2.13) is named the arithmetic Brownian

motion. It is reduced to the Wiener process when m Â¼ 0 and s Â¼ 1.

Note that in a more generic approach, the time intervals between

observations of y(t) themselves represent a random variable [4, 5].

While this process (so-called continuous-time random walk) better

resembles the market price variations, its description is beyond the

scope of this book.

In the general case, the arithmetic Brownian motion can be ex-

pressed in the following form

y(t) Â¼ m(t)t Ã¾ s(y(t), t)W(t) (4:2:14)

The random variable in this process may have negative values. This

creates a problem for describing prices that are essentially positive.

Therefore, the geometric Brownian motion Y(t) Â¼ exp [y(t)] is often

used in financial applications.

One can simulate the Wiener process with the following equation

pï¬ƒï¬ƒï¬ƒï¬ƒï¬ƒ

[W(t Ã¾ Dt) Ã€ W(t)] DW Â¼ N(0, 1) Dt (4:2:15)

35

Stochastic Processes

While the Wiener process is a continuous process, its innovations are

random. Therefore, the limit of the expression DW=Dt does not

converge when Dt ! 0. Indeed, it follows for the Wiener process that

lim [DW(t)=Dt)] Â¼ lim [DtÃ€1=2 ] (4:2:16)

Dt!0 Dt!0

As a result, the derivative dW(t)/dt does not exist in the ordinary

sense. Thus, one needs a special calculus to describe the stochastic

processes.

4.3 STOCHASTIC DIFFERENTIAL EQUATION

The Brownian motion (4.2.14) can be presented in the differential

form3

dy(t) Â¼ mdt Ã¾ sdW(t) (4:3:1)

The equation (4.3.1) is named the stochastic differential equation.

Note that the term dW(t) Â¼ [W(t Ã¾ dt) Ã€ W(t)] has the following

properties

E[dW] Â¼ 0, E[dW dW] Â¼ dt, E[dW dt] Â¼ 0 (4:3:2)

Let us calculate (dy)2 having in mind (4.3.2) and retaining the terms

O(dt):4

(dy)2 Â¼ [mdt Ã¾ sdW]2 Â¼ m2 dt2 Ã¾ 2mdt sdW Ã¾ s2 dW2 % s2 dt (4:3:3)

It follows from (4.3.3) that while dy is a random variable, (dy)2 is a

ñòð. 4 |