<< . .

. 4
( : 18)

. . >>

In (3.3.2), d ¼ q=jqj and the distribution parameters must satisfy the
following conditions
2, À 1
0<a 1, g > 0 (3:3:3)
The parameter m corresponds to the mean of the stable distribution
and can be any real number. The parameter a characterizes the
distribution peakedness. If a ¼ 2, the distribution is normal. The
parameter b characterizes skewness of the distribution. Note that
skewness of the normal distribution equals zero and the parameter
b does not affect the characteristic function with a ¼ 2. For the
normal distribution
ln FN (q) ¼ imq À gq2 (3:3:4)
The non-negative parameter g is the scale factor that characterizes the
spread of the distribution. In the case of the normal distribution,
g ¼ s2 =2 (where s2 is variance). The Cauchy distribution is defined
26 Probability Distributions

with the parameters a ¼ 1 and b ¼ 0. Its characteristic function
ln FC (q) ¼ imq À gjqj (3:3:5)
The important feature of the stable distributions with a < 2 is that
they exhibit the power-law decay at large absolute values of the
argument x
fL (jxj) $ jxjÀ(1þa) (3:3:6)
The distributions with the power-law asymptotes are also named the
Pareto distributions. Many processes exhibit power-law asymptotic
behavior. Hence, there has been persistent interest to the stable distri-
The power-law distributions describe the scale-free processes. Scale
invariance of a distribution means that it has a similar shape on
different scales of independent variables. Namely, function f(x) is
scale-invariant to transformation x ! ax if there is such parameter
L that
f(x) ¼ Lf(ax) (3:3:7)
The solution to equation (3.3.7) is simply the power law
f(x) ¼ xn (3:3:8)
where n ¼ Àln (L)= ln (a). The power-law function f(x) (3.3.8) is scale-
free since the ratio f(ax)=f(x) ¼ L does not depend on x. Note that the
parameter a is closely related to the fractal dimension of the function
f(x). The fractal theory will be discussed in Chapter 6.
Unfortunately, the moments of stable processes E[xn ] with power-
law asymptotes (i.e., when a < 2) diverge for n ! a. As a result, the
mean of a stable process is infinite when a 1. In addition, variance
of a stable process is infinite when a < 2. Therefore, the normal
distribution is the only stable distribution with finite mean and finite
The stable distributions have very helpful features for data analysis
such as flexible description of peakedness and skewness. However, as it
was mentioned previously, the usage of the stable distributions in
financial applications is often restricted because of their infinite vari-
ance at a < 2. The compromise that retains flexibility of the Levy
Probability Distributions

distribution yet yields finite variance is named truncated Levy flight.
This distribution is defined as [2]

jxj > ˜
fTL (x) ¼ (3:3:9)
CfL (x), À˜ x ˜
In (3.3.9), fL (x) is the Levy distribution ˜ is the cutoff length, and C is
the normalization constant. Sometimes the exponential cut-off is used
at large distances [3]
fTL (x) $ exp ( À ljxj), l > 0, jxj > ˜ (3:3:10)
Since fTL (x) has finite variance, it converges to the normal distribu-
tion according to the central limit theorem.

The Feller™s textbook is the classical reference to the probability
theory [1]. The concept of scaling in financial data has been advocated
by Mandelbrot since the 1960s (see the collection of his work in [7]).
This problem is widely discussed in the current Econophysics litera-
ture [2, 3, 8].

1. Calculate the correlation coefficients between the prices of
Microsoft (MSFT), Intel (INTC), and Wal-Mart (WMT). Use
monthly closing prices for the period 1994“2003. What do you
think of the opposite signs for some of these coefficients?
2. Familiarize yourself with Microsoft Excel™s statistical tools. As-
suming that Z is the standard normal distribution: (a) calculate
Pr(1 Z 3) using the NORMSDIST function; (b) calculate x
such that Pr(Z x) ¼ 0:95 using the NORMSINV function; (c)
calculate x such that Pr(Z ! x) ¼ 0:15; (d) generate 100 random
numbers from the standard normal distribution using Tools/
Data Analysis/Random Number Generation. Calculate the
sample mean and standard variance. How do they differ from
the theoretical values of m ¼ 0 and s ¼ 1, respectively? (e) Do
the same for the standard uniform distribution as in (d).
28 Probability Distributions

(f) Generate 100 normally distributed random numbers x using
the function x ¼ NORMSINV(z) where z is taken from a sample
of the standard uniform distribution. Explain why it is possible.
Calculate the sample mean and the standard deviation. How do
they differ from the theoretical values of m and s, respectively?
3. Calculate mean, standard deviation, excess kurtosis, and skew
for the SPY data sample from Exercise 2.1. Draw the distribu-
tion function of this data set in comparison with the standard
normal distribution and the standard Cauchy distribution.
Compare results with Figure 3.1.
Hint: (1) Normalize returns by subtracting their mean and divid-
ing the results by the standard deviation. (2) Calculate the histo-
gram using the Histogram tool of the Data Analysis menu. (3)
Divide the histogram frequencies with the product of their sum and
the bin size (explain why it is necessary).
4. Let X1 and X2 be two independent copies of the normal distri-
bution X $ N(m, s2 ). Since X is stable, aX1 þ bX2 $ CX þ D.
Calculate C and D via given m, s, a, and b.
Chapter 4

Stochastic Processes

Financial variables, such as prices and returns, are random time-
dependent variables. The notion of stochastic process is used to de-
scribe their behavior. Specifically, the Wiener process (or the Brownian
motion) plays the central role in mathematical finance. Section 4.1
begins with the generic path: Markov process ! Chapmen-Kolmo-
gorov equation ! Fokker-Planck equation ! Wiener process. This
methodology is supplemented with two other approaches in Section
4.2. Namely, the Brownian motion is derived using the Langevin™s
equation and the discrete random walk. Then the basics of stochastic
calculus are described. In particular, the stochastic differential equa-
tion is defined using the Ito™s lemma (Section 4.3), and the stochastic
integral is given in both the Ito and the Stratonovich forms
(Section 4.4). Finally, the notion of martingale, which is widely popu-
lar in mathematical finance, is introduced in Section 4.5.

Consider a process X(t) for which the values x1 , x2 , . . . are meas-
ured at times t1 , t2 , . . . Here, one-dimensional variable x is used
for notational simplicity, though extension to multidimensional
systems is trivial. It is assumed that the joint probability density
f(x1 , t1 ; x2 , t2 ; . . . ) exists and defines the system completely. The con-
ditional probability density function is defined as

30 Stochastic Processes

f(x1 , t1 ; x2 , t2 ; . . . xk , tk jxkþ1 , tkþ1 ; xkþ2 , tkþ2 ; . . . ) ¼
f(x1 , t1 ; x2 , t2 ; . . . xkþ1 , tkþ1 ; . . . )=f(xkþ1 , tkþ1 ; xkþ2 , tkþ2 ; . . . ) (4:1:1)
In (4.1.1) and further in this section, t1 > t2 > . . . tk > tkþ1 > . . .
unless stated otherwise. In the simplest stochastic process, the present
has no dependence on the past. The probability density function for
such a process equals
f(x1 , t1 ; x2 , t2 ; . . . ) ¼ f(x1 , t1 )f(x2 , t2 ) . . .  f(xi , ti ) (4:1:2)

The Markov process represents the next level of complexity, which
embraces an extremely wide class of phenomena. In this process, the
future depends on the present but not on the past. Hence, its condi-
tional probability density function equals
f(x1 , t1 ; x2 , t2 ; . . . xk , tk jxkþ1 , tkþ1 ; xkþ2 , tkþ2 ; . . . ) ¼
f(x1 , t1 ; x2 , t2 ; . . . xk , tk jxkþ1 , tkþ1 ) (4:1:3)
This means that evolution of the system is determined with the initial
condition (i.e., with the value xkþ1 at time tkþ1 ). It follows for the
Markov process that
f(x1 , t1 ; x2 , t2 ; x3 , t3 ) ¼ f(x1 , t1 jx2 , t2 )f(x2 , t2 jx3 , t3 ) (4:1:4)
Using the definition of the conditional probability density, one can
introduce the general equation
f(x1 , t1 jx3 , t3 ) ¼ f(x1 , t1 ; x2 , t2 jx3 , t3 )dx2
¼ f(x1 , t1 jx2 , t2 ; x3 , t3 )f(x2 , t2 jx3 , t3 )dx2 (4:1:5)

For the Markov process,
f(x1 , t1 jx2 , t2 ; x3 , t3 ) ¼ f(x1 , t1 jx2 , t2 ), (4:1:6)
Then the substitution of equation (4.1.6) into equation (4.1.5) leads to
the Chapmen-Kolmogorov equation
f(x1 , t1 jx3 , t3 ) ¼ f(x1 , t1 jx2 , t2 )f(x2 , t2 jx3 , t3 )dx2 (4:1:7)

This equation can be used as the starting point for deriving the
Fokker-Planck equation (see, e.g., [1] for details). First, equation
(4.1.7) is transformed into the differential equation
Stochastic Processes

1 @2
@ @
f(x, tjx0 , t0 ) ¼À [A(x, t)f(x, tjx0 , t0 )] þ [D(x, t)f(x, tjx0 , t0 )]þ
2 @x2
@t @x
[R(xjz, t)f(z, tjx0 , t0 ) ÀR(zjx, t)f(x, tjx0 , t0 )]dz (4:1:8)

In (4.1.8), the drift coefficient A(x, t) and the diffusion coefficient
D(x, t) are equal
A(x, t) ¼ lim (z À x)f(z, t þ Dtjx, t)dz (4:1:9)
Dt!0 Dt
(z À x)2 f(z, t þ Dtjx, t)dz
D(x, t) ¼ lim (4:1:10)
Dt!0 Dt

The integral in the right-hand side of the Chapmen-Kolmogorov
equation (4.1.8) is determined with the function
R(xjz, t) ¼ lim f(x, t þ Dtjz, t) (4:1:11)
Dt!0 Dt

It describes possible discontinuous jumps of the random variable. Neg-
lecting this term in equation (4.1.8) yields the Fokker-Planck equation
@ @
f(x, tjx0 , t0 ) ¼ À [A(x, t)f(x, tjx0 , t0 )]
@t @x
1 @2
þ [D(x, t)f(x, tjx0 , t0 )]
2 @x2
This equation with A(x, t) ¼ 0 and D ¼ const is reduced to the
diffusion equation that describes the Brownian motion
D @2
f(x, tjx0 , t0 ) ¼ f(x, tjx0 , t0 ) (4:1:13)
2 @x2
Equation (4.1.13) has the analytic solution in the Gaussian form
f(x, tjx0 , t0 ) ¼ [2pD(t À t0 )]À1=2 exp [À(x À x0 )2 =2D(t À t0 )] (4:1:14)
Mean and variance for the distribution (4.1.14) equal
E[x(t)] ¼ x0 , Var[x(t)] ¼ E[(x(t) À x0 )2 ] ¼ s2 ¼ D(t À t0 ) (4:1:15)
The diffusion equation (4.1.13) with D ¼ 1 describes the standard
Wiener process for which
E[(x(t) À x0 )2 ] ¼ t À t0 (4:1:16)
32 Stochastic Processes

The notions of the generic Wiener process and the Brownian motion
are sometimes used interchangeably, though there are some fine
differences in their definitions [2, 3]. I shall denote the Wiener process
with W(t) and reserve this term for the standard version (4.1.16), as it
is often done in the literature.
The Brownian motion is the classical topic of statistical physics.
Different approaches for introducing this process are described in the
next section.

In mathematical statistics, the notion of the Brownian motion is
used for describing the generic stochastic process. Yet, this term
referred originally to Brown™s observation of random motion of
pollen in water. Random particle motion in fluid can be described
using different theoretical approaches. Einstein™s original theory of
the Brownian motion implicitly employs both the Chapman-Kolmo-
gorov equation and the Fokker-Planck equation [1]. However, choos-
ing either one of these theories as the starting point can lead to the
diffusion equation. Langevin offered another simple method for de-
riving the Fokker-Planck equation. He considered one-dimensional
motion of a spherical particle of mass m and radius R that is subjected
to two forces. The first force is the viscous drag force described by the
Stokes formula, F ¼ À6pZRv, where Z is viscosity and v ¼ is the
particle velocity. Another force, Z, describes collisions of the water
molecules with the particle and therefore has a random nature. The
Langevin equation of the particle motion is
¼ À6pZRv þ Z
m (4:2:1)
Let us multiply both sides of equation (4.2.1) by r. Since
dv d 1d 2
r ¼ (rv) À v2 and rv ¼ (r ), then
dt dt 2dt

dr 2
1 d2 2 d
¼ À3pZR (r2 ) þ Zr
m 2 (r ) À m (4:2:2)
2 dt dt dt
Note that the mean kinetic energy of a spherical particle, E[ 1 mv2 ],
equals 2 kT. Since E[Zr] ¼ 0 due to the random nature of Z, averaging
of equation (4.2.2) yields
Stochastic Processes

d2 d
m 2 E[r2 ] þ 6pZR E[r2 ] ¼ 6kT (4:2:3)
dt dt
The solution to equation (4.2.3) is
E[r2 ] ¼ kT=(pZR) þ C exp (À6pZRt=m) (4:2:4)
where C is an integration constant. The second term in equation
(4.2.4) decays exponentially and can be neglected in the asymptotic
solution. Then

E[r2 ] À r2 ¼ [kT=(pZR)]t (4:2:5)

where r0 is the particle position at t ¼ 0. It follows from the compari-
son of equations (4.2.5) and (4.1.15) that D ¼ kT=(pZR).1
The Brownian motion can be also derived as the continuous limit
for the discrete random walk (see, e.g., [3]). First, let us introduce the
process e(t) that is named the white noise and satisfies the following

E[e(t)] ¼ 0; E[e2 (t)] ¼ s2 ; E[e(t) e(s)] ¼ 0, if t 6¼ s: (4:2:6)
Hence, the white noise has zero mean and constant variance s2 . The
last condition in (4.2.6) implies that there is no linear correlation
between different observations of the white noise. Such a model repre-
sents an independently and identically distributed process (IID) and is
sometimes denoted IID(0, s2 ). The IID process can still have non-
linear correlations (see Section 5.3). The normal distribution N(0, s2 )
is the special case of the white noise. First, consider a simple discrete
y(k) ¼ y(k À 1) þ e(k) (4:2:7)
where the white noise innovations can take only two values2

D, with probability p, p ¼ const < 1
e(k) ¼ (4:2:8)
ÀD, with probability (1 À p)
Now, let us introduce the continuous process yn (t) within the time
interval t 2 [0, T], such that
yn (t) ¼ y([t=h]) ¼ y([nt=T]), t 2 [0, T] (4:2:9)
34 Stochastic Processes

In (4.2.9), [x] denotes the greatest integer that does not exceed x. The
process yn (t) has the stepwise form: it is constant except the moments
t ¼ kh, k ¼ 1, . . . , n. Mean and variance of the process yn (T) equal
E[yn (T)] ¼ n(2p À 1)D ¼ T(2p À 1)D=h (4:2:10)
Var[yn (T)] ¼ nD2 ¼ TD2 =h (4:2:11)
Both mean (4.2.10) and variance (4.2.11) become infinite in the
limiting case h ! 0 with arbitrary D. Hence, we must impose a rela-
tion between D and h that ensures the finite values of the moments
E[yn (T)] and Var[yn (T)]. Namely, let us set
p¬¬¬ p¬¬¬
p ¼ (1 þ m h=s)=2, D ¼ s h (4:2:12)
where m and s are some parameters. Then
E[yn (T)] ¼ mT, Var[yn (T)] ¼ s2 T (4:2:13)
It can be shown that yn (T) converges to the normal distribution
N(mT, s2 T) in the continuous limit. Hence, m and s are the drift
and diffusion parameters, respectively. Obviously, the drift parameter
differs from zero only when p 6¼ 0:5, that is when there is preference
for one direction of innovations over another. The continuous process
defined with the relations (4.2.13) is named the arithmetic Brownian
motion. It is reduced to the Wiener process when m ¼ 0 and s ¼ 1.
Note that in a more generic approach, the time intervals between
observations of y(t) themselves represent a random variable [4, 5].
While this process (so-called continuous-time random walk) better
resembles the market price variations, its description is beyond the
scope of this book.
In the general case, the arithmetic Brownian motion can be ex-
pressed in the following form
y(t) ¼ m(t)t þ s(y(t), t)W(t) (4:2:14)
The random variable in this process may have negative values. This
creates a problem for describing prices that are essentially positive.
Therefore, the geometric Brownian motion Y(t) ¼ exp [y(t)] is often
used in financial applications.
One can simulate the Wiener process with the following equation
[W(t þ Dt) À W(t)]  DW ¼ N(0, 1) Dt (4:2:15)
Stochastic Processes

While the Wiener process is a continuous process, its innovations are
random. Therefore, the limit of the expression DW=Dt does not
converge when Dt ! 0. Indeed, it follows for the Wiener process that

lim [DW(t)=Dt)] ¼ lim [DtÀ1=2 ] (4:2:16)
Dt!0 Dt!0

As a result, the derivative dW(t)/dt does not exist in the ordinary
sense. Thus, one needs a special calculus to describe the stochastic

The Brownian motion (4.2.14) can be presented in the differential
dy(t) ¼ mdt þ sdW(t) (4:3:1)
The equation (4.3.1) is named the stochastic differential equation.
Note that the term dW(t) ¼ [W(t þ dt) À W(t)] has the following
E[dW] ¼ 0, E[dW dW] ¼ dt, E[dW dt] ¼ 0 (4:3:2)
Let us calculate (dy)2 having in mind (4.3.2) and retaining the terms
(dy)2 ¼ [mdt þ sdW]2 ¼ m2 dt2 þ 2mdt sdW þ s2 dW2 % s2 dt (4:3:3)
It follows from (4.3.3) that while dy is a random variable, (dy)2 is a

<< . .

. 4
( : 18)

. . >>