glement for the general case of two distinguishable quantum objects, and then extend
this de¬nition to indistinguishable particles and to the electromagnetic ¬eld.
6.3.1 Tensor product spaces
In Section 4.2.1, the Hilbert space HQED for quantum electrodynamics was constructed
as the tensor product of the Hilbert space Hchg for the atoms and the Fock space HF for
the ¬eld. This construction only depends on the Born interpretation and the superposi
tion principle; consequently, it works equally well for any pair of distinguishable phys
ical systems A and B described by Hilbert spaces HA and HB . Let {φ± } and {·β }
be basis sets for HA and HB respectively, then for any pair of vectors (ψ A , ‘ B ) the
product vector Λ = ψ A ‘ B is de¬ned by the probability amplitudes
φ± , ·β Λ = φ± ψ ·β ‘ . (6.7)
Since {φ± } and {·β } are complete orthonormal sets of vectors in their respective
spaces, the inner product between two such vectors is consistently de¬ned by
Λ1 Λ2 = Λ1 φ± , ·β φ± , ·β Λ2
±β
ψ1 φ± ‘1 ·β φ± ψ2 ·β ‘2
=
±β
= ψ1 ψ2 ‘1 ‘2 , (6.8)
where the inner products ψ1 ψ2 and ‘1 ‘2 refer respectively to HA and HB . The
linear combination of two product vectors is de¬ned by componentwise addition, i.e.
the ket
¦ = c1 Λ1 + c2 Λ2 (6.9)
is de¬ned by the probability amplitudes
φ± , ·β ¦ = c1 φ± , ·β Λ1 + c2 φ± , ·β Λ2
= c1 φ± ψ1 ·β ‘1 + c2 φ± ψ2 ·β ‘2 . (6.10)
The tensor product space HC = HA — HB is the family of all linear combinations of
product kets. The family of product kets,
{χ±β = φ± , ·β = φ± A ·β B } , (6.11)
forms a complete orthonormal set with respect to the inner product (6.8), i.e.
 χ±β = φ±  φ± ·β ·β = δ±± δββ ,
χ± (6.12)
β
½
Extensions of the notion of entanglement
and a general vector ¦ in HC can be expressed as
¦ = ¦±β χ±β = ¦±β φ± ·β . (6.13)
A B
± ±
β β
The inner product between any two vectors is
Ψ— ¦±β .
Ψ ¦ = (6.14)
±β
± β
One can show that choosing new basis sets in HA and HB produces an equivalent
basis set for HC . This notion can be extended to composite systems composed of N
distinguishable subsystems described by Hilbert spaces H1 , . . . , HN . The composite
system is described by the N fold tensor product space
HC = H1 — · · · — HN , (6.15)
which is de¬ned by repeated use of the twospace de¬nition given above.
It is useful to extend the tensor product construction for vectors to a similar one
for operators. Let A and B be operators acting on HA and HB respectively, then the
operator tensor product, A — B, is the operator acting on HC de¬ned by
(A — B) ¦ = ¦±β A φ± A B ·β B . (6.16)
± β
This de¬nition immediately yields the rule
(A1 — B1 ) (A2 — B2 ) = (A1 A2 ) — (B1 B2 ) (6.17)
for the product of two such operators. Since the notion of the outer or tensor product
of matrices and operators is less familiar than the idea of product wave functions, we
sometimes use the explicit — notation for operator tensor products when it is needed
for clarity. The de¬nition (6.16) also allows us to treat A and B as operators acting
on the product space HC by means of the identi¬cations
A ” A — IB ,
(6.18)
B ” IA — B ,
where IA and IB are respectively the identity operators for HA and HB . These relations
lead to the rule
AB ” A — B , (6.19)
so we can use either notation as dictated by convenience.
As explained in Section 2.3.2, a mixed state of the composite system is described
by a density operator
Pe Ψe Ψe  ,
ρ= (6.20)
e
where Pe is a probability distribution on the ensemble {Ψe } of pure states. The
expectation values of observables for the subsystem A are determined by the reduced
density operator
½ Entangled states
ρA = TrB (ρ) , (6.21)
where the partial trace over HB of a general operator X acting on HC is the operator
on HA with matrix elements
φ± TrB (X) φ± = χ± β X χ±β . (6.22)
β
This can be expressed more explicitly by using the fact that every operator on HC can
be decomposed into a sum of operator tensor products, i.e.
An — Bn .
X= (6.23)
n
Substituting this into the de¬nition (6.22) de¬nes the operator
An TrB (Bn )
TrB (X) = (6.24)
n
acting on HA , where the cnumber
·β Bn  ·β
TrB (Bn ) = (6.25)
β
is the trace over HB . The average of an observable A for the subsystem A is thus given
by
Tr (ρA) = TrA (ρA A) . (6.26)
In the same way the average of an observable B for the subsystem B is
Tr (ρB) = TrB (ρB B) , (6.27)
where
ρB = TrA (ρ) . (6.28)
6.3.2 The Schmidt decomposition
For ¬nitedimensional spaces, the general expansion (6.13) becomes
dA dB
Ψ = Ψ±β χ±β , (6.29)
±=1 β=1
where Ψ±β = χ±β Ψ . In the study of entanglement, it is useful to have an alternative
representation that is speci¬cally tailored to a particular state vector Ψ .
For our immediate purposes it is su¬cient to explain the geometrical concepts
leading to this special expansion; the technical details of the proof are given in Section
6.3.3. The basic idea is illustrated in Fig. 6.1, which shows the original vector, Ψ ,
and the normalized product vector, ζ1 A ‘1 B , that has the largest projection Y1
onto Ψ .
½
Extensions of the notion of entanglement
Ψ>

Fig. 6.1 A qualitative sketch of the procedure
for deriving the Schmidt decomposition, given
by eqn (6.30). The heavy arrow represents the
original vector Ψ and the plane represents
ζ ‘ the set of all product vectors ζ ‘ . The light
 
> >
1 1
arrow denotes the projection of Ψ onto the
plane.
After determining this ¬rst product vector, we de¬ne a new vector, Ψ1 = Ψ ’
Y1 ζ1 A ‘1 B , that is orthogonal to ζ1 A ‘1 B . The same game can be played with
Ψ1 ; that is, we ¬nd the normalized product vector ζ2 A ‘2 B that has the maximum
projection Y2 onto Ψ1 and is orthogonal to ζ1 A ‘1 B . Since the spaces HA and HB
are ¬nite dimensional, this process must terminate after a ¬nite number r of steps, i.e.
when Yr+1 = 0. The orthogonality of the successive product vectors implies that they
are linearly independent; therefore, the largest possible number of steps is the smaller
of the two dimensions, min (dA , dB ). The ¬nal result is the Schmidt decomposition
r
Ψ = Yn ζn ‘n , (6.30)
A B
n=1
where the Schmidt rank r min (dA , dB ). The density operator for this pure state
is therefore
r r
—
Ym Yn ζm ‘m ζn  ‘n 
ρ= A BA B
m=1 n=1
r r
—
ζn ) — (‘m ‘n ) .
= Ym Yn (ζm (6.31)
AA BB
m=1 n=1
The minimum value (r = 1) of the Schmidt rank occurs when Ψ is a product vector.
The product vectors ζn A ‘n B are orthonormal by construction, i.e. ζn ζm =
‘n ‘m = δnm , and the coe¬cients Yn satisfy the normalization condition
r
2
Yn  = 1 . (6.32)
n=1
In applications of the Schmidt decomposition (6.30), it is important to keep in mind
that the basis vectors ζn A ‘n B themselves”and not just the coe¬cients Yn ”are
uniquely associated with the vector Ψ . The Schmidt decomposition for a new vector
¦ would require a new set of basis vectors.
Proof of the Schmidt decomposition—
6.3.3
We o¬er here a proof”modeled on one of the arguments given by Peres (1995, Sec.
53)”that the expansion (6.30) exists. For normalized vectors ζ1 A and ‘1 B : set
¾¼¼ Entangled states
ζ1 , ‘1 = ζ1 A ‘1 B , and consider the projection operator P1 = ζ1 , ‘1 ζ1 , ‘1 . The
identity Ψ = P1 Ψ + (1 ’ P1 ) Ψ can then be written as Ψ = Y1 ζ1 , ‘1 + Ψ1 ,
where Y1 = ζ1 , ‘1 Ψ and the vector Ψ1 = (1 ’ P1 ) Ψ is orthogonal to ζ1 , ‘1 . By
applying the general expansion (6.29) to the vectors Ψ and ζ1 , ‘1 , one can express
Y1 2 as
2
dA dB
Ψ— x± yβ
Y1 2 = 1, (6.33)
±β
±=1 β=1
where x± = φ± ζ1 , yβ = ·β ‘1 , and the upper bound follows from the normaliza
tion of the vectors de¬ning Y1 .
From a geometrical point of view, Y1  is the magnitude of the projection of ζ1 , ‘1
2
onto Ψ . In quantum terms, Y1  is the probability that a measurement of P1 will
result in the eigenvalue unity and will leave the system in the state ζ1 , ‘1 . The
next step is to choose the product vector ζ1 , ‘1 ”i.e. to ¬nd values of x± and yβ ”
that maximizes Y1 2 . This is always possible, since Y1 2 is a bounded, continuous
function of the ¬nite set of complex variables (x1 , . . . , xdA , y1 , . . . , ydB ). The solution
is not unique, since the overall phase of ζ1 , ‘1 is not determined by the maximization
procedure. This is not a real di¬culty; the undetermined phases can be chosen so that
Y1 is real. In general, there may be several linearly independent solutions for ζ1 , ‘1 ,
but this is also not a serious di¬culty. By forming appropriate linear combinations of
the degenerate solutions it is always possible to make them mutually orthogonal. We
will therefore simplify the discussion by assuming that the maximum is always unique.
Note that the maximum value of Y1 2 can only be unity if the original vector is itself
a product vector.
Now that we have made our choice of ζ1 , ‘1 , we pick a new product vector
ζ2 , ‘2 ”with projection operator P2 = ζ2 , ‘2 ζ2 , ‘2 ”and write the identity Ψ1 =
P2 Ψ1 + (1 ’ P2 ) Ψ1 as
Ψ1 = Y2 ζ2 , ‘2 + Ψ2 , (6.34)
where Y2 = ζ2 , ‘2 Ψ1 and Ψ2 = (1 ’ P2 ) Ψ1 . Since Ψ1 is orthogonal to ζ1 , ‘1 ,
we can assume that ζ2 , ‘2 is also orthogonal to ζ1 , ‘1 . Now we proceed, as in the
2
¬rst step, by choosing ζ2 , ‘2 to maximize Y2  . At this point, we have
Ψ = Y1 ζ1 , ‘1 + Y2 ζ2 , ‘2 + Ψ2 , (6.35)
and this procedure can be repeated until the next projection vanishes. The last re
mark implies that the number of terms is limited by the minimum dimensionality,
min (dA , dB ); therefore, we arrive at eqn (6.30).
6.4 Entanglement for distinguishable particles
In Section 6.3.1 we saw that the Hilbert space for a composite system formed from any
two distinguishable subsystems A and B (which can be atoms, molecules, quantum
dots, etc.) is the tensor product HC = HA — HB . The current intense interest in
quantum information processing has led to the widespread use of the terms parties
¾¼½
Entanglement for distinguishable particles
for A and B, and bipartite system, for what has traditionally been called a two
particle system. Since our interests in this book are not limited to quantum information
processing, we will adhere to the traditional terminology in which the distinguishable
objects A and B are called particles and the composite system is called a twoparticle
or twopart system.
In order to simplify the discussion, we will assume that the two Hilbert spaces have
¬nite dimensions, dA , dB < ∞. A composite system composed of two distinguishable,
spin1/2 particles”for example, impurity atoms bound to adjacent sites in a crys
tal lattice”provides a simple example that ¬ts within this framework. In this case,
HA = HB = C2 , and all observables can be written as linear combinations of the spin
operators, e.g.
OA = C0 I A + C1 n · SA , (6.36)
where C0 and C1 are constants, I A is the identity operator, n is a unit vector, SA =
σ A /2, and σ = (σx , σy , σz ) is the vector of Pauli matrices. A discrete analogue of the
EPR wave function is given by the singlet state
1
= √ {‘
S = 0 “ ’ “ ‘ B} , (6.37)
AB A B A
2
where the spinup and spindown states are de¬ned by
1 1
n · SA ‘ = + ‘ n · SA “ = ’ “
A, , etc. (6.38)
A A A
2 2
The singlet state has total spin angular momentum zero, so one can show”as in
Exercise 6.3”that it has the same expression for every choice of n. If several spin
projections are under consideration, the notation ‘n A and “n A can be used to
distinguish them.
The most important feature of entanglement for pure states is that the result of
one measurement yields information about the probability distribution of a second,
independent measurement. For the twospin system, a measurement of n · SA with the
result ±1/2 guarantees that a subsequent measurement of n · SB will yield the result
“1/2. A discrete version of the unentangled (separable) state (6.4) is
φ = {c‘ ‘ + c“ “ A } {b‘ ‘ + b“ “ B} . (6.39)
A B
In this case, measuring n · SA provides no information at all on the distribution of
values for n · SB .
6.4.1 De¬nition of entanglement
We will approach the general idea of entanglement indirectly by ¬rst de¬ning separable
(unentangled) pure and mixed states, and then de¬ning entangled states as those that
are not separable. Since entangled states are the focus of this chapter, this negative
procedure may seem a little strange. The explanation is that separable states are simple
and entangled states are complicated. We will de¬ne separability and entanglement
in terms of properties of the state vector or density operator. This is the traditional
approach, and it provides a quick entry into the applications of these notions.
¾¼¾ Entangled states
A Pure states
The de¬nitions we give here are simply generalizations of the examples presented in
Sections 6.1 and 6.2, or rather the ¬nitedimensional analogues given by eqns (6.37)
and (6.39). Thus we say that a pure state Ψ of the twoparticle system described by
the Hilbert space HC = HA — HB is separable if it can be expressed as
Ψ = ¦ Ξ , (6.40)
A B
which is the general version of eqn (6.39), and entangled if it is not separable. This
awkward negative de¬nition of entanglement as the absence of separability can be
avoided by using the Schmidt decomposition (6.30). A little thought shows that the
states that cannot be written in the form (6.40) are just the states with r > 1. With
this in mind, we could de¬ne entanglement positively by saying that Ψ is entangled
if it has Schmidt rank r > 1. The discrete analogue (6.37) of the continuous EPR wave
function is an example of an entangled state.
The de¬nitions given above imply several properties of the state vector which,
conversely, imply the original de¬nitions. Thus the new properties can be used as
equivalent de¬nitions of separability and entanglement for pure states. For ease of
reference, we present these results as theorems.
Theorem 6.1 A pure state is separable if and only if the reduced density operators
represent pure states, i.e. separable states satisfy the classical separability principle.
There are two assertions to be proved.
(a) The reduced density operators for a separable pure state Ψ represent pure states
of A and B.
(b) If the reduced density operators for a pure state Ψ describe pure states of A and
B, then Ψ is separable.
Suggestions for these arguments are given in Exercise 6.1.
Since entanglement is the absence of separability, this result can also be stated as
follows.
Theorem 6.2 A pure state is entangled if and only if the reduced density operators
for the subsystems describe mixed states.
Mixed states are, by de¬nition, not states of maximum information, so this result
explicitly demonstrates that possession of maximum information for the total system
does not yield maximum information for the constituent parts. However, the statistical
properties of the mixed states for the subsystems are closely related. This can be seen
by using the Schmidt decomposition (6.31) to evaluate the reduced density operators:
r
2
Ym  (ζm ζm )
ρA = TrB (ρ) = (6.41)
m=1
and
¾¼¿
Entanglement for distinguishable particles
r
2
Ym  (‘m ‘m ) .
ρB = TrA (ρ) = (6.42)
m=1
Comparing eqns (6.41) and (6.42) shows that the two reduced density operators”
although they act in di¬erent Hilbert spaces”have the same set of nonzero eigenvalues
2 2
Y1  , . . . , Yr  . This implies that the purities of the two reduced states agree,
r
4
P (ρA ) = Ym  = P (ρB ) < 1 ,
TrA ρ2 = (6.43)
A
m=1
and that the subsystems have identical von Neumann entropies,
r
2 2
S (ρA ) = ’ TrA [ρA ln ρA ] = ’ Ym  ln Ym  = S (ρB ) . (6.44)
m=1
An entangled pure state is said to be maximally entangled if the reduced density
operators are maximally mixed according to eqn (2.141), where the number of degen
erate nonzero eigenvalues is given by M = r. The corresponding values of the purity
and von Neumann entropy are respectively P (ρ) = 1/r and S (ρ) = ln r.
We next turn to results that are more directly related to experiment. For observ
ables A and B acting on HA and HB respectively and any state Ψ in HC = HA — HB ,
we de¬ne the averages A = Ψ A—IB  Ψ and B = Ψ IA —B Ψ and the ¬‚uctu
ation operators δA = A ’ A and δB = B ’ B . The quantum ¬‚uctuations are
said to be uncorrelated if Ψ δA δB Ψ = 0. With this preparation we can state the