9

The last formula is often used for proving theoretical results and equation (1.24) is used for calculations. As an

example the von Neumann entropy of ρ = p |0 0| + (1 ’ p) (|0 +|1 )( 0|+ 1|) is found to be

2

1’p 1’p

p+ 2 2 = ’»1 log »1 ’ »2 log »2,

S (ρ) = S 1’p 1’p

2 2

√ √

(1+2p2 ’2p) (1+2p2 ’2p)

1+ 1’

where »1 = and »2 = are the eigenvalues of the corresponding matrix. Sur-

2 2

prisingly S (ρ) = H (p, 1 ’ p) even if the same probabilities where assigned for both of them! This shows

that quantum probabilities are not expelled by quantum thermodynamics. The equality could only hold if the

probabilities written in Shannon™s entropy are the eigenvalues of the density matrix.

Following the same path as for classical information theory, in the quantum case it is straightforward to

de¬ne the joint entropy, the relative entropy of ρ to σ, the entropy of A conditional on knowing B and the

common or mutual information of A and B. Each case is correspondingly

tr ρAB log ρAB ,

S (A, B) (1.26a)

tr (ρ log ρ) ’ tr (ρ log σ) ,

S (ρ||σ) (1.26b)

S (A, B) ’ S (B) ,

S (A|B) (1.26c)

S (A) + S (B) ’ S (A, B) = S (A) ’ S (A|B) .

S (A : B) (1.26d)

One can see that there are lot of similarities between Shannon™s and von Neumann™s entropy. As such one can

prove a result reminding equation (1.4)

S (ρ||σ) > 0, (1.27a)

0 ” ρ = σ,

S (ρ||σ) = (1.27b)

and is known as Klein™s inequality. This inequality provides evidence of why von Neumann relative entropy is

close to the notion of metric. What is also important is that it can be used, like in the classic case, to prove

something corresponding to equation (1.5) of Shannon™s entropy

S (ρ) < log d, (1.28a)

1

= log d ” ρ =

S(ρ) I. (1.28b)

d

In addition to this from the de¬nition (1.25) it follows that

S (ρ) > 0, (1.29a)

0 ” ρ is pure,

S (ρ) = (1.29b)

which resembles to equation (1.6).

One can also prove that supposing some ρi states, with probabilities pi , have support on orthogonal sub-

spaces, then

S pi ρi = H (pi ) + pi S (ρi ) .

i i

Directly from this relation the joint entropy theorem can be proven, where supposing that p i are probabilities,

|i are orthogonal states for a system A, and ρi is an set of density matrices of another system B, then

pi |i i| — ρi

S = H (pi) + pi S (ρi ) . (1.30)

i i

Using the de¬nition of von Neumann entropy (1.25) and the above mentioned theorem for the case where ρ i = σ

for every i, and let ρ be a density matrix with eigenvalues pi , and eigenvectors |i , then the entropy of a tensor

product ρ — σ is found to be

S (ρ — σ) = S (ρ) + S (σ) . (1.31)

Another interesting result can be derived by Schmidt decomposition; if a composite system AB is in a pure

state, it has subsystems A and B with density matrices of equal eigenvalues, and by (1.24)

S (A) = S (B) . (1.32)

10

1.2.4 A great discrepancy between classical and quantum information theory:

quantum entanglement

The tools developed in the above subsection can help reveal a great discrepancy between classical and quantum

information theory: entanglement. Concerning the nomenclature, in quantum mechanics two states are named

entangled if they cannot be written as a tensor product of other states. For the demonstration of the aforemen-

tioned discrepancy, let for example a composite system AB be in an entangled pure state |AB , then, because

of entanglement, in the Schmidt decomposition it should be written as the sum of more than one terms

|AB = »i |iA |iB , with |I| > 1, (1.33)

i∈I

where |iA and |iB are orthonormal bases. The corresponding density matrix is obviously ρAB = |AB AB| =

i,j∈I »i »j |iA |iB jA | jB | . As usually the density matrix of the subsystem B can be found by tracing out

system A,

»2 |iB

ρB = trA ρAB = »i»j kA |iA |iB jA |kA jB | = iB | .

i

i∈I

i,j,k∈I

Now because of the assumption |I| > 1 in (1.33) and the fact that |iB are orthonormal bases and it is impossible

to collect them together in a tensor product, subsystem B is not pure. Thus by equation (1.29a) S (B) > 0,

AB is pure thus by (1.29b) S (A, B) = 0 and obviously by (1.26b) S (A|B) < 0. The last steps can be repeated

backwards and the conclusion which can be drawn is that a pure composite system AB is entangled if and only

if S (A|B) < 0.

Of course in classical information theory conditional entropy could only be H (X|Y ) ≥ 0 (property 2 in

subsection 1.1.2) and that is obviously the reason why entangled states did not exist at all! This is an exclusive

feature of quantum information theory. A very intriguing or better to say a very entangled feature, which until

now is not well understood by physicists. However it has incredible applications, such as quantum cryptography,

which will be the main topic of the chapter 4. Concerning the nomenclature, entangled states are named after

(1.26c)

the fact that S (A|B) < 0 ⇐’ S (A, B) < S (B) which means that the ignorance about a system B can be in

quantum mechanics more than the ignorance of both A and B! This proposes some correlation between these

two systems.

How can nature have such a peculiar property? Imagine a simple pair of quantum particles, with two possible

states each |0 and |1 . Then a possible formulation of entanglement can be a state |ψ = |0 √2 + |1 √2 . After

—|0 —|1

a measurement Mm of the ¬rst particle for example, according to (1.12)

Mm |ψ = 1 (|m — |m ) + 0 (|1 ’ m — |1 ’ m ) = |m — |m ,

hence they both collapse to state |m . This example sheds light into the quantum property, where ignorance of

both particles is greater than the ignorance of one of them, since perfect knowledge about one implies perfect

knowledge about the second.

1.2.5 Basic properties of von Neumann entropy

The basic properties of von Neumann entropy, which can be compared to the properties of Shannon entropy

discussed in subsection 1.1.2, are:

1. S (A, B) = S (B, A) , S (A : B) = S (B : A) .

2. Unitary operations preserve entropy: S U ρU † = S (ρ) .

3. Subadditivity: S (A, B) ¤ S (A) + S (B) .

4. S (A, B) ≥ |S (A) ’ S (B)| (Triangle or Araki-Lieb inequality).

5. Strict concavity of the entropy: Suppose there are probabilities pi ≥ 0 and the corresponding density

matrices ρi , then S ( i pi ρi ) < i piS (ρi ) and S ( i pi ρi ) = i pi S (ρi ) ” ρi for which pi > 0 are all

identical.

11

6. Upper bound of a mixture of states: Suppose ρ = i pi ρi where pi probabilities and the correspond-

ing density matrices ρi , then S (ρ) < i piS (ρi ) + H (pi) and S (ρ) = i pi S (ρi ) + H (pi) ” ρi have

support on orthogonal subspaces.

7. Strong subadditivity: S (A, B, C) + S (B) ¤ S (A, B) + S (B, C) , or equivalently S (A) + S (B) ¤

S (A, C) + S (B, C) .

8. Conditioning reduces entropy: S (A|B, C) ¤ S (A|B) .

9. Discarding quantum systems never increases mutual information: Suppose ABC is a composite

quantum system, then S (A : B) ¤ S (A : B, C) .

10. Trace preserving quantum operations never increase mutual information: Suppose AB is a

composite quantum system and E is a trace preserving quantum operation on system B. Let S (A : B)

denote the mutual information between systems A and B before E applied to system B, and S (A : B )

the mutual information after E is applied to system B. Then S (A : B ) ¤ S (A : B) .

11. Relative entropy is jointly convex in its arguments: let 0 ¤ » ¤ 1, then

S (»A1 + (1 ’ ») A2 ||»B1 + (1 ’ ») B2 ) ≥ »S (A1 ||B1) + (1 ’ ») S (A2||B2) .

12. The relative entropy is monotonic: S ρA ||σA ¤ S ρAB ||σAB .

Proof

1. Obvious from the de¬nition of joint entropy (1.26a) and mutual entropy (1.26d).

2. Let U be a unitary matrix then S U ρU † ’tr U ρU † log U ρU † = ’tr U ρU † U (log ρ) U † , where the

fact that U ρU † and ρ are similar and hence they have the same eigenvalues, was employed. Furthermore

tr(AB)=tr(BA)

U is unitary, hence U U † = I, and the proof is concluded S U ρU † = ’tr U ρ (log ρ) U † =

†

U U =I

’tr U † U ρ (log ρ) ’tr (ρ log ρ)

= S (ρ) .

3. Refer to [1, p.516].

4. Assume a ¬ctitious state R purifying the system ρAB , then by applying subadditivity (property 3)

S (R) + S (A) ≥ S (A, R) . Because ρ ABR is pure, and by (1.32) S (R) = S (A, B) and S (A, R) = S (B) .

Combining the last two equations with the last inequality, S (A, B) ≥ S (B) ’ S (A) . Moreover be-

cause S (A, B) = S (B, A) , A and B can be interchanged and then S (A, B) ≥ S (A) ’ S (B) , thus

S (A, B) ≥ |S (A) ’ S (B)| . It is obvious by (1.31) that the equality holds if ρ AR = ρA — ρB . This is hard

to understand because R system was arti¬cially introduced. Another way to obtain the equality condi-

tion is by assuming ρAB has a spectral decomposition ρ AB = ik »ik |iA iA | — |kB kB | , and obviously

ρA =trB ρAB = i ( k »ik ) |iA iA | and ρB = k ( i »ik ) |kB kB | , then one can write

»jk »jk

j j

S (A, B) = S (B) ’ S (A) ” ” »ik =

»ik log »ik = »ik log .

m » im m »im

ik ik

Summing over k in the last equation

2

»jk 1

kj

” =1”

»ik = = »ik »ik = 1,

»im m » im

m

k k k

where the last relation holds because »ik ≥ 0. Now combining the last two outcomes

S (A, B) = S (B) ’ S (A) ” »ik = »jk ” »ik ≡ »i .

j

Rephrasing this result the equality condition holds if and only if the matrices trB (|i i|) have a common

eigenbasis and the matrices trA (|i i|) have orthogonal support. As an example of the above comment

consider the systems ρ A = |0 0| and ρB = 1 |0 0| + 2 |1 1| , then the entropy of each is S (A) = 0 and

1

2

S (B) = 1 and ¬nally the joint system ρAB = ρA — ρB = 1 |00 00| + 2 |01 01| has entropy S (A, B) = 1.

1

2

12

5. Introducing an auxiliary system B, whose state has an orthogonal basis |i corresponding to the states ρ i

of system A. The joint system will have a state ρ AB = i piρi — |i i| , and its entropy according to the

joint entropy theorem (1.30) is S (A, B) = H (pi ) + i pi ρi . Moreover the entropy of each subsystem will

be S (A) = S ( i pi ρi ) and S (B) = S ( i pi |i i|) = H (pi ) , and by applying subadditivity (property

3) S (A, B) ¤ S (A) + S (B) , q.e.d. Concerning its equality conditions assume ρ = ρ i for every i, then

by calculating ρAB = A B

i pi ρ i — |i i| = ρ — i pi |i i| = ρ — ρ , and equality follows from (1.31).

Conversely if S ( i pi ρi ) = i pi S (ρi ) , and suppose there is at least one density matrix which is not

equal to the others, say σ ρj , then

S (qρ + pσ) = qS (ρ) + pS (σ) , (1.34)

where the following quantities where de¬ned q pi , p pj and ρ ρi for i = j. If the density

i=j

i »i |i ρ i|ρ , σ = j κj |j σ j|σ . It is easy to ¬nd

matrices ρ, σ have a spectral decomposition, say ρ =

out that

qS (ρ) + pS (σ) = q»m log (q»m ) + pκm log (pκm ) . (1.35)

m m

This was the right hand side of (1.34). To get the left hand side assume that the matrix qρ + pσ has a

spectral decomposition qρ + pσ = m µm |m ρσ , one can ¬nd unitary matrices connecting these bases,

|i ρ = m uim |m ρσ and |j σ = m wim |m ρσ . This implies that

®«

n|ρσ °q wjl l|ρσ

u— l|ρσ + p —

=’ »iuim |m κj wjm |m

S (qρ + pσ) (1.36)

il

ρσ ρσ

n iml jml

«

— log q wjl l|ρσ » |n

u— l|ρσ + p —

»iuim |m κj wjm |m

il

ρσ ρσ ρσ

iml jml

« «

q κj wjmwjl log q κj wjm wjl

»i uim u— + p —

»i uim u— + p —

=’ il il

n iml jml iml jml

=’ (q»n + pκn ) log (q»n + pκn) .

n

In the last step the fact that uij and wij are unitary (U U † = I ’ j uij u— = δ il ) was employed. Taking

lj

the left and right hand side of equation (1.34) as found in (1.35) and (1.36) correspondingly, it is simple

to verify that

q»m log (1 + pκm ) + pκm log (1 + q»m ) = 0.

m m

The fact that log (1 + pκm ) , log (1 + q»m ) ≥ 0, implies that both summations are greater or equal to zero,

and the last equality leaves no other case than being both of them zero. This can only happen if for the

non-zero »m , pm are null. An alternative proof that von Neumann entropy is concave, can be presented by

de¬ning f (p) S (pρ + (1 ’ p) σ) . Then by calculus if f (p) ¤ 0, then f is concave, that is for 0 ¤ p ¤ 1,

f (px + (1 ’ p) y) ≥ pf (x) + (1 ’ p) f (y) . Selecting x = 1, y = 0, f (p) ≥ pf (1) + (1 ’ p) f (0) , which

according to the de¬nition of f, implies that S (pρ + (1 ’ p) σ) ≥ pS (ρ) + (1 ’ p) S (σ) .

6. Refer to [1, p.518,519].

7. For the proof refer to [1, p.519-521]. The fact that these inequalities are equivalent, will be presented

here. If S (R) + S (B) ¤ S (R, C) + S (B, C) holds then by introducing an auxiliary system A, which

puri¬es the system RBC, S (R) = S (A, B, C) and S (R, C) = S (A, B) , so the last inequality becomes

S (A, B, C)+S (B) ¤ S (A, B)+S (B, C) . Conversely if S (R, B, C)+S (B) ¤ S (R, B)+S (B, C) holds by

inserting again a system A purifying system RBC, because S (R, B, C) = S (A) and S (R, B) = S (A, C) ,

the last inequality becomes S (A) + S (B) ¤ S (A, C) + S (B, C) . From this another equivalent form to

write strong subadditivity is 0 ¤ S (C|A)+S (C|B) or S (A)+S (B)’S (A, B)+S (A)+S (C)’S (A, C) ¤

2S (A) ” S (A : B) + S (A : C) ¤ 2S (A) . This inequality corresponds to the second property of Shannon

13

entropy, where H (X : Y ) ¤ H (X) , which is not always true for quantum information theory. For

1 1

example, the composite system |AB = 2 |00 + 2 |11 , has entropy S (A, B) = 0 because it is pure, and

1 1

ρA = 2 |0 0| + 2 |1 1| , thus S (A) = 1 and similarly S (B) = 1. Hence S (A : B) > S (A) .

8. Refer to [1, p.523].

9. Refer to [1, p.523].

10. Refer to [1, p.523].

11. For the proof refer to [1, p.520]. Concerning this property it should be emphasized that joint concavity

implies concavity in each input; this is obvious by selecting B1 B2 B or A1 A2 A. The converse

‚

is not true. For example by choosing f (x, y) = ’x2 ey , which is convex on x because ‚x2 f = ’2ey ¤ 0 for

every x, y, and convex on y because ‚y2 f = ’x2ey ¤ 0 for every x, y. However f 3 4 + 2 1 , 1 (’3) + 3 3

‚ 1 22

33 3

’0.57 which is less than 3 f 4, 1 + 2 f ’3, 2

1

’0.41.

3 3 3

12. Refer to [1, p.524,525].

Using strict concavity to prove other properties of von Neumann entropy

Strict concavity (property 5) of von Neumann entropy, can be used to prove (1.28b) and moreover that the

completely mixed state 1 I on a space of d dimensions is the unique state of maximal entropy. To do this the

d

following result is stated: for a d — d normal matrix A there exists a set of unitary matrices U i such that

d

(A) (A)†

Ui AUi = tr (A) I. (1.37)

i=1

For a proof refer to appendix A.1 (the proof of a more general proposition is due to Abbas Edalat). To prove

property 2

d

1 1

as a maximal state, take A ≡ ρ any quantum state, then S (ρ) =

the uniqueness of dI S (ρ) =

i=1

d

property 5 (1.37)

(ρ) (ρ)† 1 (ρ) (ρ)†

d d

1 1

¤

i=1 d S Ui ρUi S i=1 d Ui ρUi = S dI . Hence by strict concavity (property 5)

1

any state ρ has less or equal entropy to the completely mixed state of d I, and in order to be equal they should

be identical.

Using von Neumann entropy a proof of the concavity of Shannon entropy can be provided. Let p i and qi

two probability distributions. Then

property 5

pi qj |j j| ≥ pi S (qj |j j|) =

H pi qj =S piH (qj ) , (1.38)

i i i i

with equality if and only if qj |j j|s are the same, that is qj s are identical.

1.2.6 Measurement in quantum physics and information theory

As already noted in the introduction of the present section, quantum physics seems very puzzling. This is

because there are two types of evolution a quantum system can undergo: unitary and measurement. One

understands that the ¬rst one is needed to preserve probability during evolution (see subsection 1.2.1). Then

why a second type of evolution is needed? Information theory can explain this, using the following results:

1. Projective measurement can increase entropy. This is derived using strict concavity. Let P be a projector

and Q I ’ P the complementary projector, then there exist unitary matrices U1 , U2 and a probability

† †

p such that for all ρ, P ρP + QρQ = pU1 ρU1 + (1 ’ p) U2 ρU2 (refer to A.2 for a proof), thus

property 5

† † † †

S (P ρP + QρQ) = S pU1 ρU1 + (1 ’ p) U2 ρU2 ≥ pS U1 ρU1 + (1 ’ p) S U2 ρU2 (1.39)

property 2

pS (ρ) + (1 ’ p) S (ρ) = S (ρ) .

=

Because of strict concavity the equality holds if and only if P ρP + QρQ = ρ.

14

2. General measurement can decrease entropy. One can convince himself by considering a qubit in state

1 1

ρ = 2 |0 0| + 2 |1 1| , which is not pure thus S (ρ) > 0, which is measured using the measurement

matrices M1 = |0 0| and M2 = |0 1| . If the result of the measurement is unknown then the state of the

† †

system afterwards is ρ = M1 ρM1 + M2 ρM2 = |0 0| , which is pure, hence S (ρ ) = 0 < S (ρ) .

3. Unitary evolution preserves entropy. This is already seen as von Neumann™s entropy property 2.

Now one should remember the information theoretic interpretation of entropy given throughout this chapter:

entropy is the amount of knowledge one has about a system. Result 3 instructs that if only unitary evolutions

were present in quantum theory, then no knowledge on any physical system could exist! One is relieved by

seeing that knowledge can decrease or increase by measurements, as seen by results 1 and 2, and of course that

is what measurements were meant to be in the ¬rst place.

15

Chapter 2

Some important results of information

theory

Some very important results derived by information theory, which will be useful for the development of the

next two chapters, concern the amount of accessible information and how data can be processed. These are

presented correspondingly in section 2.1 and in section 2.2, both for the classical and the quantum case.

2.1 Accessible information

Information is not always perfectly known, for example during a transmission over a noisy channel there is a

possibility of information loss. This means that obtaining upper bounds of accessible information can be very

useful in practice. These upper bounds are calculated in subsection 2.1.1 for the classical and in subsection

2.1.2 for the quantum case.

2.1.1 Accessible classical information: Fano™s inequality

Of major importance, in classical information theory, is the amount of information that can be extracted from

a random variable X based on the knowledge of another random variable Y. That should be given as an upper

˜

bound for H (X|Y ) , and is going to be calculated next. Suppose X f (Y ) is some function which is used

˜

as the best guess for X. Let pe p X = X be the probability that this guess is incorrect. Then an ™error™

random variable can be de¬ned

˜

1, X = X

E ˜,

0, X = X

thus H (E) = H (pe ) . Using the chaining rule for conditional entropies H (E, X|Y ) = H (E|X, Y ) + H (X|Y )

(Shannon entropy, property 8), however E is completely determined once X and Y are known, so H (E|X, Y ) = 0