and

Applications to Quantum Cryptography

By

Nikolaos P. Papadakos

B.Sc. in Mathematics, University of Athens, 2001

Supervisor

Professor Abbas Edalat

Individual Study Option,

on the First Term of the Academic year 2001-2002,

for the M.Sc. in Advanced Computing

Department of Computing

Imperial College of Science, Technology and Medicine

University of London, United Kingdom

Contents

List of Figures iii

Acknowledgements iv

About this Individual Study Option v

1 Physics and information theory 1

1.1 Classical physics and information theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Shannon entropy and relevant measures of classical information theory . . . . . . . . . . . 2

1.1.2 Basic properties of Shannon entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Quantum physics and information theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.1 Basic features of quantum mechanics relevant to information theory . . . . . . . . . . . . 5

1.2.2 The language of quantum information theory: quantum thermodynamics . . . . . . . . . 7

1.2.3 Von Neumann entropy and relevant measures of quantum information theory . . . . . . . 9

1.2.4 A great discrepancy between classical and quantum information theory: quantum entan-

glement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.2.5 Basic properties of von Neumann entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.2.6 Measurement in quantum physics and information theory . . . . . . . . . . . . . . . . . . 14

2 Some important results of information theory 16

2.1 Accessible information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......... . . . . 16

2.1.1 Accessible classical information: Fano™s inequality . . . . . . . . . ......... . . . . 16

2.1.2 Accessible quantum information: quantum Fano inequality and the Holevo bound . . . . 17

2.2 Data processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......... . . . . 18

2.2.1 Classical data processing . . . . . . . . . . . . . . . . . . . . . . . . ......... . . . . 18

2.2.2 Quantum data processing . . . . . . . . . . . . . . . . . . . . . . . ......... . . . . 19

3 Real-life applications of information theory 20

3.1 Data compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.1.1 Shannon™s noiseless channel coding theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.1.2 Schumacher™s quantum noiseless channel coding theorem . . . . . . . . . . . . . . . . . . . 21

3.2 Information over noisy channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2.1 Error correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2.2 Classical information over noisy classical channels . . . . . . . . . . . . . . . . . . . . . . 24

3.2.3 Classical information over noisy quantum channels . . . . . . . . . . . . . . . . . . . . . . 25

3.2.4 Quantum information over noisy quantum channels . . . . . . . . . . . . . . . . . . . . . . 26

4 Practical quantum cryptography 27

4.1 Theoretical principles of quantum cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.1.1 The BB84 quantum key distribution protocol . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.1.2 Information reconciliation and privacy ampli¬cation . . . . . . . . . . . . . . . . . . . . . 29

4.1.3 Privacy and security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.1.4 A provable secure version of the BB84 protocol . . . . . . . . . . . . . . . . . . . . . . . . 30

4.2 A commercial quantum cryptographic device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.2.1 Experimental quantum key distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2.2 Commercial quantum cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

i

Summary 33

Appendices 35

A Omitted proofs 35

A.1 Special diagonal transformation of normal matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 35

A.2 Projective measurements are equivalent to an addition of unitary operations . . . . . . . . . . . . 36

B Distance measures of information 37

B.1 Classical information distance: Hamming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

B.2 Quantum information distance: Fidelity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Bibliography 38

ii

List of Figures

1.1 Relationships between di¬erent entropies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 A summary of basic information relative features, of quantum mechanics. . . . . . . . . . . . . . 7

1.3 Transition from classical to quantum thermodynamics. . . . . . . . . . . . . . . . . . . . . . . . . 9

3.1 Quantum data compression. The compression operation Cn compresses a quantum source ρ stored

in n log d qubits into nS (ρ) qubits. The source is accurately recovered via the decompression

operation Dn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2 The noisy channel coding problem for classical messages. It is required that every one of the 2 nR

possible messages should be sent uncorrupted through the channel with high probability. . . . . . 25

4.1 Alice and Bob communicating with the fear of Eve stealing information. . . . . . . . . . . . . . . 27

4.2 The setup of BB84 quantum key distribution protocol. The quantum line is used for the distri-

bution of the key, while the classical line for discussing security topics and the transmission of

encrypted messages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.3 Typical system for quantum cryptography using polarization coding (LD: laser diode, BS: beam-

splitter, F: neutral density ¬lter, PBS: polarizing beam splitter, »/2: half waveplate, APD:

avalanche photodiode). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

S.1 Summary of classical and quantum information theory. . . . . . . . . . . . . . . . . . . . . . . . . 33

S.2 Summary of quantum cryptography. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

iii

Acknowledgements

I am very pleased to thank my supervisor Professor Abbas Edalat, who with his important comments not only

did he help me improve the presentation of this Individual Study Option, but by suggesting me to solve all

64 exercises of the last two chapters of [1, chap.11,12] he contributed dramatically to my understanding and

augmented my self-con¬dence on the ¬eld. I am indebted to him, for his e¬ort of transforming the present

material into something more than just an Individual Study Option.

iv

About this Individual Study Option

In this Individual Study Option the concepts of classical and quantum information theory are presented, with

some real-life applications, such as data compression, information transmission over noisy channels and quantum

cryptography.

Abstract

In each chapter the following topics are covered:

1. The relation of information theory with physics is studied and by using the notion of Shannon™s entropy,

measures of information are introduced. Then their basic properties are proven. Moreover important

information relevant features of quantum physics, like the disability of distinguishing or cloning states

in general are studied. In order to get a quantum measure of information, an introduction to quantum

thermodynamics is given with a special focus on the explanation of the utility of density matrix. After this

von Neumann entropy and its related measures are de¬ned. In this context a major discrepancy between

classical and quantum information theory is presented: quantum entanglement. The basic properties of

von Neumann entropy are proven and some information theoretic interpretation of quantum measurement

is given.

2. The amount of accessible information is obtained by the Fano™s inequality in the classical case, and by its

quantum analogue and Holevo™s bound in the quantum case. In addition to this classical and quantum

data processing is discussed.

3. Furthermore for real-life applications data compression is studied via Shannon™s classical and Schumacher™s

quantum noiseless channel coding theorems. Another application is transmission of classical information

over noisy channels. For this a summary of classical and quantum error correction is given, and then

Shannon™s classical noisy channel and Holevo-Schumacher-Westmoreland quantum noisy channel coding

theorems are studied. The present state of transmission of quantum information over quantum channels

is summarized.

4. A practical application of the aforementioned is presented: quantum cryptography. In this context the

BB84, a quantum key distribution protocol, is demonstrated and its security is discussed. The current

experimental status of quantum key distribution is summarized and the possibility of a commercial device

realizing quantum cryptography is presented.

Special viewpoints emphasized

The above material is in close connection with the last two chapters of [1, chap.11,12], and error correction is a

summary of the results given in chapter 10 of [1]. From this book Figures 1.1, 3.1, 3.2 and S.1, were extracted.

However there was much in¬‚uence on information theory by [2], and some results given therein like for example

equation (3.1) were extended to (3.2). Of equal impact was [3] concerning error correction, and [3, 4] concerning

quantum cryptography. Note that Figure was 4.2 extracted from [4]. Of course some notions which were vital

to this Individual Study Option were taken from almost all the chapters of [1].

However there where some topics not well explained or sometimes not enough emphasized. The most

important tool of quantum information theory, the density matrix, is misleadingly given in [1, p.98]: ”the

density matrix [...] is mathematically equivalent to the state vector approach, but it provides a much more

convenient language for thinking about commonly encountered scenarios in quantum mechanics”. The density

matrix is not equivalent to the state vector approach and is much more than just a convenient language. The

former is a tool for quantum thermodynamics, a ¬eld where is impossible to use the latter. Rephrasing it,

v

quantum thermodynamics is the only possible language. Willing to augment understanding of this meaningful

tool, subsection 1.2.2 is devoted to it, and is close to the viewpoint of [5, p.295-307], which is a classical textbook

for quantum mechanics.

In this Individual Study Option some topics are presented in a di¬erent perspective. There is an attempt to

emphasize information theoretic discrepancies between classical and quantum information theory, the greatest of

which is perhaps quantum entanglement. A very special demonstration of this di¬erence is given in subsection

1.2.4. It should be noted that some particular information theoretic meaning of measurement in physics is

presented in subsection 1.2.6.

Concerning the di¬erent views presented in this text, nothing could be more bene¬cial than the 64 exercises

of the last two chapters of [1, chap.11,12]. As an example a delicate presentation of measures of information is

given in page 1, due to exercise 11.2 [1, p.501], or subsection 1.2.4 on quantum entanglement was inspired by

exercise 11.14 [1, p.514]. In some occasions special properties where proved in order to solve the exercises. Such

a property is the preservation of quantum entropy by unitary operations (property 2 of von Neumann entropy,

in page 11), which was needed to solve for example exercise 11.19,20 [1, p.517,518]. For these last two exercises

some special lemmas were proven in appendices A.1 and A.2. It is important to notice that from the above

mentioned exercises of [1, chap.11,12] only half of them are presented here.

Notation

Finally about the mathematical notation involved, the symbol should be interpreted as ”de¬ned by”, and the

symbol ≡ as ”to be identi¬ed with”.

vi

Chapter 1

Physics and information theory

Since its invention, computer science [6] was considered as a branch of mathematics, in contrast to information

theory [7, 8] which was viewed by its physical realization; quoting Rolf Landauer [9] ”information is physical”.

The last decades changed the landscape and both computers and information are mostly approached by their

physical implementations [10, 11, 12, 13]. This view is not only more natural, but in the case of quantum

laws it gives very exciting results and sometimes an intriguing view of what information is or can be. Such an

understanding could never be inspired just from mathematics. Moreover there is a possibility that the inverse

relation exists between physics and information [14, 15] or quoting Steane [3] one could ¬nd a new methods of

studying physics by ”the ways that nature allows, and prevents, information to be expressed and manipulated,

rather than [ the ways it allows] particles to move”. Such a program is still in its infancy, however one relevant

application is presented in subsection 1.2.6. In this chapter the well established information theory based on

classical physics is presented in section 1.1, and the corresponding up to date known results for quantum physics

are going to be analyzed in section 1.2.

1.1 Classical physics and information theory

In classical physics all entities have certain properties which can be known up to a desired accuracy. This fact

gives a simple pattern to store and transmit information, by assigning information content to each of the property

a physical object can have. For example storage can be realized by writing on a paper, where information lays

upon each letter, or on a magnetic disk, where information, in binary digits (bits), is represented each of the

two spin states a magnetic dipole can have. In what concerns transmission, speech is one example, where each

sound corresponds to an information, or a second example is an electronic signal on a wire, where each state

of the electricity is related to some piece of information. Unfortunately in every day life such simple patterns

are non-economical and unreliable. This is because communication is realized by physical entities which are

imperfect, and hence they can be in¬‚uenced by environmental noise, resulting information distortion.

Concerning the economical transmission of information, one can see that the naive pattern of information

assignment presented in the last paragraph is not always an optimal choice. This is because a message, in

English language for example, contains symbols with di¬erent occurrence frequencies. Looking for example

this text one can note immediately that the occurrence probability of letter a, p a , is much greater than that

of exclamation p! . According to the naive assignment, English language symbols are encoded to codewords of

identical length l, and the average space needed to store a is lpa and of the exclamation lp! , and since pa > p! ,

a lot of space is wasted for the letter a. In order to present how a encoding scheme can be economical consider

a four letter alphabet A, B, C, D, with occurrence probabilities pA = 3 , pB = 1 , pC = pD = 16 , the subsequent

1

4 8

assignment of bits: A ’ 1, B ’ 01, C ’ 010, and D ’ 011. A message of n symbols, using this encoding, has

on average n (pA + 2pB + 3pC + 3pD ) = n 3 + 2 1 + 3 16 + 3 16 = n 11 bits instead of 2n which would needed

1 1

4 8 8

if somebody just mapped to each letter a two bit codeword.

The topics discussed in the last two paragraphs give rise to the most important information theoretic

question: which are the minimal resources needed to reliably communicate? An answer to this question can be

given by abstractly quantifying information in relevance to the physical resources needed to carry it. Motivated

by the previously demonstrated four letter alphabet example, probabilities are going to be used for such an

abstraction. One now de¬nes a function H, quantifying a piece of information I, exhibiting the following

reasonable properties:

1

1. H (I) is a function only of the probability of occurrence p of information I, thus H (I) ≡ H (p) .

2. H is a smooth function.

3. The resources needed for two independent informations with individual probabilities p, q > 0 are the sum

of the resources needed for one alone, or in mathematical language H (pq) = H (p) + H (q) .

The second and third property imply that H is a logarithmic function, and by setting q = 1 in the third it

is immediate to see that H (1) = 0. Hence H (p) = k log p, where k and a are constants to be determined (refer

to comments after equations (1.2) and (1.5)). This means that the average of resources needed when one of the

mutually exclusive set of information with probabilities p1, . . . , pn occurs is

H (p1, . . . , pn) = k pi loga pi . (1.1)

i

It should be noted that probability is not the only way of quantifying information [14, 15, 16, 17, 18, 19].

1.1.1 Shannon entropy and relevant measures of classical information theory

The function Q found in (1.1) is known in physics as entropy, and measures the order of a speci¬c statistical

system. Of course one interesting physical system is an n-bit computer memory, and if all the possible cases of

data entry are described by a random variable X with probabilities p1, . . . , p2n , then the computer™s memory

should have an entropy given by

H (X) ≡ H (p1, . . . , p2n ) ’ px log px . (1.2)

x

Here a modi¬ed version of (1.1) is used, with log ≡ log2 and k 1, an assignment to be veri¬ed after equation

(1.5). Equation (1.2) is known in information theory as the Shannon™s entropy [7]. There are two complementary

ways of understanding Shannon™s entropy. It can be considered as a measure of uncertainty before learning the

value of a physical information or the information gain after learning it.

The Shannon entropy gives rise to other important measures of information. One such is the relative entropy,

which is de¬ned by

px

H (px||qx) = ’H (X) ’

px log px log qx, (1.3)

qx

x x

and is a measure of distance between two probability distributions. This is because it can be proven that

H (px ||qx) > 0, (1.4)

H (px ||qx) 0 ” ∀x : px = qx .

=

Of course it is not a metric because as one can check H (px ||qx) = H (qx||px) is not always true. The relative

entropy is often useful, not in itself, but because it helps ¬nding important results. One such is derived using

the last equality in (1.3) and (1.4), then in a memory with n-bits

log 2n = n,

H (X) < (1.5)

1

log 2n = n ” ∀i : pi =

H(X) = ,

2n

which justi¬es the selection of k = 1 in (1.2), because in the optimal case and in absence of noise, the maximum

physical resources needed to transmit or to store an n-bit word should not exceed n. One can also see from

(1.2) that

H (X) > 0, (1.6)

0 ” system X is in a de¬nite state (p = 1).

H (X) =

Other important results are deduced relative entropy, and concern useful entropic quantities such as the

joint entropy, the entropy of X conditional on knowing Y, and the common or mutual information of X and

2

Y. Those entropies are correspondingly de¬ned by the subsequent intuitive relations

’

H (X, Y ) pxy log pxy , (1.7)

xy

H (X, Y ) ’ H (Y ) ,

H (X|Y ) (1.8)

H (X) + H (Y ) ’ H (X, Y ) = H (X) ’ H (X|Y ) ,

H (X : Y ) (1.9)

and can be represented in the ™entropy Venn diagram™ as shown in Figure 1.1.

"¢

#"¢

¨¤

! ¤¢

©¨¦¤£¢

§¥ ¡

Figure 1.1: Relationships between di¬erent entropies.

1.1.2 Basic properties of Shannon entropy

It is worth mentioning here the basic properties of Shannon entropy:

1. H (X, Y ) = H (Y, X) , H (X : Y ) = H (Y : X) .

2. H (Y |X) ≥ 0 and thus by the second equality of (1.9) H (X : Y ) ¤ H (Y ) , with equality if and only if Y

is a function of X, Y = f (X) .

3. H (X) ¤ H (X, Y ) , with equality if and only if Y is a function of X.

4. Subadditivity: H (X, Y ) ¤ H (X) + H (Y ) with equality if and only if X and Y are independent

random variables.

5. H (Y |X) ¤ H (Y ) and thus by the second equality of (1.9) H (X : Y ) ≥ 0, with equality in each if and

only if X and Y are independent variables.

6. Strong subadditivity: H (X, Y, Z) + H (X) ¤ H (X, Y ) + H (Y, Z) , with equality if and only if Z ’

Y ’ X forms a Markov chain.

7. Conditioning reduces entropy: H (X|Y, Z) ¤ H (X|Y ) .

8. Chaining rule for conditional entropies: Let X1 , . . . , Xn and Y be any set of random variables, then

n

H (X1 , . . . , Xn |Y ) = i=1 H (Xi |Y, X1, . . . , Xi’1) .

9. Concavity of the entropy: Suppose there are probabilities qi ≥ 0, pi > 0 and then H ( ≥

i pi qj )

i pi H (qj ) , with equality if and only if qj s are identical.

The various relationships between entropies may mostly be derived from the ™entropy Venn diagram™ shown

in Figure 1.1. Such ¬gures are not completely reliable as a guide to properties of entropy, but they provide a

useful mnemonic for remembering the various de¬nitions and properties of entropy. The proofs of the above

mentioned properties follow.

Proof

3

1. Obvious from the de¬nition of joint entropy (1.7) and mutual entropy (1.9).

2. Based on the de¬nition of conditional entropy (1.8)

H (Y |X) H (X, Y ) ’ H (Y ) = ’ p (x, y) log p (x, y) + p (y) log p (y)

xy x

p (x, y)

’ p (x, y) log p (y) = ’

= p (x, y) log p (x, y) + p (x, y) log

p (y)

xy xy xy

’H (p (x, y) ||p (y)) ,

=

and using (1.4), where equality holds if and only if p (X, Y ) = p (Y ) , that is Y = f (X) .

3. It is immediately proven by the second one and using the de¬nition of conditional entropy (1.8).

4. Following the subsequent steps of calculation

H (p (x, y) ||p (x) p (y)) ’H (p (x, y)) ’ p (x, y) log p (x) p (y)