computing and communications 2 information theory entropy
play

Computing and Communications 2. Information Theory -Entropy Ying - PowerPoint PPT Presentation

1896 1920 1987 2006 Computing and Communications 2. Information Theory -Entropy Ying Cui Department of Electronic Engineering Shanghai Jiao Tong University, China 2017, Autumn 1 Outline Entropy Joint entropy and conditional


  1. 1896 1920 1987 2006 Computing and Communications 2. Information Theory -Entropy Ying Cui Department of Electronic Engineering Shanghai Jiao Tong University, China 2017, Autumn 1

  2. Outline • Entropy • Joint entropy and conditional entropy • Relative entropy and mutual information • Relationship between entropy and mutual information • Chain rules for entropy, relative entropy and mutual information • Jensen’s inequality and its consequences 2

  3. Reference • Elements of information theory, T. M. Cover and J. A. Thomas, Wiley 3

  4. OVERVIEW 4

  5. Information Theory • Information theory answers two fundamental questions in communication theory – what is the ultimate data compression? -- entropy H – what is the ultimate transmission rate of communication? -- channel capacity C • Information theory is considered as a subset of communication theory 5

  6. Information Theory • Information theory has fundamental contributions to other fields 6

  7. A Mathematical Theory of Commun. • In 1948, Shannon published “A Mathematical Theory of Communication”, founding Information Theory • Shannon made two major modifications having huge impact on communication design – the source and channel are modeled probabilistically – bits became the common currency of communication 7

  8. A Mathematical Theory of Commun. • Shannon proved the following three theorems – Theorem 1. Minimum compression rate of the source is its entropy rate H – Theorem 2. Maximum reliable rate over the channel is its mutual information I – Theorem 3. End-to-end reliable communication happens if and only if H < I , i.e. there is no loss in performance by using a digital interface between source and channel coding • Impacts of Shannon’s results – after almost 70 years, all communication systems are designed based on the principles of information theory – the limits not only serve as benchmarks for evaluating communication schemes, but also provide insights on designing good ones – basic information theoretic limits in Shannon’s theorems have now been successfully achieved using efficient algorithms and codes 8

  9. ENTROPY 9

  10. Definition • Entropy is a measure of the uncertainty of a r.v. • Consider discrete r.v. X with alphabet and p.m.f.    ( ) Pr[ ], p x X x x – log is to the base 2, and entropy is expressed in bits • e.g., the entropy of a fair coin toss is 1 bit   – define , since  log 0 as 0 0log0 0 x x x • adding terms of zero probability does not change the entropy 10

  11. Properties – entropy is nonnegative – base of log can be changed 11

  12. Example – H(X)=1 bit when p=0.5 • maximum uncertainty – H(X)=0 bit when p=0 or 1 • minimum uncertainty – concave function of p 12

  13. Example 13

  14. JOINT ENTROPY AND CONDITIONAL ENTROPY 14

  15. Joint Entropy • Joint entropy is a measure of the uncertainty of a pair of r.v.s • Consider a pair of discrete r.v.s (X,Y) with alphabet ,       , and p.m.f.s ( ) Pr[ ], ( ) Pr[ ], p x X x x p y Y y y 15

  16. Conditional Entropy • Conditional entropy of a r.v. (Y) given another r.v. (X) – expected value of entropies of conditional distributions, averaged over conditioning r.v. 16

  17. Chain Rule 17

  18. Chain Rule 18

  19. Example 19

  20. Example 20

  21. RELATIVE ENTROPY AND MUTUAL INFORMATION 21

  22. Relative Entropy • Relative entropy is a measure of the “distance” between two distributions 0 0 p – convention:     0log 0, 0log 0 and log p 0 0 q – if there is any      such that ( ) 0 and ( ) 0, then ( || ) . x p x q x D p q 22

  23. Example 23

  24. Mutual Information • Mutual information is a measure of the amount of information that one r.v. contains about another r.v. 24

  25. RELATIONSHIP BETWEEN ENTROPY AND MUTUAL INFORMATION 25

  26. Relation 26

  27. Proof 27

  28. Illustration 28

  29. CHAIN RULES FOR ENTROPY, RELATIVE ENTROPY, AND MUTUAL INFORMATION 29

  30. Chain Rule for Entropy 30

  31. Proof 31

  32. Alternative Proof 32

  33. Chain Rule for Information 33

  34. Proof 34

  35. Chain Rule for Relative Entropy 35

  36. Proof 36

  37. JENSEN'S INEQUALITY AND ITS CONSEQUENCES 37

  38. Convex & Concave Functions • Examples: x  2 x convex functions: , | |, , log (for 0) x x e x x x  concave functions: log and (for 0) x x  linear functions are both convex and concave ax b 38

  39. Convex & Concave Functions 39

  40. Jensen’s Inequality 40

  41. Information Inequality 41

  42. Proof 42

  43. Nonnegativity of Mutual Information 43

  44. Max. Entropy Dist. – Uniform Dist. 44

  45. Conditioning Reduces Entropy 45

  46. Independence Bound on Entropy 46

  47. Summary 47

  48. Summary 48

  49. Summary 49

  50. cuiying@sjtu.edu.cn iwct.sjtu.edu.cn/Personal/yingcui 50

Recommend


More recommend