Chapter 4 Entropy Rates of a Stochastic Process Peng-Hua Wang - PowerPoint PPT Presentation

Chapter 4 Entropy Rates of a Stochastic Process Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University

Chapter Outline Chap. 4 Entropy Rates of a Stochastic Process 4.1 Markov Chains 4.2 Entropy Rate 4.3 Example: Entropy Rate of a Random Walk on a Weighted Graph 4.4 Second Law of Thermodynamics 4.5 Functions of Markov Chains Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 2/13

4.1 Markov Chains Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 3/13

Stationary Definition (Stationary) A stochastic process is said to be stationary if Pr { X 1 = x 1 , X 2 = x 2 , . . . , X n = x n } = Pr { X 1+ ℓ = x 1 , X 2+ ℓ = x 2 , . . . , X n + ℓ = x n } for every n and every shift ℓ . ■ the joint distribution of any subset of the sequence of random variables is invariant with respect to shifts in the time index. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 4/13

Markov chain Definition (Markov chain) A discrete stochastic process X 1 , X 2 , . . . is said to be a Markov chain or a Markov process if for n = 1 , 2 , . . . , Pr { X n +1 = x n +1 | X n = x n , X n − 1 = x n − 1 , . . . , X 1 = x 1 } = Pr { X n +1 = x n +1 | X n = x n } . ■ The joint pmf can be written as p ( x 1 , x 2 , . . . , x n ) = p ( x 1 ) p ( x 2 | x 1 ) p ( x 3 | x 2 ) · · · p ( x n | x n − 1 ) . Definition (Time invariant) The Markov chain is said to be time invariant if the transition probability p ( x n +1 | x n ) , Pr { X n +1 = b | X n = a } = Pr { X 2 = b | X 1 = a } for all a, b ∈ X . Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 5/13

Markov chain ■ We will assume that the Markov chain is time invariant. ■ X n is called the state at time n . ■ A time invariant Markov chain is characterized by its initial state and a probability transition matrix P = [ P ij ] , i, j ∈ { 1 , 2 , . . . , m } , where P i,j = Pr { X n +1 = j | X n = i } . ■ The pmf at time n + 1 is � p ( x n +1 ) = p ( x n ) P x n x n +1 x n ■ A distribution on the states such that the distribution at time n + 1 is the same as the distribution at time n is called a stationary distribution . Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 6/13

Example 4.1.1 Consider a two-state Markov chain with a probability transition matrix � � 1 − α α P = . 1 − β β Find its stationary distribution and entropy. Solution. Let µ 1 , µ 2 be the stationary distribution. µ 1 = µ 1 (1 − α ) + µ 2 β µ 2 = µ 1 α + µ 2 (1 − β ) and µ 1 + µ 2 = 1 . Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 7/13

4.2 Entropy Rate Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 8/13

Entropy Rate Definition (Entropy Rate) The entropy of a random process { X i } is defined by 1 H ( X ) = lim nH ( X 1 , X 2 , . . . , X n ) . n →∞ Definition (Conditional Entropy Rate) The entropy of a random process { X i } is defined by H ′ ( X ) = lim n →∞ H ( X n | X 1 , X 2 , . . . , X n − 1 ) . Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 9/13

Entropy Rate ■ If X 1 , X 2 , . . . are i.i.d. random variables. Then H ( X 1 , X 2 , . . . , X n ) = lim nH ( X 1 ) H ( X ) = lim = H ( X 1 ) . n n n →∞ ■ If X 1 , X 2 , . . . are independent but not identical distributed n 1 � H ( X ) = lim H ( X i ) . n n →∞ i =1 ■ We can choose a sequence of distributions on X 1 , X 2 . . . such that the limit does not exist. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 10/13

Entropy Rate Theorem 4.2.2 For a stationary stochastic process, H ( X n | X n − 1 , . . . , X 1 ) is nonincreasing in n and has a limit H ′ ( X ) . Proof. H ( X n +1 | X 1 , X 2 , . . . , X n ) ≤ H ( X n +1 | X 2 , . . . , X n ) ( conditioning reduce entropy ) = H ( X n | X 1 , . . . , X n − 1 ) ( stationary ) Since H ( X n | X n − 1 , . . . , X 1 ) is nonnegative and decreasinging, it has a limit H ′ ( X ) . Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 11/13

Entropy Rate Theorem 4.2.1 For a stationary stochastic process, both H ( X ) and H ′ ( X ) exist and are equal. H ( X ) = H ′ ( X ) . Proof. By the chain rule, n nH ( X 1 , X 2 , . . . , X n ) = 1 1 � H ( X i | , X i − 1 , . . . , X 1 ) , n i =1 that is, the entropy rate is the time average of the conditional entropies. Since the conditional entropies has a limit H ′ ( X ) . We conclude that � the entropy rate has the same limit by Theorem of Ces´ aro mean. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 12/13

Ces´ aro mean � n aro mean) If a n → a and b n = 1 i =1 a i , then Theorem (Ces´ n b n → a . Proof. Let ǫ > 0 . Since a n → a , there exists a number N such that | a n − a | ≤ ǫ for n > N . Hence, � � n n 1 � ≤ 1 � � � � | b n − a | = ( a i − a ) | ( a i − a ) | � � n n � � � i =1 i =1 N N ≤ 1 | ( a i − a ) | + n − N ǫ ≤ 1 � � | ( a i − a ) | + ǫ ≤ ǫ n n n i =1 i =1 when n is large enough. � Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 13/13

Chapter 4 Entropy Rates of a Stochastic Process Peng-Hua Wang - PowerPoint PPT Presentation

Chapter 4 Entropy Rates of a Stochastic Process Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University Chapter Outline Chap. 4 Entropy Rates of a Stochastic Process 4.1 Markov Chains 4.2 Entropy Rate 4.3 Example: Entropy

Entropy, Relative Entropy, Cross Entropy Entropy Entropy, H(x) is a measure of the uncertainty of

Formal Modeling in Cognitive Science Lecture 25: Entropy, Joint Entropy, Conditional Entropy 1

Chapter 2 Entropy, Relative Entropy, and Mutual Infor- mation Peng-Hua Wang Graduate Institute

Entropy Change in Entropy Reversible Isobaric Process Ideal Gas in a Reversible Process Free

PROPERTY RATES PROPERTY RATES PROPERTY RATES PROPERTY RATES BUFFALO CITY MUNICIPALITY

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

1) Entropy = measure of randomness 2) Entropy = measure of compressibility More random = Less

Road detection via entropy By Anna Zaidman 1 1 What is entropy? Entropy is a mathematically

4200:225 Equilibrium Thermodynamics Unit I. Earth, Air, Fire, and Water Chapter 3. The Entropy

CHAPTER V V CHAPTER Annealing by Stochastic Annealing by Stochastic Neural Networks for

Entropy Estimation on the Basis Stochastic Model of a Stochastic Model Werner Schindler Bundesamt

Entropy and The Second Law of Thermodynamics Entropy (S)

Orc David Schleef Entropy Wave Inc (c) 2009 Entropy Wave Inc What is Orc A system for

Topological entropy and algebraic entropy on locally compact abelian groups - The Bridge Theorem

Probabilistic Models of Human Sentence Experiment 1: Entropy and Sentence Length 2 Processing

Outlines Stochastic Process Discrete Time Markov Chain (DTMC) 2 Stochastic Process

Stationary Distributions of Markov Chains Will Perkins April 4, 2013 Back to Markov Chains

Prob obab abil ilit ity y an and d Tim Time: Ma Marko kov v Mo Mode dels ls Com

Asymptotic normality of numbers of observations near order statistics from stationary processes

Geostatistics for Gaussian processes Hans Wackernagel Geostatistics group MINES ParisTech

Sampling Analysis Cengiz ztireli using Correlations Gurprit Singh for Monte Carlo Rendering

Computability, randomness and the ergodic decomposition Mathieu Hoyrup ( t r

Statistical Simulation An Introduction James H. Steiger Department of Psychology and Human

Covariate Adjustment and Statistical Power Tara Slough EGAP Learning Days X Covariate Adjustment