Smooth ergodic theory, lecture 21 M. Verbitsky Teoria Erg´ odica Diferenci´ avel lecture 21: Entropy Instituto Nacional de Matem´ atica Pura e Aplicada Misha Verbitsky, November 29, 2017 1
Smooth ergodic theory, lecture 21 M. Verbitsky Measure-theoretic entropy DEFINITION: Partition of a probability space ( M, µ ) is a countable decom- position M = � V i onto a disjoint union of measurable set. Refinement of a partition V = { V i } is a partition W , obtained by partition of some of V i into subpartitions. In this case we write V ≺ W . Minimal common refinement of partitions V = { V i } , W = { W j } is a partition V ∨ W = { V i ∩ W j } . DEFINITION: Entropy of a partition V = { V i } is H µ ( V ) := − � i µ ( V i ) log( µ ( V i )). EXERCISE: The entropy of infinite partition can be infinite. Find a parti- tion with infinite entropy. 2
Smooth ergodic theory, lecture 21 M. Verbitsky Entropy of a communication channel Consider a communication channel which sends words, chosen randomly of k letters which appear with probabilities p 1 , ..., p k , with � i p k = 1. The en- tropy of this channel is H ( p 1 , ..., p k ) measures “informational density” of communication (C. Shannon). It should satisfy the following natural conditions. 1. Let l > k . The information density is clearly higher for p 1 = ... = p k = 1 /k than for q 1 , ..., q l = 1 /l . Therefore, H (1 /k, ..., 1 /k ) < H (1 /l, ..., 1 /l ) . 2. H should be continuous as a function of p i and symmetric under their permutations. 3. Suppose that we have replaced the first letter in the alphabeth of k letters by l letters, appearing with probabilities q 1 , ..., q l . We have ob- tained a communication channel with k + l − 1 letters, with probabilities p 1 q 1 , ..., p 1 q l , p 2 , ..., p k . Then H ( p 1 q 1 , ..., p 1 q l , p 2 , ..., p k ) = H ( p 1 , ..., p k )+ p 1 H ( q 1 , ..., q l ) . Clearly, H ( p 1 , ..., p k ) = − � p i log p i satisfies these axioms. Indeed, k l k l � � � � − p i log p i − p 1 q j log( p 1 q j ) = − p i log p i − p 1 log p 1 − p 1 q j log q j . i =2 j =1 i =2 j =1 It is possible to show that H ( p 1 , ..., p k ) = − � p i log p i is the only function which satisfies these axioms. 3
Smooth ergodic theory, lecture 21 M. Verbitsky C. Shannon, “Mathematical theory of computation”, p. 10 4
Smooth ergodic theory, lecture 21 M. Verbitsky Entropy of dynamical system In this lecture, we consider only dynamical systems ( M, µ, T ) with µ proba- bilistic and T measure-preserving. Given a partition V , M = � V i we denote by T − 1 ( V ) the partition M = � T − 1 ( V i ) . DEFINITION: Let ( M, µ, T ) be a dynamical system, and V , M = � V i a partition of M . Denote by V n the partition V n := V ∨ T − 1 ( V ) ∨ T − 2 ( V ) ∨ ... ∨ T − n +1 . Entropy ( M, µ, T ) of with respect to the partition V is h µ ( T, V ) := lim n 1 n H µ ( V n ) Entropy of ( M, µ, T ) is supremum of h µ ( T, V ) taken over all partitions V with finite entropy. REMARK: Let V ≻ W be a refinement of the partition W . Clearly, H µ ( V ) � H µ ( W ). This implies h µ ( T, V ) � h µ ( T, W ). 5
Smooth ergodic theory, lecture 21 M. Verbitsky Entropy of dynamical system and iterations REMARK: Clearly, � n − 1 j =0 T − j ( V k ) = V n + k . This gives 1 h µ ( V k , T ) = lim n nH µ ( V n + k ) = h µ ( V , T ) . n The last equation holds because lim n n + k = 1. COROLLARY: This implies h µ ( V , T ) = 1 n h µ ( V n , T n ) . j =0 V n = V kn 2 , giving h µ ( V n , T n ) = lim n 1 Proof: Indeed, � kn − 1 n H µ ( V kn ) = nh µ ( V , T ) (the last equation is implied by the previous remark). COROLLARY: For any ( M, µ, T ) , one has h µ ( T n ) = nh µ ( T ) . Proof: Since V n is a refinement of V , one has H µ ( V n ) � H µ ( V ). This gives h µ ( T n ) = sup V H µ ( T n , V ) = sup V n H µ ( T n , V n ) = n sup V H µ ( T, V ) = nh µ ( T ) . COROLLARY: Let µ = 1 � n i =1 δ x i be a sum of atomic measures. Since n T preserves µ , T acts on the set { x 1 , ..., x n } by permutations. Therefore T n ! = Id , giving h µ ( V , T ) = h µ ( V n ! , T ) = 1 n ! h µ ( V n ! , T n ! ) = 0 . 6
Smooth ergodic theory, lecture 21 M. Verbitsky Independent partitions DEFINITION: Let V , W be finite partitions. We say that they are indepen- dent if for all V i ∈ V and W j ∈ W , one has µ ( V i ∩ W j ) = µ ( V i ) µ ( W j ). REMARK: In probabilistic terms, this means that the events associated with V i and W j are uncorrelated . REMARK: Let V , W be independent partitions, with p 1 , ..., p k measures of V i and q 1 , ..., q l measures of W . Then � � � � � H µ ( V∨W ) = p i q j log( p i q j ) = p i q j log q j + q j p i log p i = H µ ( V )+ H µ ( W ) . i,j j i i j COROLLARY: Let ( M, µ, T ) be a dynamical system, and V a partition of M . Assume that T − i ( V ) is independent from V i for all i . Then H µ ( V n ) = nH µ ( V ) , giving h µ ( T, V ) = H µ ( V ) . REMARK: It is possible to show (and it clearly follows from Shannon’s description of entropy) that H ( V ∨ W ) � H ( V ) + H ( W ) , and the equality is reached if and only if V and W are independent. This result is called subadditivity of entropy . This implies, in particular, that H µ ( V n ) � nH µ ( V ), hence the limit lim 1 n H µ ( V n ) is always finite. 7
Smooth ergodic theory, lecture 21 M. Verbitsky Entropy of dynamical system: Bernoulli space DEFINITION: Let P be a finite set, P Z the product of Z copies of P , Σ ⊂ Z a finite subset, and π Σ : P Z − → P | Σ | projection to the corresponding Σ ( R ), where R ⊂ P | Σ | is any components. Cylindrical sets are sets C R := π − 1 subset. REMARK: For Bernoulli space, a complement to an cylindrical set is again a cylindrical set, and the cylindrical sets form a Boolean algebra. DEFINITION: Bernoulli measure on P Z is µ such that µ ( C R ) := | R | | P | | Σ | . EXAMPLE: Let V = { V i } be a finite partition of Bernoulli space M = P Z into cylindrical sets, a T the Bernoulli shift. Let Σ ⊂ Z be a finite subset such that all V i are obtained as π − 1 Σ ( R i ) for some R i ⊂ P | Σ | . For N sufficienty big, the sets Σ and T − i (Σ) don’t intersect. In this case, the partitions V kN and T − N ( V ) are independent, giving h µ ( T N , V ) = H µ ( V ) . Since h µ ( T ) = 1 /Nh µ ( T N ) � H µ ( V ), this implies that the entropy of T is positive. 8
Smooth ergodic theory, lecture 21 M. Verbitsky Approximating partitions LEMMA 1: Let ( M, µ ) be a space with measure, and A an algebra of mea- surable subsets of M which generates any measurable subset uo to measure 0. Then for any partition V with finite entropy and any ε 0. there exists a finite partition W ⊂ A such that H µ ( W ∨ V ) − H µ ( W ) < ε . Proof: Using Lebesgue approximation theorem, we can approximate the par- tition V by W ⊂ A with arbitrary precision: for each V i ∈ V there exists W i ∈ W (which can be empty) such that µ ( V i △ W i ) < ε i . Then p i H µ ( p − 1 µ ( W i ∩ V 1 ) , ..., p − 1 � H µ ( W ∨ V ) − H µ ( W ) = µ ( W i ∩ V n )) . i i i where p i = µ ( W i ). However, W is chosen in such a way that µ ( W i ∩ V i ) is arbitrarily close to p i , and µ ( W i ∩ V j ) is arbitrarily small for j � = i , hence the entropy H µ ( p − 1 µ ( W i ∩ V 1 ) , ..., p − 1 µ ( W i ∩ V n )) is arbitrarily small. i i 9
Smooth ergodic theory, lecture 21 M. Verbitsky Kolmogorov-Sinai theorem THEOREM: (Kolmogorov-Sinai) Let ( M, µ, T ) be a dynamical system, and V 1 ≺ V 2 ≺ ... a sequence of partitions of M finite entropy, such that the subsets � ∞ i =1 V i generate the σ -algebra of measurable sets, up to measure zero. Then h µ ( T ) = lim n h µ ( T, V n ) . Proof: Notice that h µ ( T, V n ) is monotonous as a function of n , because Moreover, h µ ( T, V N V 1 ≺ V 2 ≺ ... . n ) = h µ ( T, V n ) as shown above. Since any partition W admits an approximation by a partition from the σ -algebra generated by V n , we obtain that for n sufficiently big, one has h µ ( T, W ) � h µ ( T, V N n ) + ε = h µ ( T, V n ) + ε Passing to the limit as ε − → 0, obtain that h µ ( T, W ) � lim n h µ ( T, V n ). DEFINITION: We say that a partition V is a generator , or generating partition if the union of all V n = � n − 1 i =0 T − i ( V ) generates the σ -algebra of measurable sets, up to measure zero. COROLLARY: Let V be a generating partition on ( M, µ, T ). Then h µ ( T ) = h µ ( T, V ) . Proof: By Kolmogorov-Sinai, h µ ( T ) = lim n h µ ( T, V n ). However, h µ ( T, V n ) = h µ ( T, V ) as shown above. 10
Smooth ergodic theory, lecture 21 M. Verbitsky Entropy of a dynamical system: Bernoulli space (2) REMARK: Let ( M = P Z , µ, T ) be the Bernoulli system, with P = { x 1 , ..., x p } and Π i the projection to i -th component. Consider a partition V with M = � p i =1 Π − 1 0 ( x i ). Clearly, the Borel σ -algebra is generated by Π − 1 ( { x } ). Then i V is a generating partition. However, h µ ( T, V ) = � p 1 [ log( p ) = log( p ). We i =1 have proved that h µ ( T ) = log( | P | ) . 11
Recommend
More recommend