Novel Lower Bounds on the Entropy Rate of Binary Hidden Markov - PowerPoint PPT Presentation

Novel Lower Bounds on the Entropy Rate of Binary Hidden Markov Processes Or Ordentlich MIT ISIT, Barcelona, July 11, 2016 1 / 13

Binary Markov Processes q 10 1 − q 10 0 1 1 − q 01 q 01 � 1 − q 01 � q 01 P = , π P = π = [ π 0 π 1 ] q 10 1 − q 10 X 1 ∼ Bernoulli ( π 1 ), Pr( X n = j | X n − 1 = i , X n − 2 , . . . , X 1 ) = P ij 2 / 13

Binary Markov Processes q 10 1 − q 10 0 1 1 − q 01 q 01 � 1 − q 01 � q 01 P = , π P = π = [ π 0 π 1 ] q 10 1 − q 10 X 1 ∼ Bernoulli ( π 1 ), Pr( X n = j | X n − 1 = i , X n − 2 , . . . , X 1 ) = P ij Entropy Rate For a stationary process { X n } the entropy rate is defined as H ( X 1 , . . . , X n ) ¯ H ( X ) � lim = lim n →∞ H ( X n | X n − 1 , . . . , X 1 ) n n →∞ 2 / 13

Binary Markov Processes q 10 1 − q 10 0 1 1 − q 01 q 01 � 1 − q 01 � q 01 P = , π P = π = [ π 0 π 1 ] q 10 1 − q 10 X 1 ∼ Bernoulli ( π 1 ), Pr( X n = j | X n − 1 = i , X n − 2 , . . . , X 1 ) = P ij Entropy Rate For the Markov process above ¯ H ( X ) = H ( X n | X n − 1 ) = π 0 h ( q 01 ) + π 1 h ( q 10 ) h ( α ) � − α log 2 ( α ) − (1 − α ) log 2 (1 − α ) 2 / 13

Binary Hidden Markov Processes q 10 { X n } : 1 − q 10 0 1 1 − q 01 q 01 3 / 13

Binary Hidden Markov Processes q 10 { X n } : 1 − q 10 0 1 1 − q 01 q 01 { Y n } : P Y | X X n Y n 3 / 13

Binary Hidden Markov Processes q 10 { X n } : 1 − q 10 0 1 1 − q 01 q 01 { Y n } : Z n ∼ Bernoulli ( α ) X n Y n = X n ⊕ Z n 3 / 13

Binary Hidden Markov Processes q 10 { X n } : 1 − q 10 0 1 1 − q 01 q 01 { Y n } : Z n ∼ Bernoulli ( α ) X n Y n = X n ⊕ Z n Entropy Rate Unknown ¯ H ( Y ) = f ( α, q 10 , q 01 ) =??? 3 / 13

Binary Hidden Markov Processes q 10 { X n } : 1 − q 10 0 1 1 − q 01 q 01 { Y n } : Z n ∼ Bernoulli ( α ) X n Y n = X n ⊕ Z n Entropy Rate Unknown ¯ H ( Y ) = f ( α, q 10 , q 01 ) =??? Our contribution: new lower bounds on ¯ H ( Y ) 3 / 13

Binary Symmetric Hidden Markov Processes q { X n } : 1 − q 0 1 1 − q q { Y n } : Z n ∼ Bernoulli ( α ) X n Y n = X n ⊕ Z n Entropy Rate Unknown ¯ H ( Y ) = f ( α, q ) =??? 4 / 13

Binary Symmetric HMP - Simple Bounds “Cover-Thomas bounds”: H ( Y n | Y n − 1 . . . , Y 1 , X 0 ) ≤ ¯ H ( Y ) ≤ H ( Y n | Y n − 1 . . . , Y 0 ) 5 / 13

Binary Symmetric HMP - Simple Bounds “Cover-Thomas bounds”: H ( Y n | Y n − 1 . . . , Y 1 , X 0 ) ≤ ¯ H ( Y ) ≤ H ( Y n | Y n − 1 . . . , Y 0 ) Accuracy improves exponentially with n [Birch’62] 5 / 13

Binary Symmetric HMP - Simple Bounds “Cover-Thomas bounds”: H ( Y n | Y n − 1 . . . , Y 1 , X 0 ) ≤ ¯ H ( Y ) ≤ H ( Y n | Y n − 1 . . . , Y 0 ) Accuracy improves exponentially with n [Birch’62] Simple lower bound by Mrs. Gerber’s Lemma: � � H ( X 1 , . . . , X n ) �� α ∗ h − 1 H ( Y 1 , . . . , Y n ) ≥ nh n h − 1 : [0 , 1] → [0 , 1 / 2] a ∗ b = a (1 − b ) + b (1 − a ), 5 / 13

Binary Symmetric HMP - Simple Bounds “Cover-Thomas bounds”: H ( Y n | Y n − 1 . . . , Y 1 , X 0 ) ≤ ¯ H ( Y ) ≤ H ( Y n | Y n − 1 . . . , Y 0 ) Accuracy improves exponentially with n [Birch’62] Simple lower bound by Mrs. Gerber’s Lemma: � � H ( X 1 , . . . , X n ) �� α ∗ h − 1 H ( Y 1 , . . . , Y n ) ≥ nh n � � �� H ( X 1 , . . . , X n ) ⇒ ¯ α ∗ h − 1 H ( Y ) ≥ h lim n n →∞ � � α ∗ h − 1 ( u ) Continuity of MGL function ϕ ( u ) = h 5 / 13

Binary Symmetric HMP - Simple Bounds “Cover-Thomas bounds”: H ( Y n | Y n − 1 . . . , Y 1 , X 0 ) ≤ ¯ H ( Y ) ≤ H ( Y n | Y n − 1 . . . , Y 0 ) Accuracy improves exponentially with n [Birch’62] Simple lower bound by Mrs. Gerber’s Lemma: � � H ( X 1 , . . . , X n ) �� α ∗ h − 1 H ( Y 1 , . . . , Y n ) ≥ nh n � � �� H ( X 1 , . . . , X n ) ⇒ ¯ α ∗ h − 1 H ( Y ) ≥ h lim n n →∞ ⇒ ¯ H ( Y ) ≥ h ( α ∗ q ) ¯ H ( X ) = h ( q ) 5 / 13

Binary Symmetric HMP - Simple Bounds “Cover-Thomas bounds”: H ( Y n | Y n − 1 . . . , Y 1 , X 0 ) ≤ ¯ H ( Y ) ≤ H ( Y n | Y n − 1 . . . , Y 0 ) Accuracy improves exponentially with n [Birch’62] Simple lower bound by Mrs. Gerber’s Lemma: � � H ( X 1 , . . . , X n ) �� α ∗ h − 1 H ( Y 1 , . . . , Y n ) ≥ nh n � � �� H ( X 1 , . . . , X n ) ⇒ ¯ α ∗ h − 1 H ( Y ) ≥ h lim n n →∞ ⇒ ¯ H ( Y ) ≥ h ( α ∗ q ) The same as Cover-Thomas bound of order n = 1 5 / 13

Binary Symmetric HMP - Simple Bounds “Cover-Thomas bounds”: H ( Y n | Y n − 1 . . . , Y 1 , X 0 ) ≤ ¯ H ( Y ) ≤ H ( Y n | Y n − 1 . . . , Y 0 ) Accuracy improves exponentially with n [Birch’62] Simple lower bound by Mrs. Gerber’s Lemma: � � H ( X 1 , . . . , X n ) �� α ∗ h − 1 H ( Y 1 , . . . , Y n ) ≥ nh n � � �� H ( X 1 , . . . , X n ) ⇒ ¯ α ∗ h − 1 H ( Y ) ≥ h lim n n →∞ ⇒ ¯ H ( Y ) ≥ h ( α ∗ q ) Standard MGL gives a weak estimate 5 / 13

Binary Symmetric HMP - Simple Bounds “Cover-Thomas bounds”: H ( Y n | Y n − 1 . . . , Y 1 , X 0 ) ≤ ¯ H ( Y ) ≤ H ( Y n | Y n − 1 . . . , Y 0 ) Accuracy improves exponentially with n [Birch’62] Simple lower bound by Mrs. Gerber’s Lemma: � � H ( X 1 , . . . , X n ) �� α ∗ h − 1 H ( Y 1 , . . . , Y n ) ≥ nh n � � �� H ( X 1 , . . . , X n ) ⇒ ¯ α ∗ h − 1 H ( Y ) ≥ h lim n n →∞ ⇒ ¯ H ( Y ) ≥ h ( α ∗ q ) Standard MGL gives a weak estimate We will use an improved version of MGL 5 / 13

Samorodnitsky’s MGL X , Y ∈ { 0 , 1 } n are the input and output of a BSC ( α ) λ � (1 − 2 α ) 2 The projection of X onto a subset of coordinates S ⊆ [ n ] is X S � { X i : i ∈ S } Let V be a random subset of [ n ] generated by independently sampling each element i with probability λ Theorem [Samorodnitsky’15] � � H ( X V | V ) �� α ∗ h − 1 H ( Y ) ≥ nh λ n 6 / 13

Samorodnitsky’s MGL X , Y ∈ { 0 , 1 } n are the input and output of a BSC ( α ) λ � (1 − 2 α ) 2 The projection of X onto a subset of coordinates S ⊆ [ n ] is X S � { X i : i ∈ S } Let V be a random subset of [ n ] generated by independently sampling each element i with probability λ Theorem [Samorodnitsky’15] � � H ( X V | V ) �� α ∗ h − 1 H ( Y ) ≥ nh λ n is nonincreasing ∗ in λ By Han’s inequality H ( X V | V ) λ n 6 / 13

Samorodnitsky’s MGL X , Y ∈ { 0 , 1 } n are the input and output of a BSC ( α ) λ � (1 − 2 α ) 2 The projection of X onto a subset of coordinates S ⊆ [ n ] is X S � { X i : i ∈ S } Let V be a random subset of [ n ] generated by independently sampling each element i with probability λ Theorem [Samorodnitsky’15] � � H ( X V | V ) �� α ∗ h − 1 H ( Y ) ≥ nh λ n ⇒ The new bound is stronger than MGL 6 / 13

Samorodnitsky’s MGL - Proof Outline n � H ( Y i | Y i − 1 H ( Y ) = ) 1 i =1 n � ϕ ( H ( X i | Y i − 1 ≥ )) 1 i =1 n � � � H ( X i ) − I ( X i ; Y i − 1 = ϕ ) 1 i =1 � � ϕ ( x ) � h α ∗ h − 1 ( x ) 7 / 13

Samorodnitsky’s MGL - Proof Outline n � H ( Y i | Y i − 1 H ( Y ) = ) 1 i =1 n � ϕ ( H ( X i | Y i − 1 ≥ )) 1 i =1 n � � � H ( X i ) − I ( X i ; Y i − 1 = ϕ ) 1 i =1 Need to upper bound I ( X i ; Y i − 1 ) 1 7 / 13

Samorodnitsky’s MGL - Proof Outline n � H ( Y i | Y i − 1 H ( Y ) = ) 1 i =1 n � ϕ ( H ( X i | Y i − 1 ≥ )) 1 i =1 n � � � H ( X i ) − I ( X i ; Y i − 1 = ϕ ) 1 i =1 Need to upper bound I ( X i ; Y i − 1 ) = I ( X i ; Y i − 2 ) + I ( X i ; Y i − 1 | Y i − 2 ) 1 1 1 7 / 13

Samorodnitsky’s MGL - Proof Outline n � H ( Y i | Y i − 1 H ( Y ) = ) 1 i =1 n � ϕ ( H ( X i | Y i − 1 ≥ )) 1 i =1 n � � � H ( X i ) − I ( X i ; Y i − 1 = ϕ ) 1 i =1 Need to upper bound I ( X i ; Y i − 1 ) = I ( X i ; Y i − 2 ) + I ( X i ; Y i − 1 | Y i − 2 ) 1 1 1 (SDPI) ≤ I ( X i ; Y i − 2 ) + λ I ( X i ; X i − 1 | Y i − 2 ) 1 1 7 / 13

Samorodnitsky’s MGL - Proof Outline n � H ( Y i | Y i − 1 H ( Y ) = ) 1 i =1 n � ϕ ( H ( X i | Y i − 1 ≥ )) 1 i =1 n � � � H ( X i ) − I ( X i ; Y i − 1 = ϕ ) 1 i =1 Need to upper bound I ( X i ; Y i − 1 ) = I ( X i ; Y i − 2 ) + I ( X i ; Y i − 1 | Y i − 2 ) 1 1 1 (SDPI) ≤ I ( X i ; Y i − 2 ) + λ I ( X i ; X i − 1 | Y i − 2 ) 1 1 � � = (1 − λ ) I ( X i ; Y i − 2 I ( X i ; Y i − 2 ) + I ( X i ; X i − 1 | Y i − 2 ) + λ ) 1 1 1 7 / 13

Samorodnitsky’s MGL - Proof Outline n � H ( Y i | Y i − 1 H ( Y ) = ) 1 i =1 n � ϕ ( H ( X i | Y i − 1 ≥ )) 1 i =1 n � � � H ( X i ) − I ( X i ; Y i − 1 = ϕ ) 1 i =1 Need to upper bound I ( X i ; Y i − 1 ) = I ( X i ; Y i − 2 ) + I ( X i ; Y i − 1 | Y i − 2 ) 1 1 1 (SDPI) ≤ I ( X i ; Y i − 2 ) + λ I ( X i ; X i − 1 | Y i − 2 ) 1 1 � � = (1 − λ ) I ( X i ; Y i − 2 I ( X i ; Y i − 2 ) + I ( X i ; X i − 1 | Y i − 2 ) + λ ) 1 1 1 (Chain Rule) = (1 − λ ) I ( X i ; Y i − 2 ) + λ I ( X i ; X i − 1 , Y i − 2 ) 1 1 7 / 13

Novel Lower Bounds on the Entropy Rate of Binary Hidden Markov - PowerPoint PPT Presentation

Novel Lower Bounds on the Entropy Rate of Binary Hidden Markov Processes Or Ordentlich MIT ISIT, Barcelona, July 11, 2016 1 / 13 Binary Markov Processes q 10 1 q 10 0 1 1 q 01 q 01 1 q 01 q 01 P = , P = = [ 0

Entropy, Relative Entropy, Cross Entropy Entropy Entropy, H(x) is a measure of the uncertainty of

Formal Modeling in Cognitive Science Lecture 25: Entropy, Joint Entropy, Conditional Entropy 1

Circuit Lower-bounds Lecture 24 Weak circuits are indeed weak 1 Circuit Lower-bounds 2

Labor Classification Yrs Rate 1 Rate 2 Rate 3 Rate 4 Rate 5 Rate 6 Rate 7 Rate 8 Rate 9

Entropy bounds and the holographic principle Raphael Bousso Berkeley Center for Theoretical

Binary Numbers Binary numbers look like this Binary Numbers or Binary Code Binary numbers or

A Quick Review Decimal to binary Binary to decimal Binary to hexadecimal

Lower Bounds on Matrix Rigidity via a Quantum Argument Ronald de Wolf CWI Amsterdam Lower

Binary Trees, Heaps Binary Trees, Heaps Binary trees Binary trees A binary tree (

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

1) Entropy = measure of randomness 2) Entropy = measure of compressibility More random = Less

Chapter 2 Entropy, Relative Entropy, and Mutual Infor- mation Peng-Hua Wang Graduate Institute

Lecture 2. Upper and lower bounds for subgaussian matrices The -net method refined 1 Random

Kernel-Size Lower Bounds: The Evidence from Complexity Theory Andrew Drucker IAS Worker 2013,

Amit Chakrabarti Dartmouth College WAPMDS, IIT Kanpur, Dec 2009 Amit Chakrabarti 1 Multi-Pass

Algebraic Identification of Binary-Valued Hidden Markov Processes Alexander Schnhuth CWI,

A Simple and Efficient Solution of the Identifiability Problem for Hidden Markov Models and

San Franciscos Getting to Zero Initiative April 1, 2016 Lance Toma, Asian & Pacific

Social Network HIV Testing in Oakland, California Yamini Bhatnagar Downtown Youth Clinic East

Objec5ves Using sines and cosines to reconstruct a signal The Fourier Transform Image

Linux You Can Drive My Car Embedded Linux Conference

Using Social Media to Grow Enrollment The Latest Ideas

WELCOME 2019 GRANT AND COOPERATIVE AGREEMENT TECHNICAL WORKSHOP Habitat Management Program/

Novel Lower Bounds on the Entropy Rate of Binary Hidden Markov - PowerPoint PPT Presentation

Novel Lower Bounds on the Entropy Rate of Binary Hidden Markov Processes Or Ordentlich MIT ISIT, Barcelona, July 11, 2016 1 / 13 Binary Markov Processes q 10 1 q 10 0 1 1 q 01 q 01 1 q 01 q 01 P = , P = = [ 0

Entropy, Relative Entropy, Cross Entropy Entropy Entropy, H(x) is a measure of the uncertainty of

Formal Modeling in Cognitive Science Lecture 25: Entropy, Joint Entropy, Conditional Entropy 1

Circuit Lower-bounds Lecture 24 Weak circuits are indeed weak 1 Circuit Lower-bounds 2

Labor Classification Yrs Rate 1 Rate 2 Rate 3 Rate 4 Rate 5 Rate 6 Rate 7 Rate 8 Rate 9

Entropy bounds and the holographic principle Raphael Bousso Berkeley Center for Theoretical

Binary Numbers Binary numbers look like this Binary Numbers or Binary Code Binary numbers or

A Quick Review Decimal to binary Binary to decimal Binary to hexadecimal

Lower Bounds on Matrix Rigidity via a Quantum Argument Ronald de Wolf CWI Amsterdam Lower

Binary Trees, Heaps Binary Trees, Heaps Binary trees Binary trees A binary tree (

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

1) Entropy = measure of randomness 2) Entropy = measure of compressibility More random = Less

Chapter 2 Entropy, Relative Entropy, and Mutual Infor- mation Peng-Hua Wang Graduate Institute

Lecture 2. Upper and lower bounds for subgaussian matrices The -net method refined 1 Random

Kernel-Size Lower Bounds: The Evidence from Complexity Theory Andrew Drucker IAS Worker 2013,

Amit Chakrabarti Dartmouth College WAPMDS, IIT Kanpur, Dec 2009 Amit Chakrabarti 1 Multi-Pass

Algebraic Identification of Binary-Valued Hidden Markov Processes Alexander Schnhuth CWI,

A Simple and Efficient Solution of the Identifiability Problem for Hidden Markov Models and

San Franciscos Getting to Zero Initiative April 1, 2016 Lance Toma, Asian &amp; Pacific

Social Network HIV Testing in Oakland, California Yamini Bhatnagar Downtown Youth Clinic East

Objec5ves Using sines and cosines to reconstruct a signal The Fourier Transform Image

Linux You Can Drive My Car Embedded Linux Conference

Using Social Media to Grow Enrollment The Latest Ideas

WELCOME 2019 GRANT AND COOPERATIVE AGREEMENT TECHNICAL WORKSHOP Habitat Management Program/

San Franciscos Getting to Zero Initiative April 1, 2016 Lance Toma, Asian & Pacific