Intro Results Related Work Conclusion Entropy property testing with finitely many errors Changlong Wu (Univ of Hawaii, Manoa) Empty line Empty line Joint work with Narayana Santhanam (Univ of Hawaii, Manoa) ISIT2020 Online Talk June, 2020
Intro Results Related Work Conclusion Introduction Meta-question: When will scientist find perfect theory eventually almost surely? 1 / 20
Intro Results Related Work Conclusion Introduction Meta-question: When will scientist find perfect theory eventually almost surely? Consider a scientist building a theory that describes a nature phenomenon by making observations. 1 / 20
Intro Results Related Work Conclusion Introduction Meta-question: When will scientist find perfect theory eventually almost surely? Consider a scientist building a theory that describes a nature phenomenon by making observations. The scientist may refine his theory every time new observations arrive. (e.g. Newton → Einstein) 1 / 20
Intro Results Related Work Conclusion Introduction Meta-question: When will scientist find perfect theory eventually almost surely? Consider a scientist building a theory that describes a nature phenomenon by making observations. The scientist may refine his theory every time new observations arrive. (e.g. Newton → Einstein) Will the scientist perpetually refine his theory or settle a perfect theory after making finitely many observations? 1 / 20
Intro Results Related Work Conclusion A toy example Let p be a distribution over { 1 , 2 , · · · , m } , and H ( p ) be the entropy of p . 2 / 20
Intro Results Related Work Conclusion A toy example Let p be a distribution over { 1 , 2 , · · · , m } , and H ( p ) be the entropy of p . For some fixed h ∈ [0 , log m ], we would like to decide: Is H ( p ) = h ? by observing i . i . d . samples X 1 , X 2 , · · · ∼ p . 2 / 20
Intro Results Related Work Conclusion A toy example Let p be a distribution over { 1 , 2 , · · · , m } , and H ( p ) be the entropy of p . For some fixed h ∈ [0 , log m ], we would like to decide: Is H ( p ) = h ? by observing i . i . d . samples X 1 , X 2 , · · · ∼ p . Seems to be an ill-posed problem... 2 / 20
Intro Results Related Work Conclusion A toy example Let p be a distribution over { 1 , 2 , · · · , m } , and H ( p ) be the entropy of p . For some fixed h ∈ [0 , log m ], we would like to decide: Is H ( p ) = h ? by observing i . i . d . samples X 1 , X 2 , · · · ∼ p . Seems to be an ill-posed problem... Since one can’t decide for distributions p with H ( p ) arbitrary close but not equals to h . 2 / 20
Intro Results Related Work Conclusion A toy example Let p be a distribution over { 1 , 2 , · · · , m } , and H ( p ) be the entropy of p . For some fixed h ∈ [0 , log m ], we would like to decide: Is H ( p ) = h ? by observing i . i . d . samples X 1 , X 2 , · · · ∼ p . Seems to be an ill-posed problem... Since one can’t decide for distributions p with H ( p ) arbitrary close but not equals to h . 2 / 20
Intro Results Related Work Conclusion A toy example We are allowed to sample as long as we want, but after some point we must make the right decision. 3 / 20
Intro Results Related Work Conclusion A toy example We are allowed to sample as long as we want, but after some point we must make the right decision. We show that for any h ∈ [0 , log m ], there exist a universal decision rule Φ, such that for any distribution p over [ m ], we have Φ( X n 1 ) → 1 { H ( p ) = h } , almost surely as n → ∞ where X 1 , X 2 , · · · ∼ p independently. 3 / 20
Intro Results Related Work Conclusion A toy example We are allowed to sample as long as we want, but after some point we must make the right decision. We show that for any h ∈ [0 , log m ], there exist a universal decision rule Φ, such that for any distribution p over [ m ], we have Φ( X n 1 ) → 1 { H ( p ) = h } , almost surely as n → ∞ where X 1 , X 2 , · · · ∼ p independently. In other words, Φ makes the right decision eventually almost surely. 3 / 20
Intro Results Related Work Conclusion Proof? Let ˆ p n be the empirical distribution of p with n samples 4 / 20
Intro Results Related Work Conclusion Proof? Let ˆ p n be the empirical distribution of p with n samples Standard concentration inequality yields that there exist an number p n − p || TV ≥ log 2 n √ n ) ≤ 1 N such that for any n ≥ N , we have p ( || ˆ n 2 . 4 / 20
Intro Results Related Work Conclusion Proof? Let ˆ p n be the empirical distribution of p with n samples Standard concentration inequality yields that there exist an number p n − p || TV ≥ log 2 n √ n ) ≤ 1 N such that for any n ≥ N , we have p ( || ˆ n 2 . Entropy function is uniform continuous over bounded support, we have function t ( n ) → 0, such that for n ≥ N p n ) − H ( p ) | ≥ t ( n )) ≤ 1 p ( | H (ˆ n 2 . 4 / 20
Intro Results Related Work Conclusion Proof? The decision rule is as follows: if | H (ˆ p n ) − h | ≤ t ( n ) we decide ”yes”, otherwise decide ”no”. 5 / 20
Intro Results Related Work Conclusion Proof? The decision rule is as follows: if | H (ˆ p n ) − h | ≤ t ( n ) we decide ”yes”, otherwise decide ”no”. Now, if indeed we have H ( p ) = h , we know by Borel-Cantelli lemma that the rule will be correct for all but finite n ≥ N w.p. 1. 5 / 20
Intro Results Related Work Conclusion Proof? The decision rule is as follows: if | H (ˆ p n ) − h | ≤ t ( n ) we decide ”yes”, otherwise decide ”no”. Now, if indeed we have H ( p ) = h , we know by Borel-Cantelli lemma that the rule will be correct for all but finite n ≥ N w.p. 1. If H ( p ) � = h , we know that there exist some number N p such that for all but finite n ≥ N p we have | H (ˆ p n ) − h | > | H ( p ) − h | − t ( n ) > t ( n ) w.p. 1, since t ( n ) → 0. 5 / 20
Intro Results Related Work Conclusion Testing general entropy property Let P be a class of distributions over N , and A ⊂ R + . For what combination of P and A , we can find a decision rule Φ such that Φ( X n 1 ) → 1 { H ( p ) ∈ A } , almost surely as n → ∞ for all p ∈ P and X 1 , X 2 , · · · i . i . d . ∼ p ? 6 / 20
Intro Results Related Work Conclusion F σ -separable Sets A ⊂ R + and A c = R + \ A are said to be F σ -separable, if there exist collection of sets { B n } n ∈ N and { C n } n ∈ N such that n ∈ N B n and A c = � 1. A = � n ∈ N C n ; 2. For all n ∈ N , B n ⊂ B n +1 and C n ⊂ C n +1 ; 3. For all n ∈ N , inf {| x − y | : x ∈ B n , y ∈ C n } > 0. 7 / 20
Intro Results Related Work Conclusion Bounded support case Theorem For any A ⊂ [0 , log m ] , we can decide Is H ( p ) ∈ A ? eventually almost surely for all distributions p over [ m ] iff A and A c are F σ -separable. 8 / 20
Intro Results Related Work Conclusion Infinite Alphabets Does the result extend to distributions on naturals with arbitrary support? 9 / 20
Intro Results Related Work Conclusion Infinite Alphabets Does the result extend to distributions on naturals with arbitrary support? The answer is no, we prove the following the theorem: Theorem For any k ≥ 1 , there is no decision rule that decides 1. Is H ( p ) ≥ k ? 2. Is H ( p ) finite ? eventual almost surely for all distributions over N . 9 / 20
Intro Results Related Work Conclusion Infinite Alphabets Does the result extend to distributions on naturals with arbitrary support? The answer is no, we prove the following the theorem: Theorem For any k ≥ 1 , there is no decision rule that decides 1. Is H ( p ) ≥ k ? 2. Is H ( p ) finite ? eventual almost surely for all distributions over N . Proof uses a diagonization argument... 9 / 20
Intro Results Related Work Conclusion Infinite Alphabets We note the following somewhat surprising theorem: Theorem For any k ≥ 1 , there exists a decision rule that decides Is H ( p ) > k ? eventual almost surely for all distributions over N . 10 / 20
Intro Results Related Work Conclusion Infinite Alphabets We note the following somewhat surprising theorem: Theorem For any k ≥ 1 , there exists a decision rule that decides Is H ( p ) > k ? eventual almost surely for all distributions over N . The difference from the H ( p ) ≥ k case is that, one can construct an estimator ˆ H such that ˆ 1 ) ≤ H ( p ) and ˆ H ( X n H ( X n 1 ) → H ( p ) almost surely. Decide ”yes” if ˆ H ( X n 1 ) ≤ k and ”no” otherwise. 10 / 20
Intro Results Related Work Conclusion Preparing for the main result: Tail entropy For any function ρ : N → R + and class P of distributions over N . We say the tail entropy of P is eventually dominated by ρ if for all p ∈ P there exist an number N p such that for all n ≥ N p we have � H n ( p ) = − p i log p i ≤ ρ ( n ) . i ≥ n 11 / 20
Intro Results Related Work Conclusion Main Result Theorem Let ρ : N → R + be an arbitrary function such that ρ ( n ) → 0 as n → ∞ , P is eventually dominated by ρ , and A ⊂ R + . Then, there exist decision rule that decides Is H ( p ) ∈ A ? eventually almost surely for all p ∈ P , iff A and A c are F σ -separable. 12 / 20
Recommend
More recommend