statistical inference for r enyi entropy of integer order
play

Statistical inference for R enyi entropy of integer order David K - PowerPoint PPT Presentation

Statistical inference for R enyi entropy of integer order David K allberg August 23, 2010 David K allberg Statistical inference for R enyi entropy of integer order Outline Introduction Measures of uncertainty Estimation of


  1. Statistical inference for R´ enyi entropy of integer order David K¨ allberg August 23, 2010 David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  2. Outline Introduction Measures of uncertainty Estimation of entropy Numerical experiment More results Conclusion David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  3. Coauthor: Oleg Seleznjev Department of Mathematics and Mathematical Statistics Ume˚ a university David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  4. Introduction A system described by probability distribution P . Only partial information about P is available, e.g. the covariance matrix. A measure of uncertainty (entropy) in P . What P should we use if any? David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  5. Why entropy? The entropy maximization principle: choose P , satisfying given constraints, with maximum uncertainty. Objectivity: we don’t use more information than we have. David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  6. Measures of uncertainty. The Shannon entropy. Discrete P = { p ( k ) , k ∈ D } � h 1 ( P ) := − p ( k ) log p ( k ) k Continuous P with density p ( x ) , x ∈ R d � h 1 ( P ) := − R d log ( p ( x )) p ( x ) dx David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  7. Measures of uncertainty.The R´ enyi entropy A class of entropies. Of order s ≥ 0 given by Discrete P = { p ( k ) , k ∈ D } 1 � p ( k ) s ) , h s ( P ) := 1 − s log ( s � = 1 k Continuos P with density p ( x ) , x ∈ R d 1 � R d p ( x ) s dx ) , h s ( P ) := 1 − s log ( s � = 1 David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  8. Motivation The R´ enyi entropy satisfies axioms on how a measure of uncertainty should behave, R´ enyi(1961,1970). For both discrete and continuous P , the R´ enyi entropy is a generalization of the Shannon entropy, q → 1 h q ( P ) = h 1 ( P ) lim David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  9. Problem Non-parametric estimation of integer order R´ enyientropy, for discrete and continuous multivariate P , from sample { X 1 , . . . X n } of P -i.i.d. observations. Estimators of entropy are widely used. Distribution identification problems (Student-r distributions). Average case analysis for random databases. Clustering. David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  10. Overview of R´ enyi entropy estimation Consistency of nearest neighbor estimators for any s , Leonenko et al. (2008). Only for continuous entropy. Consistency and asymptotic normality for quadratic case s=2, Leonenko and Seleznjev (2010). David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  11. Notation I ( C ) indicator function of event C . S s , n set of all s-subsets of { 1 , . . . , n } . � ( s , n ) is summation over S s , n . d ( x , y ) the Euclidean distance in R d B ǫ ( x ) := { y : d ( x , y ) ≤ ǫ } ball of radius ǫ with center x . b ǫ volume of B ǫ ( x ) p ǫ ( x ) := P ( X ∈ B ǫ ( x )) the ǫ -ball probability at x David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  12. Estimation Method relies on estimating functional  k p ( k ) s � ( Discrete )  q s := E ( p s − 1 ( X )) = � R d p ( x ) s dx ( Continuous )  David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  13. Estimation, discrete case An estimator of q s , � − 1 � � s Q s , n := ψ ( S ) , n ( s , n ) where ψ ( S ) := 1 � I ( X i = X j , ∀ j ∈ S ) . s i ∈ S 1 − s log(max( ˜ 1 H s , n := Q s , n , 1 / n )) estimator of h s . Q s , n is a U -statistic, so properties follows from conventional theory. David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  14. Estimation, continuous case A U -statistic estimator of q ǫ, s := E ( p ǫ ( X ) s − 1 ), � − 1 � � s Q s , n := ψ ( S ) , n ( s , n ) where ψ ( S ) := 1 � I ( d ( X i , X j ) ≤ ǫ, ∀ j ∈ S ) . s i ∈ S Q s , n := Q s , n / b ǫ ( d ) s − 1 asymptotically unbiased estimator of q s ˜ if ǫ = ǫ ( n ) → 0 as n → ∞ . 1 − s log(max( ˜ 1 H s , n := Q s , n , 1 / n )) estimator of h s . David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  15. Smoothness conditions Denote by H ( α ) ( K ) , 0 < α ≤ 2 , K > 0, a linear space of continuous in R d functions satisfying α -H¨ older condition if 0 < α ≤ 1 or if 1 < α ≤ 2 with continuous partial derivatives satisfying ( α − 1)-H¨ older condition with constant K. David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  16. Asymptotics, continuous case ǫ = ǫ ( n ) → 0 as n → ∞ . L ( n ) > 0 , n ≥ 1 , a slowly varying function as n → ∞ . K s , n := max( s 2 ( ˜ Q 2 s − 1 , n − ˜ Q 2 s , n ) , 1 / n ) consistent estimator of the asymptotic variance. Theorem Suppose that p ( x ) is bounded and continuous. Let n ǫ d → a for some 0 < a ≤ ∞ . (i) Then H s , n is a consistent estimator of h s . (ii) Let p ( x ) s − 1 ∈ H ( α ) ( K ) for some d / 2 < α ≤ 2 . If ǫ ∼ L ( n ) n − 1 / d and a = ∞ , then ˜ √ n Q s , n (1 − s ) ( H s , n − h s ) D → N (0 , 1) n → ∞ . as � K s , n David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  17. Numerical experiment χ 2 distribution, 4 degrees of freedom. h 3 = − 1 2 log ( q 3 ), where q 3 = 1 / 54. 500 simulations, each of size n = 1000, ǫ = 1 / 4. Quantile plot and histogram supports standard normality. Remark. Choice of ǫ in a practical situation remains an open problem. David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  18. Figures Histogram of x 0.4 0.3 Standard normal density 0.2 0.1 0.0 −3 −2 −1 0 1 2 x David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  19. Figures Normal Q−Q Plot 2 1 Sample Quantiles 0 −1 −2 −3 −2 −1 0 1 2 3 Theoretical Quantiles David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  20. More results Estimation of (quadratic) R´ enyi entropy from m-dependent sample. Two different distributions P X and P Y . Inference for ...functionals of type � R d p X ( x ) s 1 p Y ( x ) s 2 dx , s 1 , s 2 ∈ N + . ...statistical distances (Bregman) Discrete: ( p X ( k ) − p Y ( k )) 2 � D 2 := k Continuous: � R d ( p X ( x ) − p Y ( x )) 2 dx D 2 := David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  21. Conclusion Asymptotically normal estimates possible for R´ enyi entropy of integer order. David K¨ allberg Statistical inference for R´ enyi entropy of integer order

  22. References Leonenko, N. ,Pronzato, L. ,Savani, V. (1982). A class of R´ enyi information estimators for multidimensional densities, Annals of Statistics 36 2153-2182. Leonenko, N. and Seleznjev, O. (2010). Statistical inference for ǫ -entropy and quadratic R´ enyientropy, J. Multivariate Analysis 101 , Issue 9, 1981-1994. R´ enyi, A.(1961). On measures of information and entropy Proceedings of the 4th Berkeley Symposium on Mathematics, Statistics and Probability 1960 , 547-561. Renyi, A. Probability theory, North-Holland Publishing Company 1970 David K¨ allberg Statistical inference for R´ enyi entropy of integer order

Recommend


More recommend