statistical inference for r enyi entropy
play

Statistical inference for R enyi entropy David K allberg - PowerPoint PPT Presentation

Statistical inference for R enyi entropy David K allberg Department of Mathematics and Mathematical Statistics Ume a University David K allberg Statistical inference for R enyi entropy Coauthor Oleg Seleznjev Department of


  1. Statistical inference for R´ enyi entropy David K¨ allberg Department of Mathematics and Mathematical Statistics Ume˙ a University David K¨ allberg Statistical inference for R´ enyi entropy

  2. Coauthor Oleg Seleznjev Department of Mathematics and Mathematical Statistics Ume˙ a University David K¨ allberg Statistical inference for R´ enyi entropy

  3. Outline Introduction Measures of uncertainty U -statistics Estimation of entropy Numerical experiment Conclusion David K¨ allberg Statistical inference for R´ enyi entropy

  4. Introduction A system described by probability distribution P Only partial information about P is available, e.g. the mean A measure of uncertainty (entropy) in P What P should we use if any? David K¨ allberg Statistical inference for R´ enyi entropy

  5. Why entropy? The entropy maximization principle: choose P , satisfying given constraints, with maximum uncertainty. Objectivity: we don’t use more information than we have. David K¨ allberg Statistical inference for R´ enyi entropy

  6. Measures of uncertainty. The Shannon entropy. Discrete P = { p ( k ) , k ∈ D } � h 1 ( P ) := − p ( k ) log p ( k ) k Continuous P with density p ( x ) , x ∈ R d � h 1 ( P ) := − R d log ( p ( x )) p ( x ) dx David K¨ allberg Statistical inference for R´ enyi entropy

  7. Measures of uncertainty. The R´ enyi entropy. A class of entropies. Of order s � = 1 given by Discrete P = { p ( k ) , k ∈ D } 1 � p ( k ) s ) h s ( P ) := 1 − s log ( k Continuos P with density p ( x ) , x ∈ R d 1 � R d p ( x ) s dx ) h s ( P ) := 1 − s log ( David K¨ allberg Statistical inference for R´ enyi entropy

  8. Motivation The R´ enyi entropy satisfies axioms on how a measure of uncertainty should behave, R´ enyi (1970). For both discrete and continuous P , the R´ enyi entropy is a generalization of the Shannon entropy, because q → 1 h q ( P ) = h 1 ( P ) lim David K¨ allberg Statistical inference for R´ enyi entropy

  9. Problem Non-parametric estimation of integer order R´ enyi entropy, for discrete and continuous P , from sample { X 1 , . . . X n } of P -iid observations. David K¨ allberg Statistical inference for R´ enyi entropy

  10. Overview of R´ enyi entropy estimation, continuos P Consistency of nearest neighbor estimators for any s , Leonenko et al. (2008) Consistency and asymptotic normality for quadratic case s=2, Leonenko and Seleznjev (2010) David K¨ allberg Statistical inference for R´ enyi entropy

  11. U -statistics: basic setup For a P -iid sample { X 1 , . . . , X n } and a symmetric kernel function h ( x 1 , . . . , x m ) E h ( X 1 , . . . , X m ) = θ ( P ) The U -statistic estimator of θ is defined as: � − 1 � n � U n = U n ( h ) := h ( X i 1 , . . . , X i m ) m 1 ≤ i 1 <...< i m ≤ n David K¨ allberg Statistical inference for R´ enyi entropy

  12. U -statistics: properties Symmetric, unbiased Optimality properties for large class of P Asymptotically normally distributed David K¨ allberg Statistical inference for R´ enyi entropy

  13. Estimation, continuous case Method relies on estimating functional � R d p s ( x ) dx = E ( p s − 1 ( X )) q s := David K¨ allberg Statistical inference for R´ enyi entropy

  14. Estimation, some notation d ( x , y ) the Euclidean distance in R d B ǫ ( x ) := { y : d ( x , y ) ≤ ǫ } ball of radius ǫ with center x . b ǫ volume of B ǫ ( x ) p ǫ ( x ) := P ( X ∈ B ǫ ( x )) the ǫ -ball probability at x David K¨ allberg Statistical inference for R´ enyi entropy

  15. Estimation, useful limit When p ( x ) bounded and continuous, we rewrite ǫ → 0 E ( p s − 1 ( X )) / b s − 1 q s = lim ǫ ǫ So, unbiased estimate of q s ,ǫ := E ( p s − 1 ( X )) leads to ǫ asymptotically unbiased estimate of q s as ǫ → 0. David K¨ allberg Statistical inference for R´ enyi entropy

  16. Estimation of q s ,ǫ For s = 2 , 3 , 4 , . . . , let I ij ( ǫ ) := I ( d ( X i , X j ) ≤ ǫ ) ˜ � I i ( ǫ ) := I ij ( ǫ ) 1 ≤ j ≤ s j � = i Define U-statistic Q s , n for q s ,ǫ by kernel s h s ( x 1 , . . . , x s ) := 1 ˜ � I i ( ǫ ) s i =1 David K¨ allberg Statistical inference for R´ enyi entropy

  17. Estimation of R´ enyi entropy Denote by ˜ Q s , n := Q s , n / b s − 1 an estimator of q s and by ǫ 1 − s log (max ( ˜ 1 H s , n := Q s , n , 1 / n )) corresponding estimator of h s David K¨ allberg Statistical inference for R´ enyi entropy

  18. Consistency Assume ǫ = ǫ ( n ) → 0 as n → ∞ s , n := Var ( ˜ Let v 2 Q s , n ) s , n → 0 as n ǫ d → a ∈ (0 , ∞ ], so we get v 2 Theorem Let n ǫ d → a, 0 < a ≤ ∞ and p ( x ) be bounded and continuos. Then H s , n is a consistent estimator of h s David K¨ allberg Statistical inference for R´ enyi entropy

  19. Smoothness conditions Denote by H α ( K ), 0 < α ≤ 2, K > 0, a linear space of continuos functions in R d satisfying α -H¨ older condition if 0 < α ≤ 1 or if 1 < α ≤ 2 with continuos partial derivates satisfying ( α − 1)-H¨ older condition with constant K . David K¨ allberg Statistical inference for R´ enyi entropy

  20. Asymptotic normality When n ǫ d → ∞ , we have v 2 s , n ∼ s ( q 2 s − 1 − q 2 s ) / n Let K s , n = max( s ( ˜ Q 2 s − 1 , n − ˜ Q 2 s , n ) , 1 / n ) L ( n ) > 0 , n ≥ 1 is a slowly varying function as n → ∞ Theorem Let p s − 1 ( x ) ∈ H α ( K ) for α > d / 2 . If ǫ ∼ L ( n ) n − 1 / d and n ǫ d → ∞ , then ˜ √ n Q s , n (1 − s ) D ( H s , n − h s ) − → N (0 , 1) � K s , n David K¨ allberg Statistical inference for R´ enyi entropy

  21. Numerical experiment χ 2 distribution, 4 degrees of freedom. h 3 = − 1 2 log ( q 3 ), where q 3 = 1 / 54 300 simulations, each of size n = 500. Quantile plot and histogram supports standard normality David K¨ allberg Statistical inference for R´ enyi entropy

  22. Figures Normal Q−Q Plot 3 2 Sample Quantiles 1 0 −1 −2 −3 −2 −1 0 1 2 3 Theoretical Quantiles David K¨ allberg Statistical inference for R´ enyi entropy

  23. Figures Chi2 sample 0.4 0.3 Standard normal density 0.2 0.1 0.0 −2 −1 0 1 2 3 David K¨ allberg Statistical inference for R´ enyi entropy

  24. Conclusion Asymptotically normal estimates possible for integer order R´ enyi entropy. David K¨ allberg Statistical inference for R´ enyi entropy

  25. References Leonenko, N. ,Pronzato, L. ,Savani, V. (1982). A class of R´ enyi information estimators for multidimensional densities, Annals of Statistics 36 2153-2182 Leonenko, N. and Seleznjev, O. (2009). Statistical inference for ǫ -entropy and quadratic R´ enyi entropy. Univ. Ume˙ a, Research Rep., Dep. Math. and Math. Stat., 1-21, J. Multivariate Analysis (submitted) Renyi, A. Probability theory, North-Holland Publishing Company 1970 David K¨ allberg Statistical inference for R´ enyi entropy

Recommend


More recommend