Statistical inference for R enyi entropy David K allberg - PowerPoint PPT Presentation

Statistical inference for R´ enyi entropy David K¨ allberg Department of Mathematics and Mathematical Statistics Ume˙ a University David K¨ allberg Statistical inference for R´ enyi entropy

Coauthor Oleg Seleznjev Department of Mathematics and Mathematical Statistics Ume˙ a University David K¨ allberg Statistical inference for R´ enyi entropy

Outline Introduction Measures of uncertainty U -statistics Estimation of entropy Numerical experiment Conclusion David K¨ allberg Statistical inference for R´ enyi entropy

Introduction A system described by probability distribution P Only partial information about P is available, e.g. the mean A measure of uncertainty (entropy) in P What P should we use if any? David K¨ allberg Statistical inference for R´ enyi entropy

Why entropy? The entropy maximization principle: choose P , satisfying given constraints, with maximum uncertainty. Objectivity: we don’t use more information than we have. David K¨ allberg Statistical inference for R´ enyi entropy

Measures of uncertainty. The Shannon entropy. Discrete P = { p ( k ) , k ∈ D } � h 1 ( P ) := − p ( k ) log p ( k ) k Continuous P with density p ( x ) , x ∈ R d � h 1 ( P ) := − R d log ( p ( x )) p ( x ) dx David K¨ allberg Statistical inference for R´ enyi entropy

Measures of uncertainty. The R´ enyi entropy. A class of entropies. Of order s � = 1 given by Discrete P = { p ( k ) , k ∈ D } 1 � p ( k ) s ) h s ( P ) := 1 − s log ( k Continuos P with density p ( x ) , x ∈ R d 1 � R d p ( x ) s dx ) h s ( P ) := 1 − s log ( David K¨ allberg Statistical inference for R´ enyi entropy

Motivation The R´ enyi entropy satisfies axioms on how a measure of uncertainty should behave, R´ enyi (1970). For both discrete and continuous P , the R´ enyi entropy is a generalization of the Shannon entropy, because q → 1 h q ( P ) = h 1 ( P ) lim David K¨ allberg Statistical inference for R´ enyi entropy

Problem Non-parametric estimation of integer order R´ enyi entropy, for discrete and continuous P , from sample { X 1 , . . . X n } of P -iid observations. David K¨ allberg Statistical inference for R´ enyi entropy

Overview of R´ enyi entropy estimation, continuos P Consistency of nearest neighbor estimators for any s , Leonenko et al. (2008) Consistency and asymptotic normality for quadratic case s=2, Leonenko and Seleznjev (2010) David K¨ allberg Statistical inference for R´ enyi entropy

U -statistics: basic setup For a P -iid sample { X 1 , . . . , X n } and a symmetric kernel function h ( x 1 , . . . , x m ) E h ( X 1 , . . . , X m ) = θ ( P ) The U -statistic estimator of θ is defined as: � − 1 � n � U n = U n ( h ) := h ( X i 1 , . . . , X i m ) m 1 ≤ i 1 <...< i m ≤ n David K¨ allberg Statistical inference for R´ enyi entropy

U -statistics: properties Symmetric, unbiased Optimality properties for large class of P Asymptotically normally distributed David K¨ allberg Statistical inference for R´ enyi entropy

Estimation, continuous case Method relies on estimating functional � R d p s ( x ) dx = E ( p s − 1 ( X )) q s := David K¨ allberg Statistical inference for R´ enyi entropy

Estimation, some notation d ( x , y ) the Euclidean distance in R d B ǫ ( x ) := { y : d ( x , y ) ≤ ǫ } ball of radius ǫ with center x . b ǫ volume of B ǫ ( x ) p ǫ ( x ) := P ( X ∈ B ǫ ( x )) the ǫ -ball probability at x David K¨ allberg Statistical inference for R´ enyi entropy

Estimation, useful limit When p ( x ) bounded and continuous, we rewrite ǫ → 0 E ( p s − 1 ( X )) / b s − 1 q s = lim ǫ ǫ So, unbiased estimate of q s ,ǫ := E ( p s − 1 ( X )) leads to ǫ asymptotically unbiased estimate of q s as ǫ → 0. David K¨ allberg Statistical inference for R´ enyi entropy

Estimation of q s ,ǫ For s = 2 , 3 , 4 , . . . , let I ij ( ǫ ) := I ( d ( X i , X j ) ≤ ǫ ) ˜ � I i ( ǫ ) := I ij ( ǫ ) 1 ≤ j ≤ s j � = i Define U-statistic Q s , n for q s ,ǫ by kernel s h s ( x 1 , . . . , x s ) := 1 ˜ � I i ( ǫ ) s i =1 David K¨ allberg Statistical inference for R´ enyi entropy

Estimation of R´ enyi entropy Denote by ˜ Q s , n := Q s , n / b s − 1 an estimator of q s and by ǫ 1 − s log (max ( ˜ 1 H s , n := Q s , n , 1 / n )) corresponding estimator of h s David K¨ allberg Statistical inference for R´ enyi entropy

Consistency Assume ǫ = ǫ ( n ) → 0 as n → ∞ s , n := Var ( ˜ Let v 2 Q s , n ) s , n → 0 as n ǫ d → a ∈ (0 , ∞ ], so we get v 2 Theorem Let n ǫ d → a, 0 < a ≤ ∞ and p ( x ) be bounded and continuos. Then H s , n is a consistent estimator of h s David K¨ allberg Statistical inference for R´ enyi entropy

Smoothness conditions Denote by H α ( K ), 0 < α ≤ 2, K > 0, a linear space of continuos functions in R d satisfying α -H¨ older condition if 0 < α ≤ 1 or if 1 < α ≤ 2 with continuos partial derivates satisfying ( α − 1)-H¨ older condition with constant K . David K¨ allberg Statistical inference for R´ enyi entropy

Asymptotic normality When n ǫ d → ∞ , we have v 2 s , n ∼ s ( q 2 s − 1 − q 2 s ) / n Let K s , n = max( s ( ˜ Q 2 s − 1 , n − ˜ Q 2 s , n ) , 1 / n ) L ( n ) > 0 , n ≥ 1 is a slowly varying function as n → ∞ Theorem Let p s − 1 ( x ) ∈ H α ( K ) for α > d / 2 . If ǫ ∼ L ( n ) n − 1 / d and n ǫ d → ∞ , then ˜ √ n Q s , n (1 − s ) D ( H s , n − h s ) − → N (0 , 1) � K s , n David K¨ allberg Statistical inference for R´ enyi entropy

Numerical experiment χ 2 distribution, 4 degrees of freedom. h 3 = − 1 2 log ( q 3 ), where q 3 = 1 / 54 300 simulations, each of size n = 500. Quantile plot and histogram supports standard normality David K¨ allberg Statistical inference for R´ enyi entropy

Figures Normal Q−Q Plot 3 2 Sample Quantiles 1 0 −1 −2 −3 −2 −1 0 1 2 3 Theoretical Quantiles David K¨ allberg Statistical inference for R´ enyi entropy

Figures Chi2 sample 0.4 0.3 Standard normal density 0.2 0.1 0.0 −2 −1 0 1 2 3 David K¨ allberg Statistical inference for R´ enyi entropy

Conclusion Asymptotically normal estimates possible for integer order R´ enyi entropy. David K¨ allberg Statistical inference for R´ enyi entropy

References Leonenko, N. ,Pronzato, L. ,Savani, V. (1982). A class of R´ enyi information estimators for multidimensional densities, Annals of Statistics 36 2153-2182 Leonenko, N. and Seleznjev, O. (2009). Statistical inference for ǫ -entropy and quadratic R´ enyi entropy. Univ. Ume˙ a, Research Rep., Dep. Math. and Math. Stat., 1-21, J. Multivariate Analysis (submitted) Renyi, A. Probability theory, North-Holland Publishing Company 1970 David K¨ allberg Statistical inference for R´ enyi entropy

Statistical inference for R enyi entropy David K allberg - PowerPoint PPT Presentation

Statistical inference for R enyi entropy David K allberg Department of Mathematics and Mathematical Statistics Ume a University David K allberg Statistical inference for R enyi entropy Coauthor Oleg Seleznjev Department of

Statistical inference for R enyi entropy of integer order David K allberg August 23, 2010

On the R enyi Entropy of Log-Concave Sequences James Melbourne University of Minnesota

R enyi Entropy and Spectral Geometry Alexander Patrushev in collaboration with Dmitri Fursaev

Probability, Entropy, and Inference Ensemble X is a triple ( x, A X , P X ) , where Based on

UQ, STAT2201, 2017, Lecture 6 Unit 6 Statistical Inference Ideas. 1 Statistical Inference is

Foundations for Inference I Dajiang Liu @PHS525 Feb-09-2016 Statistical Inference

Explicit R enyi Entropy for Hidden Markov Chains Joachim Breitner, Maciej Skorski ISIT, June

Statistical ensembles, entropy and probability in statistical mechamics Personne nignore que

Entropy, Relative Entropy, Cross Entropy Entropy Entropy, H(x) is a measure of the uncertainty of

Statistical Natural Language Processing Statistical models: learning, inference, estimation,

Mathematical Foundations of Infinite-Dimensional Statistical Models: 3.3 The Entropy Method and

Statistical Inference Fall 2018 Tyson S. Barrett, PhD The Whole Idea All can be done in R Why

Algorithms and Limits in Statistical Inference Jayadev Acharya Massachusetts Institute of

Inference under the entropy-maximizing Bayesian model of sufficient evidence The Third

II.2 Statistical Inference: Sampling and Estimation A statistical model is a set of

Samples and Statistics The objective of statistical inference is to draw conclusions or make

Entropy production and steady states in quantum statistical mechanics Vojkan Jaksic and

Maximum Entropy Model (I) LING 572 Advanced Statistical Methods for NLP January 28, 2020 1

How Much Can Be Inferred From Almost Nothing? A Two-Stage Maximum Entropy Approach to Uncertainty

Entropy and The Second Law of Thermodynamics Entropy (S)

Formal Modeling in Cognitive Science Lecture 25: Entropy, Joint Entropy, Conditional Entropy 1

Two Approximate- Programmability Birds, One Statistical- Inference Stone Adrian Sampson

Modes of Statistical Inference for Causal Efgects Plus an overview of the testing based approach

MaxEnt14, The 34th International Workshop on Bayesian Inference and Maximum Entropy Methods in