Uniform Convergence Rate of the Kernel Density Estimator Adaptive to Intrinsic Volume Dimension Jisu KIM Inria Saclay 2019-06-11 Poster:#188 1/7
Kernel Density Estimator ◮ For X 1 , . . . , X n ∼ P , a given kernel function K , and a bandwidth p h : R d → R is h > 0, the Kernel Density Estimator (KDE) ˆ n 1 � x − X i � � p h ( x ) = ˆ K . nh d h i = 1 KDE 0.8 ^ p h 0.6 0.4 0.2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● −0.2 −3 −2 −1 0 1 2 3 Poster:#188 2/7
Average Kernel Density Estimator ◮ The Average Kernel Density Estimator (KDE) p h : R d → R is p h ( x )] = 1 � � x − X �� p h ( x ) = E P [ˆ . h d E P K h Average KDE 0.8 ^ p h 0.6 p h 0.4 0.2 −0.2 −3 −2 −1 0 1 2 3 Poster:#188 3/7
We get the uniform convergence rate on Kernel Density Estimator. ◮ Fix a subset X ⊂ R d , we need uniform control of the Kernel Density Estimator over X , sup x ∈ X | ˆ p h ( x ) − p h ( x ) | , for various purposes. ◮ We get the concentration inequalities for the Kernel Density Estimator in the supremum norm that hold uniformly over the selection of the bandwidth, i.e., sup | ˆ p h ( x ) − p h ( x ) | . h ≥ l n , x ∈ X Uniform bound on KDE 0.8 ^ p h 0.6 p h 0.4 0.2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● −0.2 −3 −2 −1 0 1 2 3 Poster:#188 4/7
The volume dimension characterizes the intrinsic dimension of the distribution related to the convergence rate of the Kernel Density Estimator. ◮ For a probability distribution P on R d , the volume dimension is � P ( B ( x , r )) � d vol := sup ν ≥ 0 : lim sup sup < ∞ , r ν r → 0 x ∈ X where B ( x , r ) = { y ∈ R d : � x − y � < r } . ◮ In other words, the volume dimension is the maximum possible exponent rate dominating the probability volume decay on balls. Poster:#188 5/7
The uniform convergence rate of the Kernel Density Estimator is derived in terms of the volume dimension. Theorem (Corollary 13, Corollary 17) Let P be a probability distribution on R d satisfying weak assumptions and K be a kernel function satisfying weak assumptions. Suppose l n → 0 and nl n → ∞ . Then with high probability, � � 1 log( 1 / l n ) sup | ˆ p h ( x ) − p h ( x ) | � , � nl 2 d − d vol nl 2 d − d vol h ≥ l n , x ∈ X n n for all large n . Poster:#188 6/7
Poster: Pacific Ballroom #188 ◮ Poster: Tuesday Jun 11th 18:30 - 21:00 @ Pacific Ballroom #188 ◮ Thank you! Poster:#188 7/7
Recommend
More recommend