Model selection for fast density estimation orfi 1 L´ aszl´ o (Laci) Gy¨ 1 Department of Computer Science and Information Theory Budapest University of Technology and Economics Budapest, Hungary July 17, 2008 e-mail: gyorfi@szit.bme.hu www.szit.bme.hu/˜ gyorfi Gy¨ orfi Model selection for fast density estimation
Density estimation R d -valued i.i.d. random vectors X 1 , . . . , X n Gy¨ orfi Model selection for fast density estimation
Density estimation R d -valued i.i.d. random vectors X 1 , . . . , X n distributed according to unknown probability measure µ with density f Gy¨ orfi Model selection for fast density estimation
Density estimation R d -valued i.i.d. random vectors X 1 , . . . , X n distributed according to unknown probability measure µ with density f The L 1 norm � � f − g � := R d | f ( x ) − g ( x ) | dx Gy¨ orfi Model selection for fast density estimation
Density estimation R d -valued i.i.d. random vectors X 1 , . . . , X n distributed according to unknown probability measure µ with density f The L 1 norm � � � � � � � � f − g � := R d | f ( x ) − g ( x ) | dx = 2 sup f ( x ) dx − g ( x ) dx � � � � A A A Gy¨ orfi Model selection for fast density estimation
Kernel density estimate For a kernel function K and bandwidth h > 0, let f n be the kernel density estimate with sample size n : n � x − X i � 1 � f n ( x ) = K . nh d h i =1 Gy¨ orfi Model selection for fast density estimation
Density-free consistency If n →∞ h n = 0 lim and n →∞ nh d lim n = ∞ Gy¨ orfi Model selection for fast density estimation
Density-free consistency If n →∞ h n = 0 lim and n →∞ nh d lim n = ∞ then, for any density f , n →∞ E � f − f n � = 0 lim and n →∞ � f − f n � = 0 a.s. lim Gy¨ orfi Model selection for fast density estimation
Rate of convergence If the density f has a compact support and is twice differentiable, then c 1 + c 2 h 2 E ( � f n − f � ) ≤ n . � nh d n Gy¨ orfi Model selection for fast density estimation
Rate of convergence If the density f has a compact support and is twice differentiable, then c 1 + c 2 h 2 E ( � f n − f � ) ≤ n . � nh d n If h n = cn − 1 / ( d +4) then E ( � f n − f � ) ≤ Cn − 2 / ( d +4) . Gy¨ orfi Model selection for fast density estimation
Rate of convergence If the density f has a compact support and is twice differentiable, then c 1 + c 2 h 2 E ( � f n − f � ) ≤ n . � nh d n If h n = cn − 1 / ( d +4) then E ( � f n − f � ) ≤ Cn − 2 / ( d +4) . TOO SLOW. Gy¨ orfi Model selection for fast density estimation
Model selection for density estimation We wish to estimate a density f on R d Gy¨ orfi Model selection for fast density estimation
Model selection for density estimation We wish to estimate a density f on R d that belongs to a parametric family, F k , where k is unknown, Gy¨ orfi Model selection for fast density estimation
Model selection for density estimation We wish to estimate a density f on R d that belongs to a parametric family, F k , where k is unknown, but F k ⊂ F k +1 for all k . Gy¨ orfi Model selection for fast density estimation
Model selection for density estimation We wish to estimate a density f on R d that belongs to a parametric family, F k , where k is unknown, but F k ⊂ F k +1 for all k . � F = F k . k ≥ 1 Gy¨ orfi Model selection for fast density estimation
Model selection for density estimation We wish to estimate a density f on R d that belongs to a parametric family, F k , where k is unknown, but F k ⊂ F k +1 for all k . � F = F k . k ≥ 1 the complexity associated with f is defined as k ∗ = min { k ≥ 1 : f ∈ F k } . Gy¨ orfi Model selection for fast density estimation
Example F k is the set of mixtures of d dimensional normal densities, where the number of components is at most k Gy¨ orfi Model selection for fast density estimation
Objective We wish to introduce an estimate k n of the complexity k ∗ and Gy¨ orfi Model selection for fast density estimation
Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with Gy¨ orfi Model selection for fast density estimation
Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with 1 k n → k ∗ almost surely Gy¨ orfi Model selection for fast density estimation
Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with 1 k n → k ∗ almost surely (i.e., k n = k ∗ almost surely, for all n large enough) Gy¨ orfi Model selection for fast density estimation
Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with 1 k n → k ∗ almost surely (i.e., k n = k ∗ almost surely, for all n large enough) 2 and � 1 � � � � ˆ f k n − f � = O √ n . E Gy¨ orfi Model selection for fast density estimation
Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with 1 k n → k ∗ almost surely (i.e., k n = k ∗ almost surely, for all n large enough) 2 and � 1 � � � � ˆ f k n − f � = O √ n . E Biau, Devroye (2004) Gy¨ orfi Model selection for fast density estimation
Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with 1 k n → k ∗ almost surely (i.e., k n = k ∗ almost surely, for all n large enough) 2 and � 1 � � � � ˆ f k n − f � = O √ n . E Biau, Devroye (2004) k n and ˆ f k n via projection of the empirical measure with respect to the Yatracos class Gy¨ orfi Model selection for fast density estimation
Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with 1 k n → k ∗ almost surely (i.e., k n = k ∗ almost surely, for all n large enough) 2 and � 1 � � � � ˆ f k n − f � = O √ n . E Biau, Devroye (2004) k n and ˆ f k n via projection of the empirical measure with respect to the Yatracos class too complex Gy¨ orfi Model selection for fast density estimation
Testing homogeneity Gy¨ orfi Model selection for fast density estimation
Testing homogeneity Two mutually independent samples X ′ 1 , . . . , X ′ X 1 , . . . , X n and n distributed according to unknown probability distributions µ and µ ′ on R d . Gy¨ orfi Model selection for fast density estimation
Testing homogeneity Two mutually independent samples X ′ 1 , . . . , X ′ X 1 , . . . , X n and n distributed according to unknown probability distributions µ and µ ′ on R d . We are interested in testing the null hypothesis that the two samples are homogeneous, that is H 0 : µ = µ ′ . Gy¨ orfi Model selection for fast density estimation
Testing homogeneity Two mutually independent samples X ′ 1 , . . . , X ′ X 1 , . . . , X n and n distributed according to unknown probability distributions µ and µ ′ on R d . We are interested in testing the null hypothesis that the two samples are homogeneous, that is H 0 : µ = µ ′ . empirical probability distributions µ n and µ ′ n Gy¨ orfi Model selection for fast density estimation
The test statistic Based on a partition P n = { A n 1 , . . . , A nm n } of R d , we let the test statistic be defined as m n � | µ n ( A nj ) − µ ′ T n = n ( A nj ) | . j =1 Gy¨ orfi Model selection for fast density estimation
Asymptotic behavior of T n Theorem. Under H 0 , for all 0 < ε < 2, P { T n > ε } = e − n ( g T ( ε )+ o (1)) , as n → ∞ , Gy¨ orfi Model selection for fast density estimation
Asymptotic behavior of T n Theorem. Under H 0 , for all 0 < ε < 2, P { T n > ε } = e − n ( g T ( ε )+ o (1)) , as n → ∞ , where g T ( ε ) = (1 + ε/ 2) ln(1 + ε/ 2) + (1 − ε/ 2) ln(1 − ε/ 2) ≈ ε 2 / 4 . (Biau, Gy¨ orfi (2005)) Gy¨ orfi Model selection for fast density estimation
A strong consistent test Corollary. Consider the test which rejects H 0 when √ � m n T n > 2 ln 2 n . Gy¨ orfi Model selection for fast density estimation
A strong consistent test Corollary. Consider the test which rejects H 0 when √ � m n T n > 2 ln 2 n . Assume that m n m n lim n = 0 and lim ln n = ∞ . n →∞ n →∞ Gy¨ orfi Model selection for fast density estimation
A strong consistent test Corollary. Consider the test which rejects H 0 when √ � m n T n > 2 ln 2 n . Assume that m n m n lim n = 0 and lim ln n = ∞ . n →∞ n →∞ Then, under H 0 , after a random sample size the test makes a.s. no error. Gy¨ orfi Model selection for fast density estimation
Recommend
More recommend