Asymptotics for Empirical Process and Bootstrap Marquis Hou University of California j7hou@ucsd.edu Marquis Hou (UCSD) Learning Proofs 1 / 16
Overview Introduction 1 Empirical Process on R 2 Glivenko-Cantelli Theorem C` adl` ag space and Donsker Theorem Weak Convergence in l ∞ ( R ) Empirical Process in General Sample Space 3 P-Glivenko-Cantelli and P-Donsker Measurability and P-Donsker Class Empirical Bootstrap 4 Weak Convergence with Donsker Class Functional δ -Method Marquis Hou (UCSD) Learning Proofs 2 / 16
Reference Aad van der Vaart, Asymptotic Statistics , Ch. 19 and Ch. 23. Cambridge University Press, 1998 Aad van der Vaart, Jon Wellner, Weak Convergence and Empirical Processes . Springer, 1996 Evarist Gin´ e, Joel Zinn, Some Limit Theorems for Empirical Processes . The Annals of Probablity Vol. 12, No. 4, 1984 Evarist Gin´ e, Joel Zinn, Necessary Conditions for the Bootstrap of the Mean . The Annals of Statistics Vol. 17, No. 2, 1989 Evarist Gin´ e, Joel Zinn, Bootstrapping General Empirical Measures . The Annals of Probability Vol. 18, No. 2, 1990 Marquis Hou (UCSD) Learning Proofs 3 / 16
Introduction Empirical Measure and Bootstrap Measure Empirical cumulative distribution function: n n F n ( x ) = 1 χ [ X i , + ∞ ) ( x ) = 1 � � I ( X i ≤ x ) n n i =1 i =1 Empirical measure: n P n ( ω ) = 1 � δ X i ( ω ) , ω ∈ (Ω ∞ , P ∞ , P ∞ ) n i =1 Bootstrap measure: n n n ( ω, σ ) = 1 i ( ω,σ ) = 1 P ∗ � � δ X ∗ δ X σ ( ω ) n n i =1 i =1 σ ∼ Multinomial(n) with uniform p i Marquis Hou (UCSD) Learning Proofs 4 / 16
Empirical Process on R Glivenko-Cantelli Theorem Glivenko-Cantelli Theorem on R Theorem (Glivenko-Cantelli) a . s . � F n − F � ∞ − − → 0 . Proof by partition, pick bigger jumps of F ( x ) as cut points. Marquis Hou (UCSD) Learning Proofs 5 / 16
Empirical Process on R C` adl` ag space and Donsker Theorem C` adl` ag space and Donsker Theorem C` adl` ag space D [ −∞ , + ∞ ], right continuous functions with left limits. Skorokhod metric: σ ( f , g ) = inf λ ∈ Λ max � λ − I � , � f − g ◦ λ � Λ is the set of all strictly increasing continuous bijection of [ −∞ , + ∞ ]. Theorem (Donsker) In Skorokhod topology of C` adl` ag space D [ −∞ , + ∞ ] , √ n ( F n − F ) L − → B ◦ F where B is a Brownian bridge. Marquis Hou (UCSD) Learning Proofs 6 / 16
Empirical Process on R Weak Convergence in l ∞ ( R ) Weak Convergence in l ∞ ( R ) Fact: F n and G n = √ n ( F n − F ) are not Borel measurable ( P n → B ( l ∞ ( R ))). l ∞ ( R ) is neither compact nor separable. Thus, Dudley and Hoffman-Jørgensen developed the extended theory of weak convergence. Definition (Outer expectation) E ∗ T ( P ) = inf { E U : U ≥ T , U extended r.v and E U = � UdP exists } Definition (Weak Convergence) G n → G in l ∞ [0 , 1]. For all bounded continuous h : l ∞ [0 , 1] → R , E ∗ h ( G n ) = → E h ( G ) Marquis Hou (UCSD) Learning Proofs 7 / 16
Empirical Process on R Weak Convergence in l ∞ ( R ) Second Donsker Theorem Theorem (Donsker) If F is continuous, then G n converges weakly in l ∞ ( R ) to B ◦ F, a tight process concentrating on a complete separable subspace of l ∞ ( R ) . Marquis Hou (UCSD) Learning Proofs 8 / 16
Empirical Process in General Sample Space P-Glivenko-Cantelli and P-Donsker Empirical Process in General Sample Space No more c.d.f. F n ( . ) and F ( . ), all in terms of measure P n and P For a measurable function f : Ω → R , P n f = 1 � � nf ( X i ) , Pf = fdP n i =1 No proper extension to C` adl` ag and Skorokhod, but l ∞ ( F ), where F is a class of functions. Marquis Hou (UCSD) Learning Proofs 9 / 16
Empirical Process in General Sample Space P-Glivenko-Cantelli and P-Donsker P-Glivenko-Cantelli and P-Donsker Suppose F is a class of measurable functions. Definition (P-Glivenko-Cantelli) | P n f − Pf | a . s . � P n f − Pf � F = sup − − → 0 . f ∈F Definition (P-Donsker) G n = √ n ( P n − P ) converges in law to a tight limit process G P in l ∞ ( F ), also known as a P -Brownian bridge. Marquis Hou (UCSD) Learning Proofs 10 / 16
Empirical Process in General Sample Space Measurability and P-Donsker Class In Gin´ e and Zinn (1984), there is a long list of criteria for proper class F . Usually, we need additional measurability for uncountable F : LSM LDM NLSM SM DM NLDM Marquis Hou (UCSD) Learning Proofs 11 / 16
Empirical Bootstrap Weak Convergence with Donsker Class Empirical Bootstrap In Gin´ e and Zinn (1990), a general convergence theorem for empirical Bootstrap is established. We need to assume certain measurability condition F ∈ M ( P ) NLDM(P) for F and NLSM(P) for F 2 and F ′ 2 . Theorem (Gin´ e and Zinn 1990) Let F ∈ M ( P ) , then the following are equivalent: (a) The envelope F for F is in L 2 ( P ) and F is P-Donsker with limit G P . (b) There exists a centered tight Gaussian process G on F such that √ n ( P ∗ n − P n ) → G weakly in l ∞ ( F ) . If either one holds, then G = G P . Marquis Hou (UCSD) Learning Proofs 12 / 16
Empirical Bootstrap Weak Convergence with Donsker Class Convergence via Bounded Lipschitz Metric The equivalence of weak convergence in l ∞ ( F ): | E ∗ h ( G n ) − E h ( G ) | → 0 L{ G n } � L{ G } ⇔ sup h ∈ BL 1 ( l ∞ ( F )) where BL 1 is the space of functions whose Lipschitz norm is bounded by 1. Theorem For every P-Donsker class F with envelope function F, i.e. | f ( ω ) | ≤ F ( ω ) < ∞ for all ω ∈ Ω and f ∈ F . n ) − E h ( G P ) | P | E M h ( G ∗ sup − → 0 h ∈ BL 1 ( l ∞ ( F )) n is asymptotically measurable. If P ∗ F 2 < ∞ , then the Moreover, G ∗ convergence is outer almost surely as well. Marquis Hou (UCSD) Learning Proofs 13 / 16
Empirical Bootstrap Functional δ -Method Theorem (Delta method for Bootstrap) Let D be a normed space and let φ : D φ ⊂ D → R k be Hadamard differentiable at θ tangentially to a subspace D 0 . Let ˆ θ n and ˆ θ ∗ be maps with values in D φ such that √ n (ˆ θ n − θ ) L − → T, tight in D 0 . sup h ∈ BL 1 ( D ) | E M h ( √ n (ˆ θ )) − E h ( T ) | P n − ˆ θ ∗ − → 0 . Then sup h ∈ BL 1 ( D ) | E M h ( √ n ( φ (ˆ θ ( T )) | P n ) − φ (ˆ θ ∗ θ ))) − E h ( φ ′ − → 0 . Marquis Hou (UCSD) Learning Proofs 14 / 16
Empirical Bootstrap Functional δ -Method An Application Corollary (Empirical distribution function) The class F = { f t : f t = 1 ( −∞ , t ] } is Donsker, so the empirical distribution function F n satisfies the condition for the preceding theorem. Thus, conditionally on sample, √ n ( φ ( F ∗ n ) − φ ( F n )) converges in distribution to the same limit as √ n ( φ ( F n ) − φ ( F )) , for every Hadamard-differentiable function φ , e.g. quantiles and trimmed-means. Marquis Hou (UCSD) Learning Proofs 15 / 16
Empirical Bootstrap Functional δ -Method The End Marquis Hou (UCSD) Learning Proofs 16 / 16
Recommend
More recommend