mathematical foundations of infinite dimensional
play

Mathematical Foundations of Infinite-Dimensional Statistical Models: - PowerPoint PPT Presentation

Mathematical Foundations of Infinite-Dimensional Statistical Models: 3.3 The Entropy Method and Talagrands Inequality(3.3.4 3.3.5) Seoul National University ga0408@snu.ac.kr Nov 15, 2018 1/20 Table of Contents 3.3 The Entropy


  1. Mathematical Foundations of Infinite-Dimensional Statistical Models: 3.3 The Entropy Method and Talagrand’s Inequality(3.3.4 3.3.5) 이 종 진 Seoul National University ga0408@snu.ac.kr Nov 15, 2018 1/20

  2. Table of Contents 3.3 The Entropy Method and Talagrand’s Inequality 3.3.2 & 3.3.3 3.3.4 The Entropy Method for Random Variables with Bounded Differences and for Self-Bounding Random Variables 3.3.5 The Upper Tail in Talagrand’s Inequality for Nonidentically Distributed Random Variables ∗ 2/20

  3. Ent µ f := E µ flogf − E µ f · log E µ f ◮ Exponential inequality Ee λ ( Z − EZ ) ≤ ... 1. Subadditivity random variable 2. Functions with bounded differences condition 3. Self-bounding random variables ◮ Talagrand’s inequality 1. The upper tail in Talagrand’s Inequality, Bousquet’s version,( v n ) 2. The lower tail in Talagrand’s Inequality, Klein’s version,( v n ) 3. The lower tail in Talagrand’s Inequality, Klein-Rio version, ( V n ) 4. The upper tail in Talagrand’s inequality for nonidentically distributed random variable, ( V n ) 3/20

  4. 3.3.2 & 3.3.3 4/20

  5. Theorem 3.3.7 Let Z = Z ( X 1 , . . . , X n ) , X i independent, be a subadditive random variable relative to Z k = Z k ( X 1 , . . . , X k − 1 , X k + 1 , . . . , X n ) , k = 1 , . . . , n , such that EZ ≥ 0 and for which there exist random variables Y k ≥ Z ` Z k ≥ 1 such that E k Y k ≤ 0 . Let σ 2 < ∞ be any real number satisfying n 1 � E k Y 2 k ≤ σ 2 , n k = 1 and set v := 2 EZ + n σ 2 . Then log Ee λ ( Z − EZ ) ≤ v ( e λ − λ − 1 ) = v φ ( − λ ) , λ ≥ 0 . 5/20

  6. ◮ Taylor 전 개 하 면 Var Z ≤ 2 EZ + n σ 2 ◮ Prop 3.1.6 에 Thm 3.3.7 을 적 용 하 면 Z − EZ 의 꼬 리 확 률 의 상 한 들 을 얻 음 Corollary 3.3.8 Let Z be as in Theorem 3.3.7. Then, for all t ≥ 0 , P ( Z ≥ EZ + t ) ≤ exp( − vh 1 ( t / v )) ≤ exp( − 3 t 4 log( 1 + 2 t 3 v )) t 2 ≤ exp( − 2 v + 2 t / 3 ) and √ 2 vx + x / 3 ) ≤ e − x , P ( Z ≥ EZ + x ≤ 0 . 6/20

  7. Theorem 3.3.9 (Upper tail of Talagrand’s inequality, Bousquet’s version) Let ( S , S ) be a measurable space, and let n ∈ N . Let X 1 , . . . , X n be independent S -valued random variables. Let F be a countable set of measurable real-valued functions on S such that || f || ∞ ≤ U < ∞ and Ef ( X 1 ) = · · · = Ef ( X n ) = 0 , for all f ∈ F . Let j j � � S j = sup f ( X k ) S j = sup | f ( X k ) | , j = 1 , . . . , n , or f ∈F f ∈F k = 1 k = 1 and let the parameters σ 2 and v be defined by n U 2 ≥ σ 2 ≥ 1 � Ef 2 ( X k ) , v n = 2 UES n + n σ 2 . sup and n f ∈F k = 1 7/20

  8. Theorem 3.3.9 (Upper tail of Talagrand’s inequality, Bousquet’s version) Then log Ee λ ( S n − ES n ) ≤ v n ( e λ − 1 − λ ) , λ ≥ 0 . As a consequence, 1 ≤ j ≤ n S j ≥ ES n + x ) ≤ e − ( v n / U 2 ) h 1 ( xU / v n ) P ( S n ≥ ES n + x ) ≤ P ( max x 2 ≤ exp[ − 3 x 4 U log( 1 + 2 xU 3 v n )] ≤ exp[ − 2 v n + 2 xU / 3 ] and √ √ 2 v n x + Ux 2 v n x + Ux 3 ) ≤ e − x , P ( S n ≥ ES n + 3 ) ≤ P ( max 1 ≤ j ≤ n S j ≥ ES n + for all x ≥ 0 . 7/20

  9. Theorem 3.3.10 (Lower tail of Talagrand’s inequality: Klein’s version) Under the same hypotheses and notation as in Theorem 3.3.9, we have Ee − t ( S n − ES n ) ≤ exp( v n e 4 t − 1 − 4 t ) = e v n φ ( − 4 t ) / 16 , for 0 ≤ t < 1 . 16 As a consequence, for all x ≥ 0 , 16 U 2 h 1 ( 4 xU v n P ( S n ≤ ES n − x ) ≤ exp( − v n )) x 2 ≤ exp( − 3 x 16 U 2 log( 1 + 8 xU 3 v n )) ≤ exp( − 2 v n + 8 xU / 3 ) and √ 2 v n x − 4 Ux 3 ) ≤ e − x . P ( S n ≤ ES n − 8/20

  10. Remark 3.3.11 (Klein-Rio version, Klein and Rio(2005)) Setting n � Ef 2 ( X k ) , V n = 2 UES n + sup f k = 1 then Ee − t ( S n − ES n ) ≤ exp( V n e 3 t − 1 − 3 t ) = e v n φ ( − 3 t ) / 9 , for 0 ≤ t < 1 , 9 and that, as a consequence, for all x ≥ 0, P ( S n ≤ ES n − x ) ≤ exp( − v n 9 U 2 h 1 ( 3 xU V n )) x 2 4 U 2 log( 1 + 2 xU ≤ exp( − x V n )) ≤ exp( − 2 V n + 2 xU ) and √ 2 V n x − Ux ) ≤ e − x . P ( S n ≤ ES n − 9/20

  11. 3.3.4 The Entropy Method for Random Variables with Bounded Differences and for Self-Bounding Random Variables 10/20

  12. Bounded Differences Definition 3.3.12 Let ( S i , S i ) , i =1,...,n, be measurable spaces, and let f : � n i = 1 S i �→ R be a measurable function. f has bounded differences if � � ′ sup � f ( x 1 , ..., x n ) − f ( x 1 , ... x i − 1 , x i , x i + 1 , ..., x n ) � ≤ c i � � x i , x ′ k ∈ S , i , j ≤ n where, for each i, c i is a measuralbe function of x j , j � = i and there exists a n i ≤ c 2 for all ( x 1 , ..., x n ) ∈ S n . c 2 finite constant c such that � i = 1 If Z = f ( X 1 , ..., X n ) , where X i are S i -valued independent random variables, we say that the random variable Z has bounded differences. 11/20

  13. Theorem 3.3.14 If Z has bounded differences and � c 2 i ≤ c 2 , then, for all λ ≥ 0 Ee λ ( Z − EZ ) ≤ e λ 2 c 2 / 8 (3.115) so that, for all t ≥ 0 Pr { Z ≥ EZ + t } ≤ e − 2 t 2 / c 2 , Pr { Z ≤ EZ − t } ≥ e − 2 t 2 / c 2 (3.116) Moreover, Var ( Z ) ≤ c 2 4 . (3.117) Proof. Y ( λ ) − L Y ( λ )) = Ee λ Y � λ Ent µ ( e λ ( Y − EY ) ) = Ee λ Y ( λ L ′ ′′ Y ( t ) dt , L Y = log F Y 0 tL & tensorisation of entropy(Proposition 2.5.3) 12/20

  14. Previous seminar Definition Z , Z k 가 � 0 ≤ Z − Z k ≤ 1 ( 1 ≤ k ≤ n ) , ( Z − Z k ) ≤ Z k 를 만 족 하 면 Z 를 자 기 경계 (self-bounding) 라 한 다 . ◮ Z 가 자 기 경계 이 면 명 제 3.3.1 에 서 L ( λ ) := log F ( λ ) 일 때 ( λ − φ ( λ )) L ′ ( λ ) − L ( λ ) ≤ φ ( λ ) EZ (3.79) 처 럼 훨 씬 간 단 한 꼴 로 바 꿀 수 있음 13/20

  15. Theorem Theorem 3.3.15 Let Z be a self-bounding random variable. Then log E ( e λ ( Z − EZ ) ) ≤ φ ( − λ ) EZ , λ ∈ R . (3.123) n This applies in particular to Z = sup f ∈F � f ( X i ) , where F is countable and k = 1 0 ≤ f ( x ) ≤ 1 for all x ∈ S and f ∈ F . – φ ( λ ) = e − λ + λ − 1 Proof. ′ ( λ ) φ ′ ( − λ ) , ψ 0 ( λ ) := v φ ( − λ ) is solution of (3.79) Since φ ( λ ) + φ ( − λ ) = φ 14/20

  16. As a consequence, Theorem 3.3.15 and Propositioin 3.1.6 Pr { Z ≥ EZ + t } ≤ exp ( − ( EZ ) h 1 ( t / EZ )) (3.124) Pr { Z ≤ EZ − t } ≤ exp ( − ( EZ ) h 1 ( − t / EZ )) � � t 2 − 3 t 2 t Pr { Z ≥ EZ + t } ≤ exp 4 log( 1 + ≤ exp ( − 2 EZ + 2 t / 3 ) 3 EZ Pr { Z ≤ EZ − t } ≤ exp ( − t 2 / ( 2 EZ )) and Var ( Z ) ≤ EZ 15/20

  17. 3.3.5 The Upper Tail in Talagrand’s Inequality for Nonidentically Distributed Random Variables ∗ 16/20

  18. Theorem 3.3.16 Let X i , i ∈ N , be independent S-valued random variables, and let F be a countable class of functions f = ( f 1 , ..., f n ) : S �→ [ − 1 , 1 ] n such that Ef k ( X k ) = 0 for all f i ∈ F and k=1,...,n. Set n � f k ( X k ) , Z = sup T n ( f ) = T n ( f ) f ∈ F k = 1 and n ET 2 � E [ f k ( X k )] 2 , V n = 2 EZ + V n . V n = sup n ( f ) = sup (3.126) f ∈ F f ∈ F k = 1 Then, for all t ∈ [ 0 , 2 / 3 ] , t 2 L ( t ) := log( Ee tZ ) ≤ tEZ + 2 − 3 t V n , (3.127) and therefore, for all x ≥ 0 , √ � 2 V n x + 3 x � ≤ e − x Pr Z ≥ EZ + (3.128) 2 17/20

  19. Proof. To prove Theorem 3.3.16 we need Lemma 3.3.17 ∼ 3.3.19 Lemma 3.3.17 Let F(t) = Ee tZ , let g(t; X 1 , ..., X n ) = e tZ and let g k ( t ; , X 1 , ..., X n ) , k = 1,...,n, be nonnegative functions such that E ( g k log g k ) < ∞ for all t ≤ 0 . Then n � tF ′ ( t ) − F ( t ) log F ( t ) = Ent p ( g ( t )) ≤ E [ g k log( g k / E k g k )]+ k = 1 (3.129) n � E [( g − g k ) log( g / E k g )] . k = 1 18/20

  20. Lemma 3.3.18 For g = e tZ and the functions g k , 1 ≤ k ≤ n , defined by (3.130), we have E (( g − g k ) log( g / E k g )) ≤ tE ( g − g k ) Lemma 3.3.19 ≤ t 2 e t V n � � g k log g k E h k + ( 1 + t )( h k − g k ) F ( t ) (3.134) 2 19/20

  21. proof of Theorem 3.3.16 Proof. Setting as usual L ( t ) = log Ee tZ = log F ( t ) , Through Lemmma 3.3.17 ∼ Lemma 3.3.19 ′ ( t ) − L ( t ) ≤ t 2 e t V n / 2 t ( 1 − t ) L ′ = l Dividing both sides by t 2 and noting that ( L / t ) ′ / t − L / t 2 , it becomes � ′ ′ ≤ e t V n � L − L t 2 And, integrates and uses taylor expansion, ... t 2 1 − t EZ ≤ t 2 ( V n + 2 EZ ) t 2 L ( t ) − tEZ ≤ ( 2 − t )( 1 − t ) V n + 2 − 3 t This proves (3.127), (To prove (3.128), Propositon 3.1.6, φ ( λ ) = V n λ 2 / ( 2 ( 1 − 3 λ/ 2 )) . 20/20

Recommend


More recommend