stein s method logarithmic and transport inequalities
play

Steins method, logarithmic and transport inequalities M. Ledoux - PowerPoint PPT Presentation

Steins method, logarithmic and transport inequalities M. Ledoux Institut de Math ematiques de Toulouse, France joint work with I. Nourdin, G. Peccati (Luxemburg) new connections between Steins method logarithmic Sobolev inequalities


  1. Stein’s method, logarithmic and transport inequalities M. Ledoux Institut de Math´ ematiques de Toulouse, France

  2. joint work with I. Nourdin, G. Peccati (Luxemburg) new connections between Stein’s method logarithmic Sobolev inequalities transportation cost inequalities I. Nourdin, G. Peccati, Y. Swan (2013)

  3. classical logarithmic Sobolev inequality L. Gross (1975) R d γ standard Gaussian (probability) measure on dx d γ ( x ) = e −| x | 2 / 2 (2 π ) d / 2 � h > 0 smooth, R d h d γ = 1 � � |∇ h | 2 R d h log h d γ ≤ 1 entropy d γ Fisher information 2 h R d � � R d h 2 log h 2 d γ ≤ 2 h → h 2 R d |∇ h | 2 d γ

  4. classical logarithmic Sobolev inequality � � � |∇ h | 2 R d h log h d γ ≤ 1 d γ, R d h d γ = 1 2 h R d ν < < γ d ν = h d γ � � � � ≤ 1 H ν | γ 2 I ν | γ � � � (relative) H -entropy H ν | γ = R d h log h d γ � � � |∇ h | 2 (relative) Fisher Information I ν | γ = d γ h R d hypercontractivity (integrability of Wiener chaos), convergence to equilibrium, concentration inequalities

  5. logarithmic Sobolev inequality and concentration Herbst argument (1975) � � � |∇ h | 2 R d h log h d γ ≤ 1 d γ, R d h d γ = 1 2 h R d � ϕ : R d → R 1-Lipschitz R d ϕ d γ = 0 e λϕ � h = R d e λϕ d γ , λ ∈ R � R d e λϕ d γ Z ( λ ) =

  6. logarithmic Sobolev inequality and concentration Herbst argument (1975) λ Z ′ ( λ ) − Z ( λ ) log Z ( λ ) ≤ λ 2 2 Z ( λ ) integrate � R d e λϕ d γ ≤ e λ 2 / 2 Z ( λ ) = Chebyshev’s inequality γ ( ϕ ≥ r ) ≤ e − r 2 / 2 , r ≥ 0 Gaussian concentration

  7. logarithmic Sobolev inequality and concentration � ϕ : R d → R 1-Lipschitz R d ϕ d γ = 0 γ ( ϕ ≥ r ) ≤ e − r 2 / 2 , r ≥ 0 Gaussian concentration equivalent (up to numerical constants) � � � 1 / p ≤ C √ p , R d | ϕ | p d γ p ≥ 1 moment growth: concentration rate

  8. Gaussian processes F collection of functions f : S → R G ( f ) , f ∈ F centered Gaussian process M = sup G ( f ) , Lipschitz M f ∈F Gaussian concentration �� � ≤ 2 e − r 2 / 2 σ 2 , � M − m | ≥ r r ≥ 0 P � G ( f ) 2 � σ 2 = sup mean or median, m E f ∈F Gaussian isoperimetric inequality C. Borell, V. Sudakov, B. Tsirel’son, I. Ibragimov (1975)

  9. extension to empirical processes M. Talagrand (1996) independent in ( S , S ) X 1 , . . . , X n F collection of functions f : S → R n � M = sup f ( X i ) f ∈F i =1 M Lipschitz and convex concentration inequalities on � � | M − m | ≥ r r ≥ 0 P ,

  10. extension to empirical processes n � M = sup f ( X i ) f ∈F i =1 � � | f | ≤ 1 , E f ( X i ) = 0 , f ∈ F � � �� � � − r r | M − m | ≥ r ≤ C exp C log 1 + , r ≥ 0 P σ 2 + m n � � � σ 2 = sup f 2 ( X i ) m mean or median, E f ∈F i =1 M. Talagrand (1996) isoperimetric methods for product measures entropy method – Herbst argument P. Massart (2000) S. Boucheron, G. Lugosi, P. Massart (2005, 2013)

  11. Stein’s method C. Stein (1972) γ standard normal on R � � φ ′ d γ, x φ d γ = φ : R → R smooth R R characterizes γ Stein’s inequality ν probability measure on R � � � � φ ′ d ν � ν − γ � TV ≤ sup x φ d ν − � φ � ∞ ≤ √ R R π/ 2 , � φ ′ � ∞ ≤ 2

  12. the Stein factor ν (centered) probability measure on R Stein factor for ν : x �→ τ ν ( x ) � � τ ν φ ′ d ν, x φ d ν = φ : R → R smooth R R γ standard normal τ γ = 1 Stein discrepancy S ( ν | γ ) � S 2 � | τ ν − 1 | 2 d ν ν | γ ) = R Stein’s inequality � � � ν − γ � TV ≤ 2 S ν | γ

  13. Stein factor and discrepancy: examples I Stein factor for ν : x �→ τ ν ( x ) � � τ ν φ ′ d ν x φ d ν = R R γ standard normal τ γ = 1 d ν = f dx � ∞ � � − 1 τ ν ( x ) = f ( x ) y f ( y ) dy , x ∈ supp ( f ) x ( τ ν polynomial: Pearson class)

  14. Stein factor and discrepancy: examples II central limit theorem X , X 1 , . . . , X n iid random variables mean zero, variance one 1 √ n ( X 1 + · · · + X n ) S n = S 2 � � n S 2 � � � � ≤ 1 = 1 L ( S n ) | γ L ( X ) | γ τ L ( X ) ( X ) n Var � 1 � S 2 � � L ( S n ) | γ = O n

  15. Stein factor and discrepancy: examples III Wiener multiple integrals (chaos) multilinear Gaussian polynomial N � F = a i 1 ,..., i k X i 1 · · · X i k i 1 ,..., i k =1 independent standard normal X 1 , . . . , X N a i 1 ,..., i k ∈ R symmetric, vanishing on diagonals E ( F 2 ) = 1

  16. Stein factor and discrepancy: examples III D. Nualart, G. Peccati (2005) F = F n , n ∈ N k -chaos (fixed degree k ) N = N n → ∞ E ( F 2 n ) = 1 (or → 1) converges to a standard normal F n if and only if � � � E ( F 4 x 4 d γ n ) → 3 = R

  17. Stein factor and discrepancy: examples III F Wiener chaos or multilinear polynomial � � � DF , − D L − 1 F � | F = x τ F ( x ) = E Ornstein-Uhlenbeck operator, Malliavin derivative L D S 2 � � � � ≤ k − 1 E ( F 4 ) − 3 L ( F ) | γ 3 k multidimensional versions I. Nourdin, G. Peccati (2009), I. Nourdin, J. Rosinski (2012)

  18. multidimensional Stein matrix R d ν (centered) probability measure on � � τ ij Stein matrix for ν : x �→ τ ν ( x ) = ν ( x ) 1 ≤ i , j ≤ d � � φ : R d → R R d x φ d ν = R d τ ν ∇ φ d ν, smooth Stein discrepancy S ( ν | γ ) � S 2 � � R d � τ ν − Id � 2 ν | γ = HS d ν no Stein inequality in general

  19. entropy and total variation Stein’s inequality (on R ) � � � ν − γ � TV ≤ 2 S ν | γ stronger convergence in entropy R d , ν probability measure on d ν = h d γ density h � � � (relative) H -entropy H ν | γ = R d h log h d γ Pinsker’s inequality � � TV ≤ 1 � ν − γ � 2 2 H ν | γ

  20. logarithmic Sobolev and Stein R d γ standard Gaussian measure on logarithmic Sobolev inequality d ν = h d γ ν < < γ � � � � ≤ 1 H ν | γ 2 I ν | γ � � � (relative) H -entropy ν | γ = R d h log h d γ H � |∇ h | 2 � � (relative) Fisher Information I ν | γ = d γ h R d (relative) Stein discrepancy � S 2 � � R d � τ ν − Id � 2 ν | γ = HS d ν

  21. HSI inequality new HSI (H-entropy-Stein-Information) inequality � � � � 2 S 2 � � ≤ 1 1 + I ( ν | γ ) ν | γ ν | γ log H S 2 ( ν | γ ) log(1 + x ) ≤ x improves upon the logarithmic Sobolev inequality entropic convergence if S ( ν n | γ ) → 0 and I ( ν n | γ ) bounded, then � � H ν n | γ → 0

  22. HSI and entropic convergence entropic central limit theorem X , X 1 , . . . , X n iid random variables, mean zero, variance one 1 √ n ( X 1 + · · · + X n ) S n = S 2 � � � � ≤ 1 L ( S n ) | γ n Var τ L ( X ) ( X ) � � � � Stam’s inequality L ( S n ) | γ ≤ I L ( X ) | γ < ∞ I � log n � � � HSI inequality H L ( S n ) | γ = O n O ( 1 optimal n ) under fourth moment on X S. Bobkov, G. Chistyakov, F. G¨ otze (2013-14)

  23. HSI and concentration inequalities R d ν probability measure on � ϕ : R d → R 1-Lipschitz R d ϕ d ν = 0 moment growth in p ≥ 2 , C > 0 numerical � � � 1 / p � � � � 1 / p � � � + √ p R d � τ ν � p / 2 R d | ϕ | p d ν ≤ C S p ν | γ Op d ν � � � 1 / p � � R d � τ ν − Id � p S p ν | γ = HS d ν

  24. HSI and concentration inequalities R d iid random variables in X , X 1 , . . . , X n mean zero, covariance identity 1 S n = √ n ( X 1 + · · · + X n ) ϕ : R d → R 1-Lipschitz �� �� � � � � ≤ C e − r 2 / C � ϕ ( S n ) − E ϕ ( S n ) � ≥ r P 0 ≤ r ≤ r n → ∞ � R d � τ ν − Id � p according to the growth in p of HS d ν

  25. HSI inequality: elements of proof HSI inequality � � � � 2 S 2 � � ≤ 1 1 + I ( ν | γ ) H ν | γ ν | γ log S 2 ( ν | γ ) H -entropy H ( ν | γ ) Fisher Information I ( ν | γ ) Stein discrepancy S ( ν | γ )

  26. HSI inequality: elements of proof Ornstein-Uhlenbeck semigroup ( P t ) t ≥ 0 � � � � 1 − e − 2 t y e − t x + P t f ( x ) = d γ ( y ) R d f d ν = h d γ, d ν t = P t h d γ ( ν 0 = ν, ν ∞ = γ ) � ∞ � � � � ν | γ = ν t | γ H I dt 0 � � � � ≤ e − 2 t I classical I ν t | γ ν | γ e − 4 t � � 1 − e − 2 t S 2 � � new main ingredient ν t | γ ≤ ν | γ I

  27. HSI inequality: elements of proof � ∞ � � � � ν | γ = ν t | γ H I dt 0 � � � � ≤ e − 2 t I classical I ν t | γ ν | γ e − 4 t � � 1 − e − 2 t S 2 � � new main ingredient ν t | γ ≤ ν | γ I representation of I ( ν t | γ ) ( v t = log P t h ) � � �� �� � e − 2 t � � 1 − e − 2 t y e − t x + √ τ ν ( x ) − Id y · ∇ v t d ν ( x ) d γ ( y ) 1 − e − 2 t R d R d optimize small t > 0 and large t > 0

  28. HSI inequalities for other distributions � � � � 2 S 2 � � ≤ 1 1 + I ( ν | µ ) ν | µ ν | µ log H S 2 ( ν | µ ) µ gamma, beta distributions multidimensional families of log-concave distributions µ Markov Triple ( E , µ, Γ) (typically abstract Wiener space)

Recommend


More recommend