Concentration inequalities and tail bounds John Duchi Prof. John - PowerPoint PPT Presentation

Concentration inequalities and tail bounds John Duchi Prof. John Duchi

Outline I Basics and motivation 1 Law of large numbers 2 Markov inequality 3 Cherno ff bounds II Sub-Gaussian random variables 1 Definitions 2 Examples 3 Hoe ff ding inequalities III Sub-exponential random variables 1 Definitions 2 Examples 3 Cherno ff /Bernstein bounds Prof. John Duchi

Motivation I Often in this class, goal is to argue that sequence of random (vectors) X 1 , X 2 , . . . satisfies n 1 p X X i ! E [ X ] . n i =1 I Law of large numbers: if E [ k X k ] < 1 , then n ! 1 X lim X i 6 = E [ X ] = 0 . P n n →∞ i =1 Prof. John Duchi

Markov inequalities Theorem (Markov’s inequality) Let X be a non-negative random variable. Then P ( X � t )  E [ X ] . t Prof. John Duchi

Chebyshev inequalities Theorem (Chebyshev’s inequality) Let X be a real-valued random variable with E [ X 2 ] < 1 . Then P ( X � E [ X ] � t )  E [( X � E [ X ]) 2 ] = Var( X ) . t 2 t 2 Example: i.i.d. sampling Prof. John Duchi

Cherno ff bounds Moment generating function: for random variable X , the MGF is M X ( λ ) := E [ e � X ] Example: Normally distributed random variables Prof. John Duchi

Cherno ff bounds Theorem (Cherno ff bound) For any random variable and t � 0 , � ≥ 0 M X − E [ X ] ( λ ) e − � t = inf � ≥ 0 E [ e � ( X − E [ X ]) ] e − � t . P ( X � E [ X ] � t )  inf Prof. John Duchi

Sub-Gaussian random variables Definition (Sub-Gaussianity) A mean-zero random variable X is σ 2 -sub-Gaussian if ✓ λ 2 σ 2 ◆ h e � X i  exp for all λ 2 R E 2 Example: X ⇠ N (0 , σ 2 ) Prof. John Duchi

Properties of sub-Gaussians Proposition (sums of sub-Gaussians) Let X i be independent, mean-zero σ 2 i -sub-Gaussian. Then P n i =1 X i is P n i =1 σ 2 i -sub-Gaussian. Prof. John Duchi

Concentration inequalities Theorem Let X be σ 2 -sub-Gaussian. Then for t � 0 , � t 2 ✓ ◆ P ( X � E [ X ] � t )  exp 2 σ 2 � t 2 ✓ ◆ P ( X � E [ X ]  � t )  exp 2 σ 2 Prof. John Duchi

Concentration: convergence of an independent sum Corollary Let X i be independent σ 2 i -sub-Gaussian. Then for t � 0 , n ! ! nt 2 1 X X i � t  exp � P P n 2 1 i =1 σ 2 n i n i =1 Prof. John Duchi

Example: bounded random variables Proposition Let X 2 [ a, b ] , with E [ X ] = 0 . Then λ 2( b − a )2 E [ e � X ]  e . 8 Prof. John Duchi

Maxima of sub-Gaussian random variables (in probability)  � 2 σ 2 log n p max j ≤ n X j  E Prof. John Duchi

Maxima of sub-Gaussian random variables (in expectation) ✓ ◆  e − t . p 2 σ 2 (log n + t ) max j ≤ n X j � P Prof. John Duchi

Hoe ff ding’s inequality If X i are bounded in [ a i , b i ] then for t � 0 , n ! ! 2 nt 2 1 X ( X i � E [ X i ]) � t  exp � P P n 1 n i =1 ( b i � a i ) 2 n i =1 n ! ! 2 nt 2 1 X ( X i � E [ X i ])  � t  exp � . P P n 1 n i =1 ( b i � a i ) 2 n i =1 Prof. John Duchi

Equivalent definitions of sub-Gaussianity Theorem The following are equivalent (up to constants) i E [exp( X 2 / σ 2 )]  e p ii E [ | X | k ] 1 /k  σ k iii P ( | X | � t )  exp( � t 2 2 � 2 ) If in addition X is mean-zero, then this is also equivalent to i–iii above iv X is σ 2 -sub-Gaussian Prof. John Duchi

Sub-exponential random variables Definition (Sub-exponential) A mean-zero random variable X is ( τ 2 , b ) -sub-Exponential if ✓ λ 2 τ 2 ◆ for | λ |  1 E [exp ( λ X )]  exp b. 2 Example: Exponential RV, density p ( x ) = β e − � x for x � 0 Prof. John Duchi

Sub-exponential random variables Example: χ 2 -random variable. Let Z ⇠ N (0 , σ 2 ) and X = Z 2 . Then 1 E [ e � X ] = . 1 [1 � 2 λσ 2 ] 2 + Prof. John Duchi

Concentration of sub-exponentials Theorem Let X be ( τ 2 , b ) -sub-exponential. Then e − t 2 ( if 0  t  ⌧ 2 ⇢ e − t 2 � 2 τ 2 2 τ 2 , e − t P ( X � E [ X ]+ t )  b = max . 2 b e − t if t � ⌧ 2 2 b b Prof. John Duchi

Sums of sub-exponential random variables Let X i be independent ( τ 2 i , b i ) -sub-exponential random variables. Then P n i =1 X i is ( P n i =1 τ 2 i , b ∗ ) -sub-exponential, where b ∗ = max i b i Corollary: If X i satisfy above, then � n � ! ( )! nt 2 1 , nt � � X X i � E [ X i ] � � t  2 exp � min . P � � P n 2 1 i =1 τ 2 n 2 b ∗ � � i n � i =1 Prof. John Duchi

Bernstein conditions and sub-exponentials Suppose X is mean-zero with | E [ X k ] |  1 2 k ! σ 2 b k − 2 Then λ 2 σ 2 ✓ ◆ E [ e � X ]  exp 2(1 � b | λ | ) Prof. John Duchi

Johnson-Lindenstrauss and high-dimensional embedding Question: Let u 1 , . . . , u m 2 R d be arbitrary. Can we find a mapping F : R d ! R n , n ⌧ d , such that � u i � u j � � 2 � 2 � u i � u j � � 2 � F ( u i ) � F ( u j ) � � � � (1 � δ ) 2  2  (1 + δ ) 2 Theorem (Johnson-Lindenstrauss embedding) For n & 1 ✏ 2 log m such a mapping exists. Prof. John Duchi

Proof of Johnson-Lindenstrauss continued � � ! k Xu k 2 � nt 2 ✓ ◆ � � 2 � 1 � � t  2 exp for t 2 [0 , 1] . P � � n k u k 2 8 � � 2 � Prof. John Duchi

Reading and bibliography 1. S. Boucheron, O. Bousquet, and G. Lugosi. Concentration inequalities. In O. Bousquet, U. Luxburg, and G. Ratsch, editors, Advanced Lectures in Machine Learning , pages 208–240. Springer, 2004 2. V. Buldygin and Y. Kozachenko. Metric Characterization of Random Variables and Random Processes , volume 188 of Translations of Mathematical Monographs . American Mathematical Society, 2000 3. M. Ledoux. The Concentration of Measure Phenomenon . American Mathematical Society, 2001 4. S. Boucheron, G. Lugosi, and P. Massart. Concentration Inequalities: a Nonasymptotic Theory of Independence . Oxford University Press, 2013 Prof. John Duchi

Concentration inequalities and tail bounds John Duchi Prof. John - PowerPoint PPT Presentation

Concentration inequalities and tail bounds John Duchi Prof. John Duchi Outline I Basics and motivation 1 Law of large numbers 2 Markov inequality 3 Cherno ff bounds II Sub-Gaussian random variables 1 Definitions 2 Examples 3 Hoe ff ding

tail bounds tail bounds For a random variable X, the tails of X are the parts of the PMF/density

Concentration Inequalities for Random Matrices M. Ledoux Institut de Math ematiques de

CS161 Recursion Continued Tail recursion n Tail recursion is a recursive call that occurs as

Tales of the Tail Hardware, OS, and Application-level Sources of Tail Latency Jialin Li, Naveen

Race Condition Shared Data: 4 5 6 1 8 5 6 20 9 ? Synchronization and Deadlocks tail

Race Condition Shared Data: 5 6 4 1 8 5 6 20 9 ? InterProcess Communication tail A[]

OSMOSIS and DIFFUSION Concentration gradient Concentration Gradient - change in the concentration

Circuit Lower-bounds Lecture 24 Weak circuits are indeed weak 1 Circuit Lower-bounds 2

Concentration inequalities G abor Lugosi ICREA and Pompeu Fabra University Barcelona what is

Concentration inequalities, the entropy method, search for super -concentration Concentration, ...

Concentration inequalities for occupancy models with log-concave marginals Jay Bartroff, Larry

Inequalities for Symmetric Polynomials Curtis Greene October 24, 2009 Inequalities for Symmetric

Lecture #2: Advanced hashing and concentration bounds o Bloom filters o Cuckoo hashing o Load

Improved Concentration Bounds for Count-Sketch Gregory T. Minton 1 Eric Price 2 1 MIT MSR New

Probe or Wait : Handling tail losses using Multipath TCP Kiran Yedugundla, Per Hurtig, Anna

Probabilistic Program Analysis and Concentration of Measure Part I: Concentration of Measure

Dynamic Rupture Simulation Methods Luis A. Dalguer Consultant at 3Q-Lab GmbH , Switzerland

1 Method Outline Fragment Matching Fragment Extraction Individual Correspondence

Third Quarter Fiscal 2020 February 5, 2020 8:00 am CDT Forward-Looking Statements This

Matrix Completion and Matrix Concentration Lester Mackey Collaborators: Ameet Talwalkar ,

Concentration phenomena in high dimensional geometry. Olivier Gu edon Universit e

Regulatory Infrastructure Public Meeting on Blending of Low-Level Radioactive Waste Patrice M.

Solutions: permission of the owners. NJCTL maintains its website for the convenience of teachers

Subexponential Time Algorithms via Concentration Bounds Manish Purohit, Anshul Sawant 2014-12-08