concentration inequalities for random matrices
play

Concentration Inequalities for Random Matrices M. Ledoux Institut - PowerPoint PPT Presentation

Concentration Inequalities for Random Matrices M. Ledoux Institut de Math ematiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics exponential tail inequalities classical theme in


  1. sample covariance matrices Y = ( Y 1 , . . . , Y N ) M × N matrix Y = ( Y ij ) 1 ≤ i ≤ M , 1 ≤ j ≤ N Y ij independent identically distributed (real or complex) E ( Y 2 E ( Y ij ) = 0 , ij ) = 1 R M Wishart model : Y j standard Gaussian in numerous extensions

  2. sample covariance matrices M × N Y = ( Y 1 , . . . , Y N ) matrix E ( Y ij ) = 0 , E ( Y 2 Y = ( Y ij ) 1 ≤ i ≤ M , 1 ≤ j ≤ N iid ij ) = 1

  3. sample covariance matrices M × N Y = ( Y 1 , . . . , Y N ) matrix E ( Y ij ) = 0 , E ( Y 2 Y = ( Y ij ) 1 ≤ i ≤ M , 1 ≤ j ≤ N iid ij ) = 1 0 ≤ λ N 1 ≤ · · · ≤ λ N center of interest : eigenvalues M Y Y t ( M × M of non-negative symmetric matrix)

  4. sample covariance matrices M × N Y = ( Y 1 , . . . , Y N ) matrix E ( Y ij ) = 0 , E ( Y 2 Y = ( Y ij ) 1 ≤ i ≤ M , 1 ≤ j ≤ N iid ij ) = 1 0 ≤ λ N 1 ≤ · · · ≤ λ N center of interest : eigenvalues M Y Y t ( M × M of non-negative symmetric matrix) � λ N singular values of Y k

  5. sample covariance matrices M × N Y = ( Y 1 , . . . , Y N ) matrix E ( Y ij ) = 0 , E ( Y 2 Y = ( Y ij ) 1 ≤ i ≤ M , 1 ≤ j ≤ N iid ij ) = 1 0 ≤ λ N 1 ≤ · · · ≤ λ N center of interest : eigenvalues M Y Y t ( M × M of non-negative symmetric matrix) � λ N singular values of Y k k = λ N 1 � λ N k N Y Y t eigenvalues of N

  6. sample covariance matrices M × N Y = ( Y 1 , . . . , Y N ) matrix E ( Y ij ) = 0 , E ( Y 2 Y = ( Y ij ) 1 ≤ i ≤ M , 1 ≤ j ≤ N iid ij ) = 1 0 ≤ λ N 1 ≤ · · · ≤ λ N center of interest : eigenvalues M Y Y t ( M × M of non-negative symmetric matrix) � λ N singular values of Y k k = λ N 1 � λ N k N Y Y t eigenvalues of N M � 1 spectral measure δ � λ N M k k =1

  7. sample covariance matrices M × N Y = ( Y 1 , . . . , Y N ) matrix E ( Y ij ) = 0 , E ( Y 2 Y = ( Y ij ) 1 ≤ i ≤ M , 1 ≤ j ≤ N iid ij ) = 1 0 ≤ λ N 1 ≤ · · · ≤ λ N center of interest : eigenvalues M Y Y t ( M × M of non-negative symmetric matrix) � λ N singular values of Y k k = λ N 1 � λ N k N Y Y t eigenvalues of N M � 1 spectral measure δ � λ N M k k =1 asymptotics M = M ( N ) ∼ ρ N N → ∞

  8. Marchenko-Pastur theorem (1967) N ( � k = λ N asymptotic behavior of the spectral measure λ k / N )

  9. Marchenko-Pastur theorem (1967) N ( � k = λ N asymptotic behavior of the spectral measure λ k / N ) M � 1 δ � → ν Marchenko-Pastur distribution λ N M k k =1

  10. Marchenko-Pastur theorem (1967) N ( � k = λ N asymptotic behavior of the spectral measure λ k / N ) M � 1 δ � → ν Marchenko-Pastur distribution λ N M k k =1 � � � 1 − 1 1 d ν ( x ) = + δ 0 + ( b − x )( x − a ) 1 [ a , b ] dx ρ ρ 2 π x

  11. Marchenko-Pastur theorem (1967) N ( � k = λ N asymptotic behavior of the spectral measure λ k / N ) M � 1 δ � → ν Marchenko-Pastur distribution λ N M k k =1 � � � 1 − 1 1 d ν ( x ) = + δ 0 + ( b − x )( x − a ) 1 [ a , b ] dx ρ ρ 2 π x � � 2 � � 2 1 − √ ρ 1 + √ ρ a = a ( ρ ) = b = b ( ρ ) =

  12. Marchenko-Pastur theorem (1967) N ( � k = λ N asymptotic behavior of the spectral measure λ k / N ) M � 1 δ � → ν Marchenko-Pastur distribution λ N M k k =1 � � � 1 − 1 1 d ν ( x ) = + δ 0 + ( b − x )( x − a ) 1 [ a , b ] dx ρ ρ 2 π x � � 2 � � 2 1 − √ ρ 1 + √ ρ a = a ( ρ ) = b = b ( ρ ) =

  13. Marchenko-Pastur theorem M � � � 1 δ � → ν on a ( ρ ) , b ( ρ ) M ∼ ρ N λ N M k k =1 global regime

  14. Marchenko-Pastur theorem M � � � 1 δ � → ν on a ( ρ ) , b ( ρ ) M ∼ ρ N λ N M k k =1 global regime large deviation asymptotics of the spectral measure

  15. Marchenko-Pastur theorem M � � � 1 δ � → ν on a ( ρ ) , b ( ρ ) M ∼ ρ N λ N M k k =1 global regime large deviation asymptotics of the spectral measure fluctuations of the spectral measure

  16. Marchenko-Pastur theorem M � � � 1 δ � → ν on a ( ρ ) , b ( ρ ) M ∼ ρ N λ N M k k =1 global regime large deviation asymptotics of the spectral measure fluctuations of the spectral measure M � � �� � � � λ N − R f d ν → G Gaussian variable f k k =1 f : R → R smooth

  17. Marchenko-Pastur theorem � M � � 1 → ν M ∼ ρ N δ � on a ( ρ ) , b ( ρ ) λ N M k k =1 local regime

  18. Marchenko-Pastur theorem � M � � 1 → ν M ∼ ρ N δ � on a ( ρ ) , b ( ρ ) λ N M k k =1 local regime behavior of the individual eigenvalues

  19. Marchenko-Pastur theorem � M � � 1 → ν M ∼ ρ N δ � on a ( ρ ) , b ( ρ ) λ N M k k =1 local regime behavior of the individual eigenvalues spacings (bulk behavior)

  20. Marchenko-Pastur theorem � M � � 1 → ν M ∼ ρ N δ � on a ( ρ ) , b ( ρ ) λ N M k k =1 local regime behavior of the individual eigenvalues spacings (bulk behavior) extremal eigenvalues (edge behavior)

  21. extremal eigenvalues λ N M = max 1 ≤ k ≤ M λ N largest eigenvalue k

  22. extremal eigenvalues λ N M = max 1 ≤ k ≤ M λ N largest eigenvalue k M = λ N � λ N M N

  23. extremal eigenvalues λ N M = max 1 ≤ k ≤ M λ N largest eigenvalue k M = λ N � � 2 1 + √ ρ � λ N M → b ( ρ ) = M ∼ ρ N N

  24. Marchenko-Pastur theorem (1967) N ( � k = λ N asymptotic behavior of the spectral measure λ k / N ) M � 1 δ � → ν Marchenko-Pastur distribution λ N M k k =1 � � � 1 − 1 1 d ν ( x ) = + δ 0 + ( b − x )( x − a ) 1 [ a , b ] dx ρ ρ 2 π x � � 2 � � 2 1 − √ ρ 1 + √ ρ a = a ( ρ ) = b = b ( ρ ) =

  25. extremal eigenvalues λ N M = max 1 ≤ k ≤ M λ N largest eigenvalue k M = λ N � � 2 1 + √ ρ � λ N M → b ( ρ ) = M ∼ ρ N N

  26. extremal eigenvalues λ N M = max 1 ≤ k ≤ M λ N largest eigenvalue k M = λ N � � 2 1 + √ ρ � λ N M → b ( ρ ) = M ∼ ρ N N fluctuations around b ( ρ )

  27. extremal eigenvalues λ N M = max 1 ≤ k ≤ M λ N largest eigenvalue k M = λ N � � 2 1 + √ ρ � λ N M → b ( ρ ) = M ∼ ρ N N fluctuations around b ( ρ ) complex or real Gaussian (Wishart matrices)

  28. extremal eigenvalues λ N M = max 1 ≤ k ≤ M λ N largest eigenvalue k M = λ N � � 2 1 + √ ρ � λ N M → b ( ρ ) = M ∼ ρ N N fluctuations around b ( ρ ) complex or real Gaussian (Wishart matrices) M 2 / 3 � � � λ N M − b ( ρ ) → C ( ρ ) F TW

  29. extremal eigenvalues λ N M = max 1 ≤ k ≤ M λ N largest eigenvalue k M = λ N � � 2 1 + √ ρ � λ N M → b ( ρ ) = M ∼ ρ N N fluctuations around b ( ρ ) complex or real Gaussian (Wishart matrices) M 2 / 3 N − 1 � � λ N M − b ( ρ ) N → C ( ρ ) F TW

  30. extremal eigenvalues λ N M = max 1 ≤ k ≤ M λ N largest eigenvalue k M = λ N � � 2 1 + √ ρ � λ N M → b ( ρ ) = M ∼ ρ N N fluctuations around b ( ρ ) complex or real Gaussian (Wishart matrices) M 2 / 3 N − 1 � � λ N M − b ( ρ ) N → C ( ρ ) F TW C. Tracy, H. Widom (1994) distribution F TW

  31. extremal eigenvalues λ N M = max 1 ≤ k ≤ M λ N largest eigenvalue k M = λ N � � 2 1 + √ ρ � λ N M → b ( ρ ) = M ∼ ρ N N fluctuations around b ( ρ ) complex or real Gaussian (Wishart matrices) M 2 / 3 N − 1 � � λ N M − b ( ρ ) N → C ( ρ ) F TW C. Tracy, H. Widom (1994) distribution F TW K. Johansson (2000), I. Johnstone (2001)

  32. F TW C. Tracy, H. Widom (1994) distribution � � ∞ � ( x − s ) u ( x ) 2 dx (complex) F TW ( s ) = exp − , s ∈ R s u ′′ = 2 u 3 + xu Painlev´ e II equation

  33. F TW C. Tracy, H. Widom (1994) distribution � � ∞ � ( x − s ) u ( x ) 2 dx (complex) F TW ( s ) = exp − , s ∈ R s u ′′ = 2 u 3 + xu Painlev´ e II equation density

  34. mean ≃ − 1 . 77 F TW ( s ) ∼ e − s 3 / 12 as s → −∞ 1 − F TW ( s ) ∼ e − 4 s 3 / 2 / 3 as s → + ∞ density (similar for real case)

  35. extremal eigenvalues λ N M = max 1 ≤ k ≤ M λ N largest eigenvalue k M = λ N � � 2 1 + √ ρ λ N � M → b ( ρ ) = M ∼ ρ N N fluctuations around b ( ρ ) complex or real Gaussian (Wishart matrices) M 2 / 3 � � � λ N M − b ( ρ ) → C ( ρ ) F TW F TW C. Tracy, H. Widom (1994) distribution K. Johansson (2000), I. Johnstone (2001)

  36. Gaussian (Wishart matrices)

  37. Gaussian (Wishart matrices) completely solvable models

  38. Gaussian (Wishart matrices) completely solvable models determinantal structure orthogonal polynomial analysis

  39. Gaussian (Wishart matrices) completely solvable models determinantal structure orthogonal polynomial analysis asymptotics of Laguerre orthogonal polynomials

  40. Gaussian (Wishart matrices) completely solvable models determinantal structure orthogonal polynomial analysis asymptotics of Laguerre orthogonal polynomials C. Tracy, H. Widom (1994) K. Johansson (2000), I. Johnstone (2001)

  41. extension to non-Gaussian matrices

  42. extension to non-Gaussian matrices A. Soshnikov (2001-02) � � ( YY t ) p �� moment method E Tr

  43. extension to non-Gaussian matrices A. Soshnikov (2001-02) � � ( YY t ) p �� moment method E Tr L. Erd¨ os, H.-T. Yau (2009-12) (and collaborators) local Marchenko-Pastur law T. Tao, V. Vu (2010-11) Lindeberg comparison method

  44. extension to non-Gaussian matrices A. Soshnikov (2001-02) � � ( YY t ) p �� moment method E Tr L. Erd¨ os, H.-T. Yau (2009-12) (and collaborators) local Marchenko-Pastur law T. Tao, V. Vu (2010-11) Lindeberg comparison method symmetric matrices

  45. (brief) survey of recent approaches to non-asymptotic exponential inequalities

  46. (brief) survey of recent approaches to non-asymptotic exponential inequalities quantify the limit theorems

  47. (brief) survey of recent approaches to non-asymptotic exponential inequalities quantify the limit theorems spectral measure

  48. (brief) survey of recent approaches to non-asymptotic exponential inequalities quantify the limit theorems spectral measure extremal eigenvalues ( mean ) 1 / 3 catch the new rate

  49. (brief) survey of recent approaches to non-asymptotic exponential inequalities quantify the limit theorems spectral measure extremal eigenvalues ( mean ) 1 / 3 catch the new rate from the Gaussian case to non-Gaussian models

  50. two main questions and objectives

  51. two main questions and objectives tail inequalities for the spectral measure � M � � f ( � λ N k ) ≥ t P k =1

  52. Marchenko-Pastur theorem M � � � 1 δ � → ν on a ( ρ ) , b ( ρ ) M ∼ ρ N λ N M k k =1 global regime large deviation asymptotics of the spectral measure fluctuations of the spectral measure M � � �� � � � λ N − R f d ν → G Gaussian variable f k k =1 f : R → R smooth

Recommend


More recommend