ultra high dimensional statistics and statistical
play

Ultra-high dimensional statistics and statistical learning on some - PowerPoint PPT Presentation

Examples Linear model Mathematical ingredients LOL algo Ultra-high dimensional statistics and statistical learning on some applications Dominique Picard Universit e Paris-Diderot Laboratoire Probabilit es et Mod` eles Al eatoires


  1. Examples Linear model Mathematical ingredients LOL algo Ultra-high dimensional statistics and statistical learning on some applications Dominique Picard Universit´ e Paris-Diderot Laboratoire Probabilit´ es et Mod` eles Al´ eatoires M2MO : 20 ans !

  2. Examples Linear model Mathematical ingredients LOL algo Plan Examples Linear model Mathematical ingredients LOL algo

  3. Examples Linear model Mathematical ingredients LOL algo Example 1 : prediction of electrical consumption 4 (354) 20071220 with 12 coeff (Group 9) 8.6 x 10 original Model 8.4 8.2 8 7.8 7.6 7.4 7.2 0.0010% 7 0 5 10 15 20 25 30 35 40 45 50 Figure : Signal- Prediction M. Mougeot, K. Tribouley, Laurence Maillard, V. Lefieux, D.P.

  4. Examples Linear model Mathematical ingredients LOL algo Examples of days (worst) (2009 07 14) 4 20090714 (2387) 4.6 x 10 original Model 4.4 4.2 4 3.8 3.6 3.4 mape 0.06769 3.2 0 5 10 15 20 25 30 35 40 45 50

  5. Examples Linear model Mathematical ingredients LOL algo Example 3 : genomic

  6. Examples Linear model Mathematical ingredients LOL algo Example 4 : Estimate a probability density on the sphere

  7. Examples Linear model Mathematical ingredients LOL algo Example 5 : CMB

  8. Examples Linear model Mathematical ingredients LOL algo C M B : mask

  9. Examples Linear model Mathematical ingredients LOL algo High frequency signal FBUND FBund 20091207 123.1 123.05 123 122.95 122.9 122.85 122.8 122.75 122.7 122.65 122.6 0 500 1000 1500 2000 2500 Trading time E. Bacry

  10. Examples Linear model Mathematical ingredients LOL algo High frequency signal FBUND FBund 20091208 123.9 123.8 123.7 123.6 123.5 123.4 123.3 123.2 123.1 123 122.9 0 500 1000 1500 2000 2500 Trading time E. Bacry

  11. Examples Linear model Mathematical ingredients LOL algo High frequency signal FBOBL FBobl 20091207 116.3 116.25 116.2 116.15 116.1 116.05 116 115.95 0 500 1000 1500 2000 2500 Trading time E. Bacry

  12. Examples Linear model Mathematical ingredients LOL algo High frequency signal FBOBL FBobl 20091208 117 116.9 116.8 116.7 116.6 116.5 116.4 116.3 116.2 0 500 1000 1500 2000 2500 Trading time E. Bacry

  13. Examples Linear model Mathematical ingredients LOL algo Model

  14. Examples Linear model Mathematical ingredients LOL algo Linear Model Observation : Y = ( Y 1 , . . . , Y n ) t Y = Φ α + ǫ R p is the unknown parameter (to be estimated) α ∈ I • ǫ = ( ǫ 1 , . . . , ǫ n ) t is a (non observed) vector of random errors. It is assumed to be variables i.i.d. N (0 , σ 2 ) • Φ is a known matrix n × p . High dimension : p >> n

  15. Examples Linear model Mathematical ingredients LOL algo Example : genomic   1  .  .   . Y =     1 0 Φ =

  16. Examples Linear model Mathematical ingredients LOL algo • Large random matrices : Φ is composed of n × p random variables i.i.d. N (0 , 1).

  17. Examples Linear model Mathematical ingredients LOL algo Signal denoising FBund 20091207 123.1 123.05 123 122.95 122.9 122.85 122.8 122.75 122.7 122.65 Y = 122.6 0 500 1000 1500 2000 2500 Trading time What is Φ in this case ?

  18. Examples Linear model Mathematical ingredients LOL algo • Statistical learning, regression estimation Y i = f ( X i ) + ǫ i + u i , i = 1 . . . n • ǫ ′ i s are i.i.d. N (0 , 1). • u i ’s possibly random, not necessarily random nor iid but ’small’. i ’s random i.i.d. taking values in a compact set of R d . • X ′ • f is the parameter to be estimated.

  19. Examples Linear model Mathematical ingredients LOL algo To embed this problem in a linear model, we consider a dictionary R d . We assume that f D of size p , of real functions defined on I can be ’reasonably’ well approached by the dictionary functions { g ∈ D} : i.e. there exists α g tel que � f = α g g + h g ∈D where h is ’small’. Then the model writes � Y i = α g g ( X i ) + h ( X i ) + ǫ i , i = 1 , . . . , n g ∈D Y = Φ α + u + ǫ if we put u i = h ( X i ) pour i = 1 , . . . , n et Φ being the matrix with general terms Φ i ℓ = g ℓ ( X i )

  20. Examples Linear model Mathematical ingredients LOL algo Associated problems Y = Φ α + u + ǫ n observations : Y ( n × 1), Φ ( n × p ) ◮ Estimation : determine ˆ α ◮ Selection : α ∗ = ˆ Find the significant coefficients ˆ α 1 | ˆ α | > T ◮ Predict : ˆ Y = Φˆ α

  21. Examples Linear model Mathematical ingredients LOL algo Conditions generally required to solve the problem • ’sparsity’ of the vector α • good approximation of the ’true function’ by the dictionary • ’Coherence’ conditions on the matrix Φ

  22. Examples Linear model Mathematical ingredients LOL algo Approximation � u i = h ( X i ) = f ( X i ) − α g g ( X i ) g ∈D Asking the u i ’s to be small means that f est well approximated by a linear combination of the dictionnary

  23. Examples Linear model Mathematical ingredients LOL algo Sparsity conditions : what does it means to be sparse ?

  24. Examples Linear model Mathematical ingredients LOL algo Sparsity conditions • { α ℓ } ℓ ≤ p S sparse

  25. Examples Linear model Mathematical ingredients LOL algo Sparsity conditions • { α ℓ } ℓ ≤ p S sparse • Strict sparsity # { ℓ ∈ { 1 , . . . , p } , | α ℓ | � = 0 } ≤ S

  26. Examples Linear model Mathematical ingredients LOL algo Sparsity conditions • { α ℓ } ℓ ≤ p S sparse • Strict sparsity # { ℓ ∈ { 1 , . . . , p } , | α ℓ | � = 0 } ≤ S • more generally � | α ℓ | q ≤ M , 0 < q < 1 ℓ

  27. Examples Linear model Mathematical ingredients LOL algo The dictionary problem Of course sparsity is linked with the dictionary. • Fourier Basis • Wavelet basis • Needlets • Combination of ’bases’

  28. Examples Linear model Mathematical ingredients LOL algo Fourier basis Dictionary func 1 1.5 1 0.5 0 −0.5 −1 0 5 10 15 20 25 30 35 40 45 50 Dictionary func 4 0.25 0.2 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 −0.25 0 5 10 15 20 25 30 35 40 45 50 Dictionary func 30 0.25 0.2 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 −0.25 0 5 10 15 20 25 30 35 40 45 50

  29. Examples Linear model Mathematical ingredients LOL algo Haar wavelets Dictionary func 48 Dictionary func 49 Dictionary func 50 Dictionary func 51 0.2 0.5 0.5 0.5 0 0 0 0 −0.2 −0.5 −0.5 −0.5 0 50 0 50 0 50 0 50 Dictionary func 52 Dictionary func 53 Dictionary func 54 Dictionary func 55 0.5 0.5 0.5 0.5 0 0 0 0 −0.5 −0.5 −0.5 −0.5 0 50 0 50 0 50 0 50 Dictionary func 56 Dictionary func 57 Dictionary func 58 Dictionary func 59 0.5 0.5 0.5 0.5 0 0 0 0 −0.5 −0.5 −0.5 −0.5 0 50 0 50 0 50 0 50 Dictionary func 60 Dictionary func 61 Dictionary func 62 0.5 0.5 0.5 0 0 0 −0.5 −0.5 −0.5

  30. Examples Linear model Mathematical ingredients LOL algo Functions defined on the sphere

  31. Examples Linear model Mathematical ingredients LOL algo Spherical Harmonics on the sphere 0.6 0.4 0.2 0 −0.2 −0.4 −0.6 3.5 −0.8 3 7 2.5 6 5 2 4 1.5 3 1 2 0.5 1 0 0 THETA PHI

  32. Examples Linear model Mathematical ingredients LOL algo Needlets on the sphere (Petrushev-co-authors) 1.5 1 0.5 0 −0.5 8 6 3.5 3 2.5 4 2 1.5 2 1 0.5 0 0 PHI THETA

  33. Examples Linear model Mathematical ingredients LOL algo Needlets associated to Jacobi polynomials on [0,1] (Petrushev-co-authors) 60 6 3 6 40 4 2 4 20 2 1 2 0 0 0 0 −20 −2 −1 −2

  34. Examples Linear model Mathematical ingredients LOL algo Sparsity conditions and functional approximation spaces In the wavelet, needlet cases, Besov spaces are especially adapted to reflect sparsity conditions. More complex : How to translate in terms of spaces sparsity conditions for combinations of bases ? Petrushev, Narkowitch, Ward, Xu, Kyriasis ; Coulon, Kerkyacharian, Petrushev

  35. Examples Linear model Mathematical ingredients LOL algo Conditions generally required to solve the problem • ’Sparsity’ sur le vecteur α • good approximation of the ’true function’ by the dictionary • ’Coherence’ conditions on the matrix Φ

  36. Examples Linear model Mathematical ingredients LOL algo RIP- Coherence The raws of Φ are supposed to be normalized For C ⊂ { 1 , . . . p } , denote Φ C the matrix Φ restricted to the raws which are in C and the associated Gram-matrix M ( C ) := 1 n Φ t C Φ C RIP( m 0 , ν ) assumes that M ( C ) is almost diagonal for any C as soon as #( C ) ≤ m 0 , in the following sense : There exist 0 ≤ ν < 1 and m 0 ≥ 1 such that : R m , � x � 2 l 2 ( m ) (1 − ν ) ≤ x t M ( C ) x ≤ � x � 2 ∀ x ∈ I l 2 ( m ) (1 + ν ) ,

  37. Examples Linear model Mathematical ingredients LOL algo Coherence. • Introduce the p × p Gram matrix : M := 1 n Φ t Φ . and the Coherence � n | 1 τ n = sup | M ℓ m | = sup Φ i ℓ Φ im | n ℓ � = m ℓ � = m i =1 Coherence = ⇒ RIP( ⌊ ν/τ n ⌋ , ν )

Recommend


More recommend