the multiresolution criterion and nonparametric regression
play

The multiresolution criterion and nonparametric regression Thoralf - PowerPoint PPT Presentation

The multiresolution criterion and nonparametric regression Thoralf Mildenberger and Henrike Weinert joint work with P .L. Davies and U. Gather SFB 475 Fakultt Statistik Technische Universitt Dortmund Workshop on current trends and


  1. The multiresolution criterion and nonparametric regression Thoralf Mildenberger and Henrike Weinert joint work with P .L. Davies and U. Gather SFB 475 Fakultät Statistik Technische Universität Dortmund Workshop on current trends and challenges in model selection and related areas Vienna, July 2008

  2. Outline Nonparametric Regression Choosing the smoothing parameter Simulation Study The multiresolution norm Geometric Interpretation The MR-norm and ℓ p -Norms

  3. Nonparametric Regression Model: y ( t i ) = f ( t i ) + ε ( t i ) , ( 0 ≤ t 1 < · · · < t N ≤ 1 ) ε ( t 1 ) , . . . , ε ( t N ) iid ∼ N ( 0 , σ 2 ) Goal: Find estimate ˆ f of f .

  4. Nonparametric Regression Model: y ( t i ) = f ( t i ) + ε ( t i ) , ( 0 ≤ t 1 < · · · < t N ≤ 1 ) ε ( t 1 ) , . . . , ε ( t N ) iid ∼ N ( 0 , σ 2 ) Goal: Find estimate ˆ f of f . Problem: ˆ f usually chosen from family (ˆ f h ) indexed by smoothing parameter h (bandwidth, size of a partition, penalty etc.) Interpretation: Often h - ‘complexity’ of ˆ f h .

  5. Choosing the smoothing parameter Risk based choice: h such that ˆ f h minimizes risk (e.g. MSE, MISE etc.) Risk has to be estimated from data by e.g.: Asymptotic considerations, Plug-In-Methods, Penalized Criteria, CV, Risk bounds etc. Residual based choice: Given data, find simplest model that ’could have generated’ the data, i.e. residuals ’look like noise’ e.g. Taut-String Algorithm (Davies and Kovac 2001) .

  6. The Multiresolution Criterion Given some estimate ˆ f , consider residuals r i := r ( t i ) := y ( t i ) − ˆ f ( t i ) Accept residuals as noise iff � � 1 � � � max r i � ≤ σ C ( ∗ ) � � � � � | I | I ∈I � i ∈ I I System of all intervals in { 1 , . . . , N }

  7. The Multiresolution Criterion Given some estimate ˆ f , consider residuals r i := r ( t i ) := y ( t i ) − ˆ f ( t i ) Accept residuals as noise iff � � 1 � � � max r i � ≤ σ C ( ∗ ) � � � � � | I | I ∈I � i ∈ I I System of all intervals in { 1 , . . . , N } Choose estimate of smallest complexity such that ( ∗ ) is fulfilled.

  8. Residual based methods MR criterion has been combined with different measures of complexity : ◮ Number of local extrema or total variation (Taut-String-Algorithm, Davies and Kovac 2001) ◮ Number of changes between convexity and concavity (Davies, Kovac and Meise 2008) ◮ Smoothness quantified by derivatives (Weighted Smoothing Splines, Davies and Meise 2008) ◮ Number of jumps (Potts smoother, Boysen et al. 2008)

  9. Taut String Method n = 1 summed process y ◦ � t i ≤ t y ( t i ) n � � n , C y ◦ Tube T : √ n n − C n + C y ◦ √ n ≤ g ( t ) ≤ y ◦ √ n � 1 � 1 + s 2 String S n : has smallest length ( S n ) = n ( t ) dt 0 Derivative of S n : candidate for ˆ f Check if MR criterion fulfilled, if not: local squeezing of tube

  10. Simulation Study (Davies, Gather, Weinert, 2008) ◮ Wavelet-Thresholding (Donoho and Johnstone, 1994) → hard and soft thresholding [H,S] ◮ Unbalanced Haar (Fryzlewicz, 2006) [U] ◮ Minimum-Description-Length (Rissanen, 2000) [M] ◮ Adaptive weights smoothing (Polzehl and Spokoiny, 2003) [A] ◮ Local Plug-in kernel method (Herrmann, 1997) [P] ◮ Taut-string (Davies and Kovac, 2001) [T,V]

  11. Simulation Study Doppler Bumps Heavisine 10 10 5 40 5 0 0 f f f 20 −5 −10 −15 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t t t Blocks Sine Constant Signal 1.0 1.0 15 0.0 0.0 f 5 f f −5 −1.0 −1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t t t

  12. Simulation Study 6 Test-bed functions, 4 σ -values, 5 sample sizes n 1000 simulations at each test-bed function, σ − and n − level

  13. Simulation Study 6 Test-bed functions, 4 σ -values, 5 sample sizes n 1000 simulations at each test-bed function, σ − and n − level Mean for 3 performance criteria: � i � i � �� ℓ ( f , ˆ − ˆ � L ∞ -norm: f ) = max 1 ≤ i ≤ n � f f � � n n � �� 2 � i � i � ℓ ( f , ˆ − ˆ f ) = 1 � n � L 2 -norm: f f i = 1 n n n

  14. Simulation Study 6 Test-bed functions, 4 σ -values, 5 sample sizes n 1000 simulations at each test-bed function, σ − and n − level Mean for 3 performance criteria: � i � i � �� ℓ ( f , ˆ − ˆ � L ∞ -norm: f ) = max 1 ≤ i ≤ n � f f � � n n � �� 2 � i � i � ℓ ( f , ˆ − ˆ f ) = 1 � n � L 2 -norm: f f i = 1 n n n Peak-identification-loss: ℓ ( f , ˆ f ) = number of unidentified extremes of f + number of superfluous extremes of ˆ f → overall error in identifying extremes of true f with extremes of ˆ f

  15. Approximations of Doppler-data Wavelet (hard) Unbalanced Haar MDL 15 15 15 doppler-data doppler-data doppler-data 5 5 5 0 0 0 -5 -5 -5 -15 -15 -15 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t (n=1024) t (n=1024) t (n=1024) Kernel Plug-in AWS Taut String 15 15 15 doppler-data doppler-data doppler-data 5 5 5 0 0 0 -5 -5 -5 -15 -15 -15 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t (n=1024) t (n=1024) t (n=1024)

  16. Approximations of Blocks-data Wavelet (hard) Unbalanced Haar MDL 15 15 15 blocks-data blocks-data blocks-data 5 5 5 0 0 0 -10 -10 -10 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t (n=1024) t (n=1024) t (n=1024) Kernel Plug-in AWS Taut String 15 15 15 blocks-data blocks-data blocks-data 5 5 5 0 0 0 -10 -10 -10 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t (n=1024) t (n=1024) t (n=1024)

  17. Approximations of a Constant Wavelet (hard) Unbalanced Haar MDL 2 2 2 1 1 1 noise noise noise 0 0 0 -1 -1 -1 -3 -3 -3 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t (n=1024) t (n=1024) t (n=1024) Kernel Plug-in AWS Taut String 2 2 2 1 1 1 noise noise noise 0 0 0 -1 -1 -1 -3 -3 -3 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t (n=1024) t (n=1024) t (n=1024)

  18. Average Ranks Average rank 5 L ∞ -norm 4 3 W H W S U H MD L P L AW S TS TV H S U M P A T V 6 Average rank 5 L 2 -norm 4 3 W H W S U H MD L P L AW S TS TV H S U M P A T V 7 Average rank PID 5 3 W H W S U H MD L P L AW S TS TV H S U M P A T V

  19. Average Ranks Average rank 5 L ∞ -norm 4 3 W H W S U H MD L P L AW S TS TV H S U M P A T V 6 Average rank 5 L 2 -norm 4 3 W H W S U H MD L P L AW S TS TV H S U M P A T V 7 Average rank PID 5 3 W H W S U H MD L P L AW S TS TV H S U M P A T V MR-based TS algorithm performs well

  20. MR criterion and Nadaraya-Watson kernel regression � n �� n  i = 1 K h ( t i − t ) r i √ � n i = 1 K 2 h ( t i − t ) , h ( t i − t ) � = 0 if  i = 1 K 2 r t , h := �� n i = 1 K 2 0 , if h ( t i − t ) = 0  for all t ∈ [ 0 , 1 ] , h > 0, with K h ( · ) := h − 1 K ( h − 1 · ) for the uniform kernel K := I [ − 0 . 5 , 0 . 5 ]

  21. MR criterion and Nadaraya-Watson kernel regression � n �� n  i = 1 K h ( t i − t ) r i √ � n i = 1 K 2 h ( t i − t ) , h ( t i − t ) � = 0 if  i = 1 K 2 r t , h := �� n i = 1 K 2 0 , if h ( t i − t ) = 0  for all t ∈ [ 0 , 1 ] , h > 0, with K h ( · ) := h − 1 K ( h − 1 · ) for the uniform kernel K := I [ − 0 . 5 , 0 . 5 ] Then: iid ◮ r 1 , . . . , r N ∼ N ( 0 , σ 2 ) = ⇒ r t , h ∼N ( 0 , σ 2 ) . ◮ MR criterion: � � 1 � � � sup | r t , h | = max r i � � � � � | I | I ∈I t , h � � i ∈ I

  22. The Multiresolution Norm (Mildenberger 2008) Consider: data ( y 1 , . . . , y N ) (ˆ f 1 , . . . , ˆ estimate f N ) residuals ( r 1 , . . . , r N ) as vectors in R N with the multiresolution norm � � 1 � � � � ( x 1 , . . . , x N ) � MR := max x t � � � � � | I | I ∈I � � t ∈ I

  23. The Multiresolution Norm (Mildenberger 2008) Consider: data ( y 1 , . . . , y N ) (ˆ f 1 , . . . , ˆ estimate f N ) residuals ( r 1 , . . . , r N ) as vectors in R N with the multiresolution norm � � 1 � � � � ( x 1 , . . . , x N ) � MR := max x t � � � � � | I | I ∈I � � t ∈ I Then: Multiresolution criterion is fulfilled ⇒ � y − ˆ ⇐ f � MR ≤ σ C i.e. ˆ f is contained in the MR-Ball of radius σ C centered at y or (equivalently) residuals r = y − ˆ f lie in ball around zero

  24. Multiresolution Norm Unit Ball in R 2 1.5 1 0.5 0 –1.5 –1 –0.5 0.5 1 1.5 –0.5 –1 –1.5

  25. ℓ p -Norms t = 1 | x t | p � 1 / p �� N � ( x 1 , . . . , x N ) � p = ( 1 ≤ p < ∞ ) � ( x 1 , . . . , x N ) � ∞ = max {| x 1 | , . . . , | x N |}

  26. ℓ p -Norms t = 1 | x t | p � 1 / p �� N � ( x 1 , . . . , x N ) � p = ( 1 ≤ p < ∞ ) � ( x 1 , . . . , x N ) � ∞ = max {| x 1 | , . . . , | x N |} invariant w.r.t.:

Recommend


More recommend