on the distribution of the adaptive lasso estimator
play

On the Distribution of the Adaptive LASSO Estimator U. Schneider - PowerPoint PPT Presentation

Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion On the Distribution of the Adaptive LASSO Estimator U. Schneider (joint with B. M. P otscher) Universit at Wien Workshop on


  1. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion On the Distribution of the Adaptive LASSO Estimator U. Schneider (joint with B. M. P¨ otscher) Universit¨ at Wien Workshop on Current Trends and Challenges in Model Selection, Vienna, July 24, 2008

  2. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Penalized ML Estimators Linear regression model y = X θ + u , consider estimator ˆ θ for θ ˆ � y − X θ � 2 θ = arg min + λ n p ( θ ) � �� � � �� � θ ∈ R k penalty likelihood ( LS ) − part λ n is a tuning parameter. Bridge estimators ( l p - type penalties, Frank and Friedman, 1993, LASSO for p = 1 , Tibshirani, 1996). Hard- and soft-thresholding estimators. Smoothly clipped absolute deviation (SCAD) estimator (Fan and Li, 2001). Adaptive LASSO estimator (Zou, 2006). These estimators can be viewed to simultaneously perform model selection and parameter estimation. ( p ≤ 1 for Bridge est.)

  3. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Some terminology Conservative model selection – Zero coefficients are found with asymptotic probability less than 1. Consistent model selection – Zero coefficients are found with asymptotic probability equal to 1. Oracle property – Asymptotic distribution coincides with the one of the unpenalized estimator of the true model. Consistent vs. conservative model selection is in our context driven by the asymptotic choice of tuning parameters λ n . (“Sparsely” vs. “non-sparsely” tuned procedures).

  4. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Some literature on distributional properties of PMLEs Knight and Fu, 2000. Moving-parameter asymptotics for (non-sparsely tuned) LASSO and Bridge estimators in general. Fan and Li, 2001. Fixed-parameter asymptotics for SCAD. Zou, 2006. Fixed-parameter asymptotics for LASSO and adaptive LASSO. P¨ otscher and Leeb, 2007. Finite-sample distribution, moving-parameter asymptotics for hard-thresholding, LASSO, and SCAD. Impossibility result for the estimation of the cdf. P¨ otscher and Schneider, 2007. Analogous results for the adaptive LASSO. P¨ otscher and Schneider, 2008. Finite-sample and asymptotic coverage probabilities of confidence sets for hard-thresholing, LASSO, ad. LASSO. . . .

  5. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Definition of the adaptive LASSO estimator ˆ θ AL Linear regression model y = X θ + u . X is n × k , non-stochastic, rk( X ) = k . u ∼ N n (0 , σ 2 I n ) Adaptive LASSO estimator, Zou, 2006 (random penalty weights) k � � y − X θ � 2 + 2 n µ 2 ˆ | θ j | / | ˆ θ AL = arg min θ OLS , j | , µ n > 0 n θ ∈ R k j =1 For the theoretical analysis, assume that σ 2 is known and that X ′ X is diagonal, in particular X ′ X = n I k . Remove these assumptions for simulation results concerning the finite-sample distribution.

  6. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Explicit solution in the simplified model Wlog consider Gaussian location model y 1 , . . . , y n ∼ N ( θ, 1) . Then ˆ y and θ OLS = ¯ � 0 if | ¯ y | ≤ µ n ˆ θ AL = y − µ 2 ¯ n / ¯ y if | ¯ y | > µ n ˆ θ AL ¯ y µ n

  7. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Consistency of ˆ θ AL Estimation consistency: The condition µ n → 0 is equivalent to the consistency of ˆ θ AL . Then ˆ θ AL is also is also uniformly consistent for θ , i.e. for all ε > 0 �� � � � ˆ � � n →∞ sup lim P n ,θ θ AL − θ � > ε = 0 θ ∈ R Model selection consistency: two possible regimes arise. The case µ n → 0 and n 1 / 2 µ n → m , 0 ≤ m < ∞ , corresponds 1 to conservative model selection (non-sparsely tuned). The case µ n → 0 and n 1 / 2 µ n → ∞ corresponds to consistent 2 model selection (sparsely tuned).

  8. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion The finite-sample distribution of ˆ θ AL F n ,θ ( x ) = P n ,θ ( n 1 / 2 (ˆ θ AL − θ ) ≤ x ) is given by � � � � z (2) z (1) 1 ( n 1 / 2 θ + x ≥ 0) Φ + 1 ( n 1 / 2 θ + x < 0) Φ n ,θ ( x ) n ,θ ( x ) . (( n 1 / 2 θ + x ) / 2) 2 + n µ 2 z (2) n ,θ ( x ) and z (1) n ,θ ( x ) are − ( n 1 / 2 θ − x ) / 2 ± p n . dF n ,θ ( x ) = { Φ( n 1 / 2 ( − θ + µ n )) − Φ( n 1 / 2 ( − θ − µ n )) } d δ − n 1 / 2 θ ( x ) + � � z (2) 0 . 5 × { 1 ( n 1 / 2 θ + x > 0) φ n ,θ ( x ) (1 + t n ,θ ( x )) + � � z (1) 1 ( n 1 / 2 θ + x < 0) φ n ,θ ( x ) (1 − t n ,θ ( x )) } dx ” − 1 / 2 “ (( n 1 / 2 θ + x ) / 2) 2 + n µ 2 where t n ,θ ( x ) := . Φ and φ the cdf and pdf of n N (0 , 1), resp.

  9. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion The finite-sample distribution of ˆ θ AL n = 40, θ = 0 . 05, µ n = 0 . 05

  10. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotics – both regimes 1 Conservative case. F n ,θ converges weakly to ` x ` x 2 ) 2 + m 2 ´ 2 ) 2 + m 2 ´ ( x ( x  p p 1 ( x ≥ 0) Φ 2 + + 1 ( x < 0) Φ 2 − θ = 0 Φ( x ) θ � = 0 2 Consistent case. F n ,θ converges weakly to  1 ( x ≥ 0) θ = 0 θ � = 0 and n 1 / 2 µ 2 Φ( x + ρθ ) n → ρ If n 1 / 4 µ n → 0 , F n ,θ ( x ) → Φ( x ) for θ � = 0 (“oracle property”, Zou, 2006).

  11. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case µ n = n − 1 / 3 (consistent case) n = 1 ,

  12. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case µ n = n − 1 / 3 (consistent case) n = 10 ,

  13. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case µ n = n − 1 / 3 (consistent case) n = 50 ,

  14. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case µ n = n − 1 / 3 (consistent case) n = 100 ,

  15. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case µ n = n − 1 / 3 (consistent case) n = 200 ,

  16. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case µ n = n − 1 / 3 (consistent case) n = 500 ,

  17. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case µ n = n − 1 / 3 (consistent case) n = 1000 ,

  18. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case µ n = n − 1 / 3 (consistent case) n = 2000 ,

  19. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case µ n = n − 1 / 3 (consistent case) n = 5000 ,

  20. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case µ n = n − 1 / 3 (consistent case) n = 10 4 ,

  21. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case n = 5 × 10 4 , µ n = n − 1 / 3 (consistent case)

  22. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case n = 5 × 10 5 , µ n = n − 1 / 3 (consistent case)

  23. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case µ n = n − 1 / 3 (consistent case) n = 10 6 ,

  24. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case µ n = n − 1 / 3 (consistent case) n = 10 6 , Is the non-normality of the finite-sample distribution a transient feature as n → ∞ ?

  25. Introduction Adaptive LASSO Consistency Distributions Other PMLEs Simulations CDF Estimation Conclusion Fixed-parameter asymptotic – consistent case µ n = n − 1 / 3 (consistent case) n = 10 6 , Is the non-normality of the finite-sample distribution a transient feature as n → ∞ ? Need to look at moving-parameter asymptotics!

Recommend


More recommend