overview least angle regression
play

Overview Least Angle Regression Why is LARS imporant? Tim - PowerPoint PPT Presentation

Overview Least Angle Regression Why is LARS imporant? Tim Hesterberg, Insightful Corp. Other packages GLARS package Issues 16 June 2006 Insightful Research This is joint work with Chris Fraley, with support from NIH SBIR


  1. Overview Least Angle Regression ◮ Why is LARS imporant? Tim Hesterberg, Insightful Corp. ◮ Other packages ◮ GLARS package ◮ Issues 16 June 2006 ◮ Insightful Research This is joint work with Chris Fraley, with support from NIH SBIR Phase I 1 R43 GM074313-01 Tim Hesterberg, Insightful Corp. Least Angle Regression Tim Hesterberg, Insightful Corp. Least Angle Regression Why is LARS important? Ridge Regression Y i ) + λ � ˆ ◮ Minimize � ( Y i − ˆ β 2 j S5 ◮ Variable Selection in Regression ◮ Important 500 BMI S2 ◮ Many approaches: stagewise, boosting, LASSO, regularization, . . . BP ◮ Least Angle Regression — Efron, Hastie, Johnstone, S4 S3 Tibshirani (2004) Annals (with discussion) S6 beta 0 AGE 1. Lasso 2. Forward stagewise SEX 3. Least Angle Regression (LAR) ◮ Unifying explanation −500 ◮ Fast implementation ◮ Fast way to choose tuning parameter S1 0.0 0.1 1.0 10.0 theta Tim Hesterberg, Insightful Corp. Least Angle Regression Tim Hesterberg, Insightful Corp. Least Angle Regression

  2. LASSO Forward Stagewise Regression Y i ) + λ � | ˆ ◮ Minimize � ( Y i − ˆ β j | ◮ Forces small coefficients → 0; gives simpler models. ◮ Smaller penalty on large coefficients: less effect on important terms (Forward Stagewise = Least Squares Boosting) ◮ Implementation is more complicated and slower 1. Initialize: standardize predictors, center y , r = y , β 1 = . . . = β p = 0 LASSO Ridge Regression 2. Repeat many times S5 S5 ◮ Find the predictor x j most correlated with r Standardized Coefficients Standardized Coefficients 500 500 ◮ δ = ǫ sign( r · x j ) BMI BMI S2 S2 BP BP ◮ ˆ β j ← ˆ β j + δ S4 S4 S3 S3 ◮ r ← r − δ x j S6 S6 0 0 AGE AGE SEX SEX −500 −500 S1 S1 0 1000 2000 3000 0 1000 2000 3000 sum( |beta| ) sum( |beta| ) Tim Hesterberg, Insightful Corp. Least Angle Regression Tim Hesterberg, Insightful Corp. Least Angle Regression Forward Stagewise and LASSO Similarity: ✬ ✩ March 2003 Trevor Hastie, Stanford Statistics 6 Prostate Cancer Data Lasso Forward Stagewise Are LASSO and infinitesimal forward stagewise identical? lcavol lcavol ◮ With orthogonal predictors, yes. 0.6 0.6 ◮ Otherwise similar. 0.4 0.4 Coefficients Coefficients Least Angle Regression provides explanation, and fast svi svi lweight lweight pgg45 pgg45 implementation. 0.2 lbph 0.2 lbph 0.0 0.0 gleason gleason age age -0.2 -0.2 lcp lcp 0.0 0.5 1.0 1.5 2.0 2.5 0 50 100 150 200 250 t = � j | β j | Iteration ✫ ✪ Tim Hesterberg, Insightful Corp. Least Angle Regression Tim Hesterberg, Insightful Corp. Least Angle Regression

  3. Stepwise, Forward Stagewise, Least Angle Least Angle Regression Stepwise regression: ◮ Pick predictor most correlated with y X2 D C ◮ Bring predictor completely into model (full LS E fit) Forward stagewise: ◮ Pick predictor most correlated with y O X1 B A ◮ Increment coefficient for predictor Least Angle Regression: ◮ Pick predictor most correlated with y C = projection of y onto space spanned by X 1 and X 2 . ◮ Bring predictor into model only to extent it is B = first step for least-angle regression better than others E = point on stagewise path ◮ Move in least-squares direction until another variable is as correlated Tim Hesterberg, Insightful Corp. Least Angle Regression Tim Hesterberg, Insightful Corp. Least Angle Regression LARS - other packages S+GLARS ◮ S-PLUS and R, open source ◮ Incorporate lars , glmpath ◮ Cleanup, consistent interface ◮ Incorporate future work by others; provide framework lars : Efron and Hastie (S-PLUS and R) ◮ Extensions ◮ Linear regression ◮ Numerically-accurate calculations ◮ Factors, splines, polynomials, interactions, . . . glmpath : Park and Hastie (R) ◮ Other models (robust regression, . . . ), other penalties ◮ GLM and Cox Proportional Hazards ◮ Missing data ◮ Massive data sets Methods: plot , print , predict , cv , coef ◮ Diagnostics, tools for selecting tuning parameter ◮ User-friendly ◮ Consistent interface ◮ GUI ◮ Documentation Tim Hesterberg, Insightful Corp. Least Angle Regression Tim Hesterberg, Insightful Corp. Least Angle Regression

  4. Issues Insightful Research Department ◮ Turn research into software for wide use ◮ Higher standards than academic software (ease of use, robustness, testing) ◮ Collaboration ◮ Money ◮ Variety: resampling, missing data, group sequential designs, ◮ NIH funding: require commercial potential simulation-based econometric software, functional data, stable ◮ Insightful: indirect benefit distributions, proteomics, microarrays, frailty models, causal ◮ Outside contributors modeling ◮ Licensing; ability to ship with S-PLUS, I-Miner. ◮ External funding — SBIR grants (NIH, NSF, . . . ) ◮ Somewhat easier funding ◮ Commercial potential ◮ Risk, research element ◮ We’re hiring ◮ We’re looking for good projects and collaborators Tim Hesterberg, Insightful Corp. Least Angle Regression Tim Hesterberg, Insightful Corp. Least Angle Regression

Recommend


More recommend