screening rules for lasso with non convex sparse
play

Screening Rules for Lasso with Non-Convex Sparse Regularizers A. - PowerPoint PPT Presentation

Screening Rules for Lasso with Non-Convex Sparse Regularizers A. Rakotomamonjy Joint work with G. Gasso and J. Salmon ICML 2019 This work benefited from the support of the project OATMIL ANR-17-CE23-0012 of the French National Research Agency


  1. Screening Rules for Lasso with Non-Convex Sparse Regularizers A. Rakotomamonjy Joint work with G. Gasso and J. Salmon ICML 2019 This work benefited from the support of the project OATMIL ANR-17-CE23-0012 of the French National Research Agency (ANR), the Normandie Projet GRR-DAISI, European funding FEDER DAISI 1 / 6

  2. Objective of the paper Lasso and screening learning sparsity-induced linear models from high-dimensional data X ∈ R n × d , y ∈ R n d 1 � 2 � y − Xw � 2 min 2 + λ | w j | w ∈ R d j =1 Screening rule : identify vanishing variables in w ⋆ . Example with ˆ w , ˆ s intermediate primal-dual solutions : | x ⊤ ⇒ w ⋆ j ˆ s | + r (ˆ w , ˆ s ) � x j � < 1 = j = 0 by exploiting sparsity, convexity and duality. 2.00 l1 logsum 1.75 Extension to non-convex regularizers mcp 1.50 1.25 non-convex regularizers lead to statistically 1.00 better models but 0.75 how to do screening when the regularizer 0.50 0.25 is non-convex? 0.00 −2.0 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0 2 / 6

  3. Non-convex Lasso The problem d 1 � 2 � y − Xw � 2 min 2 + r λ ( | w j | ) w ∈ R d j =1 with the regularizer r λ ( · ) being smooth and concave on [0 , ∞ [. The proposed screening strategy Solve by majorization-minimization d w k +1 = arg min � 1 2 � y − Xw � 2 2 α � w − w k � 2 1 2 + + λ j | w j | , 2 w ∈ R d j =1 with λ j = r ′ λ ( | w j | ) Screen at two levels within each weighted Lasso propagate screened variables information between 2 successive Lasso. 3 / 6

  4. Screening weighted Lasso Optimization problem and screening condition d 1 2 + 1 � 2 � y − Xw � 2 2 α � w − w k � 2 | x ⊤ j s ⋆ − v ⋆ ⇒ w ⋆ min 2 + λ j | w j | j |− λ j < 0 = j = 0 w ∈ R d j =1 with s and v being dual variables and s ⋆ = y − Xw ⋆ and w ⋆ − w ′ ⋆ = α v ⋆ . Our screening test � � x j � + 1 � � | x ⊤ j ˆ s − ˆ v j | + 2 G Λ < λ j α � �� � ( λ j ) T (ˆ w , ˆ s , ˆ v ) j given a primal-dual intermediate solution (ˆ w , ˆ s , ˆ v ), with duality gap G Λ . 4 / 6

  5. Screened variables propagation Setting After iteration k , we have a weigthed Lasso with weights { λ j } and approximate solutions ˆ w , ˆ s and ˆ v . Screened variables are those ( λ j ) (ˆ w , ˆ s , ˆ v ) < λ j T j Before iteration k + 1 change of weights { λ ν j } j =1 ,..., d w ν ,ˆ s ν , ˆ v ν ), new primal-dual triplet (ˆ Screening propagation test √ √ 2 b ) + c + 1 ( λ j ) 2 b < λ ν T (ˆ w , ˆ s , ˆ v ) + � x j � ( a + j j α s ν − ˆ w ν , ˆ s ν , ˆ v ν ) | ≤ b and | ˆ v ν with that � ˆ s � 2 ≤ a , | G Λ (ˆ w , ˆ s , ˆ v ) − G Λ ν (ˆ j − ˆ v j | ≤ c . 5 / 6

  6. Summary First approach for screening with non-convex regularizers Convexification and propagation At poster #190 Pacific Ballroom More technical details Experimental results on computational gain and on propagation strategy Regularization Path - n=50 d=100 p=5 σ =2.00 Ratio of screened variables ncxCD 100 GIST MM genuine 0.8 MM screening Percentage of time of ncxCD 80 0.6 60 0.4 40 0.2 Pre-PWL 20 Post-PWL 0.0 0 0 5 10 15 20 25 30 1.00e-03 1.00e-04 1.00e-05 Iterations Tolerance 6 / 6

Recommend


More recommend