Faster and still safe: Combining screening techniques and structured dictionaries to accelerate the Lasso Cássio F. DANTAS, Rémi GRIBONVAL cassio.fraga-dantas@inria.fr, remi.gribonval@inria.fr 1 - 03/04/18
Accelerate the Lasso optimization by combining two strategies : 1) Safe Screening Rules 2) Fast Structured Dictionaries
Contents 01. Context (Lasso problem) 02. Fast Structured Dictionaries 03. Screening Rules 04. Screening Rules w/ Approx. Dictionaries 05. Results 06. Conclusion 3 - 03/04/18
01 Context 4
01 Context Lasso problem The l1-regularized least squares. Denoting : the observation vector; ● the design matrix (or dictionary); ● the sparse representation vector; ● parameter controlling the sparsity of the solution. ● 5
01 Context Dual Lasso Dual formulation of the Lasso problem : Denoting : the dual variable; ● the feasible set; ● 6
02 Fast Structured Dictionaries 7
Fast Structured Dictionaries Motivation Iterative algorithms are often used to solve the Lasso problem. ● Exemple : ISTA (Iterative Shrinkage-Thresholding Algorithm) ● T wo matrix-vector multiplications at each iteration. ● Quadratic complexity! Can it be reduced ? 8
Fast Approximate Dictionaries Structure ⇒ Acceleration Accelerate matrix-vector multiplications Constrain the dictionary matrix to have a certain type of structure. Examples : Kronecker product – Sparse factors – Circulant factors – – (...) 9
Fast Approximate Dictionaries Structured Approximation If the dictionary matrix is not structured, an structured approximation can be found. where is the approximation error matrix and is its -th column. 10
01 Context Algorithm (high level view) 1) Start Lasso optimization by using the structured , to take advantage of its reduced multiplication cost. 2) As the algorithm approaches the solution, switch back to the original dictionary . 11
01 Context Algorithm (high level view) 1) Start Lasso optimization by using the structured , to take advantage of its reduced multiplication cost. 2) As the algorithm approaches the solution, switch back to the original dictionary . 11
02 Safe Screening Rules 12
Safe Screening Rules Safe Screening Rules for identifying inactive dictionary atoms, before completely solving the problem. ● 13
Safe Screening Rules Safe Screening Rules for identifying inactive dictionary atoms, before completely solving the problem. ● Dictionary columns that will receive zero weight on the reconstruction of the input signal 13
Safe Screening Rules Safe Screening Rules for identifying inactive dictionary atoms, before completely solving the problem. ● Dictionary columns that will receive zero weight on the reconstruction of the input signal 13
Safe Screening Rules Safe Screening Rules for identifying inactive dictionary atoms, before completely solving the problem. ● Dictionary columns that will receive zero weight on the reconstruction of the input signal Solution support Inactive atoms 13
Safe Screening Rules Safe Screening Rules for identifying inactive dictionary atoms, before completely solving the problem. ● Dictionary columns that will receive zero weight on the reconstruction of the input signal Inactive atoms We can eliminate such atoms. ● Zero risk of false eliminations ! ● 13
Screening T est Function of the atom ● is surely inactive. 14
Screening T est Function of the atom ● is surely inactive. Rejection set: Preserved set: 14
Safe Screening Rules Screening T est - Details Given a region ( safe region ) which contains . Sphere test Safe region is a closed l2-ball with center c and radius r : 15
04 Screening Rules with Approximate Dictionaries 16
Extending Screening Rules Rejection set Guarantee: 17
Extending Screening Rules Rejection set Guarantee: 17
Extending Screening Rules Rejection set Guarantee: Rejection set 17
Extending Screening Rules Rejection set Guarantee: Rejection set 17
Extending Screening Rules Rejection set Guarantee: Rejection set 17
Extending Screening Rules Extending sphere tests Suppose a safe sphere given. Sphere test : A certain « security margin » must be added to account for the atom approximation error. 18
Extending Screening Rules Extending sphere tests Suppose a safe sphere given. Sphere test : A certain « security margin » must be added to account for the atom approximation error. Sphere test with approximate dictionary : 18
Extending Screening Rules Obtaining a safe sphere GAP safe sphere : Given a primal-dual estimation at iteration . with the duality gap at iteration . 19
Extending Screening Rules Obtaining a safe sphere GAP safe sphere : Given a primal-dual estimation at iteration . with the duality gap at iteration . 19
Extending Screening Rules Obtaining a safe sphere GAP safe sphere : Given a primal-dual estimation at iteration . with the duality gap at iteration . 19
Extending Screening Rules Obtaining a safe sphere GAP safe sphere : Given a primal-dual estimation at iteration . with the duality gap at iteration . 19
Extending Screening Rules Obtaining a safe sphere GAP safe sphere: Given a primal-dual estimation at iteration . GAP safe sphere with approximate dictionary: cannot be calculated, since depends on . Instead, we use a modifjed primal with 20
Dynamic screening Guarantee 1: GAP Safe region sphere Rejection set Guarantee 2: 21
Extended dynamic screening Guarantee 1: Extended GAP Safe region sphere Rejection set Guarantee 2: 21
We now have safe screening rules that manipule an approximate dictionary. But, what’s the impact of the numerous security margins? Is it still worth it?
05 Results 22
Results Running times per iteration Less inactive atoms are identifjed by the extended screening. ● BUT, structured dictionary makes the initial iterations much faster. ● 23
Results Running times per iteration Less inactive atoms are identifjed by the extended screening. ● BUT, structured dictionary makes the initial iterations much faster. ● 23
Results Running times per iteration 23
Results 24
06 Conclusion 25
Conclusion The proposed approach combines screening rules and fast approximate dictionaries. ● It reduces even further the execution time with respect screening rules alone. ● Potential extensions Other region types (e.g. domes) ● Other problems than Lasso (e.g. Group-Lasso, Elastic-Net, Regularized Logistic Regression) ● 26
Thank you! Questions? Contact me: cassio.fraga-dantas@inria.fr
Safe Screening Rules Screening test – Details Dual formulation of the Lasso problem : Projection problem ! Feasible region At the dual solution : ➢ Constraints on and are active, i.e. ➢ Constraints on is inactive, i.e. 11
Safe Screening Rules Screening test – Details Every dictionary atom for which is inactive . ● Then, simply calculate for all and discard all atoms for which the result is ● smaller than 1. Dual solution is not known. Identify a region ( safe region ) which contains . Suffjcient condition : is inactive 12
Extending Screening Rules Swithing criterion Reasons to switch back from to : Convergence : to avoid converging to the solution of the approximate problem. ● The higher the approximation error, the sooner we need to switch. ● Screening ratio : the number of active atoms may become so small that the use of ● does not pay ofg anymore.
Extending Screening Rules Comparison Less inactive atoms are identifjed by the extended screening. 23
Extending Screening Rules Swithing criterion 24
Extending Screening Rules Complexity reduction 25
Extending Screening Rules Complexity reduction 25
Simulation Results Impact of the Approximation Error 27
Recommend
More recommend