faster and still safe combining screening techniques and
play

Faster and still safe: Combining screening techniques and - PowerPoint PPT Presentation

Faster and still safe: Combining screening techniques and structured dictionaries to accelerate the Lasso Cssio F. DANTAS, Rmi GRIBONVAL cassio.fraga-dantas@inria.fr, remi.gribonval@inria.fr 1 - 03/04/18 Accelerate the Lasso


  1. Faster and still safe: Combining screening techniques and structured dictionaries to accelerate the Lasso Cássio F. DANTAS, Rémi GRIBONVAL cassio.fraga-dantas@inria.fr, remi.gribonval@inria.fr 1 - 03/04/18

  2. Accelerate the Lasso optimization by combining two strategies : 1) Safe Screening Rules 2) Fast Structured Dictionaries

  3. Contents 01. Context (Lasso problem) 02. Fast Structured Dictionaries 03. Screening Rules 04. Screening Rules w/ Approx. Dictionaries 05. Results 06. Conclusion 3 - 03/04/18

  4. 01 Context 4

  5. 01 Context Lasso problem The l1-regularized least squares. Denoting : the observation vector; ● the design matrix (or dictionary); ● the sparse representation vector; ● parameter controlling the sparsity of the solution. ● 5

  6. 01 Context Dual Lasso Dual formulation of the Lasso problem : Denoting : the dual variable; ● the feasible set; ● 6

  7. 02 Fast Structured Dictionaries 7

  8. Fast Structured Dictionaries Motivation Iterative algorithms are often used to solve the Lasso problem. ● Exemple : ISTA (Iterative Shrinkage-Thresholding Algorithm) ● T wo matrix-vector multiplications at each iteration. ● Quadratic complexity! Can it be reduced ? 8

  9. Fast Approximate Dictionaries Structure ⇒ Acceleration Accelerate matrix-vector multiplications Constrain the dictionary matrix to have a certain type of structure. Examples : Kronecker product – Sparse factors – Circulant factors – – (...) 9

  10. Fast Approximate Dictionaries Structured Approximation If the dictionary matrix is not structured, an structured approximation can be found. where is the approximation error matrix and is its -th column. 10

  11. 01 Context Algorithm (high level view) 1) Start Lasso optimization by using the structured , to take advantage of its reduced multiplication cost. 2) As the algorithm approaches the solution, switch back to the original dictionary . 11

  12. 01 Context Algorithm (high level view) 1) Start Lasso optimization by using the structured , to take advantage of its reduced multiplication cost. 2) As the algorithm approaches the solution, switch back to the original dictionary . 11

  13. 02 Safe Screening Rules 12

  14. Safe Screening Rules Safe Screening Rules for identifying inactive dictionary atoms, before completely solving the problem. ● 13

  15. Safe Screening Rules Safe Screening Rules for identifying inactive dictionary atoms, before completely solving the problem. ● Dictionary columns that will receive zero weight on the reconstruction of the input signal 13

  16. Safe Screening Rules Safe Screening Rules for identifying inactive dictionary atoms, before completely solving the problem. ● Dictionary columns that will receive zero weight on the reconstruction of the input signal 13

  17. Safe Screening Rules Safe Screening Rules for identifying inactive dictionary atoms, before completely solving the problem. ● Dictionary columns that will receive zero weight on the reconstruction of the input signal Solution support Inactive atoms 13

  18. Safe Screening Rules Safe Screening Rules for identifying inactive dictionary atoms, before completely solving the problem. ● Dictionary columns that will receive zero weight on the reconstruction of the input signal Inactive atoms We can eliminate such atoms. ● Zero risk of false eliminations ! ● 13

  19. Screening T est Function of the atom ● is surely inactive. 14

  20. Screening T est Function of the atom ● is surely inactive. Rejection set: Preserved set: 14

  21. Safe Screening Rules Screening T est - Details Given a region ( safe region ) which contains . Sphere test Safe region is a closed l2-ball with center c and radius r : 15

  22. 04 Screening Rules with Approximate Dictionaries 16

  23. Extending Screening Rules Rejection set Guarantee: 17

  24. Extending Screening Rules Rejection set Guarantee: 17

  25. Extending Screening Rules Rejection set Guarantee: Rejection set 17

  26. Extending Screening Rules Rejection set Guarantee: Rejection set 17

  27. Extending Screening Rules Rejection set Guarantee: Rejection set 17

  28. Extending Screening Rules Extending sphere tests Suppose a safe sphere given. Sphere test : A certain « security margin » must be added to account for the atom approximation error. 18

  29. Extending Screening Rules Extending sphere tests Suppose a safe sphere given. Sphere test : A certain « security margin » must be added to account for the atom approximation error. Sphere test with approximate dictionary : 18

  30. Extending Screening Rules Obtaining a safe sphere GAP safe sphere : Given a primal-dual estimation at iteration . with the duality gap at iteration . 19

  31. Extending Screening Rules Obtaining a safe sphere GAP safe sphere : Given a primal-dual estimation at iteration . with the duality gap at iteration . 19

  32. Extending Screening Rules Obtaining a safe sphere GAP safe sphere : Given a primal-dual estimation at iteration . with the duality gap at iteration . 19

  33. Extending Screening Rules Obtaining a safe sphere GAP safe sphere : Given a primal-dual estimation at iteration . with the duality gap at iteration . 19

  34. Extending Screening Rules Obtaining a safe sphere GAP safe sphere: Given a primal-dual estimation at iteration . GAP safe sphere with approximate dictionary: cannot be calculated, since depends on . Instead, we use a modifjed primal with 20

  35. Dynamic screening Guarantee 1: GAP Safe region sphere Rejection set Guarantee 2: 21

  36. Extended dynamic screening Guarantee 1: Extended GAP Safe region sphere Rejection set Guarantee 2: 21

  37. We now have safe screening rules that manipule an approximate dictionary. But, what’s the impact of the numerous security margins? Is it still worth it?

  38. 05 Results 22

  39. Results Running times per iteration Less inactive atoms are identifjed by the extended screening. ● BUT, structured dictionary makes the initial iterations much faster. ● 23

  40. Results Running times per iteration Less inactive atoms are identifjed by the extended screening. ● BUT, structured dictionary makes the initial iterations much faster. ● 23

  41. Results Running times per iteration 23

  42. Results 24

  43. 06 Conclusion 25

  44. Conclusion The proposed approach combines screening rules and fast approximate dictionaries. ● It reduces even further the execution time with respect screening rules alone. ● Potential extensions Other region types (e.g. domes) ● Other problems than Lasso (e.g. Group-Lasso, Elastic-Net, Regularized Logistic Regression) ● 26

  45. Thank you! Questions? Contact me: cassio.fraga-dantas@inria.fr

  46. Safe Screening Rules Screening test – Details Dual formulation of the Lasso problem : Projection problem ! Feasible region At the dual solution : ➢ Constraints on and are active, i.e. ➢ Constraints on is inactive, i.e. 11

  47. Safe Screening Rules Screening test – Details Every dictionary atom for which is inactive . ● Then, simply calculate for all and discard all atoms for which the result is ● smaller than 1. Dual solution is not known. Identify a region ( safe region ) which contains . Suffjcient condition : is inactive 12

  48. Extending Screening Rules Swithing criterion Reasons to switch back from to : Convergence : to avoid converging to the solution of the approximate problem. ● The higher the approximation error, the sooner we need to switch. ● Screening ratio : the number of active atoms may become so small that the use of ● does not pay ofg anymore.

  49. Extending Screening Rules Comparison Less inactive atoms are identifjed by the extended screening. 23

  50. Extending Screening Rules Swithing criterion 24

  51. Extending Screening Rules Complexity reduction 25

  52. Extending Screening Rules Complexity reduction 25

  53. Simulation Results Impact of the Approximation Error 27

Recommend


More recommend