heuristic optimality check and computational solver
play

Heuristic Optimality Check and Computational Solver Comparison for - PowerPoint PPT Presentation

Heuristic Optimality Check and Computational Solver Comparison for Basis Pursuit Andreas M. Tillmann Research Group Optimization, TU Darmstadt, Germany joint work with Dirk A. Lorenz (TU Braunschweig) and Marc E. Pfetsch (TU Darmstadt) ISMP


  1. Heuristic Optimality Check and Computational Solver Comparison for Basis Pursuit Andreas M. Tillmann Research Group Optimization, TU Darmstadt, Germany joint work with Dirk A. Lorenz (TU Braunschweig) and Marc E. Pfetsch (TU Darmstadt) ISMP 2012 Berlin, Germany 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 1

  2. Outline Motivation Infeasible-Point Subgradient Algorithm ISAL1 Comparison of ℓ 1 -Solvers Testset Construction and Computational Results Improvements with Heuristic Optimality Check Possible Future Research 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 2

  3. Outline Motivation Infeasible-Point Subgradient Algorithm ISAL1 Comparison of ℓ 1 -Solvers Testset Construction and Computational Results Improvements with Heuristic Optimality Check Possible Future Research 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 3

  4. Sparse Recovery via ℓ 1 -Minimization ◮ Seek sparsest solution to underdetermined linear system: ( A ∈ R m × n , m < n ) min � x � 0 s. t. Ax = b ◮ Finding minimum-support solution is NP -hard. Convex “relaxation”: ℓ 1 -minimization / Basis Pursuit: min � x � 1 s. t. Ax = b (L1) ◮ Several conditions (RIP , Nullspace Property, etc) ensure “ ℓ 0 - ℓ 1 -equivalence” 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 4

  5. Solving the Basis Pursuit Problem ◮ (L1) can be recast as a linear program ◮ Broad variety of specialized algorithms for (L1) ◮ direct or primal-dual approaches ◮ regularization, penalty methods ◮ further relaxations (e. g. � Ax − b � ≤ δ instead of Ax = b ) ◮ ... � � ◮ Which algorithm is “the best”? 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 5

  6. Solving the Basis Pursuit Problem ◮ (L1) can be recast as a linear program ◮ Broad variety of specialized algorithms for (L1) ◮ direct or primal-dual approaches ◮ regularization, penalty methods ◮ further relaxations (e. g. � Ax − b � ≤ δ instead of Ax = b ) ◮ ... � � ◮ Which algorithm is “the best”? ◮ A classic algorithm from nonsmooth optimization: (projected) subgradient method – competitive? 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 5

  7. Outline Motivation Infeasible-Point Subgradient Algorithm ISAL1 Comparison of ℓ 1 -Solvers Testset Construction and Computational Results Improvements with Heuristic Optimality Check Possible Future Research 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 6

  8. Projected Subgradient Methods min f ( x ) s.t. x ∈ F ( f , F convex) Problem: standard projected subgradient iteration x k − α k h k � α k > 0, h k ∈ ∂ f ( x k ) x k +1 = P F � , applicability: only reasonable if projection is “easy” 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 7

  9. Projected Subgradient Methods min f ( x ) s.t. x ∈ F ( f , F convex) Problem: standard projected subgradient iteration x k − α k h k � α k > 0, h k ∈ ∂ f ( x k ) x k +1 = P F � , applicability: only reasonable if projection is “easy” idea: replace exact projection by approximation � “infeasible” subgradient iteration x k +1 = P ε k x k − α k h k � �P ε k � F − P F � 2 ≤ ε k , F 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 7

  10. ISA = Infeasible-Point Subgradient Algorithm ◮ ... works for arbitrary convex objectives and constraint sets ◮ ... incorporates adaptive approximate projections P ε F such that �P ε F ( y ) − P F ( y ) � 2 ≤ ε for every ε ≥ 0 ◮ ... converges to optimality (under reasonable assumptions) whenever projection inaccuracies ( ε k ) sufficiently small, ◮ for stepsizes α k > 0 with � ∞ k =0 α k = ∞ , � ∞ k =0 α 2 k < ∞ ◮ for dynamic stepsizes α k = λ k f ( x k ) − ϕ / � h k � 2 2 with ϕ ≤ ϕ ∗ � � ◮ ... converges to ϕ with dynamic stepsizes using ϕ ≥ ϕ ∗ 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 8

  11. ISAL1 = Spezialization of ISA to ℓ 1 -Minimization ◮ f ( x ) = � x � 1 , F = { x | Ax = b } , sign( x ) ∈ ∂ � x � 1 ◮ exact projected subgradient step for (L1): x k − α k h k � x k +1 = P F � = ( x k − α k h k ) − A T ( AA T ) − 1 � A ( x k − α k h k ) − b � 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 9

  12. ISAL1 = Spezialization of ISA to ℓ 1 -Minimization ◮ f ( x ) = � x � 1 , F = { x | Ax = b } , sign( x ) ∈ ∂ � x � 1 ◮ exact projected subgradient step for (L1): y k ← x k − α k h k z k ← Solution of AA T z = Ay k − b x k +1 ← y k − A T z k = P F ( y k ) AA T is s.p.d. ⇒ may employ CG to solve equation system 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 9

  13. ISAL1 = Spezialization of ISA to ℓ 1 -Minimization ◮ f ( x ) = � x � 1 , F = { x | Ax = b } , sign( x ) ∈ ∂ � x � 1 ◮ inexact projected subgradient step for (L1): y k ← x k − α k h k z k ← Solution of AA T z ≈ Ay k − b x k +1 ← y k − A T z k = P ε k F ( y k ) AA T is s.p.d. ⇒ may employ CG to solve equation system ◮ Approximation: Stop after a few CG iterations (CG residual norm ≤ σ min ( A ) · ε k ⇒ P ε k F fits ISA framework) 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 9

  14. Why simple subgradient scheme? Drawbacks of standard subgradient algorithms can often be alleviated by bundle methods, especially concerning “excessive” parameter tuning Experiments for (L1) with two bundle method implementations (E. Hübner’s and ConicBundle): ◮ approach 1: choose B s.t. A B regular, then with d := A − 1 B b , D := A − 1 B A [ n ] \ B , ⇔ min � z � 1 + � d − Dz � 1 (L1) ◮ approach 2: handle constraint implicitly by using conditional subgradients ◮ tried various parameter settings (bundle size, periodic restarts) Surprise: very often, these bundle solvers did not reach a solution (but ISA did) 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 10

  15. Outline Motivation Infeasible-Point Subgradient Algorithm ISAL1 Comparison of ℓ 1 -Solvers Testset Construction and Computational Results Improvements with Heuristic Optimality Check Possible Future Research 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 11

  16. Our Testset ◮ 100 matrices A (74 dense, 26 sparse) dense: 512 × { 1024, 1536, 2048, 4096 } ◮ × { 2048, 3072, 4096, 8192 } 1024 × { 4096, 6144, 8192, 12288 } sparse: 2048 × { 16384, 24576, 32768, 49152 } 8192 ◮ random (e.g., partial Hadamard, random signs, ...) ◮ concatenations of dictionaries (e.g., [Haar, ID, RST], ...) ◮ columns normalized ◮ 4 or 6 vectors x ∗ per matrix such that each resulting (L1) instance (with b := Ax ∗ ) has unique optimum x ∗ 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 12

  17. Constructing Unique Solutions 548 instances with known, unique solution vectors x ∗ : ◮ For each matrix A , choose support S which obeys ∈ S � A † ERC ( A , S ) := max S a j � 1 < 1. j / 1. pick S at random, and 2. try increasing some S by repeatedly adding the resp. arg max 3. For dense A ’s, use L1TestPack to construct another unique solution support (via optimality condition for (L1)) ◮ Entries of x ∗ S random with i) high dynamic range ( − 10 5 , 10 5 ) ii) low dynamic range ( − 1, 1) 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 13

  18. Comparison Setup ◮ Only exact solvers for (L1): min � x � 1 s. t. Ax = b ◮ Tested algorithms: ISAL1, SPGL1, YALL1, ℓ 1 -Magic, SolveBP (SparseLab), ℓ 1 -Homotopy, CPLEX (Dual Simplex) ◮ Use default settings (black box usage) ◮ Solution ¯ x − x ∗ � 2 ≤ 10 − 6 x “optimal”, if � ¯ ◮ Solution ¯ x − x ∗ � 2 ≤ 10 − 1 x “acceptable”, if � ¯ 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 14

  19. Running Time vs. Distance from Unique Optimum (whole testset) ISAL1 6 10 SPGL1 YALL1 CPLEX l1−MAGIC 0 10 SparseLab / x ∗ � 2 PDCO Homotopy x − −6 10 � ¯ −12 10 −2 −1 0 1 2 10 10 10 10 10 Running Times [sec] 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 15

  20. Running Time vs. Distance from Unique Optimum (high dynamic range) ISAL1 6 10 SPGL1 YALL1 CPLEX l1−MAGIC 0 10 SparseLab / x ∗ � 2 PDCO Homotopy x − −6 10 � ¯ −12 10 −2 −1 0 1 2 10 10 10 10 10 Running Times [sec] 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 16

  21. Running Time vs. Distance from Unique Optimum (low dynamic range) ISAL1 6 10 SPGL1 YALL1 CPLEX l1−MAGIC 0 10 SparseLab / x ∗ � 2 PDCO Homotopy x − −6 10 � ¯ −12 10 −2 −1 0 1 2 10 10 10 10 10 Running Times [sec] 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 17

Recommend


More recommend