relationships between necessary optimality conditions for
play

Relationships between necessary optimality conditions for the 2 - 0 - PowerPoint PPT Presentation

Relationships between necessary optimality conditions for the 2 - 0 minimization problem. Emmanuel Soubies Imaging in Paris Seminar - October 3 2019 Signal and Communication Group, IRIT, Universit e de Toulouse, CNRS L. Blanc-F


  1. Relationships between necessary optimality conditions for the ℓ 2 - ℓ 0 minimization problem. Emmanuel Soubies Imaging in Paris Seminar - October 3 2019 Signal and Communication Group, IRIT, Universit´ e de Toulouse, CNRS L. Blanc-F´ eraud G. Aubert 1

  2. 2

  3. Outline of the talk 1. Introduction ℓ 2 - ℓ 0 Minimization 2. Necessary optimality conditions 3. Relationship between optimality conditions 4. Quantifying “optimal” points 5. Algorithms and necessary optimality conditions 6. Concluding remarks 3

  4. Introduction ℓ 2 - ℓ 0 Minimization 4

  5. Formulation The ℓ 2 - ℓ 0 minimization problem � 1 � 2 � Ax − y � 2 + λ � x � 0 ˆ x ∈ , arg min x ∈ R N ◮ A ∈ R M × N with M ≪ N , ◮ Sparsity is modeled with the ℓ 0 pseudo-norm : � x � 0 = ♯ { x i , i ∈ [1 , . . . , N ] : x i � = 0 } ◮ Non-convex and NP-Hard problem [Natarajan, 1995, Nguyen et al., 2019] 5

  6. Formulation The ℓ 2 - ℓ 0 minimization problem � 1 � 2 � Ax − y � 2 + λ � x � 0 ˆ x ∈ arg min , x ∈ R N ◮ A ∈ R M × N with M ≪ N , ◮ Sparsity is modeled with the ℓ 0 pseudo-norm : � x � 0 = ♯ { x i , i ∈ [1 , . . . , N ] : x i � = 0 } ◮ Non-convex and NP-Hard problem [Natarajan, 1995, Nguyen et al., 2019] Applications - Inverse problems, - Statistical regression, - Machine learning, - Compressed sensing ... 5

  7. A Brief Literature Review Convex relaxations Greedy algorithms Iterative thresholding algorithms Global optimization 6

  8. A Brief Literature Review Convex relaxations ◮ Basic Pursuit De-Noising [Chen et al., 2001] , LASSO, [Tibshirani, 1996] , � � 1 2 � Ax − d � 2 ˆ x ∈ arg min 2 + λ � x � 1 x ∈ R N Under some conditions (RIP [Candes et al., 2006, Cand` es and Wakin, 2008] , incoherence [Donoho, 2006, Gribonval and Nielsen, 2003] ...) → exact recovery by ℓ 1 -minimization, ◮ The convex non-convex strategy [Selesnick and Farshchian, 2017, Selesnick, 2017] Greedy algorithms Iterative thresholding algorithms Global optimization 6

  9. A Brief Literature Review Convex relaxations Greedy algorithms Idea : add one by one non-zero components to the solution: ◮ Matching Pursuit (MP) [Mallat and Zhang, 1993] , Orthogonal Matching Pursuit (OMP) [Pati et al., 1993] , Orthogonal Least Squares (OLS) [Chen et al., 1991] ... → under some conditions, optimality guarantees for OMP [Tropp, 2004] and OLS [Soussen et al., 2013] , ◮ Forward-backward extensions: Single Best Replacement (SBR) [Soussen et al., 2011] ... Iterative thresholding algorithms Global optimization 6

  10. A Brief Literature Review Convex relaxations Greedy algorithms Iterative thresholding algorithms ◮ Iterative hard thresholding (IHT) [Blumensath and Davies, 2009] , ◮ Subspace pursuit [Dai and Milenkovic, 2009] , ◮ Hard thresholding pursuit [Foucart, 2011] , ◮ Compressive Sampling Matching Pursuit (CoSaMP) [Needell and Tropp, 2009] , Global optimization 6

  11. A Brief Literature Review Convex relaxations Greedy algorithms Iterative thresholding algorithms Global optimization Mixed integer programming together with branch and bounds algorithms [Bourguignon et al., 2016] → limited to moderate size problems, 6

  12. A Brief Literature Review Continuous non-convex relaxations of the ℓ 0 -norm � � 1 2 � Ax − y � 2 + Φ ( x ) ˆ x ∈ arg min . x ∈ R N ◮ Adaptive Lasso [Zou, 2006] , ◮ Nonnegative Garrote [Breiman, 1995] , ◮ Exponential approximation [Mangasarian, 1996] , ◮ Log-Sum Penalty [Cand` es et al., 2008] , ◮ Smoothly Clipped Absolute Deviation (SCAD) [Fan and Li, 2001] , ◮ Minimax Concave Penalty (MCP) [Zhang, 2010] , ◮ ℓ p -norms 0 < p < 1 [Chartrand, 2007, Foucart and Lai, 2009] ◮ Smoothed ℓ 0 -norm Penalty (SL0) [Mohimani et al., 2009] , ◮ Class of smooth non-convex penalties [Chouzenoux et al., 2013] , ◮ Smoothed norm ratio [Repetti et al., 2015, Cherni et al., 2019] . 7

  13. A Brief Literature Review Continuous non-convex relaxations of the ℓ 0 -norm � � 1 2 � Ax − y � 2 + Φ ( x ) ˆ x ∈ arg min . x ∈ R N 1 . 5 1 ℓ 0 ℓ 1 0 . 5 Cap- ℓ 1 ℓ 0 . 5 Log-Sum SCAD MCP Exp 0 − 2 0 2 7

  14. A Brief Literature Review Continuous non-convex relaxations of the ℓ 0 -norm � � 1 2 � Ax − y � 2 + Φ ( x ) ˆ x ∈ arg min . x ∈ R N There exist a class of penalties Φ that lead to exact continuous relaxation of the ℓ 2 - ℓ 0 functional in the sense that their global minimizers coincide [Soubies et al., 2017, Carlsson, 2019] 7

  15. Motivation of this work Motivation of this work NP-hardness implies that ◮ one cannot expect, in general, to attain an optimal point ◮ verifying the optimality of a point ˆ x is also, in general, intractable Interest in studying the “ restrictiveness ” of tractable necessary (but not sufficient) optimality conditions 8

  16. Motivation of this work Motivation of this work NP-hardness implies that ◮ one cannot expect, in general, to attain an optimal point ◮ verifying the optimality of a point ˆ x is also, in general, intractable Interest in studying the “ restrictiveness ” of tractable necessary (but not sufficient) optimality conditions Some notations ◮ I N = { 1 , . . . , N } , ◮ σ x = { i ∈ I N : x i � = 0 } denotes the support of x ∈ R N , ◮ x ω ∈ R # ω is the restriction of x ∈ R N to the elements indexed by ω ◮ A ω ∈ R M × # ω is the restriction of A ∈ R M × N to the columns indexed by ω ◮ a i = A { i } ∈ R M 8

  17. Necessary optimality conditions 9

  18. Local optimality Definition (Local optimality) A point x ∈ R N is a local minimizer of F 0 if and only if � � u ∈ R N � Au − y � 2 s.t. σ u ⊆ σ x x ∈ arg min , or, equivalently, if x is such that � a i , Ax − y � = 0 ∀ i ∈ σ x 10

  19. Local optimality Definition (Local optimality) A point x ∈ R N is a local minimizer of F 0 if and only if � � u ∈ R N � Au − y � 2 s.t. σ u ⊆ σ x x ∈ , arg min or, equivalently, if x is such that � a i , Ax − y � = 0 ∀ i ∈ σ x ◮ Local minimizers of F 0 are independent of λ , ◮ When rank ( A ) < N ( e.g., M < N ), local minimizers of F 0 are uncountable , ◮ An important subset contains strict local minimizers, ∃ ε > 0 , ∀ u ∈ B 2 ( x , ε ) , F 0 ( x ) < F 0 ( u ) . ◮ Indeed, global minimizers of F 0 are strict [Nikolova, 2013] . 10

  20. Local optimality Theorem (Strict local optimality for F 0 [Nikolova, 2013] ) A local minimizer x ∈ R N of F 0 is strict if and only if rank ( A σ x ) = # σ x . 11

  21. Local optimality Theorem (Strict local optimality for F 0 [Nikolova, 2013] ) A local minimizer x ∈ R N of F 0 is strict if and only if rank ( A σ x ) = # σ x . ◮ A strict (local) minimizer of F 0 can be easily computed: 1. choose a support ω ∈ Ω max where M � � ω ∈ I N : # ω = r = rank ( A ω ) � (Ω 0 = ∅ ) Ω max = Ω r and Ω r := r =0 2. solve ( A ω ) T A ω x ω = ( A ω ) T y ⇒ Given A and y we can compute all the strict (local) minimizers of F 0 by solving the restricted normal equations ∀ ω ∈ Ω max ◮ #Ω max is finite (but huge) 11

  22. Support-based optimality conditions Definition (Partial support coordinate-wise points [Beck and Hallak, 2018] ) A local minimizer x ∈ R N of F 0 is said to be partial support coordinate-wise (CW) optimal for F 0 if it verifies F 0 ( x ) ≤ min { F 0 ( u ) : u ∈ { u − x , u swap , u + x }} x where u − x , u swap , and u + x are local minimizers of F 0 with supports x - σ u − x = σ x \{ i x } - σ u swap = σ x \{ i x } ∪ { j x } x - σ u + x = σ x ∪ { j x } for � � � � i x ∈ arg min k ∈ σ x | x k | , j x ∈ arg max k ∈ ( σ x ) c | � a k , Ax − y � | . 12

  23. L-Stationarity Definition (L-stationarity [Tropp, 2006, Beck and Hallak, 2018] ) A point x ∈ R N is said to be L-stationary for F 0 ( L > 0), if � � 1 2 � T L ( x ) − u � 2 + λ x ∈ arg min L � u � 0 , u ∈ R N where T L ( x ) = x − L − 1 A T ( Ax − y ). 13

  24. L-Stationarity Definition (L-stationarity [Tropp, 2006, Beck and Hallak, 2018] ) A point x ∈ R N is said to be L-stationary for F 0 ( L > 0), if � 1 2 � T L ( x ) − u � 2 + λ � x ∈ L � u � 0 , arg min u ∈ R N where T L ( x ) = x − L − 1 A T ( Ax − y ). ◮ For L ≥ � A � 2 , L-stationarity points are fixed points of the IHT algorithm [Blumensath and Davies, 2009, Attouch et al., 2013] . 13

  25. Conditions based on exact relaxations Exact continuous relaxations: Motivation Are there continuous relaxations of F 0 of the form N F ( x ) = 1 2 � Ax − y � 2 + ˜ � φ i ( x i ) , i =1 such that, for all y ∈ R M , ˜ F ( x ) = arg min x ∈ R N F 0 ( x ) , (P1) arg min x ∈ R N x local minimizer of ˜ F = ⇒ x local minimizer of F 0 (P2) 14

  26. Conditions based on exact relaxations Exact continuous relaxations: Motivation Are there continuous relaxations of F 0 of the form N F ( x ) = 1 2 � Ax − y � 2 + ˜ � φ i ( x i ) , i =1 such that, for all y ∈ R M , ˜ F ( x ) = arg min x ∈ R N F 0 ( x ) , (P1) arg min x ∈ R N x local minimizer of ˜ F = ⇒ x local minimizer of F 0 (P2) ◮ Properties (P1) and (P2) imply that local optimality for ˜ F is a necessary optimality condition for F 0 . ◮ Moreover, there is no converse property for (P2) → ˜ − F can potentially remove local (not global) minimizers of F 0 14

Recommend


More recommend