Exact ℓ 0 -norm optimization via branch-and-bound methods S´ ebastien Bourguignon Laboratoire des Sciences du Num´ erique de Nantes ´ Ecole Centrale de Nantes GdR MIA, Thematic day on Non-Convex Sparse Optimization, Toulouse, October 9th 2020 Joint work with Ramzi Ben Mhenni (LS2N-ECN, now LITIS, Universit´ e de Rouen) Jordan Ninin (Lab-STICC / ENSTA Bretagne) Marcel Mongeau (Universit´ e de Toulouse, ENAC) Herv´ e Carfantan (Universit´ e de Toulouse, IRAP)
Outline Why? 1 Who? 2 How? 3 Where? 4 S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 2 / 16
Outline Why? 1 � exact solutions to ℓ 0 -norm problems may achieve better estimates Who? 2 � small to moderate size sparse problems can be solved exactly How? 3 � dedicated Branch-and-Bound strategy Where? 4 � directions for further works S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 2 / 16
Exactness: exact criterion, exact optimization � True, unrelaxed, ℓ 0 -“norm” criterion 1 � x � q p | x p | q � x � 1 = � p | x p | q = � � x � 0 := Card { x p | x p � = 0 } Some sparsity-enhancing functions � p ϕ ( | x p | ) and their unit balls. � Global optimization: optimality guaranteed by the algorithm 1 On (re)lˆ ache rien! S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 3 / 16
Exactness may be worth. . . � Natural formulation for many problems 2 � y − A x � 2 2 � y − A x � 2 1 1 P 2 / 0 : min 2 s.t. � x � 0 ≤ K P 0 / 2 : min x ∈ R P � x � 0 s.t. 2 ≤ ǫ x ∈ R P � � 2 � y − A x � 2 1 P 2+0 : min 2 + λ � x � 0 x ∈ R P � Global optimum � better solution [Bertsimas et al., 2016, Bourguignon et al. , 2016] 4 4 4 4 4 2 2 2 2 2 0 0 0 0 0 −2 −2 −2 −2 −2 −4 −4 −4 −4 −4 −6 −6 −6 −6 −6 0 50 100 0 50 100 0 50 100 0 50 100 0 50 100 Data and truth OMP ℓ 1 relaxation SBR Global optimum x � 2 x � 2 x � 2 x � 2 x � 2 � y − H ˚ 2 = 1 . 62 � y − H � 2 = 6 . 07 � y − H � 2 = 2 . 36 � y − H � 2 = 2 . 22 � y − H � 2 = 1 . 43 Results taken from [Bourguignon et al., 2016] S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 4 / 16
. . . but exactness has a price 2 On n’est jamais fort pour ce calcul. S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 5 / 16
. . . but exactness has a price 2 On n’est jamais fort pour ce calcul. S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 5 / 16
. . . but exactness has a price NP-hard 2 : � x � 0 ≤ K � � P � possible combinations. . . in worst case scenario! K Branch-and-Bound: eliminate (hopefully huge) sets of possible combinations without resorting to their evaluation Moderate-size problems ( P ∼ a few hundreds, K ∼ a few tens) ◮ one-dimensional problems ◮ deconvolution, time series spectral analysis, spectral unmixing, . . . ◮ variable/subset selection in Statistics 2 On n’est jamais fort pour ce calcul. S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 5 / 16
Mixed Integer Programming (MIP) reformulation (see [Bienstock 1996, Bertsimas et al. 2016, Bourguignon et al. 2016] ) Big-M assumption : ∀ p , | x p | ≤ M . � � x � 0 ≤ K x ∈ R P � y − A x � 2 Then: min 2 s.t. ∀ p , | x p | ≤ M � b p ≤ K � y − A x � 2 ⇔ min 2 s.t. p b ∈{ 0;1 } P ∀ p , | x p | ≤ Mb p x ∈ R P Can be addressed by MIP solvers (CPLEX, GUROBI, . . . ) but computation time ↑ / limited to small size Here: No need for MIP reformulation nor binary variables Specific Branch-and-Bound construction for problems P 2 / 0 , P 0 / 2 , and P 2+0 S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 6 / 16
Branch-and-Bound resolution [Land & Doig, 1960] Decision tree for binary variables At each node, a lower bound on all subproblems contained by this node � � � remaining binary variables are relaxed into 0 , 1 If this bound exceeds the best known solution, the branch is pruned . ⊲ Which variable b p branch on? P (0) b p 0 = 1 b p 0 = 0 ⊲ Which side explore first? P (1) P (4) b p 1 = 1 b p 1 = 0 b p 4 = 1 b p 4 = 0 ⊲ Which node explore first? P (2) P (3) P (5) P (6) S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 7 / 16
Branch-and-Bound resolution [Land & Doig, 1960] Decision tree for binary variables At each node, a lower bound on all subproblems contained by this node � � � remaining binary variables are relaxed into 0 , 1 If this bound exceeds the best known solution, the branch is pruned . ⊲ Which variable b p branch on? P (0) highest relaxed variable b p 0 = 1 b p 0 = 0 ⊲ Which side explore first? P (1) P (4) b p = 1 b p 1 = 1 b p 1 = 0 b p 4 = 1 b p 4 = 0 ⊲ Which node explore first? depth-first search P (2) P (3) P (5) P (6) S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 7 / 16
Branch-and-Bound resolution [Land & Doig, 1960] Decision tree for binary variables At each node, a lower bound on all subproblems contained by this node � � � remaining binary variables are relaxed into 0 , 1 If this bound exceeds the best known solution, the branch is pruned . ⊲ Which variable b p branch on? P (0) highest relaxed variable b p 0 = 1 b p 0 = 0 ⊲ Which side explore first? P (1) P (4) b p = 1 b p 1 = 1 b p 1 = 0 b p 4 = 1 b p 4 = 0 ⊲ Which node explore first? depth-first search P (2) P (3) P (5) P (6) ⊲ Computation of relaxed solutions? S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 7 / 16
Branch-and-Bound resolution [Land & Doig, 1960] Decision tree for binary variables At each node, a lower bound on all subproblems contained by this node � � � remaining binary variables are relaxed into 0 , 1 If this bound exceeds the best known solution, the branch is pruned . ⊲ Which variable b p branch on? P (0) highest relaxed variable b p 0 = 1 b p 0 = 0 ⊲ Which side explore first? P (1) P (4) b p = 1 b p 1 = 1 b p 1 = 0 b p 4 = 1 b p 4 = 0 ⊲ Which node explore first? depth-first search P (2) P (3) P (5) P (6) ⊲ Computation of relaxed solutions? related to ℓ 1 -norm optimization . . . S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 7 / 16
MIP continuous relaxation and ℓ 1 norm �� p b p ≤ K 2 � y − A x � 2 1 P 2 / 0 : min s.c. M ∀ p , − Mb p ≤ x p ≤ Mb p x ∈ R P b ∈{ 0 , 1 } P 0 �� p b p ≤ K 2 � y − A x � 2 1 � R 2 / 0 : min s.c. ∀ p , − Mb p ≤ x p ≤ Mb p x ∈ R P -M b ∈ [0 , 1] P 0 1 We have 3 �� p | x p | ≤ MK 2 � y − A x � 2 1 min R 2 / 0 = min s.t. . ∀ p , | x p | ≤ M x ∈ R P 3 Proof: for a solution ( x ⋆ , b ⋆ ) of P R 2 / 0 , we have | x ⋆ | = M b ⋆ . . . S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 8 / 16
Continuous relaxation within the branch-and-bound procedure At a given node: P (0) ◮ b S 0 = 0 and x S 0 = 0 b p 0 = 1 b p 0 = 0 ◮ b S 1 = 1 and | x S 1 | ≤ M P (1) P (4) ◮ b S free and | x S | ≤ M b S b p 1 = 1 b p 1 = 0 b p 4 = 1 b p 4 = 0 � p b p = Card S 1 + � p ∈ S b p � P (2) P (3) P (5) P (6) The relaxed problem at node i reads equivalently: � x S � 1 ≤ M ( K − Card S 1 ) R ( i ) 2 � y − A S x S − A S 1 x S 1 � 2 1 2 / 0 : min s.t. � x S � ∞ ≤ M 2 x S , x S 1 � x S 1 � ∞ ≤ M ⊲ Least squares, ℓ 1 norm (partially) and box constraints. ⊲ No binary variables ! S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 9 / 16
Optimization with (partial) ℓ 1 -norm and box constraints � Homotopy continuation principle Standard case [Osborne et al. 2000] With free variable and box constraints 2 � y − A x � 2 1 2 � y − A S x S − A S 1 x S 1 � 2 1 min 2 + λ � x � 1 min 2 + λ � x S � 1 x x S , x S 1 � � x S � ∞ ≤ M s.c. � x S 1 � ∞ ≤ M x ∗ x ∗ x ∗ M x ∗ 4 1 x ∗ 1 x ∗ 4 x ∗ x ∗ 2 2 λ (6) λ (5) λ (4) λ (3) λ (2) λ (1) λ (0) λ (4) λ (3) λ (2) λ (1) λ (0) λ λ x ∗ 5 x ∗ 3 x ∗ − M λ ∗ 3 λ ∗ S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 10 / 16
Homotopy continuation Similarly solves relaxations for the sparsity-constrained problem: � � x S � 1 ≤ τ ⋆ R ( i ) 2 � y − A S x S − A S 1 x S 1 � 2 1 2 / 0 : min s.t. 2 � x S � ∞ ≤ M , � x S 1 � ∞ ≤ M x S , x S 1 and for the error-constrained problem: � 2 � y − A S x S − A S 1 x S 1 � 2 1 2 ≤ ǫ ⋆ R ( i ) 0 / 2 : min � x S � 1 s.t. � x S � ∞ ≤ M , � x S 1 � ∞ ≤ M x S , x S 1 � x ∗ S � 1 λ ∗ τ ⋆ λ (4) λ (3) λ (2) λ (1) ǫ ⋆ λ (0) 1 S � 2 2 � y − A S 1 x ∗ S 1 − A S x ∗ Pareto curve S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 11 / 16
Recommend
More recommend