A k -norm-based Mixed Integer Programming formulation for sparse optimization M. Gaudioso, * G. Giallombardo, * G. Miglionico. * ∗ DIMES-Universit´ a della Calabria, Rende (CS), Italia GdR MIA Thematic day on Non-Convex Sparse Optimization Friday October 9th 2020
Sparse Optimization and polyhedral k -norm Two Mixed Integer Programming (MIP) formulations for the Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization Numerical experiments Bibliography The issues ℓ 0 pseudo-norm and the k -norm; A k -norm-based discrete formulation of the sparse optimization problem; Continuous relaxation; Application to Classification. A k -norm-based Mixed Integer Programming formulation for sparse Manlio Gaudioso
Sparse Optimization and polyhedral k -norm Two Mixed Integer Programming (MIP) formulations for the Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization Numerical experiments Bibliography Outline Sparse Optimization and polyhedral k -norm 1 Two Mixed Integer Programming (MIP) formulations for the 2 Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization 3 Numerical experiments 4 Bibliography 5 A k -norm-based Mixed Integer Programming formulation for sparse Manlio Gaudioso
Sparse Optimization and polyhedral k -norm Two Mixed Integer Programming (MIP) formulations for the Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization Numerical experiments Bibliography Outline Sparse Optimization and polyhedral k -norm 1 Two Mixed Integer Programming (MIP) formulations for the 2 Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization 3 Numerical experiments 4 Bibliography 5 A k -norm-based Mixed Integer Programming formulation for sparse Manlio Gaudioso
Sparse Optimization and polyhedral k -norm Two Mixed Integer Programming (MIP) formulations for the Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization Numerical experiments Bibliography Sparse optimization The sparse optimization problem: f ∗ 0 = min R n f ( x ) + � x � 0 ( P 0 ) , x ∈ I R n → R convex and not necessarily differentiable, n ≥ 2 . f : I The ℓ 0 pseudo-norm � . � 0 counts the number of non-zero components. A k -norm-based Mixed Integer Programming formulation for sparse Manlio Gaudioso
Sparse Optimization and polyhedral k -norm Two Mixed Integer Programming (MIP) formulations for the Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization Numerical experiments Bibliography A class of polyhedral norms The k -norm of x , ( � x � [ k ] ) is the sum of k maximal components (in modulus) of x , k = 1 , . . . , n . The following hold: i) � x � ∞ = � x � [1] ≤ . . . ≤ � x � [ k ] ≤ . . . � x � [ n ] = � x � 1 ; ii) � x � 0 ≤ k ⇒ � x � 1 − � x � [ s ] = 0 , k ≤ s ≤ n . In particular it is, 1 ≤ k ≤ n , � x � 0 ≤ k ⇔ � x � 1 − � x � [ k ] = 0 . Property above allows us to replace the cardinality constraint � x � 0 ≤ k with a constraint on the difference of norms (a Difference of Convex, DC, one). A k -norm-based Mixed Integer Programming formulation for sparse Manlio Gaudioso
Sparse Optimization and polyhedral k -norm Two Mixed Integer Programming (MIP) formulations for the Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization Numerical experiments Bibliography Differential properties of the k -norm R n and I [ k ] △ Let ¯ x ∈ I = { i 1 , . . . , i k } be the index set of k maximal g [ k ] of � . � 0 at ¯ components (in modulus) of ¯ x . A subgradient ¯ x is: 1 if i ∈ I [ k ] and ¯ x i ≥ 0 g [ k ] ¯ = − 1 if i ∈ I [ k ] and ¯ x i < 0 i 0 otherwise It holds: y ∈ ψ k y ⊤ ¯ � ¯ x � [ k ] = max x, where ψ k is the subdifferential of � . � [ k ] at point 0 , R n | y = u − v, 0 ≤ u, v ≤ e, ( u + v ) ⊤ e = k } , ψ k = { y ∈ I and e is a vector of n “ones”. A k -norm-based Mixed Integer Programming formulation for sparse Manlio Gaudioso
Sparse Optimization and polyhedral k -norm Two Mixed Integer Programming (MIP) formulations for the Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization Numerical experiments Bibliography Outline Sparse Optimization and polyhedral k -norm 1 Two Mixed Integer Programming (MIP) formulations for the 2 Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization 3 Numerical experiments 4 Bibliography 5 A k -norm-based Mixed Integer Programming formulation for sparse Manlio Gaudioso
Sparse Optimization and polyhedral k -norm Two Mixed Integer Programming (MIP) formulations for the Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization Numerical experiments Bibliography Standard formulation Introduction of a set of binary variables z k , k = 1 , . . . , n , as “flags” of the non-zero components of x . n � f ∗ I = min x,z f ( x ) + z k k =1 − Mz k ≤ x k ≤ Mz k , , k = 1 , . . . , n z k ∈ { 0 , 1 } , k = 1 , . . . , n, M is the classic “big M ” parameter. At the optimum it is x k � = 0 ⇔ z k = 1 , n � hence z k is exactly the ℓ 0 pseudo-norm of x . k =1 A k -norm-based Mixed Integer Programming formulation for sparse Manlio Gaudioso
Sparse Optimization and polyhedral k -norm Two Mixed Integer Programming (MIP) formulations for the Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization Numerical experiments Bibliography Continuous relaxation of the standard formulation Replace the binary constraint z k ∈ { 0 , 1 } with 0 ≤ z k ≤ 1 , k = 1 , . . . , n . At the optimum at least one of the couple of constraints: − Mz k ≤ x k ≤ Mz k , k = 1 , . . . , n, is satisfied by equality and it is z k = | x k | M . Thus we come out with n z k = 1 � M � x � 1 . k =1 The objective function of the continuous relaxation is finally: F ( x ) = f ( x ) + 1 M � x � 1 , which coincides with classic ℓ 1 normalization of f (LASSO approach). A k -norm-based Mixed Integer Programming formulation for sparse Manlio Gaudioso
Sparse Optimization and polyhedral k -norm Two Mixed Integer Programming (MIP) formulations for the Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization Numerical experiments Bibliography The k -norm formulation Rewrite the equivalence: � x � 0 ≤ k ⇔ � x � 1 − � x � [ k ] = 0 , as: � x � 0 > k ⇔ � x � 1 − � x � [ s ] > 0 . Introduce the binary variables y k , k = 1 , . . . , n and define n � f ∗ I = min x,y f ( x ) + y k k =1 � x � 1 − � x � [ k ] ≤ M ′ y k , , k = 1 , . . . , n y k ∈ { 0 , 1 } , k = 1 , . . . , n. At the optimum it is: � x � 1 − � x � [ k ] = 0 ⇔ y k = 0 . A k -norm-based Mixed Integer Programming formulation for sparse Manlio Gaudioso
Sparse Optimization and polyhedral k -norm Two Mixed Integer Programming (MIP) formulations for the Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization Numerical experiments Bibliography Properties At the optimum it is y n = 0 and, as far as x � = 0 , it is n � � � � y k = max � � x � 1 − � x � [ s ] > 0 s , k =1 thus n � y k = � x � 0 − 1 , k =1 Remark DC type constraints. A k -norm-based Mixed Integer Programming formulation for sparse Manlio Gaudioso
Sparse Optimization and polyhedral k -norm Two Mixed Integer Programming (MIP) formulations for the Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization Numerical experiments Bibliography Continuous relaxation of the k -norm formulation Replace the binary constraint y k ∈ { 0 , 1 } with 0 ≤ y k ≤ 1 , k = 1 , . . . , n . At the optimum the constraints � x � 1 − � x � [ k ] ≤ M ′ y k , , k = 1 , . . . , n are satisfied by equality, then it is: 1 y k = M ′ ( � x � 1 − � x � [ k ] ) . and n n 1 � � � � y k = n � x � 1 − � x � [ k ] . M ′ k =1 k =1 A k -norm-based Mixed Integer Programming formulation for sparse Manlio Gaudioso
Sparse Optimization and polyhedral k -norm Two Mixed Integer Programming (MIP) formulations for the Sparse Optimization problem SVM classification, Feature Selection and Sparse Optimization Numerical experiments Bibliography Formulation The relaxation is now min R n Φ( x ) , (1) x ∈ I with n Φ( x ) = f ( x ) + σ � � � n � x � 1 − � x � [ k ] . M ′ k =1 Note that function Φ is DC, with k − norms embedded into. A k -norm-based Mixed Integer Programming formulation for sparse Manlio Gaudioso
Recommend
More recommend