operator splitting techniques and their application to
play

Operator splitting techniques and their application to embedded - PowerPoint PPT Presentation

Operator splitting techniques and their application to embedded optimization problems Puya Latafat (Joint work with Panagiotis Patrinos) IMT School for Advanced Studies Lucca puya.latafat@imtlucca.it Department of Electrical Engineering


  1. Operator splitting techniques and their application to embedded optimization problems Puya Latafat (Joint work with Panagiotis Patrinos) IMT School for Advanced Studies Lucca puya.latafat@imtlucca.it Department of Electrical Engineering (ESAT-STADIUS), KU Leuven panos.patrinos@esat.kuleuven.be September 1, 2016 Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 1 / 19

  2. Outline ◮ structured optimization problem ◮ monotone operators and the splitting principle ◮ a primal-dual algorithm ◮ application: distributed optimization Based on 1. Latafat and Patrinos. "Asymmetric Forward-Backward-Adjoint Splitting for Solving Monotone Inclusions Involving Three Operators." arXiv preprint arXiv:1602.08729 (2016). 2. Latafat, Stella, Patrinos. "New Primal-Dual Proximal Algorithms for Distributed Optimization" accepted for 56th IEEE Conference on Decision and Control (2016) Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 2 / 19

  3. Structured Optimization Problem minimize f ( x ) + g ( Lx ) + h ( x ) R n x ∈ I ���� � �� � ���� nonsmooth nonsmooth smooth R n → ¯ R m → ¯ ◮ f : I I R , g : I I R are proper closed convex functions with easy to compute proximal maps R m to I R n ◮ L is a linear operator from I ◮ h is a differentiable function and ∇ h ( · ) is β -Lipschitz example 1: MPC formulations, h being the quadratic function, L encoding the dynamics, g and f indicator functions for constraint on states and inputs example 2: distributed optimization over graphs (In this talk) more examples: machine learning and signal processing ◮ Goal : find the solution as a fix point of an operator Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 3 / 19

  4. Example: Generalized Lasso with Box Constraint 1 2 � Ax − b � 2 minimize 2 + λ � Lx � 1 R n x ∈ I l ≤ x ≤ u , subject to 1 2 � Ax − b � 2 minimize + λ � Lx � 1 + δ l ≤ x ≤ u ( x ) 2 R n x ∈ I � �� � � �� � � �� � g ( Lx ) f ( x ) h ( x ) ◮ we want algorithms that involve only L , L ⊤ , prox f , prox g , ∇ h ◮ prox g ◦ L is not trivial (unless L ⊤ L = α Id ) ◮ no inner loops or linear systems to solve ◮ no need to introduce dummy variables Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 4 / 19

  5. Subgradients and Monotone Operators ◮ the subdifferential of f is the set valued operator: R n | ( ∀ y ∈ dom f ) � y − x , u � + f ( x ) ≤ f ( y ) } ∂ f : x �→ { u ∈ I example: if 0 ∈ ∂ f ( x ⋆ ) ⇒ f ( x ⋆ ) ≤ f ( y ) ∀ y ∈ dom f example: for differentiable f , ∂ f = ∇ f f ( x ) f ( x 1 ) + � u 2 , x − x 1 � f ( x 1 ) + � u 1 , x − x 1 � x 1 Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 5 / 19

  6. ◮ set valued mapping A is monotone if � x − y , u − v � ≥ 0 ∀ x , y , u ∈ Ax , v ∈ Ay example: ∂ f for proper convex function f ◮ A is maximally monotone if it is not contained in graph of another mapping example: ∂ f for proper closed convex function f y x

  7. ◮ set valued mapping A is monotone if � x − y , u − v � ≥ 0 ∀ x , y , u ∈ Ax , v ∈ Ay example: ∂ f for proper convex function f ◮ A is maximally monotone if it is not contained in graph of another mapping example: ∂ f for proper closed convex function f y x Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 6 / 19

  8. ◮ proximal mapping of proper closed convex function f : � � f ( z ) + 1 2 γ � x − z � 2 prox γ f ( x ) = argmin z ◮ unique minimizer ◮ closed form solution for many functions such as l 1 , l 2 norms, quadratic, log barrier,... example: f = δ C ⇒ prox γ f = P C f ( x ) = 1 2 x ⊤ Qx + q ⊤ x ⇒ prox γ f = ( I + γ Q ) − 1 ( x − γ q ) ◮ equivalently 0 ∈ z − x + γ∂ f ( z ), the resolvent of ∂ f J γ∂ f = ( Id + γ∂ f ) − 1 = prox γ f ◮ not every monotone operator can be written as subgradient of a function example: a linear skew symmetric matrix is monotone but not subgradient of any function Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 7 / 19

  9. Operator Splitting framework minimize f ( x ) + g ( Lx ) + h ( x ) R n x ∈ I ���� � �� � ���� nonsmooth nonsmooth smooth ◮ unconstrained minimization ◮ optimality condition 0 ∈ ∂ f ( x ) + L ∗ ∂ g ( Lx ) + ∇ h ( x ) ◮ monotone inclusion form: 0 ∈ Ax + L ∗ BLx + Cx ◮ A , B , ∂ f , ∂ g are set valued. C is single valued Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 8 / 19

  10. Initial Value Problem and Euler’s methods the path following problem dx ( t ) = −∇ h ( x ( t )) x (0) = x 0 dt x ( t ) → x ⋆ such that x ⋆ minimizes h ( · ) ◮ Euler’s forward method (explicit) x ( t + △ t ) − x ( t ) = −∇ h ( x ( t )) ⇒ x k +1 = x k − γ ∇ h ( x k ) ∼ △ t ◮ Euler’s backward method (implicit) x ( t + △ t ) − x ( t ) = −∇ h ( x ( t + △ t )) ⇒ x k +1 = x k − γ ∇ h ( x k +1 ) ∼ △ t ◮ implicit method is known to have better stability properties Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 9 / 19

  11. ◮ the big idea is to generalize this to the inclusion problem 0 ∈ dx ( t ) + Tx ( t ) dt ◮ the forward step (explicit) ⇒ x k +1 ∈ x k − γ Tx k ◮ the backward step (implicit, less sensitive to ill conditioning) x k +1 ∈ ( Id + γ T ) − 1 x k � �� � resolvent ◮ the splitting principle find x such that 0 ∈ Tx ◮ the idea of splitting principle is to combine these basic operations, also borrowed from finite differences ◮ the backward step J γ T = ( Id + γ T ) − 1 might not be easy to compute ◮ split T = A + B + · · · with one or more having easy to compute resolvent Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 10 / 19

  12. Operator Splittings ◮ two term splittings: ◮ forward-backward splitting (Mercier, 1979) ◮ Douglas-Rachford splitting (Lions and Mercier, 1979) ◮ Tseng’s forward-backward-forward splitting (Tseng 2000) ◮ three term splittings: ◮ three-operator splitting (Davis and Yin, 2015) ◮ V˜ u-Condat’s primal-dual Algorithm, equivalent to forward-backward in a certain space (V˜ u and Condat, 2013) ◮ forward-Douglas-Rachford splitting, only when third operator is a normal cone operator (Briceño-Arias, 2013) splitting our proposed method : A symmetric F orward- B ackward- A djoint splitting (AFBA) Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 11 / 19

  13. Forward-Backward Spliting ◮ monotone inclusion 0 ∈ Ax + Bx ◮ A maximally monotone, B single valued and cocoercive. ◮ minimization problem minimize f ( x ) + h ( x ) R n x ∈ I ���� ���� nonsmooth smooth ◮ forward-backward iteration x n +1 = ( Id + γ A ) − 1 ( Id − γ B ) x n ◮ proximal gradient method: x n +1 = prox γ f ( x n − γ ∇ h ( x n )) Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 12 / 19

  14. A Primal-Dual algorithm Algorithm 1 R n , y 0 ∈ I R m Inputs: x 0 ∈ I for n = 0 , . . . do x n = prox γ 1 f ( x n − γ 1 L ⊤ y n − γ 1 ∇ h ( x n )) ¯ y n +1 = prox γ 2 g ∗ ( y n + γ 2 L ¯ x n ) x n − γ 1 L ⊤ ( y n +1 − y n ) x n +1 = ¯ ◮ Arrow-Hurwicz updates � ◮ it converges if β h γ 1 < 2 − γ 1 γ 2 � L � 2 − γ 1 γ 2 � L � 2 . ◮ 2 matrix vector products ◮ new algorithm ◮ it generalizes Drori and Sabach and Teboulle, 2015, to include a nonsmooth term f Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 13 / 19

  15. Example: Generalized Lasso with Box Constraint 1 2 � Ax − b � 2 minimize 2 + λ � Lx � 1 R n x ∈ I l ≤ x ≤ u , subject to 1 2 � Ax − b � 2 minimize + λ � Lx � 1 + δ l ≤ x ≤ u ( x ) 2 R n x ∈ I � �� � � �� � � �� � g ( Lx ) f ( x ) h ( x ) The first two steps become x n = P l ≤ x ≤ u ( x n − γ 1 L ⊤ y n − γ 1 A ⊤ ( Ax n − b )) ¯ x n − γ − 1 x n ) max {| ¯ x n | − 1 , 0 } y n +1 = ¯ sign (¯ 2 x n − γ 1 L ⊤ ( y n +1 − y n ) x n +1 = ¯ Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 14 / 19

  16. Distributed Optimization ◮ large networks each having it’s own data processing unit ◮ each agent can communicate with its neighbors and is not aware of other agents in the network ◮ plug and play or distributed reconfiguration: with addition or removal of new agents, only the neighbors are ◮ possibility of asynchronous algorithms including transmission delays Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 15 / 19

  17. Application: AFBA and Distributed Optimization � N minimize f i ( x ) + g i ( L i x ) + h i ( x ) i =1 R n x ∈ I � �� � � �� � � �� � nonsmooth nonsmooth smooth 6 5 N 1 = { 2 , 4 , 6 } ◮ N agents, each only with private f i , g i , L i , h i 1 1 ◮ undirected connected graph f 1 , g 1 , L 1 , h 1 G = ( V , E ) 3 ◮ each agent i , can communicate with its neighbors j ∈ N i = { j ∈ V | ( i , j ) ∈ E } 2 4 ◮ goal : minimize the aggregate of private cost functions over a connected graph N � minimize f i ( x i ) + g i ( L i x i ) + h i ( x i ) x i ∈ I R n i =1 subject to x i = x j ( i , j ) ∈ E Puya Latafat IMTLucca, KU-Leuven AFBA September 1, 2016 16 / 19

Recommend


More recommend