Convex Optimization 5. Duality Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao Tong University 2018 SJTU Ying Cui 1 / 46
Outline Lagrange dual function Lagrange dual problem Geometric interpretation Optimality conditions Perturbation and sensitivity analysis Examples Generalized inequalities SJTU Ying Cui 2 / 46
Lagrangian standard form problem (not necessarily convex) min f 0 ( x ) x s . t . f i ( x ) ≤ 0 , i = 1 , ..., m h i ( x ) = 0 , i = 1 , ..., p domain D = � m i =0 dom f i ∩ � p i =1 dom h i and optimal value p ∗ ◮ basic idea in Lagrangian duality: take the constraints into account by augmenting the objective function with a weighted sum of the constraint functions Lagrangian: L : R n × R m × R p → R , with dom L = D × R m × R p , m p � � L ( x , λ, ν ) = f 0 ( x ) + λ i f i ( x ) + ν i h i ( x ) i =1 i =1 ◮ weighted sum of objective and constraint functions ◮ λ i is Lagrange multiplier associated with f i ( x ) ≤ 0 ◮ ν i is Lagrange multiplier associated with h i ( x ) = 0 SJTU Ying Cui 3 / 46
Lagrange dual function Lagrange dual function (or dual function): g : R m × R p → R p � m � � � g ( λ, ν ) = inf x ∈D L ( x , λ, ν ) = inf f 0 ( x ) + λ i f i ( x ) + ν i h i ( x ) x ∈D i =1 i =1 ◮ g is concave even when problem is not convex, as it is pointwise infimum of a family of affine functions of ( λ, ν ) ◮ pointwise minimum or infimum of concave functions is concave ◮ g can be −∞ when L is unbounded below in x SJTU Ying Cui 4 / 46
Lower bound property The dual function yields lower bounds on the optimal value of the primal problem, i.e., for any λ � 0 and any ν , g ( λ, ν ) ≤ p ∗ ◮ the inequality holds but is vacuous when g ( λ, ν ) = −∞ ◮ the dual function gives a nontrivial lower bound only when λ � 0 and ( λ, ν ) ∈ dom g , i.e., g ( λ, ν ) > −∞ ◮ refer to ( λ, ν ) with λ � 0 , ( λ, ν ) ∈ dom g as dual feasible proof: Suppose ˜ x is feasible, i.e., f i (˜ x ) ≤ 0 and h i (˜ x ) = 0, and λ � 0. Then, we have m p � � λ i f i (˜ x ) + ν i h i (˜ x ) ≤ 0 = ⇒ L (˜ x , λ, ν ) ≤ f 0 (˜ x ) i =1 i =1 Hence, g ( λ, ν ) = inf x ∈D L ( x , λ, ν ) ≤ L (˜ x , λ, ν ) ≤ f 0 (˜ x ) x gives p ∗ ≥ g ( λ, ν ). Minimizing over all feasible ˜ SJTU Ying Cui 5 / 46
Examples Least-norm solution of linear equations x T x min x s . t . Ax = b dual function: ◮ to minimize L ( x , ν ) = x T x + ν T ( Ax − b ) over x (unconstrained convex problem), set gradient equal to zero: x = − (1 / 2) A T ν ∇ x L ( x , ν ) = 2 x + A T ν = 0 ⇒ ◮ plug in L ( x , ν ) to obtain g : g ( ν ) = L (( − 1 / 2) A T ν, ν ) = ( − 1 / 4) ν T AA T ν − b T ν which is a concave quadratic function of ν , as − AA T � 0 lower bound property: p ∗ ≥ ( − 1 / 4) ν T AA T ν − b T ν, for all ν SJTU Ying Cui 6 / 46
Examples Standard form LP c T x min x s . t . Ax = b , x � 0 dual function: ◮ Lagrangian L ( x , λ, ν ) = c T x + ν T ( Ax − b ) − λ T x = − b T ν + ( c + A T ν − λ ) T x is affine in x (bounded below only when identically zero) ◮ dual function � − b T ν, A T ν − λ + c = 0 g ( λ, ν ) = inf x L ( x , λ, ν ) = −∞ , otherwise lower bound property: nontrivial only when λ � 0 and A T ν − λ + c = 0, and hence p ∗ ≥ − b T ν if A T ν + c � 0 SJTU Ying Cui 7 / 46
Examples Two-way partitioning problem ( W ∈ S n ) x T Wx min x x 2 s . t . i = 1 , i = 1 , ..., n ◮ a nonconvex problem with 2 n discrete feasible points ◮ find the two-way partition of { 1 , ..., n } with least total cost ◮ W ij is cost of assigning i , j to the same set ◮ − W ij is cost of assigning i , j to different sets dual function: x ( x T Wx + � ν i ( x 2 g ( ν ) = inf i − 1)) i � − 1 T ν, W + diag ( ν ) � 0 x x T ( W + diag ( ν )) x − 1 T ν = = inf −∞ , otherwise lower bound property: p ∗ ≥ − 1 T ν if W + diag ( ν ) � 0 example: ν = − λ min ( W ) 1 gives bound p ∗ ≥ n λ min ( W ) SJTU Ying Cui 8 / 46
Lagrange dual function and conjugate function ◮ conjugate f ∗ of a function f : R n → R : f ∗ ( y ) = ( y T x − f ( x )) sup x ∈ dom f ◮ dual function of min f 0 ( x ) x s . t . x = 0 x (( − ν ) T x − f ( x )) x ( f ( x ) + ν T x ) = − sup g ( ν ) = inf ◮ relationship: g ( ν ) = − f ∗ ( − ν ) ◮ conjugate of any function is convex ◮ dual function of any problem is concave SJTU Ying Cui 9 / 46
Lagrange dual function and conjugate function more generally (and more usefully), consider an optimization problem with linear inequality and equality constraints min f 0 ( x ) x s . t . Ax � b , Cx = d dual function: � � f 0 ( x ) + λ T ( Ax − b ) + ν T ( Cx − d ) g ( λ, ν ) = inf x ∈ dom f 0 � � f 0 ( x ) + ( A T λ + C T ν ) T x − b T λ − d T ν = inf x ∈ dom f 0 = − f ∗ 0 ( − A T λ − C T ν ) − b T λ − d T ν domain of g follows from domain of f ∗ 0 : dom g = { ( λ, µ ) | − A T λ − C T ν ∈ dom f ∗ 0 } ◮ simplify derivation of dual function if conjugate of f 0 is known SJTU Ying Cui 10 / 46
Examples Equality constrained norm minimization min � x � x s . t . Ax = b dual function: � − b T ν, || A T ν || ∗ ≤ 1 g ( ν ) = − b T ν − f ∗ 0 ( − A T ν ) = −∞ , otherwise ◮ conjugate of f 0 = || · || : � 0 , || y || ∗ ≤ 1 f ∗ 0 ( y ) = ∞ , otherwise i.e., the indicator function of the dual norm unit ball, where � y � ∗ = sup � u �≤ 1 u T y is dual norm of � · � SJTU Ying Cui 11 / 46
Lagrange dual problem max g ( λ, ν ) λ,ν s . t . λ � 0 ◮ find best lower bound on p ∗ , obtained from Lagrange dual function ◮ always a convex optimization problem (maximize a concave function over a convex set), regardless of convexity of primal problem, optimal value denoted by d ∗ ◮ λ, ν are dual feasible if λ � 0 and g ( λ, ν ) > −∞ (i.e., ( λ, ν ) ∈ dom g = { ( λ, ν ) | g ( λ, ν ) > −∞} ) ◮ can often be simplified by making implicit constraint ( λ, ν ) ∈ dom g explicit, e.g., ◮ standard form LP and its dual c T x − b T ν min max x ν A T ν + c � 0 s . t . Ax = b , x � 0 s . t . SJTU Ying Cui 12 / 46
Weak duality and strong duality weak duality: d ∗ ≤ p ∗ ◮ always holds (for convex and nonconvex problems) ◮ can be used to find nontrivial lower bounds for difficult problems, e.g., ◮ solving the SDP − 1 T ν max ν s . t . W + diag ( ν ) � 0 gives a lower bound for the two-way partitioning problem strong duality: d ∗ = p ∗ ◮ does not hold in general ◮ (usually) holds for convex problems ◮ conditions that guarantee strong duality in convex problems are called constraint qualifications ◮ there exist many types of constraint qualifications SJTU Ying Cui 13 / 46
Slater’s constraint qualification One simple constraint qualification is Slater’s condition (Slater’s constraint qualification): convex problem is strictly feasible, i . e . , there exists an x ∈ int D such that f i ( x ) < 0 , i = 1 , · · · , m , Ax = b ◮ can be refined, e.g., ◮ can replace int D with relint D (interior relative to affine hull) ◮ affine inequalities do not need to hold with strict inequality ◮ reduce to feasibility when the constraints are all affine equalities and inequalities ◮ implies strong duality for convex problems ◮ implies that the dual value is attained when d ∗ > −∞ , i.e., there exists a dual feasible ( λ ∗ , ν ∗ ) with g ( λ ∗ , ν ∗ ) = d ∗ = p ∗ SJTU Ying Cui 14 / 46
Examples Inequality form LP primal problem: c T x min x s . t . Ax � b dual function: � − b T λ, A T λ + c = 0 ( c + A T λ ) T x − b T λ � � g ( λ ) = inf x = −∞ , otherwise dual problem: − b T λ max λ A T λ + c = 0 , λ � 0 s . t . ◮ from weaker form of Slater’s condition: strong duality holds for any LP provided the primal problem is feasible, implying strong duality holds for LPs if the dual is feasible ◮ in fact, p ∗ = d ∗ except when primal and dual are infeasible SJTU Ying Cui 15 / 46
Examples Quadratic program : P ∈ S n ++ x T Px min x s . t . Ax � b dual function: g ( λ ) = inf x ( x T Px + λ T ( Ax − b )) = − 1 4 λ T AP − 1 A T λ − b T λ dual problem: − (1 / 4) λ T AP − 1 A T λ − b T λ max λ s . t . λ � 0 ◮ from weaker form of Slater’s condition: strong duality holds provided the primal problem is feasible ◮ in fact, p ∗ = d ∗ always holds SJTU Ying Cui 16 / 46
Examples A nonconvex problem with strong duality : A �� 0 x T Ax + 2 b T x min x x T x ≤ 1 s . t . dual function: g ( λ ) =inf x ( x T ( A + λ I ) x + 2 b T x − λ ) � − b T ( A + λ I ) † b − λ, A + λ I � 0 , b ∈ R ( A + λ I ) = −∞ , otherwise dual problem and equivalent SDP: − b T ( A + λ I ) † b − λ max max λ, t − t − λ λ � A + λ I � b s . t . A + λ I � 0 , b ∈ R ( A + λ I ) s . t . � 0 b T t ◮ strong duality holds although primal problem is nonconvex (difficult to show) SJTU Ying Cui 17 / 46
Recommend
More recommend