constrained convex optimization virgil pavlu 1
convex set a set X in a vector space is convex if for any w 1 , w 2 ∈ X and λ ∈ [0 , 1] we have λw 1 + (1 − λ ) w 2 ∈ X 2
convex function a function f is convex(concave) on X ⊆ Dom ( f ) if for any w 1 , w 2 ∈ X and λ ∈ [0 , 1] we have f ( λw 1 + (1 − λ ) w 2 ) ≤ ( ≥ ) λf ( w 1 ) + (1 − λ ) f ( w 2 ) if f is strict convex and twice differentiable on X then : ♦ f � = δ w f ( w ) strict increasing ♦ f �� ≥ 0 ♦ f � ( x 0 ) = 0 ⇔ x 0 is a global minimum 3
convex and differentiable if f is convex and differentiable then for any w 1 , w 2 we have 1. f � ( w 2 )( w 2 − w 1 ) ≥ f ( w 2 ) − f ( w 1 ) ≥ f � ( w 1 )( w 2 − w 1 ) 2. ∃ w = λw 1 + (1 − λ ) w 2 , λ ∈ [0 , 1] such that f ( w 2 ) − f ( w 1 ) = f � ( w )( w 2 − w 1 ) 4
unconstrained optimization one variable interval cutting Newton’s Method several variables gradient descent conjugate gradient descent 5
constrained optimization given convex functions f, g 1 , g 2 , ...., g k , h 1 , h 2 , ...., h m on convex set X , the problem minimize f ( w ) subject to g i ( w ) ≤ 0 ,for all i h j ( w ) = 0 ,for all j has as its solution a convex set. If f is strict convex the solution is unique (if exists) we will assume all the good things one can imagine about functions f, g 1 , g 2 , ...., g k , h 1 , h 2 , ...., h m like convexity, differentiability etc.That will still not be enough though.... 6
equality constraints only minimize f ( w ) subject to h j ( w ) = 0 ,for all j define the lagrangian function � L ( w , β ) = f ( w ) + β j h j ( w ) j Lagrange theorem nec[essary] and suff[icient] conditions for a point w to be an optimum (ie a solution for the problem above) are the � existence of � β such that w , � w , � δ w L ( � β ) = 0 ; δ β j L ( � β ) = 0 7
200 0 ! 200 ! 400 ! 600 ! 800 ! 1000 ! 10 ! 1200 ! 5 ! 1400 0 0 2 5 8 4 6 8 10 10 w alpha
inequality constraints minimize f ( w ) subject to g i ( w ) ≤ 0 ,for all i h j ( w ) = 0 ,for all j we can rewrite every inequality constraint h j ( x ) = 0 as two inequalities h j ( w ) ≤ 0 and h j ( w ) ≥ 0. so the problem becomes minimize f ( w ) subject to g i ( w ) ≤ 0 ,for all i 9
Karush Kuhn Tucker theorem minimize f ( w ) subject to g i ( w ) ≤ 0 ,for all i were g i are qualified constraints define the lagrangian function � L ( w , α ) = f ( w ) + α i g i ( w ) i KKT theorem nec and suff conditions for a point � w to be a solution for the optimization problem are the existence of � α such that δ w L ( � α ) = 0 ; � α i g i ( � w ) = 0 w , � g i ( � w ) ≤ 0 ; � α i ≥ 0 10
KKT - sufficiency Assume ( � α ) satisfies KKT conditions w , � w ) + � k δ w L ( � α ) = δ w f ( � α i δ w g i ( � w ) = 0 w , � i =1 � δ α i L ( � α ) = g i ( � w ) ≤ 0 w , � α i g i ( � w ) = 0; � α i ≥ 0 � Then w )) T ( w − � f ( w ) − f ( � w ) ≥ ( δ w f ( � w ) = − � k w ) ≥ − � k w )) T ( w − � α i ( δ w g i ( � α i ( g i ( w ) − g i ( � w )) = i =1 � i =1 � − � k α i g i ( w ) ≥ 0 i =1 � so � w is solution 11
saddle point minimize f ( w ) subject to g i ( w ) ≤ 0 ,for all i and the lagrangian function � L ( w , α ) = f ( w ) + α i g i ( w ) i ( � α ) with � α i ≥ 0 is saddle point if ∀ ( w , α ) , α i ≥ 0 w , � L ( � w , α ) ≤ L ( � α ) ≤ L ( w , � α ) w , � 12
200 0 ! 200 ! 400 ! 600 ! 800 ! 1000 ! 10 ! 1200 ! 5 ! 1400 0 0 2 5 13 4 6 8 10 10 w alpha
saddle point - sufficiency minimize f ( w ) subject to g i ( w ) ≤ 0 ,for all i lagrangian function L ( w , α ) = f ( w ) + � i α i g i ( w ) ( � α ) is saddle point w , � ∀ ( w , α ) , α i ≥ 0 : L ( � w , α ) ≤ L ( � α ) ≤ L ( w , � α ) w , � then 1. � w is solution to optimization problem 2. � α i g i ( � w ) = 0 for all i 14
saddle point - necessity minimize f ( w ) subject to g i ( w ) ≤ 0 ,for all i were g i are qualified constraints lagrangian function L ( w , α ) = f ( w ) + � i α i g i ( w ) w is solution to optimization problem � then ∃ � α ≥ 0 such that ( � w , � α ) is saddle point ∀ ( w , α ) , α i ≥ 0 : L ( � w , α ) ≤ L ( � α ) ≤ L ( w , � α ) w , � 15
constraint qualifications minimize f ( w ) subject to g i ( w ) ≤ 0 , for all i let Υ be the feasible region Υ = { w | g i ( w ) ≤ 0 ∀ i } then the following additional conditions for functions g i are connected ( i ) ⇔ ( ii ) and ( iii ) ⇒ ( i ) : ( i ) there exists w ∈ Υ such that g i ( w ) ≤ 0 ∀ i ( ii ) for all nonzero α ∈ [0 , 1) k ∃ w ∈ Υ such that α i g i ( w ) = 0 ∀ i ( iii ) the feasible region Υ contains at least two distinct elements, and ∃ w ∈ Υ such that all g i are are strictly convex at w w.r.t. Υ 16
KKT-gap Assume � w is the solution for optimization problem.Then for any ( w , α ) with α ≥ 0 and satisfying δ w L ( w , α ) = 0 ; δ α i L ( w , α ) ≥ 0 we have k � f ( w ) ≥ f ( � w ) ≥ f ( w ) + α i g i ( w ) i =1 17
duality k � f ( w ) ≥ f ( � w ) ≥ f ( w ) + α i g i ( w ) i =1 dual maximization problem : maximize L ( w , α ) = f ( w ) + � k i =1 α i g i ( w ) subject to α ≥ 0 ; δ w L ( w , α ) = 0 OR set θ ( α ) = inf w L ( w , α ) maximize θ ( α ) subject to α ≥ 0 the primal and dual problems have the same objective value iff the gap can be vanished 18
Recommend
More recommend