A Primer in Convex Optimization Moritz Diehl partly based on material by Colin Jones, Stephen Boyd and Lieven Vandenberghe
Overview ◮ Convex sets ◮ Convex functions ◮ Operations that preserve convexity ◮ Convex optimization
Convex Sets A set S ∈ R n is a convex set if for all x 1 , x 2 ∈ S and λ ∈ [0 , 1]: λ x 1 + (1 − λ ) x 2 ∈ S (set contains line segment between any two of its points) A set S ∈ R n is a convex cone if for all x 1 , x 2 ∈ S and θ 1 , θ 2 ≥ 0: θ 1 x 1 + θ 2 x 2 ∈ S
Convex hull Convex combination of z 1 , . . . , z k : Any point z of the form z = θ 1 z 1 + θ 2 z 2 + . . . + θ k z k with θ 1 + . . . + θ k = 1 , θ i ≥ 0 Convex hull of S : set of all convex combinations of points in S .
� � Convex sets: Hyperplanes and Halfspaces ◮ Hyperplane: Set of the form { x | a ⊤ x = b } ( a � = 0) { | } a x 0 x a T x = b ◮ Halfspace: Set of the form { x | a ⊤ x ≤ b } ( a � = 0) { | ≤ } a a T x ≥ b x 0 a T x ≤ b r � a ⊤ ( x − x 0 ) ≤ 0 ◮ Useful representation: � � � x a is normal vector, x 0 lies on the boundary ◮ Hyperplanes are affine and convex, halfspaces are convex
Convex sets: Polyhedra Polyhedron A polyhedron is the intersection of a finite number of halfspaces. � � � � a ⊤ P := x i x ≤ b i , i = 1 , . . . , n � A polytope is a bounded polyhedron. Often written as P := { x | Ax ≤ b } , for matrix A ∈ R m × n and b ∈ R m , where the inequality is understood row-wise. a k P
Operations that preserve convexity of sets ◮ intersection: the intersection of (any number of) convex sets is convex (but unification is generally non-convex) ◮ affine image: the image f ( S ) := { f ( x ) | x ∈ S } of a convex set S under an affine function f ( x ) = Ax + b is convex ◮ affine pre-image: the pre-image f − 1 ( S ) := { x | f ( x ) ∈ S } of a convex set S under an affine function f ( x ) = Ax + b is convex
Examples � x 1 + x 2 t + x 3 t 2 + x 4 t 3 ≥ 0 for all t ∈ [0 , 1] ◮ � � � x is convex (set of positive polynomials on unit inverval, intersection of halfspaces) ◮ { a + Pw | � w � 2 ≤ 1 } is convex (affine image of unit ball) ◮ { x | � Ax + b � 2 ≤ 1 } is convex (affine pre-image of unit ball)
The cone of positive semidefinite matrices Definitions ◮ set of symmetric n × n matrices: S n := � X = X ⊤ � X ∈ R n × n � � ◮ X � 0: for all z ∈ R n holds z ⊤ Xz ≥ 0 (all eigenvalues of X are non-negative) ◮ X ≻ 0: all eigenvalues of X are positive ◮ set of positive semidefinite n × n matrices: + := { X ∈ S n | X � 0 } S n Theorem: S n + is a convex set � z ⊤ Xz ≥ 0 for all z ∈ R n � Proof: S n � X ∈ S n � + = is intersection of (infinitely many) halfspaces.
Convex function: Definition ◮ Convex function: A function f : S → R is convex if S is convex and f ( λ x + (1 − λ ) y ) ≤ λ f ( x ) + (1 − λ ) f ( y ) for all x , y ∈ S , λ ∈ [0 , 1] ≤ ≤ ( y, f ( y )) ( x, f ( x )) ◮ A function f : S → R is strictly convex if S is convex and f ( λ x + (1 − λ ) y ) < λ f ( x ) + (1 − λ ) f ( y ) for all x , y ∈ S , λ ∈ (0 , 1) ◮ A function f : S → R is concave if − f is convex.
First and second order condition for convexity First-order condition: Differentiable f with convex domain is convex if and only if f ( y ) ≥ f ( x ) + ∇ f ( x ) ⊤ ( y − x ) for all x , y ∈ dom f f ( y ) f ( x ) + ∇ f ( x ) T ( y − x ) ( x, f ( x )) first-order approximation of f is Note: first-order approximation of f is global underestimator Second-order condition: Twice differentiable f with convex domain is convex if and only if ∇ 2 f ( x ) � 0 for all x ∈ dom f
Convex functions – Examples Examples on R : ◮ exponential: e ax , for any a ∈ R ◮ powers: x a on R + for a ≥ 1 or a ≤ 0 (otherwise concave) ◮ negative logarithm: − log x on R + Examples on R n : ◮ affine function: f ( x ) = a ⊤ x + b i =1 | x i | p ) 1 / p for p ≥ 1; � x � ∞ = max k | x k | ◮ norms: � x � p = ( � n ◮ convex quadratic: f ( x ) = x ⊤ Bx + g ⊤ x + c with B � 0 ( ∇ 2 f ( x ) = 2 B ) ◮ log-sum-exp: f ( x ) = log ( � n i =1 exp ( x i )) (“smoothed max”, as lim s → 0 s f ( x / s ) = max { x 1 , . . . , x n } )
Operations that preserve convexity of functions ◮ nonnegative weighted sum: f ( x ) = � m j =1 α j f j ( x ) is convex if α j ≥ 0 and all f j are convex ◮ composition with affine function: f ( x ) = g ( Ax + b ) is convex if g is convex ◮ pointwise maximum: f ( x ) = max { f 1 ( x ) , . . . , f m ( x ) } is convex if all f j are convex (even supremum over infinitely many functions) ◮ minimization: if g ( x , u ) is jointly convex in ( x , u ) then f ( x ) = inf u g ( x , u ) is convex ◮ convex in monotone convex: f ( x ) = h ( g ( x )) is convex if g is convex and h : R → R is monotonely non-decreasing and convex. Proof for smooth functions: ∇ 2 f ( x ) = h ′′ ( g ( x )) ∇ g ( x ) ∇ g ( x ) T + h ′ ( g ( x )) ∇ 2 g ( x )
Examples ◮ composition with affine function: f ( x ) = � Ax + b � 2 ◮ expectation f ( x ) = E w {� A ( w ) x + b ( w ) � 2 } is convex (nonnegative weighted sum) ◮ f ( x ) = exp( c ⊤ x + d ) − log( a ⊤ x + b ) is convex on � a ⊤ x + b > 0 � � � x ◮ pointwise maximum: f ( x ) = max � w � 2 ≤ 1 ( a + Pw ) ⊤ x = a ⊤ x + � P ⊤ x � 2 is convex (used for robust LP) ◮ minimization: for R ≻ 0, regard � ⊤ � Q S ⊤ � x � � x � = x ⊤ ( Q − S ⊤ R − 1 S ) x . f ( x ) = min u u S R u S ⊤ � Q � This f ( x ) is convex if � 0 (cf. Schur complement) S R
Connecting convex sets and functions: sublevel sets Theorem: Sublevel set S = { x | f ( x ) ≤ c } of a convex function f is a convex set Proof: x , y ∈ S and convexity of f imply for t ∈ [0 , 1] that f ( tx + (1 − t ) y ) ≤ tf ( x ) + (1 − t ) f ( y ) ≤ c . Note: the sign of the inequality matters - superlevel sets { x | f ( x ) ≥ c } would not be convex.
Convex sublevel sets – Examples ◮ norm balls: { x ∈ R n | � x − x c � ≤ r } for any norm � · � , with radius r > 0 and centerpoint x c � ( x − x c ) ⊤ P − 1 ( x − x c ) ≤ 1 ◮ ellipsoids: � x ∈ R n � � for any positive definite shape matrix P ≻ 0 ( x , t ) ∈ R n +1 | � x � ≤ t ◮ norm cones: � �
Overview ◮ Convex sets ◮ Convex functions ◮ Operations that preserve convexity ◮ Convex optimization
Recall: General Optimization Problem minimize f ( z ) z subject to g i ( z ) = 0 , i = 1 , . . . , p h i ( z ) ≤ 0 , i = 1 , . . . , m ◮ z = ( z 1 , . . . , z n ): variables ◮ f : R n → R : objective function ◮ g : R n → R , i = 1 , . . . , p : z ∗ equality constraint functions ◮ h : R n → R , i = 1 , . . . , m : C inequality constraint functions f ( z ) = ����� ◮ C := { z | h i ( z ) ≤ 0 , i = 1 , . . . , m , g i ( z ) = 0 , i = 1 , . . . , p } : feasible set
Optimality minimal value : smallest possible cost p ∗ := inf { f ( z ) | z ∈ C } . minimizer : feasible z ∗ with f ( z ∗ ) = p ∗ ; set of all minimizers: { z ∈ C | f ( z ) = p ∗ } ◮ z ∈ C is locally optimal if, for some R > 0, it f ( y ) satisfies f ( z ) R y ∈ C , � y − z � ≤ R ⇒ f ( y ) ≥ f ( z ) C ◮ z ∈ C is globally optimal if it satisfies f ( y ) f ( z ) y ∈ C ⇒ f ( y ) ≥ f ( z ) C ◮ If p ∗ = −∞ the problem is unbounded below ◮ If C is empty, then the problem is said to be infeasible (convention: p ∗ = ∞ )
Convex optimization problem in standard form minimize f ( z ) z subject to h i ( z ) ≤ 0 , i = 1 , . . . , m c ⊤ i z = b i , i = 1 , . . . , p ◮ f , h 1 , . . . , h m are convex ◮ equality constraints are affine often rewritten as minimize f ( z ) z subject to h ( z ) ≤ 0 Cz = b where C ∈ R p × n and h : R n → R m . Note: With nonlinear equalities, feasible set would generally not be convex
Local and global optimality in convex optimization Lemma Any locally optimal point of a convex problem is globally optimal. Proof: Assume x locally optimal and a feasible y such f ( y ) < f ( x ). x locally optimal implies that there exists an R > 0 such that � z − x � 2 ≤ R ⇒ f ( z ) ≥ f ( x ) f ( x ) f ( y ) x y z R
Local and global optimality in convex optimization Lemma Any locally optimal point of a convex problem is globally optimal. Proof: Assume x locally optimal and a feasible y such f ( y ) < f ( x ). x locally optimal implies that there exists an R > 0 such that � z − x � 2 ≤ R ⇒ f ( z ) ≥ f ( x ) ���������������� f ( x ) ⇒ f ( z ) > f ( x ) f ( y ) x y z R ��������� ⇒ f ( z ) < f ( x )
Linear Program (LP) c ⊤ x minimize x c ⊤ subject to i x + d i ≤ 0 , i = 1 , . . . , m Ax = b
LP Example minimize � Ax + b � 1 x ∈ R n subject to Cx + d = 0 equivalent to m � minimize s i x ∈ R n , s ∈ R m i =1 subject to − s ≤ Ax + b ≤ s Cx + d = 0
Quadratic Program (QP) c ⊤ x + 1 2 x ⊤ Bx minimize x c ⊤ subject to i x + d i ≤ 0 , i = 1 , . . . , m Ax = b convex if B � 0 strictly convex if B ≻ 0
Quadratically Constrained Quadratic Program (QCQP) x ⊤ B 0 x + c ⊤ minimize 0 x + r 0 x x ⊤ B i x + c ⊤ subject to i x + r i ≤ 0 , i = 1 , . . . , m Ax = b convex if B 0 , . . . , B m � 0
Recommend
More recommend