Convex Optimization: Modeling and Algorithms Lieven Vandenberghe - PowerPoint PPT Presentation

Examples convex quadratic function ( Q ≻ 0 ) f ( x ) = 1 f ∗ ( y ) = 1 2 x T Qx 2 y T Q − 1 y negative entropy n n e y i − 1 � � f ∗ ( y ) = f ( x ) = x i log x i i =1 i =1 norm � � y � ∗ ≤ 1 0 f ∗ ( y ) = f ( x ) = � x � + ∞ otherwise indicator function ( C convex) � x ∈ C 0 f ∗ ( y ) = sup y T x f ( x ) = I C ( x ) = + ∞ otherwise x ∈ C Convex sets and functions 24

Convex optimization — MLSS 2012 Convex optimization problems • linear programming • quadratic programming • geometric programming • second-order cone programming • semidefinite programming

Convex optimization problem minimize f 0 ( x ) f i ( x ) ≤ 0 , subject to i = 1 , . . . , m Ax = b f 0 , f 1 , . . . , f m are convex functions • feasible set is convex • locally optimal points are globally optimal • tractable, in theory and practice Convex optimization problems 25

Linear program (LP) c T x + d minimize subject to Gx ≤ h Ax = b • inequality is componentwise vector inequality • convex problem with affine objective and constraint functions • feasible set is a polyhedron − c x ⋆ P Convex optimization problems 26

Piecewise-linear minimization i =1 ,...,m ( a T minimize f ( x ) = max i x + b i ) f ( x ) a T i x + b i x equivalent linear program minimize t a T i x + b i ≤ t, subject to i = 1 , . . . , m an LP with variables x , t ∈ R Convex optimization problems 27

ℓ 1 -Norm and ℓ ∞ -norm minimization ℓ 1 -norm approximation and equivalent LP ( � y � 1 = � k | y k | ) n � � Ax − b � 1 minimize minimize y i i =1 − y ≤ Ax − b ≤ y subject to ℓ ∞ -norm approximation ( � y � ∞ = max k | y k | ) minimize � Ax − b � ∞ minimize y subject to − y 1 ≤ Ax − b ≤ y 1 ( 1 is vector of ones) Convex optimization problems 28

✁ � ✁ � ✁ � example: histograms of residuals Ax − b (with A is 200 × 80 ) for x ls = argmin � Ax − b � 2 , x ℓ 1 = argmin � Ax − b � 1 10 8 6 4 2 0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 ( Ax ls − b ) k 100 80 60 40 20 0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 ( Ax ℓ 1 − b ) k 1-norm distribution is wider with a high peak at zero Convex optimization problems 29

✂ ✂ ✂ ✂ ✂ ✂ Robust regression 25 20 15 10 5 f ( t ) 0 5 10 15 20 10 5 0 5 10 t • 42 points t i , y i (circles), including two outliers • function f ( t ) = α + βt fitted using 2 -norm (dashed) and 1 -norm Convex optimization problems 30

Linear discrimination • given a set of points { x 1 , . . . , x N } with binary labels s i ∈ {− 1 , 1 } • find hyperplane a T x + b = 0 that strictly separates the two classes a T x i + b > 0 if s i = 1 a T x i + b < 0 if s i = − 1 homogeneous in a , b , hence equivalent to the linear inequalities (in a , b ) s i ( a T x i + b ) ≥ 1 , i = 1 , . . . , N Convex optimization problems 31

Approximate linear separation of non-separable sets N � max { 0 , 1 − s i ( a T x i + b ) } minimize i =1 • a piecewise-linear minimization problem in a , b ; equivalent to an LP • can be interpreted as a heuristic for minimizing #misclassified points Convex optimization problems 32

Quadratic program (QP) (1 / 2) x T Px + q T x + r minimize Gx ≤ h subject to • P ∈ S n + , so objective is convex quadratic • minimize a convex quadratic function over a polyhedron −∇ f 0 ( x ⋆ ) x ⋆ P Convex optimization problems 33

Linear program with random cost c T x minimize Gx ≤ h subject to • c is random vector with mean ¯ c and covariance Σ • hence, c T x is random variable with mean ¯ c T x and variance x T Σ x expected cost-variance trade-off E c T x + γ var ( c T x ) = ¯ c T x + γx T Σ x minimize subject to Gx ≤ h γ > 0 is risk aversion parameter Convex optimization problems 34

Robust linear discrimination { z | a T z + b = 1 } H 1 = { z | a T z + b = − 1 } H − 1 = distance between hyperplanes is 2 / � a � 2 to separate two sets of points by maximum margin, � a � 2 2 = a T a minimize s i ( a T x i + b ) ≥ 1 , subject to i = 1 , . . . , N a quadratic program in a , b Convex optimization problems 35

Support vector classifier N � γ � a � 2 max { 0 , 1 − s i ( a T x i + b ) } minimize 2 + i =1 γ = 0 γ = 10 equivalent to a quadratic program Convex optimization problems 36

Kernel formulation f ( Xa ) + � a � 2 minimize 2 • variables a ∈ R n • X ∈ R N × n with N ≤ n and rank N change of variables a = X T ( XX T ) − 1 y y = Xa, • a is minimum-norm solution of Xa = y • gives convex problem with N variables y f ( y ) + y T Q − 1 y minimize Q = XX T is kernel matrix Convex optimization problems 37

Total variation signal reconstruction x − x cor � 2 minimize � ˆ 2 + γφ (ˆ x ) • x cor = x + v is corrupted version of unknown signal x , with noise v • variable ˆ x (reconstructed signal) is estimate of x • φ : R n → R is quadratic or total variation smoothing penalty n − 1 n − 1 � � x i ) 2 , φ quad (ˆ x ) = (ˆ x i +1 − ˆ φ tv (ˆ x ) = | ˆ x i +1 − ˆ x i | i =1 i =1 Convex optimization problems 38

✄ ✄ ✄ example: x cor , and reconstruction with quadratic and t.v. smoothing 2 x cor 0 2 0 500 1000 1500 2000 2 i quad. 0 2 0 500 1000 1500 2000 2 i t.v. 0 2 0 500 1000 1500 2000 i • quadratic smoothing smooths out noise and sharp transitions in signal • total variation smoothing preserves sharp transitions in signal Convex optimization problems 39

Geometric programming posynomial function K � c k x a 1 k 1 x a 2 k dom f = R n · · · x a nk f ( x ) = n , ++ 2 k =1 with c k > 0 geometric program (GP) minimize f 0 ( x ) subject to f i ( x ) ≤ 1 , i = 1 , . . . , m with f i posynomial Convex optimization problems 40

Geometric program in convex form change variables to y i = log x i , and take logarithm of cost, constraints geometric program in convex form: � K � � exp( a T minimize log 0 k y + b 0 k ) k =1 � K � � exp( a T ≤ 0 , subject to log ik y + b ik ) i = 1 , . . . , m k =1 b ik = log c ik Convex optimization problems 41

Second-order cone program (SOCP) f T x minimize � A i x + b i � 2 ≤ c T subject to i x + d i , i = 1 , . . . , m � y 2 • � · � 2 is Euclidean norm � y � 2 = 1 + · · · + y 2 n • constraints are nonlinear, nondifferentiable, convex 1 constraints are inequalities w.r.t. second-order cone: 0 . 5 y 3 � � � � y 2 1 + · · · + y 2 y p − 1 ≤ y p � 0 � 1 1 0 0 y 2 − 1 − 1 y 1 Convex optimization problems 42

Robust linear program (stochastic) c T x minimize prob ( a T i x ≤ b i ) ≥ η, subject to i = 1 , . . . , m • a i random and normally distributed with mean ¯ a i , covariance Σ i • we require that x satisfies each constraint with probability exceeding η η = 10% η = 50% η = 90% Convex optimization problems 43

SOCP formulation the ‘chance constraint’ prob ( a T i x ≤ b i ) ≥ η is equivalent to the constraint i x + Φ − 1 ( η ) � Σ 1 / 2 a T ¯ x � 2 ≤ b i i Φ is the (unit) normal cumulative density function 1 η Φ( t ) 0.5 Φ − 1 ( η ) 0 0 t robust LP is a second-order cone program for η ≥ 0 . 5 Convex optimization problems 44

Robust linear program (deterministic) c T x minimize a T i x ≤ b i for all a i ∈ E i , subject to i = 1 , . . . , m • a i uncertain but bounded by ellipsoid E i = { ¯ a i + P i u | � u � 2 ≤ 1 } • we require that x satisfies each constraint for all possible a i SOCP formulation c T x minimize a T i x + � P T i x � 2 ≤ b i , subject to ¯ i = 1 , . . . , m follows from a i + P i u ) T x = ¯ a T i x + � P T sup (¯ i x � 2 � u � 2 ≤ 1 Convex optimization problems 45

Examples of second-order cone constraints convex quadratic constraint ( A = LL T positive definite) x T Ax + 2 b T x + c ≤ 0 � � � L T x + L − 1 b � 2 ≤ ( b T A − 1 b − c ) 1 / 2 � extends to positive semidefinite singular A hyperbolic constraint x T x ≤ yz, y, z ≥ 0 � � �� 2 x � � ≤ y + z, y, z ≥ 0 � � y − z � � 2 Convex optimization problems 46

Examples of SOC-representable constraints positive powers x 1 . 5 ≤ t, x ≥ 0 � x 2 ≤ tz, z 2 ≤ x, ∃ z : x, z ≥ 0 • two hyperbolic constraints can be converted to SOC constraints • extends to powers x p for rational p ≥ 1 negative powers x − 3 ≤ t, x > 0 � z 2 ≤ tx, ∃ z : 1 ≤ tz, x, z ≥ 0 • two hyperbolic constraints on r.h.s. can be converted to SOC constraints • extends to powers x p for rational p < 0 Convex optimization problems 47

Semidefinite program (SDP) c T x minimize subject to x 1 A 1 + x 2 A 2 + · · · + x n A n � B • A 1 , A 2 , . . . , A n , B are symmetric matrices • inequality X � Y means Y − X is positive semidefinite , i.e. , � z T ( Y − X ) z = ( Y ij − X ij ) z i z j ≥ 0 for all z i,j • includes many nonlinear constraints as special cases Convex optimization problems 48

Geometry 1 0 . 5 z � � x y � 0 y z 0 1 1 0 0 . 5 y − 1 0 x • a nonpolyhedral convex cone • feasible set of a semidefinite program is the intersection of the positive semidefinite cone in high dimension with planes Convex optimization problems 49

Examples ( A i ∈ S n ) A ( x ) = A 0 + x 1 A 1 + · · · + x m A m eigenvalue minimization (and equivalent SDP) minimize λ max ( A ( x )) minimize t A ( x ) � tI subject to matrix-fractional function b T A ( x ) − 1 b minimize minimize t � A ( x ) � A ( x ) � 0 subject to b subject to � 0 b T t Convex optimization problems 50

Matrix norm minimization ( A i ∈ R p × q ) A ( x ) = A 0 + x 1 A 1 + x 2 A 2 + · · · + x n A n matrix norm approximation ( � X � 2 = max k σ k ( X ) ) � A ( x ) � 2 minimize minimize t A ( x ) T � � tI subject to � 0 A ( x ) tI nuclear norm approximation ( � X � ∗ = � k σ k ( X ) ) minimize � A ( x ) � ∗ minimize ( tr U + tr V ) / 2 A ( x ) T � � U � 0 subject to A ( x ) V Convex optimization problems 51

Semidefinite relaxation semidefinite programming is often used • to find good bounds for nonconvex polynomial problems, via relaxation • as a heuristic for good suboptimal points example: Boolean least-squares � Ax − b � 2 minimize 2 x 2 subject to i = 1 , i = 1 , . . . , n • basic problem in digital communications • could check all 2 n possible values of x ∈ {− 1 , 1 } n . . . • an NP-hard problem, and very hard in general Convex optimization problems 52

Lifting Boolean least-squares problem x T A T Ax − 2 b T Ax + b T b minimize x 2 subject to i = 1 , i = 1 , . . . , n reformulation: introduce new variable Y = xx T tr ( A T AY ) − 2 b T Ax + b T b minimize Y = xx T subject to diag ( Y ) = 1 • cost function and second constraint are linear (in the variables Y , x ) • first constraint is nonlinear and nonconvex . . . still a very hard problem Convex optimization problems 53

Relaxation replace Y = xx T with weaker constraint Y � xx T to obtain relaxation tr ( A T AY ) − 2 b T Ax + b T b minimize Y � xx T subject to diag ( Y ) = 1 • convex; can be solved as a semidefinite program � Y � x Y � xx T ⇐ ⇒ � 0 x T 1 • optimal value gives lower bound for Boolean LS problem • if Y = xx T at the optimum, we have solved the exact problem • otherwise, can use randomized rounding generate z from N ( x, Y − xx T ) and take x = sign ( z ) Convex optimization problems 54

Example 0.5 0.4 SDP bound LS solution frequency 0.3 0.2 0.1 0 1 1.2 � Ax − b � 2 / ( SDP bound ) • n = 100 : feasible set has 2 100 ≈ 10 30 points • histogram of 1000 randomized solutions from SDP relaxation Convex optimization problems 55

Overview 1. Basic theory and convex modeling • convex sets and functions • common problem classes and applications 2. Interior-point methods for conic optimization • conic optimization • barrier methods • symmetric primal-dual methods 3. First-order methods • (proximal) gradient algorithms • dual techniques and multiplier methods

Convex optimization — MLSS 2012 Conic optimization • definitions and examples • modeling • duality

Generalized (conic) inequalities conic inequality: a constraint x ∈ K with K a convex cone in R m we require that K is a proper cone: • closed • pointed: does not contain a line (equivalently, K ∩ ( − K ) = { 0 } • with nonempty interior: int K � = ∅ (equivalently, K + ( − K ) = R m ) notation x � K y ⇐ ⇒ x − y ∈ K, x ≻ K y ⇐ ⇒ x − y ∈ int K subscript in � K is omitted if K is clear from the context Conic optimization 56

Cone linear program c T x minimize Ax � K b subject to if K is the nonnegative orthant, this is a (regular) linear program widely used in recent literature on convex optimization • modeling: a small number of ‘primitive’ cones is sufficient to express most convex constraints that arise in practice • algorithms : a convenient problem format when extending interior-point algorithms for linear programming to convex optimization Conic optimization 57

Norm cone ( x, y ) ∈ R m − 1 × R | � x � ≤ y � � K = 1 0 . 5 y 0 1 1 0 0 x 2 − 1 − 1 x 1 for the Euclidean norm this is the second-order cone (notation: Q m ) Conic optimization 58

Second-order cone program c T x minimize subject to � B k 0 x + d k 0 � 2 ≤ B k 1 x + d k 1 , k = 1 , . . . , r cone LP formulation: express constraints as Ax � K b     − B 10 d 10 − B 11 d 11         . . K = Q m 1 × · · · × Q m r , . .     A = . , b = .         − B r 0 d r 0         − B r 1 d r 1 (assuming B k 0 , d k 0 have m k − 1 rows) Conic optimization 59

Vector notation for symmetric matrices • vectorized symmetric matrix: for U ∈ S p √ � U 11 � 2 , U 21 , . . . , U p 1 , U 22 2 , U 32 , . . . , U p 2 , . . . , U pp √ √ √ vec ( U ) = 2 2 • inverse operation: for u = ( u 1 , u 2 , . . . , u n ) ∈ R n with n = p ( p + 1) / 2 √   · · · 2 u 1 u 2 u p √ mat ( u ) = 1 · · · u 2 2 u p +1 u 2 p − 1   √ . . .  . . .  . . . 2 √   · · · u p u 2 p − 1 2 u p ( p +1) / 2 √ coefficients 2 are added so that standard inner products are preserved: tr ( UV ) = vec ( U ) T vec ( V ) , u T v = tr ( mat ( u ) mat ( v )) Conic optimization 60

Positive semidefinite cone + } = { x ∈ R p ( p +1) / 2 | mat ( x ) � 0 } S p = { vec ( X ) | X ∈ S p 1 z 0.5 0 1 1 0 0.5 −1 y x 0 √ � � � � � x y/ 2 S 2 = √ � � 0 ( x, y, z ) � y/ 2 z � Conic optimization 61

Semidefinite program c T x minimize x 1 A 11 + x 2 A 12 + · · · + x n A 1 n � B 1 subject to . . . x 1 A r 1 + x 2 A r 2 + · · · + x n A rn � B r r linear matrix inequalities of order p 1 , . . . , p r cone LP formulation: express constraints as Ax � K B K = S p 1 × S p 2 × · · · × S p r     · · · vec ( A 11 ) vec ( A 12 ) vec ( A 1 n ) vec ( B 1 ) · · · vec ( A 21 ) vec ( A 22 ) vec ( A 2 n ) vec ( B 2 )     A =  , b = . . . . . . . .     . . . .    vec ( A r 1 ) vec ( A r 2 ) · · · vec ( A rn ) vec ( B r ) Conic optimization 62

Exponential cone the epigraph of the perspective of exp x is a non-proper cone ( x, y, z ) ∈ R 3 | ye x/y ≤ z, y > 0 � � K = the exponential cone is K exp = cl K = K ∪ { ( x, 0 , z ) | x ≤ 0 , z ≥ 0 } 1 0.5 z 0 3 2 1 1 0 y −1 0 x −2 Conic optimization 63

Geometric program c T x minimize n i exp( a T subject to log � ik x + b ik ) ≤ 0 , i = 1 , . . . , r k =1 cone LP formulation c T x minimize a T   ik x + b ik  ∈ K exp , subject to 1 k = 1 , . . . , n i , i = 1 , . . . , r  z ik n i � z ik ≤ 1 , i = 1 , . . . , m k =1 Conic optimization 64

Power cone m definition: for α = ( α 1 , α 2 , . . . , α m ) > 0 , � α i = 1 i =1 ( x, y ) ∈ R m + × R | | y | ≤ x α 1 1 · · · x α m � � K α = m examples for m = 2 α = ( 1 2 , 1 α = ( 2 3 , 1 α = ( 3 4 , 1 2 ) 3 ) 4 ) 0.5 0.5 0.4 0.2 0 y 0 y y 0 −0.2 −0.4 −0.5 −0.5 0 0 0 0.5 0.5 0.5 1 1 1 0.5 0.5 0.5 1 1 1 x 2 0 x 2 x 2 0 0 x 1 x 1 x 1 Conic optimization 65

Outline • definition and examples • modeling • duality

Modeling software modeling packages for convex optimization • CVX, YALMIP (MATLAB) • CVXPY, CVXMOD (Python) assist the user in formulating convex problems, by automating two tasks: • verifying convexity from convex calculus rules • transforming problem in input format required by standard solvers related packages general-purpose optimization modeling: AMPL, GAMS Conic optimization 66

CVX example � Ax − b � 1 minimize subject to 0 ≤ x k ≤ 1 , k = 1 , . . . , n MATLAB code cvx_begin variable x(3); minimize(norm(A*x - b, 1)) subject to x >= 0; x <= 1; cvx_end • between cvx_begin and cvx_end , x is a CVX variable • after execution, x is MATLAB variable with optimal solution Conic optimization 67

Modeling and conic optimization convex modeling systems (CVX, YALMIP, CVXPY, CVXMOD, . . . ) • convert problems stated in standard mathematical notation to cone LPs • in principle, any convex problem can be represented as a cone LP • in practice, a small set of primitive cones is used ( R n + , Q p , S p ) • choice of cones is limited by available algorithms and solvers (see later) modeling systems implement set of rules for expressing constraints f ( x ) ≤ t as conic inequalities for the implemented cones Conic optimization 68

Examples of second-order cone representable functions • convex quadratic f ( x ) = x T Px + q T x + r ( P � 0) • quadratic-over-linear function f ( x, y ) = x T x with dom f = R n × R + ( assume 0 / 0 = 0) y • convex powers with rational exponent x β � x > 0 f ( x ) = | x | α , f ( x ) = + ∞ x ≤ 0 for rational α ≥ 1 and β ≤ 0 • p -norm f ( x ) = � x � p for rational p ≥ 1 Conic optimization 69

Examples of SD cone representable functions • matrix-fractional function + × R n | y ∈ R ( X ) } f ( X, y ) = y T X − 1 y with dom f = { ( X, y ) ∈ S n • maximum eigenvalue of symmetric matrix • maximum singular value f ( X ) = � X � 2 = σ 1 ( X ) � tI � X � X � 2 ≤ t ⇐ ⇒ � 0 X T tI • nuclear norm f ( X ) = � X � ∗ = � i σ i ( X ) � � 1 U X � X � ∗ ≤ t ⇐ ⇒ ∃ U, V : � 0 , 2( tr U + tr V ) ≤ t X T V Conic optimization 70

Functions representable with exponential and power cone exponential cone • exponential and logarithm • entropy f ( x ) = x log x power cone • increasing power of absolute value: f ( x ) = | x | p with p ≥ 1 • decreasing power: f ( x ) = x q with q ≤ 0 and domain R ++ • p -norm: f ( x ) = � x � p with p ≥ 1 Conic optimization 71

Outline • definition and examples • modeling • duality

Linear programming duality primal and dual LP c T x − b T z (P) minimize (D) maximize A T z + c = 0 subject to Ax ≤ b subject to z ≥ 0 • primal optimal value is p ⋆ ( + ∞ if infeasible, −∞ if unbounded below) • dual optimal value is d ⋆ ( −∞ if infeasible, + ∞ if unbounded below) duality theorem • weak duality: p ⋆ ≥ d ⋆ , with no exception • strong duality: p ⋆ = d ⋆ if primal or dual is feasible • if p ⋆ = d ⋆ is finite, then primal and dual optima are attained Conic optimization 72

Dual cone definition K ∗ = { y | x T y ≥ 0 for all x ∈ K } K ∗ is a proper cone if K is a proper cone dual inequality: x � ∗ y means x � K ∗ y for generic proper cone K note: dual cone depends on choice of inner product: H − 1 K ∗ is dual cone for inner product � x, y � = x T Hy Conic optimization 73

Examples + , Q p , S p are self-dual: K = K ∗ • R p • dual of a norm cone is the norm cone of the dual norm • dual of exponential cone ( u, v, w ) ∈ R − × R × R + | − u log( − u/w ) + u − v ≤ 0 K ∗ � � exp = (with 0 log(0 /w ) = 0 if w ≥ 0 ) • dual of power cone is + × R | | v | ≤ ( u 1 /α 1 ) α 1 · · · ( u m /α m ) α m � K ∗ ( u, v ) ∈ R m � α = Conic optimization 74

Primal and dual cone LP primal problem (optimal value p ⋆ ) c T x minimize subject to Ax � b dual problem (optimal value d ⋆ ) − b T z maximize A T z + c = 0 subject to z � ∗ 0 weak duality : p ⋆ ≥ d ⋆ (without exception) Conic optimization 75

Strong duality p ⋆ = d ⋆ if primal or dual is strictly feasible • slightly weaker than LP duality (which only requires feasibility) • can have d ⋆ < p ⋆ with finite p ⋆ and d ⋆ other implications of strict feasibility • if primal is strictly feasible, then dual optimum is attained (if d ⋆ is finite) • if dual is strictly feasible, then primal optimum is attained (if p ⋆ is finite) Conic optimization 76

Optimality conditions c T x − b T z minimize maximize A T z + c = 0 subject to Ax + s = b subject to s � 0 z � ∗ 0 optimality conditions A T � � � � � � � � 0 0 x c = + − A s 0 z b z T s = 0 s � 0 , z � ∗ 0 , duality gap: inner product of ( x, z ) and (0 , s ) gives z T s = c T x + b T z Conic optimization 77

Convex optimization — MLSS 2012 Barrier methods • barrier method for linear programming • normal barriers • barrier method for conic optimization

History • 1960s: Sequentially Unconstrained Minimization Technique (SUMT) solves nonlinear convex optimization problem minimize f 0 ( x ) f i ( x ) ≤ 0 , subject to i = 1 , . . . , m via a sequence of unconstrained minimization problems m tf 0 ( x ) − � log( − f i ( x )) minimize i =1 • 1980s: LP barrier methods with polynomial worst-case complexity • 1990s: barrier methods for non-polyhedral cone LPs Barrier methods 78

Logarithmic barrier function for linear inequalities m • barrier for nonnegative orthant R m � + : φ ( s ) = − log s i i =1 • barrier for inequalities Ax ≤ b : m � log( b i − a T ψ ( x ) = φ ( b − Ax ) = − i x ) i =1 convex, ψ ( x ) → ∞ at boundary of dom ψ = { x | Ax < b } gradient and Hessian ∇ ψ ( x ) = − A T ∇ φ ( s ) , ∇ 2 ψ ( x ) = A T ∇ φ 2 ( s ) A with s = b − Ax and � 1 � 1 , . . . , 1 � , . . . , 1 � ∇ φ 2 ( s ) = diag ∇ φ ( s ) = − , s 2 s 2 s 1 s m m 1 Barrier methods 79

Central path for linear program c T x minimize Ax ≤ b subject to c central path: minimizers x ⋆ ( t ) of f t ( x ) = tc T x + φ ( b − Ax ) x ⋆ ( t ) x ⋆ t is a positive parameter optimality conditions: x = x ⋆ ( t ) satisfies ∇ f t ( x ) = tc − A T ∇ φ ( s ) = 0 , s = b − Ax Barrier methods 80

Central path and duality dual feasible point on central path • for x = x ⋆ ( t ) and s = b − Ax , � 1 z ∗ ( t ) = − 1 , 1 , . . . , 1 � t ∇ φ ( s ) = ts 1 ts 2 ts m z = z ⋆ ( t ) is strictly dual feasible: c + A T z = 0 and z > 0 • can be corrected to account for inexact centering of x ≈ x ⋆ ( t ) duality gap between x = x ⋆ ( t ) and z = z ⋆ ( t ) is c T x + b T z = s T z = m t gives bound on suboptimality: c T x ⋆ ( t ) − p ⋆ ≤ m/t Barrier methods 81

Barrier method starting with t > 0 , strictly feasible x • make one or more Newton steps to (approximately) minimize f t : x + = x − α ∇ 2 f t ( x ) − 1 ∇ f t ( x ) step size α is fixed or from line search • increase t and repeat until c T x − p ⋆ ≤ ǫ complexity: with proper initialization, step size, update scheme for t , � √ m log(1 /ǫ ) � #Newton steps = O result follows from convergence analysis of Newton’s method for f t Barrier methods 82

Outline • barrier method for linear programming • normal barriers • barrier method for conic optimization

Normal barrier for proper cone φ is a θ -normal barrier for the proper cone K if it is • a barrier : smooth, convex, domain int K , blows up at boundary of K • logarithmically homogeneous with parameter θ : φ ( tx ) = φ ( x ) − θ log t, ∀ x ∈ int K, t > 0 • self-concordant : restriction g ( α ) = φ ( x + αv ) to any line satisfies g ′′′ ( α ) ≤ 2 g ′′ ( α ) 3 / 2 (Nesterov and Nemirovski, 1994) Barrier methods 83

Examples nonnegative orthant: K = R m + m � φ ( x ) = − log x i ( θ = m ) i =1 second-order cone: K = Q p = { ( x, y ) ∈ R p − 1 × R | � x � 2 ≤ y } φ ( x, y ) = − log( y 2 − x T x ) ( θ = 2) semidefinite cone : K = S m = { x ∈ R m ( m +1) / 2 | mat ( x ) � 0 } φ ( x ) = − log det mat ( x ) ( θ = m ) Barrier methods 84

exponential cone: K exp = cl { ( x, y, z ) ∈ R 3 | ye x/y ≤ z, y > 0 } φ ( x, y, z ) = − log ( y log( z/y ) − x ) − log z − log y ( θ = 3) power cone: K = { ( x 1 , x 2 , y ) ∈ R + × R + × R | | y | ≤ x α 1 1 x α 2 2 } � − y 2 � x 2 α 1 x 2 α 2 φ ( x, y ) = − log − log x 1 − log x 2 ( θ = 4) 1 2 Barrier methods 85

Central path conic LP (with inequality with respect to proper cone K ) c T x minimize subject to Ax � b barrier for the feasible set φ ( b − Ax ) where φ is a θ -normal barrier for K central path: set of minimizers x ⋆ ( t ) (with t > 0 ) of f t ( x ) = tc T x + φ ( b − Ax ) Barrier methods 86

Newton step centering problem f t ( x ) = tc T x + φ ( b − Ax ) minimize Newton step at x ∆ x = −∇ 2 f t ( x ) − 1 ∇ f t ( x ) Newton decrement � 1 / 2 ∆ x T ∇ 2 f t ( x )∆ x � λ t ( x ) = � 1 / 2 −∇ f t ( x ) T ∆ x � = useful as a measure of proximity of x to x ⋆ ( t ) Barrier methods 87

Damped Newton method f t ( x ) = tc T x + φ ( b − Ax ) minimize algorithm (with parameters ǫ ∈ (0 , 1 / 2) , η ∈ (0 , 1 / 4] ) select a starting point x ∈ dom f t repeat: 1. compute Newton step ∆ x and Newton decrement λ t ( x ) 2. if λ t ( x ) 2 ≤ ǫ , return x 3. otherwise, set x := x + α ∆ x with 1 if λ t ( x ) ≥ η, α = α = 1 if λ t ( x ) < η 1 + λ t ( x ) • stopping criterion λ t ( x ) 2 ≤ ǫ implies f t ( x ) − inf f t ( x ) ≤ ǫ • alternatively, can use backtracking line search Barrier methods 88

Convergence results for damped Newton method • damped Newton phase: f t decreases by at least a positive constant γ f t ( x + ) − f t ( x ) ≤ − γ if λ t ( x ) ≥ η where γ = η − log(1 + η ) • quadratic convergence phase: λ t rapidly decreases to zero 2 λ t ( x + ) ≤ (2 λ t ( x )) 2 if λ t ( x ) < η implies λ t ( x + ) ≤ 2 η 2 < η conclusion: the number of Newton iterations is bounded by f t ( x (0) ) − inf f t ( x ) + log 2 log 2 (1 /ǫ ) γ Barrier methods 89

Outline • barrier method for linear programming • normal barriers • barrier method for conic optimization

Convex Optimization: Modeling and Algorithms Lieven Vandenberghe - PowerPoint PPT Presentation

Convex Optimization: Modeling and Algorithms Lieven Vandenberghe Electrical Engineering Department, UC Los Angeles Tutorial lectures, 21st Machine Learning Summer School Kyoto, August 29-30, 2012 Convex optimization MLSS 2012 Introduction

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Optimization Problems

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

Some Recent Advances in Non-convex Optimization Purushottam Kar IIT KANPUR Outline of the Talk

16. Review of convex optimization Convex sets and functions Convex programming models

A Primer in Convex Optimization Moritz Diehl partly based on material by Colin Jones, Stephen

Convex hull: basic facts Convex hull: basic facts CG Lecture 1 CG Lecture 1 Problem : give a set

Faster convex optimization Simulated annealing & Interior point Elad Hazan Joint work with

Projective Splitting Methods for Decomposing Convex Optimization Problems Jonat han Eckstein

Brndsted-Rockafellar property of subdifferentials of prox-bounded functions Marc Lassonde

Inferring Visibility: Who is (not) talking to whom? Gonca Grsun, Natali Ruchansky, Evimaria

Recent Progresses in Stochastic Algorithms for Big Data Optimization Tong Zhang Rutgers

Complex Case Phenomena in the Grammar Matrix Scott Drellishak University of Washington July 28,

Projection onto Minkowski Sums with Application to Constrained Learning Joong-Ho (Johann) Won 1

Distributed nonsmooth composite optimization via the proximal augmented Lagrangian Neil K.

Stochastic Optimization Techniques for Big Data Machine Learning Tong Zhang Rutgers University

Sambuz

Useful Links

Newsletter

Mail Us

Convex Optimization: Modeling and Algorithms Lieven Vandenberghe - PowerPoint PPT Presentation

Convex Optimization: Modeling and Algorithms Lieven Vandenberghe Electrical Engineering Department, UC Los Angeles Tutorial lectures, 21st Machine Learning Summer School Kyoto, August 29-30, 2012 Convex optimization MLSS 2012 Introduction

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Optimization Problems

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

Some Recent Advances in Non-convex Optimization Purushottam Kar IIT KANPUR Outline of the Talk

16. Review of convex optimization Convex sets and functions Convex programming models

A Primer in Convex Optimization Moritz Diehl partly based on material by Colin Jones, Stephen

Convex hull: basic facts Convex hull: basic facts CG Lecture 1 CG Lecture 1 Problem : give a set

Faster convex optimization Simulated annealing &amp; Interior point Elad Hazan Joint work with

Projective Splitting Methods for Decomposing Convex Optimization Problems Jonat han Eckstein

Brndsted-Rockafellar property of subdifferentials of prox-bounded functions Marc Lassonde

Inferring Visibility: Who is (not) talking to whom? Gonca Grsun, Natali Ruchansky, Evimaria

Recent Progresses in Stochastic Algorithms for Big Data Optimization Tong Zhang Rutgers

Complex Case Phenomena in the Grammar Matrix Scott Drellishak University of Washington July 28,

Projection onto Minkowski Sums with Application to Constrained Learning Joong-Ho (Johann) Won 1

Distributed nonsmooth composite optimization via the proximal augmented Lagrangian Neil K.

Stochastic Optimization Techniques for Big Data Machine Learning Tong Zhang Rutgers University

Sambuz

Useful Links

Newsletter

Mail Us

Faster convex optimization Simulated annealing & Interior point Elad Hazan Joint work with