optimization repetition convexity
play

Optimization (Repetition) Convexity Convex set S x 1 + (1 ) x 2 - PDF document

Optimization (Repetition) Convexity Convex set S x 1 + (1 ) x 2 S , x 1 , x 2 S , [0 , 1]. Convex function f D f convex and f ( x 1 + (1 ) x 2 ) f ( x 1 ) + (1 ) f ( x 2 ),


  1. Optimization (Repetition) Convexity • Convex set S ⇔ λx 1 + (1 − λ ) x 2 ∈ S , ∀ x 1 , x 2 ∈ S , ∀ λ ∈ [0 , 1]. • Convex function f ⇔ D f convex and f ( λx 1 + (1 − λ ) x 2 ) ≤ λf ( x 1 ) + (1 − λ ) f ( x 2 ), ∀ x 1 , x 2 ∈ D f , ∀ λ ∈ [0 , 1]. • f convex ⇔ epi( f ) convex ⇔ f ( x + λd ) convex ∀ x, d : x + λd ∈ D f . Why convexity is good? • If f convex then – loc. min. ⇒ glob. min. – stat.point ⇒ glob. min. Note that glob. min. does not always exist for convex functions (i.e. y = e x ). • For convex min x ∈ S f : a is a glob. min. ⇔ ∇ f ( a ) T ( x − a ) ≥ 0, ∀ x ∈ S . • If S = { x ∈ X | g ( x ) ≤ 0 , h ( x ) = 0 } and X convex, f, g convex, h affine ⇒ convex problem. • For convex problems: – KKT ⇒ saddle point ⇒ glob. min. – Slater condition: ∃ x 0 ∈ S : g ( x 0 ) < 0 ⇒ no duality gap/saddle point. How to check that a set S is convex? • Picture ( n ≤ 3) or definition. • S 1 , S 2 convex ⇒ S 1 ∩ S 2 convex. • f convex ⇒ { x | f ( x ) ≤ const } convex. How to check that a function f is convex? • Graph ( n ≤ 2) or definition. • f 1 , f 2 convex ⇒ f 1 + f 2 convex and max { f 1 , f 2 } convex. • g convex ր and h convex ⇒ g ( h ( x )) convex. • g convex and h affine ⇒ g ( h ( x )) convex. • f convex ⇔ ∇ 2 f pos.-semidef.

  2. Positive-definite and positive-semidefinite • H pos.-def. ⇔ x T Hx > 0, ∀ x � = 0. • H pos.-def. ⇒ x T Hx + c T x + q strictly convex ⇒ ⇒ glob. min. unique (if exists). • H pos.-def. ⇒ − H ∇ f is a descent direction. • Loc. min. ⇒ ∇ f = 0, ∇ 2 f pos.-semidef. • ∇ f = 0, ∇ 2 f pos.-def. ⇒ loc. min. • ∇ 2 f pos.-semidef. on S ⇔ f convex on S . How to check positive-definiteness? • Sylvester: H pos.-def. ⇔ det( H k ) > 0, ∀ k = 1 , . . . , n . • H pos.-def. ⇔ all eigenvalues > 0. How to check positive-semidefiniteness? • Necessary: H pos.-semidef. ⇒ det( H k ) ≥ 0, ∀ k = 1 , . . . , n . • Sufficient: modified Sylvester det( H k ) > 0, ∀ k = 1 , . . . , n − 1 and det( H ) ≥ 0 ⇒ H pos.-semidef. • Completing the squares: H pos.-semidef. ⇔ f ( x ) = x T Hx = sum of squares. • H pos.-semidef. ⇔ H + ǫI pos.-def. ∀ ǫ > 0. • H pos.-semidef. ⇔ all eigenvalues ≥ 0. Factorizations • H = C T C ⇒ H pos.-semidef. • H = C T C and det( H ) � = 0 ⇒ H pos.-def. • Cholesky: H pos.-def. ⇔ H = LL T , L low-triang., det( L ) � = 0. • H pos.-def. ⇔ H = LDL T , L low. triang., L kk = 1, D = diag > 0.

  3. Search Dichotomous vs. Golden section: • GS: fewer function evaluations. • Unimodal ⇒ glob. min. Armijo: fast but inexact (normally used in multi-dim.) Newton vs. Modified Newton: • Newton: faster • Modified: always descent direction, better convergence Newton vs. Quasi-Newton: • Newton: uses 2d derivative • Quasi-Newton: only 1st derivative Conj. dir. vs. Quasi-Newton (DFP, BFGS): • CD: d new = −∇ f + βd old , β updates. • Quasi-Newton: d = − D ∇ f , D updates, lots of memory. Steepest decent vs. Conj. dir. • SD: zigzagging • CD: faster Convergence for quadratic functions: • Newton: in one step • CD = quasi-Newton: in n steps of inner loop (= one outer loop)

  4. LP and Duality • Particular case: – P: min c T x | Ax ≥ b, x ≥ 0 – D: max b T y | A T y ≤ c, y ≥ 0 • General case: – P: “=” in row k ⇔ D: y k free – P: x k free ⇔ D: “=” in row k • Easy to get from c T x − b T y = ( c − A T y ) T x + y T ( Ax − b ) ≥ 0 • CSP: “=” instead of “ ≥ ” above • Strong duality: finite min in P ⇒ finite max in D and min = max. • ¯ x primal feasible, ¯ y dual feasible + CSP ⇒ both are the optimal solutions. Constrained Optimization • Necessary: loc. min. ⇒ CQ point or KKT point • Sufficient: – KKT + convex ⇒ glob. min. – KKT + 2d order cond. ⇒ loc. min. – Saddle point ⇒ glob. min. • Numerical solution via penalty/barrier function methods. – Penalty: unfeasible approximations. – Barrier: feasible, cannot handle equalities. To check saddle point via Duality: P: min f ( x ) | x ∈ X, g ( x ) ≤ 0 , h ( x ) = 0. D: max Θ( u, v ) | u ≥ 0, where Θ( u, v ) = inf x ∈ X L ( x, u, v ). 1. Find Θ( u, v ) and get (if possible) the optimal x = x ( u, v ). 2. Find max Θ( u, v ) and get the optimal ¯ u , ¯ v . 3. Put ¯ x = x (¯ u, ¯ v ) (or calculate ¯ x as the optimal x on Step 1 for given ¯ u , ¯ v ). 4. If Θ(¯ u, ¯ v ) = f (¯ x ) then ¯ x is glob. min.

  5. The course contents: • Ch 2,3,9: Numerical methods (except Nelder-Mead simplex method). • Ch 4: Convex sets. • Ch 5: LP (except the Simplex method). • Ch 6: Convex functions (except Subgradient and Maximization). • Ch 7: KKT necessary/sufficient conditions (no Quadratic Programming). • Ch 8: Duality.

Recommend


More recommend