algorithms for constrained local optimization
play

Algorithms for constrained local optimization Fabio Schoen 2008 - PowerPoint PPT Presentation

Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained local optimization p.


  1. Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization – p.

  2. Feasible direction methods Algorithms for constrained local optimization – p.

  3. Frank–Wolfe method Let X : convex set. Consider the problem: min x ∈ X f ( x ) Let x k ∈ X ⇒ choosing a feasible direction d k corresponds to choosing a point x ∈ X : d k = x − x k . “Steepest descent” choice: x ∈ X ∇ T f ( x k )( x − x k ) min (a linear objective with convex constraints, usually easy to solve). Let ˆ x k be an optimal solution of this problem. Algorithms for constrained local optimization – p.

  4. Frank–Wolfe If ∇ T f ( x k )(ˆ x k − x k ) = 0 then ∇ T f ( x k ) d ≥ 0 for every feasible direction d ⇒ first order necessary conditions hold. Otherwise, letting d k = ˆ x k − x , this is a descent direction along which a step α k ∈ (0 , 1] might be chosen according to Armijo’s rule. Algorithms for constrained local optimization – p.

  5. Convergence of Frank-Wolfe method Under mild conditions the method converges to a point satisfying first order necessary conditions. However it is usually extremely slow (convergence may be sub–linear) It might find applications in very large scale problems in which solving the sub-problem for direction determination is very easy (e.g. when X is a polytope). Algorithms for constrained local optimization – p.

  6. Gradient Projection methods Generic iteration: x k +1 = x k + α k (¯ x k − x k ) where the direction d k = ¯ x k − x k is obtained finding x k = [ x k − s k ∇ f ( x k )] + ¯ where: s k ∈ R + and [ · ] + represents projection over the feasible set. Algorithms for constrained local optimization – p.

  7. The method is slightly faster than Frank-Wolfe, with a linear convergence rate similar to that of (unconstrained) steepest descent. It might be applied if projection is relatively cheap, e.g. when the feasible set is a box. A point x k satisfies first order necessary conditions d T ∇ f ( x k ) ≥ 0 iff x k = [ x k − s k ∇ f ( x k )] + Algorithms for constrained local optimization – p.

  8. Lagrange Multiplier Algorithms Algorithms for constrained local optimization – p.

  9. Barrier Methods min f ( x ) g j ( x ) ≤ 0 j = 1 , . . . , r A Barrier is a continuous function which tends to + ∞ whenever x approaches the boundary of the feasible region. Examples of barrier functions: � B ( x ) = − log( − g j ( x )) logaritmic barrier j 1 � B ( x ) = − invers barrier g j ( x ) j Algorithms for constrained local optimization – p.

  10. Barrier Method Let ε k ↓ 0 and x 0 strictly feasible, i.e. g j ( x 0 ) < 0 ∀ j . Then let x k = arg min x ∈ R n ( f ( x ) + ε k B ( x )) Proposition: every limit point of { x k } is a global minimum of the constrained optimization problem Algorithms for constrained local optimization – p. 1

  11. Analysis of Barrier methods Special case: a single constraint (might be generalized) Let ¯ x be a limit point of { x k } (a global minimum). If KKT conditions hold, then there exists a unique λ ≥ 0 : ∇ f (¯ x ) + λ ∇ g (¯ x ) = 0 (with λg (¯ x ) = 0 . x k , solution of the barrier problem min f ( x ) + ε k B ( x ) g ( x ) < 0 satisfies ∇ f ( x k ) + ε k ∇ B ( x k ) = 0 Algorithms for constrained local optimization – p. 1

  12. . . . If B ( x ) = φ ( g ( x )) , ⇒ ∇ f ( x k ) + ε k φ ′ ( g ( x k )) ∇ g ( x k ) = 0 In the limit, for k → ∞ : lim ε k φ ′ ( g ( x k )) ∇ g ( x k ) = λ ∇ g (¯ x ) if lim k g ( x k ) < 0 ⇒ φ ′ ( g ( x k )) ∇ g ( x k ) → K (finite) and Kε k → 0 if lim k g ( x k ) = 0 ⇒ (thanks to the unicity of Lagrange multipliers), k ε k φ ′ ( g ( x k )) λ = lim Algorithms for constrained local optimization – p. 1

  13. Difficulties in Barrier Methods strong numeric instability: the condition number of the hessian matrix grows as ε k → 0 need for an initial strictly feasible point x 0 (partial) remedy: ε k is very slowly decreased and the solution of the k + 1 –th problem is obtained starting an unconstrained optimization from x k Algorithms for constrained local optimization – p. 1

  14. Example min( x − 1) 2 + ( y − 1) 2 x + y ≤ 1 Logarithmic Barrier problem: min( x − 1) 2 + ( y − 1) 2 − ε k log(1 − x − y ) x + y − 1 < 0 Gradient:   ε k  2( x − 1) + 1 − x − y  ε k 2( y − 1) + 1 − x − y √ 1+ ε k Stationary points x = y = 3 4 ± (only the “-” solution is acceptable) 4 Algorithms for constrained local optimization – p. 1

  15. Barrier methods and L.P . min c T x Ax = b x ≥ 0 Logarithmic Barrier on x ≥ 0 : � min c T x − ε log x j j Ax = b x > 0 Algorithms for constrained local optimization – p. 1

  16. The central path The starting point is usually associated with ε = ∞ and is the unique solution of � min − log x j j Ax = b x > 0 The trajectory x ( ε ) of solutions to the barrier problem is called the central path and leads to an optimal solution of the LP . Algorithms for constrained local optimization – p. 1

  17. Penalty Methods Penalized problem: min f ( x ) + ρP ( x ) where ρ > 0 and P ( x ) ≥ 0 with P ( x ) = 0 if x is feasible. Example: min f ( x ) h i ( x ) = 0 i = 1 , . . . , m A penalized problem might be: � h i ( x ) 2 min f ( x ) + ρ i Algorithms for constrained local optimization – p. 1

  18. Convergence of the quadratic penalty me (for equality constrained problems): let � h i ( x ) 2 P ( x ; ρ ) = f ( x ) + ρ i Given ρ 0 > 0 , x 0 ∈ R n , k = 0 , let x k +1 = arg min P ( x ; ρ k ) (found with an iterative method initialized at x k ); let ρ k +1 > ρ k , k := k + 1 . If x k +1 is a global minimizer of P and ρ k → ∞ then every limit point of { x k } is a global optimum of the constrained problem. Algorithms for constrained local optimization – p. 1

  19. Exact penalties Exact penalties: there exists a penalty parameter value s.t. the optimal solution to the penalized problem is the optimal solution of the original one. ℓ 1 penalty function: � P 1 ( x ; ρ ) = f ( x ) + ρ | h i ( x ) | i Algorithms for constrained local optimization – p. 1

  20. Exact penalties for inequality constrained problems: min f ( x ) h i ( x ) = 0 g j ( x ) ≤ 0 the penalized problem is � � P 1 ( x ; ρ ) = f ( x ) ρ | h i ( x ) | + ρ max(0 , − g j ( x )) i j Algorithms for constrained local optimization – p. 2

  21. Augmented Lagrangian method Given an equality constrained problem, reformulate it as: min f ( x ) + 1 2 ρ � h ( x ) � 2 h ( x ) = 0 The Lagrange function of this problem is called Augmented Lagrangian: L ( x ; λ ) = f ( x ) + 1 2 ρ � h ( x ) � 2 + λ T h ( x ) Algorithms for constrained local optimization – p. 2

  22. Motivation x f ( x ) + 1 2 ρ � h ( x ) � 2 + λ T h ( x ) min � ∇ x L ρ ( x, λ ) = ∇ f ( x ) + λ i ∇ h ( x ) + ρh ( x ) ∇ h ( x ) i = ∇ x L ( x, λ ) + ρh ( x ) ∇ h ( x ) λ i ∇ 2 h ( x ) + ρh ( x ) ∇ 2 h ( x ) + ρ ∇ h ( x ) ∇ T h ( x ) � ∇ 2 xx L ρ ( x, λ ) = ∇ 2 f ( x ) + i xx L ( x, λ ) + ρh ( x ) ∇ 2 h ( x ) + ρ ∇ h ( x ) ∇ T h ( x ) = ∇ 2 Algorithms for constrained local optimization – p. 2

  23. motivation . . . Let ( x ⋆ , λ ⋆ ) an optimal (primal and dual) solution. Necessarily: ∇ x L ( x ⋆ , λ ⋆ ) = 0 ; moreover h ( x ⋆ ) = 0 thus ∇ x L ρ ( x ⋆ , λ ⋆ ) = ∇ x L ( x ⋆ , λ ⋆ ) + ρh ( x ⋆ ) ∇ h ( x ⋆ ) = 0 ⇒ ( x ⋆ , λ ⋆ ) is a stationary point for the augmented lagrangian. Algorithms for constrained local optimization – p. 2

  24. motivation . . . Observe that: ∇ 2 xx L ρ ( x, λ ) = ∇ 2 xx L ( x, λ ) + ρh ( x ) ∇ 2 h ( x ) + ρ ∇ h ( x ) ∇ T h ( x ) = ∇ 2 xx L ( x, λ ) + ρ ∇ h ( x ) ∇ T h ( x ) Assume that sufficient optimality conditions hold: v T ∇ 2 xx L ( x ⋆ , λ ⋆ ) v > 0 ∀ v : v T ∇ h ( x ⋆ ) = 0 , Algorithms for constrained local optimization – p. 2

  25. . . . Let v � = 0 : v T ∇ h ( x ⋆ )= 0 . Then xx L ρ ( x ⋆ , λ ⋆ ) v T = v T ∇ 2 xx L ( x ⋆ , λ ⋆ ) v T + ρv T ∇ h ( x ⋆ ) ∇ T h ( x ⋆ ) v v T ∇ 2 = v T ∇ 2 xx L ( x ⋆ , λ ⋆ ) v T > 0 Algorithms for constrained local optimization – p. 2

  26. . . . Let v � = 0 : v T ∇ h ( x ⋆ ) � = 0 . Then xx L ρ ( x ⋆ , λ ⋆ ) v T = v T ∇ 2 xx L ( x ⋆ , λ ⋆ ) v T + ρv T ∇ h ( x ⋆ ) ∇ T h ( x ⋆ ) v v T ∇ 2 xx L ( x ⋆ , λ ⋆ ) v T + ρ ( v T ∇ h ( x ⋆ )) 2 = v T ∇ 2 which might be negative. However ∃ ¯ ρ > 0 : if ρ ≥ ¯ ρ xx L ρ ( x ⋆ , λ ⋆ ) v T > 0 . ⇒ v T ∇ 2 Thus, if ρ is large enough, the Hessian of the augmented lagrangian is positive definite and x ⋆ is a (strict) local minimum of L ρ ( · , λ ⋆ ) Algorithms for constrained local optimization – p. 2

  27. Inequality constraints min f ( x ) g ( x ) ≤ 0 Nonlinear transformation of inequalities into equalities: min x,s f ( x ) g j ( x ) + s 2 j = 0 j = 1 , p Algorithms for constrained local optimization – p. 2

  28. Given the problem min f ( x ) h i ( x ) = 0 i = 1 , m g j ( x ) ≤ 0 j = 1 , p an Augmented Lagrangian problem might be defined as x,z f ( x ) + λ T h ( x ) + 1 2 ρ � h ( x ) � 2 min L ρ ( x, z ; λ, µ ) = min j ) + 1 � � µ j ( g j ( x ) + z 2 ( g j ( x ) + z 2 j ) 2 + 2 ρ j j Algorithms for constrained local optimization – p. 2

Recommend


More recommend