unconstrained minimization
play

Unconstrained minimization Lectures for PHD course on Numerical - PowerPoint PPT Presentation

Unconstrained minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universit a di Trento November 21 December 14, 2011 Unconstrained minimization 1 / 58 Outline General iterative scheme 1


  1. Unconstrained minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS – Universit´ a di Trento November 21 – December 14, 2011 Unconstrained minimization 1 / 58

  2. Outline General iterative scheme 1 Backtracking Armijo line-search 2 Global convergence of backtracking Armijo line-search Global convergence of steepest descent Wolfe–Zoutendijk global convergence 3 The Wolfe conditions The Armijo-Goldstein conditions Algorithms for line-search 4 Armijo Parabolic-Cubic search Wolfe linesearch Unconstrained minimization 2 / 58

  3. The problem (1 / 3) Given f : ❘ n �→ ❘ : minimize f ( x ) x ∈ ❘ n the following regularity about f ( x ) is assumed in the following: Assumption (Regularity assumption) We assume f ∈ C 1 ( ❘ n ) with Lipschitz continuous gradient, i.e. there exists γ > 0 such that � ∇ f ( x ) T − ∇ f ( y ) T � � ≤ γ � x − y � , � ∀ x , y ∈ ❘ n Unconstrained minimization 3 / 58

  4. The problem (2 / 3) Definition (Global minimum) Given f : ❘ n �→ ❘ a point x ⋆ ∈ ❘ n is a global minimum if ∀ x ∈ ❘ n . f ( x ⋆ ) ≤ f ( x ) , Definition (Local minimum) Given f : ❘ n �→ ❘ a point x ⋆ ∈ ❘ n is a local minimum if f ( x ⋆ ) ≤ f ( x ) , ∀ x ∈ B ( x ⋆ ; δ ) . Obviously a global minimum is a local minimum. Find a global minimum in general is not an easy task. The algorithms presented in the sequel will approximate local minima’s. Unconstrained minimization 4 / 58

  5. The problem (3 / 3) Definition (Strict global minimum) Given f : ❘ n �→ ❘ a point x ⋆ ∈ ❘ n is a strict global minimum if ∀ x ∈ ❘ n \ { x ⋆ } . f ( x ⋆ ) < f ( x ) , Definition (Strict local minimum) Given f : ❘ n �→ ❘ a point x ⋆ ∈ ❘ n is a strict local minimum if f ( x ⋆ ) < f ( x ) , ∀ x ∈ B ( x ⋆ ; δ ) \ { x ⋆ } . Obviously a strict global minimum is a strict local minimum. Unconstrained minimization 5 / 58

  6. First order Necessary condition Lemma (First order Necessary condition for local minimum) Given f : ❘ n �→ ❘ satisfying the regularity assumption. If a point x ⋆ ∈ ❘ n is a local minimum then ∇ f ( x ⋆ ) T = 0 . Proof. Consider a generic direction d , then for δ small enough we have λ − 1 � � f ( x ⋆ + λ d ) − f ( x ⋆ ) ≥ 0 , 0 < λ < δ so that λ → 0 λ − 1 � � lim f ( x ⋆ + λ d ) − f ( x ⋆ ) = ∇ f ( x ⋆ ) d ≥ 0 , because d is a generic direction we have ∇ f ( x ⋆ ) T = 0 . Unconstrained minimization 6 / 58

  7. 1 The first order necessary condition do not discriminate maximum, minimum, or saddle points. 2 To discriminate maximum and minimum we need more information, e.g. second order derivative of f ( x ) . 3 With second order derivative we can build necessary and sufficient condition for a minima. 4 In general using only first and second order derivative at the point x ⋆ it is not possible to deduce a necessary and sufficient condition for a minima. Unconstrained minimization 7 / 58

  8. Second order Necessary condition Lemma (Second order Necessary condition for local minimum) Given f ∈ C 2 ( ❘ n ) if a point x ⋆ ∈ ❘ n is a local minimum then ∇ f ( x ⋆ ) T = 0 and ∇ 2 f ( x ⋆ ) is semi-definite positive, i.e. d T ∇ 2 f ( x ⋆ ) d ≥ 0 , ∀ d ∈ ❘ n Example This condition is only, necessary, in fact consider f ( x ) = x 2 1 − x 3 2 , � 2 � 0 2 x 1 , − 3 x 2 ∇ 2 f ( x ) = � � ∇ f ( x ) = , 2 0 − 6 x 2 for the point x ⋆ = 0 we have ∇ f ( 0 ) = 0 and ∇ 2 f ( 0 ) semi-definite positive, but 0 is a saddle point not a minimum. Unconstrained minimization 8 / 58

  9. Proof. The condition ∇ f ( x ⋆ ) T = 0 comes from first order necessary conditions. Consider now a generic direction d , and the finite difference: f ( x ⋆ + λ d ) − 2 f ( x ⋆ ) + f ( x ⋆ − λ d ) ≥ 0 λ 2 by using Taylor expansion for f ( x ) f ( x ⋆ ± λ d ) = f ( x ⋆ ) ± ∇ f ( x ⋆ ) λ d + λ 2 2 d T ∇ 2 f ( x ⋆ ) d + o ( λ 2 ) and from the previous inequality d T ∇ 2 f ( x ⋆ ) d + 2 o ( λ 2 ) /λ 2 ≥ 0 taking the limit λ → 0 and form the arbitrariness of d we have that ∇ 2 f ( x ⋆ ) must be semi-definite positive. Unconstrained minimization 9 / 58

  10. Second order sufficient condition Lemma (Second order sufficient condition for local minimum) Given f ∈ C 2 ( ❘ n ) if a point x ⋆ ∈ ❘ n satisfy: 1 ∇ f ( x ⋆ ) T = 0 ; 2 ∇ 2 f ( x ⋆ ) is definite positive; i.e. d T ∇ 2 f ( x ⋆ ) d > 0 , ∀ d ∈ ❘ n \ { x ⋆ } then x ⋆ ∈ ❘ n is a strict local minimum. Remark Because ∇ 2 f ( x ⋆ ) is symmetric we can write λ min d T d ≤ d T ∇ 2 f ( x ⋆ ) d ≤ λ max d T d If ∇ 2 f ( x ⋆ ) is positive definite we have λ min > 0 . Unconstrained minimization 10 / 58

  11. Proof. Consider now a generic direction d , and the Taylor expansion for f ( x ) f ( x ⋆ + d ) = f ( x ⋆ ) + ∇ f ( x ⋆ ) d + 1 2 d T ∇ 2 f ( x ⋆ ) d + o ( � d � 2 ) ≥ f ( x ⋆ ) + 1 2 λ min � d � 2 + o ( � d � 2 ) ≥ f ( x ⋆ ) + 1 2 λ min � d � 2 � 1 + o ( � d � 2 ) / � d � 2 � choosing d small enough we can write f ( x ⋆ + d ) ≥ f ( x ⋆ ) + 1 4 λ min � d � 2 > f ( x ⋆ ) , d � = 0 , � d � ≤ δ. i.e. x ⋆ is a strict minimum. Unconstrained minimization 11 / 58

  12. General iterative scheme Outline General iterative scheme 1 Backtracking Armijo line-search 2 Global convergence of backtracking Armijo line-search Global convergence of steepest descent Wolfe–Zoutendijk global convergence 3 The Wolfe conditions The Armijo-Goldstein conditions Algorithms for line-search 4 Armijo Parabolic-Cubic search Wolfe linesearch Unconstrained minimization 12 / 58

  13. General iterative scheme How to find a minimum Given f : ❘ n �→ ❘ : minimize x ∈ ❘ n f ( x ) . 1 We can solve the problem by solving the necessary condition. i.e by solving the nonlinear systems ∇ f ( x ) T = 0 . 2 Using such an approach we looses the information about f ( x ) . 3 Moreover such an approach can find solution corresponding to a maximum or saddle points. 4 A better approach is to use all the information and try to build minimizing procedure, i.e. procedures that, starting from a point x 0 build a sequence { x k } such that f ( x k +1 ) ≤ f ( x k ) . In this way, at least, we avoid to converge to a strict maximum. Unconstrained minimization 13 / 58

  14. General iterative scheme Iterative Methods in practice very rare to be able to provide explicit minimizer. iterative method: given starting guess x 0 , generate the sequence, � � x k , k = 1 , 2 , . . . AIM: ensure that (a subsequence) has some favorable limiting properties: satisfies first-order necessary conditions satisfies second-order necessary conditions Unconstrained minimization 14 / 58

  15. General iterative scheme Line-search Methods A generic iterative minimization procedure can be sketched as follows: calculate a search direction p k from x k ensure that this direction is a descent direction, i.e. whenever ∇ f ( x k ) T � = 0 ∇ f ( x k ) p k < 0 , so that, at least for small steps along p k , the objective function f ( x ) will be reduced use line-search to calculate a suitable step-length α k > 0 so that f ( x k + α k p k ) < f ( x k ) . Update the point: x k +1 = x k + α k p k Unconstrained minimization 15 / 58

  16. General iterative scheme Generic minimization algorithm Written with a pseudo-code the minimization procedure is the following algorithm: Generic minimization algorithm Given an initial guess x 0 , let k = 0 ; while not converged do Find a descent direction p k at x k ; Compute a step size α k using a line-search along p k . Set x k +1 = x k + α k p k and increase k by 1 . end while The crucial points which differentiate the algorithms are: 1 The computation of the direction p k ; 2 The computation of the step size α k . Unconstrained minimization 16 / 58

  17. General iterative scheme Practical Line-search methods The first developed minimization algorithms try to solve α k = arg min α> 0 f ( x k + α p k ) performing exact line-search by univariate minimization; rather expensive and certainly not cost effective. Modern methods implements inexact line-search: ensure steps are neither too long nor too short try to pick useful initial step size for fast convergence best methods are based on: backtracking–Armijo search; Armijo–Goldstein search; Franke–Wolfe search; Unconstrained minimization 17 / 58

  18. General iterative scheme backtracking line-search To obtain a monotone decreasing sequence we can use the following algorithm: Backtracking line-search Given α init (e.g., α init = 1 ); Given τ ∈ (0 , 1) typically τ = 0 . 5 ; Let α (0) = α init ; while not f ( x k + α ( ℓ ) p k ) < f ( x k ) do set α ( ℓ +1) = τα ( ℓ ) ; increase ℓ by 1 ; end while Set α k = α ( ℓ ) . To be effective the previous algorithm should terminate in a finite number of steps. The next lemma assure that if p k is a descent direction then the algorithm terminate. Unconstrained minimization 18 / 58

  19. Backtracking Armijo line-search Outline General iterative scheme 1 Backtracking Armijo line-search 2 Global convergence of backtracking Armijo line-search Global convergence of steepest descent Wolfe–Zoutendijk global convergence 3 The Wolfe conditions The Armijo-Goldstein conditions Algorithms for line-search 4 Armijo Parabolic-Cubic search Wolfe linesearch Unconstrained minimization 19 / 58

Recommend


More recommend