Algorithms for unconstrained local optimization Fabio Schoen 2008 - PowerPoint PPT Presentation

Algorithms for unconstrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for unconstrained local optimization – p.

Optimization Algorithms Most common form for optimization algorithms: Line search-based methods: Given a starting point x 0 a sequence is generated: x k +1 = x k + α k d k where d k ∈ R n : search direction, α k > 0 : step Usually first d k is chosen and than the step is obtained, often from a 1–dimensional optimization Algorithms for unconstrained local optimization – p.

Trust-region algorithms A model m ( x ) and a confidence region U ( x k ) containing x k are defined. The new iterate is chosen as the solution of the constrained optimization problem x ∈ U ( x k ) m ( x ) min The model and the confidence region are possibly updated at each iteration. Algorithms for unconstrained local optimization – p.

Speed measures Let x ⋆ : local optimum. The error in x k might be measured e.g. as e ( x k ) = � x k − x ⋆ � or e ( x k ) = | f ( x k ) − f ( x ⋆ ) | . Given { x k } → x ⋆ if ∃ q > 0 , β ∈ (0 , 1) : (for k large enough): e ( x k ) ≤ qβ k ⇒{ x k } is linearly convergent, or converges with order 1; β : convergence rate A sufficient condition for linear convergence: lim sup e ( x k +1 ) ≤ β e ( x k ) Algorithms for unconstrained local optimization – p.

super–linear convergence If for every β ∈ (0 , 1) exists q : e ( x k ) ≤ qβ k then convergence is super–linear. Sufficient condition: lim sup e ( x k +1 ) = 0 e ( x k ) Algorithms for unconstrained local optimization – p.

Higher order convergence If, given p > 1 , ∃ q > 0 , β ∈ (0 , 1) : e ( x k ) ≤ qβ ( p k ) then { x k } is said to converge with order at least p If p = 2 ⇒ quadratic convergence Sufficient condition: lim sup e ( x k +1 ) e ( x k ) p < ∞ Algorithms for unconstrained local optimization – p.

Examples 1 k converges to 0 with order one 1 (linear convergence) Algorithms for unconstrained local optimization – p.

Examples 1 k converges to 0 with order one 1 (linear convergence) 1 k 2 converges to 0 with order 1 Algorithms for unconstrained local optimization – p.

Examples 1 k converges to 0 with order one 1 (linear convergence) 1 k 2 converges to 0 with order 1 2 − k converges to 0 with order 1 Algorithms for unconstrained local optimization – p.

Examples 1 k converges to 0 with order one 1 (linear convergence) 1 k 2 converges to 0 with order 1 2 − k converges to 0 with order 1 k − k converges to 0 with order 1; convergence is super–linear Algorithms for unconstrained local optimization – p.

Examples 1 k converges to 0 with order one 1 (linear convergence) 1 k 2 converges to 0 with order 1 2 − k converges to 0 with order 1 k − k converges to 0 with order 1; convergence is super–linear 1 2 2 k converges a 0 with order 2 quadratic convergence Algorithms for unconstrained local optimization – p.

Descent directions and the gradient Let f ∈ C 1 ( R n ) , x k ∈ R n : ∇ f ( x k ) � = 0 Let d ∈ R n . If d T ∇ f ( x k ) < 0 then d is a descent direction Taylor expansion: f ( x k + αd ) − f ( x k ) = αd T ∇ f ( x k ) + o ( α ) f ( x k + αd ) − f ( x k ) = d T ∇ f ( x k ) + o (1) α Thus if α is small enough f ( x k + αd ) − f ( x k ) < 0 NB: d might be a descent direction even if d T ∇ f ( x k ) = 0 Algorithms for unconstrained local optimization – p.

Convergence of line search methods If a sequence x k +1 = x k + α k d k is generated in such a way that: L 0 = { x : f ( x ) ≤ f ( x 0 ) } is compact d k � = 0 whenever ∇ f ( x k ) � = 0 f ( x k +1 ) ≤ f ( x k ) if ∇ f ( x k ) � = 0 ∀ k then d T k lim � d k �∇ f ( x k ) = 0 k →∞ Algorithms for unconstrained local optimization – p.

if d k � = 0 then | d T k ∇ f ( x k ) | ≥ σ ( �∇ f ( x k ) � ) � d k � where σ is such that lim k →∞ σ ( t k ) = 0 ⇒ lim k →∞ t k = 0 ( σ is called a forcing function) Algorithms for unconstrained local optimization – p. 1

Then either there exists a finite index ¯ k such that ∇ f ( x ¯ k ) = 0 or otherwise x k ∈ L 0 and all of its limit points are in L 0 { f ( x k ) } admits a limit lim k →∞ ∇ f ( x k ) = 0 for every limit point ¯ x of { x k } we have ∇ f (¯ x ) = 0 Algorithms for unconstrained local optimization – p. 1

Comments on the assumptions f ( x k +1 ) ≤ f ( x k ) : most optimization methods choose d k as a descent direction. If d k is a descent direction, choosing α k “sufficiently small” ensures the validity of the assumption d T lim k →∞ � d k � ∇ f ( x k ) = 0 : given a normalized direction d k , the k scalar product d k T ∇ f ( x k ) is the directional derivative of f along d k : it is required that this goes to zero. This can be achieved through precise line searches (choosing the step so that f is minimized along d k ) | d T k ∇ f ( x k ) | ≥ σ ( �∇ f ( x k ) � ) : letting, e.g., σ ( t ) = ct , c > 0 , if � d k � d k : d T k ∇ f ( x k ) < 0 then the condition becomes d T k ∇ f ( x k ) � d k � �∇ f ( x k � ≤ − c Algorithms for unconstrained local optimization – p. 1

Recalling that d T k ∇ f ( x k ) cos θ k = � d k � �∇ f ( x k � then the condition becomes cos θ k ≤ − c that is, the angle between d k and ∇ f ( x k ) is bounded away from orthogonality. d T k ∇ f ( x k ) θ k Algorithms for unconstrained local optimization – p. 1

Gradient Algorithms General scheme: x k +1 = x k − α k D k ∇ f ( x k ) with D k ≻ 0 e α k > 0 If ∇ f ( x k ) � = 0 then d k = D k ∇ f ( x k ) is a descent direction. In fact d T k ∇ f ( x k ) = −∇ T f ( x k ) D k ∇ f ( x k ) < 0 Algorithms for unconstrained local optimization – p. 1

Steepest Descent or “gradient” method: D k := I i.e. x k +1 = x k − α k ∇ f ( x k ) . If ∇ f ( x k ) � = 0 then d k = −∇ f ( x k ) is a descent direction. Moreover, it is the steepest (w.r.t. the euclidean norm): d ∈ R n ∇ T f ( x k ) d min � d � ≤ 1 Algorithms for unconstrained local optimization – p. 1

∇ f ( x k ) Algorithms for unconstrained local optimization – p. 1

. . . d ∈ R n ∇ T f ( x k ) d min √ d T d ≤ 1 KKT conditions: In the interior ⇒∇ T f ( x k ) = 0 ; if the constraint is active ⇒ ∇ f ( x k ) + λ d � d � = 0 √ d T d = 1 λ ≥ 0 ⇒ d = − ∇ f ( x k ) �∇ f ( x k ) � . Algorithms for unconstrained local optimization – p. 1

Newton’s method � − 1 ∇ 2 f ( x k ) � D k := − Motivation: Taylor expansion of f : f ( x ) ≈ f ( x k ) + ∇ T f ( x k )( x − x k ) + 1 2( x − x k ) T ∇ 2 f ( x k )( x − x k ) Minimizing the approximation: ∇ f ( x k ) + ∇ 2 f ( x k )( x − x k ) = 0 If the hessian is non singular ⇒ � − 1 ∇ f ( x k ) ∇ 2 f ( x k ) � x = x k − Algorithms for unconstrained local optimization – p. 1

Step choice Given d k , how to choose α k so that x k +1 = x k + α k d k ? “optimal” choice (one-dimensional optimization): α k = arg min α ≥ 0 f ( x k + αd k ) . Analytical expression of the optimal step is available only in few cases. E.g. if f ( x ) = 1 2 x T Qx + c T x with Q ≻ 0 . Then f ( x k + αd k ) = 1 2( x k + αd k ) T Q ( x k + αd k ) + c T ( x k + αd k ) = 1 2 α 2 d T k Qd k + α ( Qx k + c ) T d k + β where β does not depend on α . Algorithms for unconstrained local optimization – p. 1

Minimizing w.r.t. α : αd T k Qd k + ( Qx k + c ) T d k = 0 ⇒ α = − ( Qx k + c ) T d k d T k Qd k = − d T k ∇ f ( x k ) d T k ∇ 2 f ( x k ) d k E.g., in steepest descent: �∇ f ( x k ) � 2 α k = ∇ T f ( x k ) ∇ 2 f ( x k ) ∇ f ( x k ) Algorithms for unconstrained local optimization – p. 2

Approximate step size Rules for choosing a step-size (from the sufficient condition for convergence): f ( x k +1 ) < f ( x k ) d T lim k →∞ � d k � ∇ f ( x k ) = 0 k Often it is also required that � x k +1 − x k � → 0 d T K ∇ f ( x k + α k d k ) → 0 In general it is important to insure a sufficient reduction of f and a sufficiently large step x k +1 − x k Algorithms for unconstrained local optimization – p. 2

Avoid too large steps ✉ ✉ ✉ ✉ Algorithms for unconstrained local optimization – p. 2

Avoid too small steps ✉ ✉ ✉ ✉ ✉ Algorithms for unconstrained local optimization – p. 2

Armijo’s rule Input : δ ∈ (0 , 1) , γ ∈ (0 , 1 / 2) , ∆ k > 0 α := ∆ k ; while ( f ( x k + αd k ) > f ( x k ) + γαd T k ∇ f ( x k ) ) do α := δα ; end return α Typical values : δ ∈ [0 . 1 , 0 . 5] , γ ∈ [10 − 4 , 10 − 3 ] . On exit the returned step is such that f ( x k + αd k ) ≤ f ( x k ) + γαd T k ∇ f ( x k ) Algorithms for unconstrained local optimization – p. 2

acceptable steps α γαd T k ∇ f ( x k ) αd T k ∇ f ( x k ) Algorithms for unconstrained local optimization – p. 2

Line search in practice How to choose the initial step size ∆ k ? Let φ ( α ) = f ( x k + αd k ) . A possibility is to choose ∆ k = α ⋆ , the minimizer of a quadratic approximation to φ ( · ) . Example: q ( α ) = c 0 + c 1 α + 1 2 c 2 α 2 q (0) = c 0 := f ( x k ) q ′ (0) = c 1 := d T k ∇ f ( x k ) Then α ⋆ = − c 1 /c 2 . Algorithms for unconstrained local optimization – p. 2

Algorithms for unconstrained local optimization Fabio Schoen 2008 - PowerPoint PPT Presentation

Algorithms for unconstrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for unconstrained local optimization p. Optimization Algorithms Most common form for optimization algorithms: Line

Local, Unconstrained Function Optimization COMPSCI 527 Computer Vision COMPSCI 527

MATHEMATICS 1 CONTENTS Unconstrained optimization Constrained optimization Lagrange method

Unconstrained Elastic Matching Unconstrained Elastic Matching and Eigen Eigen- -Deformations

Optimization Unconstrained optimization Constrained optimization Newton with equality

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

MATH529 Fundamentals of Optimization Unconstrained Optimization II Marco A. Montes de Oca

Descent Algorithms for Optimizing Unconstrained Problems Techniques relevant for most (convex)

CSCI 1951-G Optimization Methods in Finance Part 06: Algorithms for Unconstrained Convex

Convex Optimization 9. Unconstrained minimization Prof. Ying Cui Department of Electrical

Unconstrained minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi

BEEM103 Optimization Techniques for Economists Level Curves Multivariate Functions Isoquants

Outline Optimization Unconstrained Optimization Problems Machine Learning and Pattern

Unconstrained Optimization Optimization problem Given f : R n R find x R n ,

Algorithms for constrained local optimization Fabio Schoen 2008

MATH529 Fundamentals of Optimization Unconstrained Optimization I Marco A. Montes de Oca

Unconstrained Optimization -4 0 -4 -2 -2 BEEM103 Mathematics for Economists 0 0 2 2 4 4

Optimal rotary control of the cylinder wake using POD reduced order model Michel Bergmann,

Loop Transformations for Parallelism & Locality Previously Loop transformations,

GMBA 7098: Statistics and Data Analysis (Fall 2014) Introduction to Probability (2) Ling-Chieh

In this talk: Derived Tame Nakayama Algebras J.A. Vlez-Marulanda S ET U P In this talk: k is

GENETIC ALGORITHMS: PREREQUISITES Date: Friday 18 March 2016 Course: Functional Programming and

On Computing Optimal Thresholds in Decentralized Sequential Hypothesis Testing Can Cui and Aditya

Policy Gradient Prof. Kuan-Ting Lai 2020/5/22 Advantages of Policy-based RL Previously we

On feedback target control for uncertain discrete-time systems through polyhedral techniques

Algorithms for unconstrained local optimization Fabio Schoen 2008 - PowerPoint PPT Presentation

Algorithms for unconstrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for unconstrained local optimization p. Optimization Algorithms Most common form for optimization algorithms: Line

Local, Unconstrained Function Optimization COMPSCI 527 Computer Vision COMPSCI 527

MATHEMATICS 1 CONTENTS Unconstrained optimization Constrained optimization Lagrange method

Unconstrained Elastic Matching Unconstrained Elastic Matching and Eigen Eigen- -Deformations

Optimization Unconstrained optimization Constrained optimization Newton with equality

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

MATH529 Fundamentals of Optimization Unconstrained Optimization II Marco A. Montes de Oca

Descent Algorithms for Optimizing Unconstrained Problems Techniques relevant for most (convex)

CSCI 1951-G Optimization Methods in Finance Part 06: Algorithms for Unconstrained Convex

Convex Optimization 9. Unconstrained minimization Prof. Ying Cui Department of Electrical

Unconstrained minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi

BEEM103 Optimization Techniques for Economists Level Curves Multivariate Functions Isoquants

Outline Optimization Unconstrained Optimization Problems Machine Learning and Pattern

Unconstrained Optimization Optimization problem Given f : R n R find x R n ,

Algorithms for constrained local optimization Fabio Schoen 2008

MATH529 Fundamentals of Optimization Unconstrained Optimization I Marco A. Montes de Oca

Unconstrained Optimization -4 0 -4 -2 -2 BEEM103 Mathematics for Economists 0 0 2 2 4 4

Optimal rotary control of the cylinder wake using POD reduced order model Michel Bergmann,

Loop Transformations for Parallelism &amp; Locality Previously Loop transformations,

GMBA 7098: Statistics and Data Analysis (Fall 2014) Introduction to Probability (2) Ling-Chieh

In this talk: Derived Tame Nakayama Algebras J.A. Vlez-Marulanda S ET U P In this talk: k is

GENETIC ALGORITHMS: PREREQUISITES Date: Friday 18 March 2016 Course: Functional Programming and

On Computing Optimal Thresholds in Decentralized Sequential Hypothesis Testing Can Cui and Aditya

Policy Gradient Prof. Kuan-Ting Lai 2020/5/22 Advantages of Policy-based RL Previously we

On feedback target control for uncertain discrete-time systems through polyhedral techniques

Loop Transformations for Parallelism & Locality Previously Loop transformations,