Local, Unconstrained Function Optimization COMPSCI 527 Computer - PowerPoint PPT Presentation

Local, Unconstrained Function Optimization COMPSCI 527 — Computer Vision COMPSCI 527 — Computer Vision Local, Unconstrained Function Optimization 1 / 27

Outline 1 Gradient, Hessian, and Convexity 2 A Local, Unconstrained Optimization Template 3 Steepest Descent 4 Termination 5 Convergence Speed of Steepest Descent 6 Convergence Speed of Newton’s Method 7 Newton’s Method 8 Counting Steps versus Clocking COMPSCI 527 — Computer Vision Local, Unconstrained Function Optimization 2 / 27

Motivation and Scope METERS JAM • Most estimation problems are solved by optimization • Machine learning: • Parametric predictor: h ( x ; v ) : R d ⇥ R m ! Y 0 n = 1 ` ( y n , h ( x n ; v )) : R m ! R P N • Risk: L T ( v ) = 1 N • Training: v = arg min v 2 R ˆ m L T ( v ) SCENE C s • 3D Reconstruction: t I = ⇡ ( C , S ) where I are the images, C are the camera ft positions and orientations, S is scene shape 00 • Given I , find ˆ C , ˆ S = arg min C , S k I � ⇡ ( C , S ) k • In general, “solving” equation E ( z ) = 0 can be viewed as ˆ T RAG T z = arg min z k E ( z ) k EEhFE COMPSCI 527 — Computer Vision Local, Unconstrained Function Optimization 3 / 27

Only Local Minimization 2 C IRM ˆ z = arg min z 2 ? f ( z ) • All we know about f is a “black box” (think Python function) • For many problems, f has many local minima • Start somewhere ( z 0 ), and take steps “down” f ( z k + 1 ) < f ( z k ) • When we get stuck at a local minimum, we declare success • We would like global minima, but all we get is local ones • For some problems, f has a unique minimum... • ... or at least a single connected set of minima COMPSCI 527 — Computer Vision Local, Unconstrained Function Optimization 4 / 27

Gradient, Hessian, and Convexity fC k Gradient zEe q g 2 ∂ f 2 3 ∂ z 1 fork someday . r f ( z ) = ∂ f . ∂ z = 6 . 7 4 5 ∂ f ∂ z m • We saw the gradient for the case z 2 R 2 0 • If r f ( z ) exists everywhere, the condition r f ( z ) = 0 is necessary and sufficient for a stationary point (max, min, or saddle) • Warning: only necessary for a minimum! • Reduces to first derivative when f : R ! R COMPSCI 527 — Computer Vision Local, Unconstrained Function Optimization 5 / 27

Gradient, Hessian, and Convexity First Order Taylor Expansion ZE R2 f ( z ) ⇡ g 1 ( z ) = f ( z 0 ) + [ r f ( z 0 )] T ( z � z 0 ) approximates f ( z ) near z 0 with a (hyper)plane through z 0 f( z ) FE z 2 z 0 z 1 r f ( z 0 ) points to direction of steepest increase of f at z 0 • If we want to find z 1 where f ( z 1 ) < f ( z 0 ) , going along �r f ( z 0 ) seems promising • This is the general idea of steepest descent COMPSCI 527 — Computer Vision Local, Unconstrained Function Optimization 6 / 27

Gradient, Hessian, and Convexity Hessian ∂ 2 f ∂ 2 f 2 3 . . . ∂ z 2 ∂ z 1 ∂ z m 1 . . 6 . . 7 H ( z ) = . . 6 7 4 5 ∂ 2 f ∂ 2 f . . . ∂ z m ∂ z 1 ∂ z 2 m • Symmetric matrix because of Schwarz’s theorem: ∂ 2 f ∂ 2 f = ∂ z i ∂ z j ∂ z j ∂ z i • Eigenvalues are real because of symmetry • Reduces to d 2 f dz 2 for f : R ! R COMPSCI 527 — Computer Vision Local, Unconstrained Function Optimization 7 / 27

Gradient, Hessian, and Convexity Convexity f( z ) u f( z ) + (1-u) f( z' ) f( z' ) f(u z + (1-u) z' ) z O z' u z + (1-u) z' • Convex everywhere : For all z , z 0 in the (open) domain of f and for all u 2 [ 0 , 1 ] f ( u z + ( 1 � u ) z 0 )  uf ( z ) + ( 1 � u ) f ( z 0 ) • Convex at z 0 : The function f is convex everywhere in some open neighborhood of z 0 COMPSCI 527 — Computer Vision Local, Unconstrained Function Optimization 8 / 27

Gradient, Hessian, and Convexity Convexity and Hessian Of E o • If H ( z ) is defined at a stationary point z of f , then z is a minimum iff H ( z ) < 0 • “ < ” means positive semidefinite : z T H z � 0 for all z 2 R m • Above is definition of H ( z ) < 0 • To check computationally: All eigenvalues are nonnegative • H ( z ) < 0 reduces to d 2 f dz 2 � 0 for f : R ! R COMPSCI 527 — Computer Vision Local, Unconstrained Function Optimization 9 / 27

Gradient, Hessian, and Convexity Second Order Taylor Expansion O f ⇡ g 2 ( z ) = f ( z 0 ) + [ r z 0 ] T ( z � z 0 ) + ( z � z 0 ) T H ( z 0 )( z � z 0 ) approximates f ( z ) near z 0 with a quadratic equation through z 0 • For minimization, this is useful only when H ( z 0 ) < 0 • Function looks locally like a bowl f( z ) z 2 z 0 z 1 z 1 • If we want to find z 1 where f ( z 1 ) < f ( z 0 ) , going to the bottom of the bowl seems promising • This is the general idea of Newton’s method COMPSCI 527 — Computer Vision Local, Unconstrained Function Optimization 10 / 27

A Local, Unconstrained Optimization Template A Template GIVEN Eo • Regardless of method, most local unconstrained optimization methods fit the following template: ITERATION COUNT k = 0 while z k is not a minimum e I compute step direction p k with k p k k > 0 compute step size α k > 0 z k + 1 = z k + α k p k k = k + 1 I end. COMPSCI 527 — Computer Vision Local, Unconstrained Function Optimization 11 / 27

A Local, Unconstrained Optimization Template Design Decisions • Whether to stop (“while z k is not a minimum”) • In what direction to proceed ( p k ) • How long a step to take in that direction ( α k ) • Different decisions for the last two lead to different methods with very different behaviors and computational costs COMPSCI 527 — Computer Vision Local, Unconstrained Function Optimization 12 / 27

Local, Unconstrained Function Optimization COMPSCI 527 Computer - PowerPoint PPT Presentation

Local, Unconstrained Function Optimization COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision Local, Unconstrained Function Optimization 1 / 27 Outline 1 Gradient, Hessian, and Convexity 2 A Local, Unconstrained Optimization

Algorithms for unconstrained local optimization Fabio Schoen 2008

MATHEMATICS 1 CONTENTS Unconstrained optimization Constrained optimization Lagrange method

Local Function Optimization COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning

Unconstrained Elastic Matching Unconstrained Elastic Matching and Eigen Eigen- -Deformations

Optimization Unconstrained optimization Constrained optimization Newton with equality

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

MATH529 Fundamentals of Optimization Unconstrained Optimization II Marco A. Montes de Oca

Convex Optimization 9. Unconstrained minimization Prof. Ying Cui Department of Electrical

Unconstrained minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi

BEEM103 Optimization Techniques for Economists Level Curves Multivariate Functions Isoquants

Outline Optimization Unconstrained Optimization Problems Machine Learning and Pattern

Unconstrained Optimization Optimization problem Given f : R n R find x R n ,

Descent Algorithms for Optimizing Unconstrained Problems Techniques relevant for most (convex)

MATH529 Fundamentals of Optimization Unconstrained Optimization I Marco A. Montes de Oca

Unconstrained Optimization -4 0 -4 -2 -2 BEEM103 Mathematics for Economists 0 0 2 2 4 4

Computational Optimization Convexity and Unconstrained Optimization 1/29/08 and 2/1(revised)

RTCP Reporting Extensions draft-friedman-avt-rtcp-report-extns-02.txt Minneapolis, 20 March 2002

NETWORK FLOWS NETWORK FLOWS A network consists of a loopless digraph D = ( V , A ) plus a function

Introduction to Machine Translation Joost Bastings ILLC, University of Amsterdam

The Lords S upper Possible site where the Lord and Passover in His disciples Upper

Constructing English Reading Courseware Masao Utiyama (NICT) Midori Tanimura (Kinki Univ.)

Monomial Bases for NBC Complexes Jason I. Brown Department of Mathematics and Statistics

Positive Logic Is 2-Exptime Hard Aleksy Schubert Pawe Urzyczyn Daria Walukiewicz-Chrzszcz

Clustering / Unsupervised Learning The target features are not given in the training examples The