Computational Optimization Mathematical Programming Fundamentals 1/25 (revised)
If you don’t know where you are going, you probably won’t get there. -from some book I read in eight grade If you do get there, you won’t know it. -Dr. Bennett’s amendment Mathematical Programming Theory tells us – How to formulate a model. Strategies for solving the model. How to know when we have found an optimal solutions. How hard it is to solve the model. Let’s start with the basics…………………
Line Segment Let x ∈ R n and y ∈ R n , the points on the line segment joining x and y are { z | z = λ x+(1- λ )y, 0 ≤ λ ≤ 1 }. y x
Convex Sets A set S is convex if the line segment joining any two points in the set is also in the set, i.e., for any x,y ∈ S, λ x+(1- λ )y ∈ S for all 0 ≤ λ ≤ 1 }. convex not convex convex not convex not convex
Favorite Convex Sets Circle with center c and radius r } − ≤ { | x x c r Linear Equalities = plane ∈ ∈ ∈ mxn m n Matrix A R b R x R { | = } x Ax b Linear Inequalities or Polyhedrons ∈ ∈ ∈ mxn m n Matrix A R b R x R { | ≤ } x Ax b
Convex Sets Is the intersection of two convex sets convex? Yes Is the union of two convex sets convex? NO
Convex Functions A function f is (strictly) convex on a convex set S, if and only if for any x,y ∈ S, f( λ x+(1- λ )y)(<) ≤ λ f(x)+ (1- λ )f(y) for all 0 ≤ λ ≤ 1. f( λ x+(1- λ )y) f(y) f(x) λ x+(1- λ )y x y
Concave Functions A function f is (strictly) concave on a convex set S, if and only if for any –f is (strictly) convex on S. f -f
(Strictly)Convex, Concave, or none of the above? Concave Convex None of the above Concave Strictly convex
Favorite Convex Functions Linear functions n ∑ = = ∈ n ( ) ' f x w x w x where x R i i = i 1 = + ( , ) 2 f x x x x 1 2 1 2 Certain Quadratic functions depends on choice of Q (the Hessian matrix) = + + ( ) ' ' = + f x x Qx w x c 2 2 ( , ) 2 f x x x x 1 2 1 2
Convexity of function affects optimization algorithm
Convexity of constraints affects optimization algorithm min f(x) subject to x ∈ S direction of S not convex Steepest descent S convex
Convex Program min f(x) subject to x ∈ S where f and S are convex Make optimization nice Many practical problems are convex problem Use convex program as subproblem for nonconvex programs
Theorem : Global Solution of convex program If x* is a local minimizer of a convex programming problem, x* is also a global minimizer. Further more if the objective is strictly convex then x* is the unique global minimizer. Proof: x* contradiction f(y)<f(x*) y
Proof by contradiction Suppose x* is a local but not global minimizer, i.e. there exist y, s.t. f(y) <f(x*). Then for all 0< ε <1, f( ε x*+(1- ε )y) ≤ ε f(x*)+(1- ε )f(y) < ε f(x*)+(1- ε )f(x*)=f(x*). Contradiction, x* is not a local min. You try for uniqueness in strict case.
Problems with nonconvex objective Min f(x) subject to x ∈ [a,b] f strictly convex, problem has unique global minimum a x* b f not convex, problem has two local minima a x’ x* b
Problems with nonconvex set Min f(x) subject to x ∈ [a,b] or [c d] a d b c x* x’
Multivariate Calculus For x ∈ R n , f(x)=f( x 1 , x 2 , x 3 , x 4 ,…, x n ) The gradient of f: ′ ⎛ ⎞ ∂ ∂ ∂ ( ) ( ) ( ) f x f x f x ∇ = ⎜ ⎟ ( ) , ,..., f x ∂ ∂ ∂ ⎝ ⎠ x x x 1 2 n ⎡ ⎤ ∂ ∂ ∂ 2 2 2 ( ) ( ) ( ) f x f x f x ⎢ ... ⎥ The Hessian of f: ∂ ∂ ∂ ∂ ∂ ∂ x x x x x x ⎢ ⎥ 1 1 1 2 1 n ⎢ ⎥ ∂ ∂ 2 2 ( ) ( ) f x f x ⎢ ⎥ � ... ∇ = ∂ ∂ ∂ ∂ 2 ( ) ⎢ ⎥ f x x x x x 2 1 2 2 ⎢ ⎥ � � � � ⎢ ⎥ ⎢ ⎥ ∂ ∂ ∂ 2 2 2 ( ) ( ) ( ) f x f x f x ⎢ ... ⎥ ∂ ∂ ∂ ∂ ∂ ∂ ⎢ ⎥ ⎣ ⎦ x x x x x x 1 2 n n n n
For example = + + + 4 3 x 2 ( ) 3 4 f x x e x x 1 x 1 2 1 2 ⎡ ⎤ + + 3 x 2 3 4 x e x 1 ∇ = ⎢ 1 2 ⎥ ( ) f x + 3 ⎢ ⎥ ⎣ 1 2 4 ⎦ x x 2 1 ⎡ ⎤ + 3 x 2 9 4 e 1 ∇ = ⎢ 2 ⎥ ( ) f x 2 ⎣ ⎦ 4 3 6 x ′ 2 = [ 0 ,1] x ⎡ ⎤ 7 ∇ = ⎢ ( ) ⎥ f x ⎣ 1 2 ⎦ ⎡ ⎤ 1 1 4 ∇ = ⎢ 2 ( ) ⎥ f x ⎣ ⎦ 4 3 6
Quadratic Functions 1 Form ′ = − ( ) ' f x x Q x b x 2 ∈ ∈ ∈ n n n n nxn n 1 ∑ ∑ ∑ x R Q R b R = − Q x x b x ij i j j j 2 = = = 1 1 1 i j j ∂ ( ) 1 1 f x ∑ ∑ = + + − Q x Q x Q x b Gradient ∂ kk k ik i kj j k 2 2 x ≠ ≠ i k j k k n ∑ = − assuming symmetric Q x b Q kj j k = j 1 ∇ = − ( ) f x Qx b ∇ = 2 ( ) f x Q
Taylor Series Expansion about x* - 1D Case Let x=x*+p 1 1 ′ 2 2 3 3 f(x)= f(x*+p)=f(x*)+pf (x*)+ p f (x*)+ p f (x*) 2 3! 1 … n n … + + p f (x*) + n! Equivalently 1 1 ′ − 2 2 3 3 f(x)=f(x*)+(x-x*)f (x*)+ (x-x*) f (x*)+ ( *) f (x*) x x 2 3! 1 − … n n … + + ( *) f (x*) + x x n!
Taylor Series Example Let f(x) = exp(-x), compute Taylor Series Expansion about x*=0 1 1 ′ − 2 2 3 3 f(x)=f(x*)+(x-x*)f (x*)+ (x-x*) f (x*)+ ( *) f (x*) x x 2 3! 1 − … n n … + + ( *) f (x*) + x x n! 2 3 n x x x − − − − = − + − + … … * * * n * x x x x 1 + +(-1) xe e e e 2 3! ! n 2 3 n x x x = − + − + … n … 1 + +(-1) x 2 3! ! n
First Order Taylor Series Approximation Let x=x*+p ′∇ α f(x)=f(x*+p)=f(x*)+p f(x*)+ p ( *, ) x p α = lim ( *, ) 0 where x p → 0 p Says that a linear approximation of a function works well locally ′ ≈ + ∇ f(x) f(x*+p)= ( *) ( *) f x p f x f(x) ≈ + − ∇ f(x) ( *) ( *)' ( *) f x x x f x x*
Second OrderTaylor Series Approximation Let x=x*+p 1 ′ ′ 2 ∇ ∇ α 2 f(x)=f(x*+p)=f(x*)+p f(x*)+ f(x*)p+ p ( *, ) p x p 2 α = lim ( *, ) 0 where x p → 0 p Says that a quadratic approximation of a function works even better locally ≈ + − ∇ f(x) ( *) ( *)' ( *) f x x x f x f(x) 1 ( + − ∇ − 2 *)' ( *)( *) x x f x x x 2 x*
Theorem 2.1 –Taylor’s Theorem version 2 Suppose f is cont diff, + = + ∇ + ( ) ( ) ( )' f x p f x f x tp p ∈ for some [0,1]. t If f is twice cont. diff, + = + ∇ + ∇ + 1 2 ( ) ( ) ( )' ' ( )' f x p f x f x p p f x tp p 2 ∈ for some [0,1]. t Also called Mean Value Theorem
Taylor Series Approximation Exercise Consider the function and x*=[-2,3] = + + + 3 2 2 2 ( , ) 5 7 2 f x x x x x x x x 1 2 1 1 2 1 2 2 Compute gradient and Hessian. What is First Order TSA about X* What is second order TSA about X* Evaluate both TSA at y=[-1.9,3.2] and compare with f(y)
Exercise = + + + 3 2 2 2 ( , ) 5 7 2 fu n ctio n f x x x x x x x x 1 2 1 1 2 1 2 2 ⎡ ⎤ ∇ = ∇ = ( ) ( *) [ , ] ' g rad ien t f x ⎢ ⎥ f x ⎣ ⎦ ⎡ ⎤ ⎡ ⎤ ∇ = ∇ = 2 2 ( ) ⎢ ⎥ ( *) H essian f x f x ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ F irst o rd er T S A : ′ = + − ∇ = ( ) ( *) ( *) ( *) g x f x x x f x seco n d o rd er T S A : ′ = + − ∇ ( ) ( *) ( *) ( *) h x f x x x f x ′ + − ∇ − 1 2 ( *) ( *)( *) x x f x x x 2 − = | ( ) ( ) | f y g y − = | ( ) ( ) | f y h y
Exercise = + + + = − 3 2 2 2 ( , ) 5 7 2 function ( *) 56 f x x x x x x x x f x 1 2 1 1 2 1 2 2 ⎡ ⎤ + + 2 2 3x 10 7 x x x ′ ∇ = ∇ = − 1 1 2 2 ⎢ ⎥ ( ) ( *) [15, 52] gradient f x f x + + ⎢ ⎥ 2 ⎣ 5 14 4 ⎦ x x x x 1 1 2 2 + + ⎡ ⎤ ⎡ ⎤ 6 10 10 14 18 22 x x x x ∇ = ∇ = 1 2 1 2 2 2 ( ) ⎢ ⎥ ( *) ⎢ ⎥ Hessian f x f x + + ⎣ ⎦ ⎣ ⎦ 10 14 14 4 22 -24 x x x 1 2 1
Exercise F ir s t o r d e r T S A : ′ = + − ∇ ( ) ( * ) ( * ) ( * ) g x f x x x f x s e c o n d o r d e r T S A : ′ = + − ∇ ( ) ( * ) ( * ) ( * ) h x f x x x f x ′ ′ + − ∇ − 1 2 ( * ) ( * ) ( * ) x x f x x x 2 − = − − − = | ( ) ( ) | | 6 4 .8 1 1 ( 6 4 .9 ) | .0 8 9 f y g y − = − − − = | ( ) ( ) | | 6 4 .8 1 1 ( 6 4 .5 ) | .0 3 9 f y h y
General Optimization algorithm Specify some initial guess x 0 For k = 0, 1, …… � If x k is optimal then stop � Determine descent direction p k � Determine improved estimate of the solution: x k+1 =x k + λ k p k Last step is one-dimensional search problem called line search
Descent Directions If the directional derivative is negative then linesearch will lead to decrease in the function ′ ∇ < ( ) 0 f x d [8,2] d −∇ ( ) f x [0,-1]
Recommend
More recommend