Optimality Conditions Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Optimality Conditions – p.
Optimality Conditions: descent directions Let S ⊆ R n be a convex set and consider the problem min x ∈ S f ( x ) where f : S → R . Let x 1 , x 2 ∈ S and d = x 2 − x 1 . d is a feasible direction. ǫ > 0 such that f ( x 1 + ǫd ) < f ( x 1 ) ∀ ǫ ∈ (0 , ¯ If there exists ¯ ǫ ) , d is called a descent direction at x 1 . Elementary necessary optimality condition: if x ⋆ is a local optimum, no descent direction may exist at x ⋆ Optimality Conditions – p.
Optimality Conditions for Convex Sets If x ⋆ ∈ S is a local optimum for f () and there exists a neighborhood U ( x ⋆ ) such that f ∈ C 1 ( U ( x ⋆ )) , then d T ∇ f ( x ⋆ ) ≥ 0 ∀ d : feasible direction Optimality Conditions – p.
Optimality Conditions – p.
proof Taylor expansion: f ( x ⋆ + ǫd ) = f ( x ⋆ ) + ǫd T ∇ f ( x ⋆ ) + o ( ǫ ) d cannot be a descent direction, so, if ǫ is sufficiently small, then f ( x ⋆ + ǫd ) ≥ f ( x ⋆ ) . Thus ǫd T ∇ f ( x ⋆ ) + o ( ǫ ) ≥ 0 and dividing by ǫ , d T ∇ f ( x ⋆ ) + o ( ǫ ) ≥ 0 ǫ Letting ǫ ↓ 0 the proof is complete. Optimality Conditions – p.
Optimality Conditions: tangent cone General case: min f ( x ) g i ( x ) ≤ 0 i = 1 , . . . , m x ∈ X ( X : open set ) Let S = { x ∈ X : g i ( x ) ≤ 0 , i = 1 , . . . , m } . x ) = { d ∈ R n } : Tangent cone to S in ¯ x : T (¯ x k − ¯ d x � d � = lim � x k − ¯ x � x k → ¯ x where x k ∈ S . Optimality Conditions – p.
b c b c b c b b c b c b c Optimality Conditions – p.
Some examples S = R n ⇒ T ( x ) = R n ∀ x S = { Ax = b } ⇒ T ( x ) = { d : Ad = 0 } S = { Ax ≤ b } ; let I be the set of active constraints in ¯ x : a T i ¯ x = b i i ∈ I a T i ¯ x < b i i �∈ I . Optimality Conditions – p.
Optimality Conditions – p.
Let d = lim k ( x k − ¯ x ) / � ( x k − ¯ x ) � ⇒ a T i d = a T k ( x k − ¯ x ) / � ( x k − ¯ x ) � i ∈ I i lim k a T i ( x k − ¯ x ) / � ( x k − ¯ x ) � = lim k ( a T = lim i x k − b ) / � ( x k − ¯ x ) � ≤ 0 x ) ⇒ a T Thus if d ∈ T (¯ i d ≤ 0 for i ∈ I . Optimality Conditions – p. 1
x + α k d . If a T Viceversa, let x k = ¯ i d ≤ 0 for i ∈ I ⇒ a T i x k = a T i (¯ x + α k d ) i ∈ I = b i + α k a T i d ≤ b i a T i x k = a T i (¯ x + α k d ) i �∈ I < b i + α k a T i d ≤ b i if α k small enough Thus T ( x ) = { d : a T i d ≤ 0 ∀ i ∈ I} Optimality Conditions – p. 1
Example Let S = { ( x, y ) ∈ R 2 : x 2 − y = 0 } (parabola). Tangent cone at (0 , 0) ? Let { ( x k , y k ) → (0 , 0) } , i.e. x k → 0 , y k = x 2 k : � x 2 � ( x k , y k ) − (0 , 0) � = k + ( x k ) 4 � 1 + x 2 = | x k | k and x k y k lim = 1 lim = 0 � � 1 + x 2 1 + x 2 | x k | | x k | x k → 0 + x k → 0 + k k x k y k = − 1 lim lim = 0 � � 1 + x 2 1 + x 2 | x k | | x k | x k → 0 − x k → 0 − k k thus T (0 , 0) = { ( − 1 , 0) , (1 , 0) } Optimality Conditions – p. 1
Descent direction d ∈ R n is a feasible direction in ¯ x ∈ S if ∃ ¯ α > 0 : x + αd ∈ S ∀ α ∈ [0 , ¯ ¯ α ) . d feasible ⇒ d ∈ T (¯ x ) , but in general the converse is false. If f (¯ x + αd ) ≤ f (¯ x ) ∀ α ∈ (0 , ¯ α ) d is a descent direction Optimality Conditions – p. 1
I order necessary opt condition x ∈ S ⊆ R n be a local optimum for min x ∈ S f ( x ) ; let Let ¯ f ∈ C 1 ( U (¯ x )) . Then d T ∇ f (¯ x ) ≥ 0 ∀ d ∈ T (¯ x ) Proof d = lim k ( x k − ¯ x ) / � ( x k − ¯ x ) � . Taylor expansion: x ) + ∇ T f (¯ x )( x k − ¯ x ) + o ( � x k − ¯ x � ) f ( x k ) = f (¯ x ) + ∇ T f (¯ x )( x k − ¯ x ) + � x k − ¯ x � o (1) . = f (¯ x local optimum ⇒∃ U (¯ ¯ x ) : f ( x ) ≥ f (¯ x ) ∀ x ∈ U ∩ S . Optimality Conditions – p. 1
. . . If k is large enough, x k ∈ U (¯ x ) : f ( x k ) − f (¯ x ) ≥ 0 thus ∇ T f (¯ x )( x k − ¯ x ) + � x k − ¯ x � o (1) ≥ 0 Dividing by � ( x k − ¯ x ) � : ∇ T f (¯ x )( x k − ¯ x ) / � ( x k − ¯ x ) � + o (1) ≥ 0 and in the limit ∇ T f (¯ x ) d ≥ 0 . Optimality Conditions – p. 1
Examples Unconstrained problems Every d ∈ R n belongs to the tangent cone ⇒ at a local optimum ∇ T f (¯ ∀ d ∈ R n x ) d ≥ 0 Choosing d = e i e d = − e i we get ∇ f (¯ x ) = 0 NB: the same is true if ¯ x is a local minimum in the relative interior of the feasible region. Optimality Conditions – p. 1
Linear equality constraints min f ( x ) Ax = b Tangent cone: { d : Ad = 0 } . Necessary conditions: ∇ T f (¯ x ) d ≥ 0 ∀ d : Ad = 0 equivalent statement: ∇ T f (¯ min x ) d = 0 d Ad = 0 (a linear program). Optimality Conditions – p. 1
Linear equality constraints From LP duality ⇒ max 0 T λ = 0 A T λ = ∇ f (¯ x ) Thus at a local minimum point there exist Lagrange multipliers: ∃ λ : A T λ = ∇ f (¯ x ) Optimality Conditions – p. 1
Linear inequalities min f ( x ) Ax ≤ b Tangent cone at a local minimum ¯ x : { d ∈ R n : a T i d ≤ 0 ∀ i ∈ I (¯ x ) } . Let A I be the rows of A associated to active constraints at ¯ x . Then ∇ T f (¯ min x ) d = 0 d A I d ≤ 0 λ ≤ 0 Optimality Conditions – p. 1
Linear inequalities From LP duality: max 0 T λ = 0 A T I λ = ∇ f (¯ x ) λ ≤ 0 Thus, at a local optimum, the gradient is a non positive linear combination of the coefficients of active constraints. Optimality Conditions – p. 2
Farkas’ Lemma Let A : matrix in R m × n and b ∈ R n . One and only one of the following sets: A T y ≤ 0 b T y > 0 and Ax = b x ≥ 0 is non empty Optimality Conditions – p. 2
Geometrical interpretation A T y ≤ 0 Ax = b b T y > 0 x ≥ 0 a 1 { z : ∃ x : z = Ax, x ≥ 0 } b a 2 { y : A T y ≤ 0 } Optimality Conditions – p. 2
Proof 1) if ∃ x ≥ 0 : Ax = b ⇒ b T y = x T A T y . Thus if A T y ≤ 0 ⇒ b T y ≤ 0 . 2) Premise: Separating hyperplane theorem: let C and D be two convex nonempty sets: C ∪ D = ∅ . Then there exists a � = 0 and b : a T x ≤ b x ∈ C a T x ≥ b x ∈ D If C is a point and D is a closed convex set, separation is strict, i.e. a T C < b a T x > b x ∈ D Optimality Conditions – p. 2
Farkas’ Lemma (proof) 2) let { x : Ax = b, x ≥ 0 } = ∅ . Let S = { y ∈ R m : ∃ x ≥ 0 , Ax = y } S is closed, convex and b �∈ S . From the separating hyperplane theorem: ∃ α ∈ R m � = 0 , β ∈ R : α T y ≤ β ∀ x ∈ S α T b > β 0 ∈ S ⇒ β ≥ 0 ⇒ α T b > 0 ; α T Ax ≤ β for all x ≥ 0 . This is possible iff α T A ≤ 0 . Letting y = α we obtain a solution of A Y y ≤ 0 b T y > 0 Optimality Conditions – p. 2
First order feasible variations cone x ) = { d ∈ R n : ∇ T g i (¯ x ) d ≤ 0 } i ∈ I G (¯ b b Optimality Conditions – p. 2
First order variations x ) ⊇ T (¯ G (¯ x ) . In fact if { x k } is feasible and x k − ¯ x d = lim � x k − ¯ x � k then g i (¯ x ) ≤ 0 and g (¯ x + lim k ( x k − ¯ x )) ≤ 0 Optimality Conditions – p. 2
. . . x � x k − ¯ x k � x k − ¯ x � ) ≤ 0 g (¯ x + lim � x k − ¯ x � lim x k − ¯ x g (¯ x + lim k � x k − ¯ x � ) ≤ 0 � x k − ¯ g (¯ x + lim k � x k − ¯ x � d ) ≤ 0 Let α k = � x k − ¯ x � , if α k ≈ 0 : x + α k d ) ≤ 0 g (¯ Optimality Conditions – p. 2
x ) + α k ∇ T g i (¯ g i (¯ x + α k d ) = g i (¯ x ) d + o ( α k ) where α k > 0 and d belong to the tangent cone T (¯ x ) . If the i –th constraint is active, then x + α k d ) = α k ∇ T g i (¯ x ) d + o ( α k ) ≤ 0 g i (¯ x + α k d ) /α k = ∇ T g i (¯ x ) d + o ( α k )) /α k ≤ 0 g i (¯ Letting α k → 0 the result is obtained. Optimality Conditions – p. 2
example x ) � = T (¯ G (¯ x ) ; − x 3 + y ≤ 0 − y ≤ 0 Optimality Conditions – p. 2
KKT necessary conditions (Karush–Kuhn–Tucker) x ∈ X ⊆ R n , X � = ∅ be a local optimum for Let ¯ min f ( x ) g i ( x ) ≤ 0 i = 1 , . . . , m x ∈ X I : indices of active constraints at ¯ x . If: 1. f ( x ) , g i ( x ) ∈ C 1 (¯ x ) for i ∈ I 2. “constraint qualifications” conditions: T (¯ x ) = G (¯ x ) hold in ¯ x ; then there exist Lagrange multipliers λ i ≥ 0 , i ∈ I : � ∇ f (¯ λ i ∇ g i (¯ x ) + x ) = 0 . i ∈I Optimality Conditions – p. 3
Proof x ) ⇒ d T ∇ f (¯ x local optimum ⇒ if d ∈ T (¯ ¯ x ) ≥ 0 . But d ∈ T (¯ x ) ⇒ d T ∇ g i (¯ x ) ≤ 0 i ∈ I . Thus it is impossible that −∇ T f (¯ x ) d > 0 ∇ T g i (¯ x ) d ≤ 0 i ∈ I From Farkas’ Lemma ⇒ there exists a solution of: � λ i ∇ T g i (¯ x ) = −∇ T f (¯ i ∈ I x ) i ∈I λ i ≥ 0 i ∈ I Optimality Conditions – p. 3
Constraint qualifications: examples polyhedra: X = R n and g i ( x ) are affine functions: Ax ≤ b linear independence: X open set, g i ( x ) , i �∈ I continuous in ¯ x and {∇ g i (¯ x ) } , i ∈ I are linearly independent. Slater condition: X open set, g i ( x ) , i ∈ I convex differentiable x , g i ( x ) , i �∈ I continuous in ¯ x , and ∃ ˆ x ∈ X functions in ¯ strictly feasible: i ∈ I . g i (ˆ x ) < 0 Optimality Conditions – p. 3
Recommend
More recommend