In case you missed it - Who am I ? Name: S ebastien Gros - PowerPoint PPT Presentation

Core idea Goal : solve r ( w ) = 0... how ?!? 0.2 Algorithm: Newton method Input : w , tol w while � r ( w ) � ∞ ≥ tol do 0 Compute r ( w ) r ( w ) and ∇ r ( w ) -0.2 Compute the Newton direction ∇ r ( w ) T ∆ w = − r ( w ) -0.4 Newton step w w ← w + ∆ w Key idea : guess w , iterate the linear model: return w r ( w + ∆ w ) ≈ r ( w ) + ∇ r ( w ) ⊤ ∆ w = 0 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 8 / 32

Core idea Goal : solve r ( w ) = 0... how ?!? 0.2 Algorithm: Newton method Input : w , tol w while � r ( w ) � ∞ ≥ tol do 0 Compute r ( w ) r ( w ) and ∇ r ( w ) -0.2 Compute the Newton direction ∇ r ( w ) T ∆ w = − r ( w ) -0.4 Newton step w w ← w + ∆ w Key idea : guess w , iterate the linear model: return w r ( w + ∆ w ) ≈ r ( w ) + ∇ r ( w ) ⊤ ∆ w = 0 This is a full-step Newton iteration 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 8 / 32

Core idea Goal : solve r ( w ) = 0... how ?!? 0.2 Algorithm: Newton method Input : w , tol w while � r ( w ) � ∞ ≥ tol do 0 Compute r ( w ) r ( w ) and ∇ r ( w ) -0.2 Compute the Newton direction ∇ r ( w ) T ∆ w = − r ( w ) -0.4 Newton step, t ∈ ]0 , 1] w w ← w + t ∆ w Key idea : guess w , iterate the linear model: return w r ( w + ∆ w ) ≈ r ( w ) + ∇ r ( w ) ⊤ ∆ w = 0 This is a full-step Newton iteration Reduced steps are often needed 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 8 / 32

Why reduced steps ? Newton step with t ∈ ]0 , 1]: ∇ r ( w ) ⊤ ∆ w = − r ( w ) w ← w + t ∆ w 1.5 1 0.5 r ( w ) 0 -0.5 -1 -1.5 w 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 9 / 32

Why reduced steps ? Newton step with t ∈ ]0 , 1]: ∇ r ( w ) ⊤ ∆ w = − r ( w ) w ← w + t ∆ w 1.5 1 t = 1 0.5 r ( w ) 0 w -0.5 -1 -1.5 w 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 9 / 32

Why reduced steps ? Newton step with t ∈ ]0 , 1]: ∇ r ( w ) ⊤ ∆ w = − r ( w ) w ← w + t ∆ w 1.5 1 t = 1 0.5 r ( w ) w 0 -0.5 -1 -1.5 w 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 9 / 32

Why reduced steps ? Newton step with t ∈ ]0 , 1]: ∇ r ( w ) ⊤ ∆ w = − r ( w ) w ← w + t ∆ w 1.5 1 t = 1 0.5 r ( w ) w 0 -0.5 -1 -1.5 w The full-step Newton iteration can be unstable !! 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 9 / 32

Why reduced steps ? Newton step with t ∈ ]0 , 1]: ∇ r ( w ) ⊤ ∆ w = − r ( w ) w ← w + t ∆ w 1.5 1 t = 0.8 0.5 r ( w ) 0 w -0.5 -1 -1.5 w The full-step Newton iteration can be unstable !! 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 9 / 32

Why reduced steps ? Newton step with t ∈ ]0 , 1]: ∇ r ( w ) ⊤ ∆ w = − r ( w ) w ← w + t ∆ w 1.5 1 t = 0.8 0.5 r ( w ) w 0 -0.5 -1 -1.5 w The full-step Newton iteration can be unstable !! 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 9 / 32

Why reduced steps ? Newton step with t ∈ ]0 , 1]: ∇ r ( w ) ⊤ ∆ w = − r ( w ) w ← w + t ∆ w 1.5 1 t = 0.8 0.5 r ( w ) 0 w -0.5 -1 -1.5 w The full-step Newton iteration can be unstable !! 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 9 / 32

Why reduced steps ? Newton step with t ∈ ]0 , 1]: ∇ r ( w ) ⊤ ∆ w = − r ( w ) w ← w + t ∆ w 1.5 1 t = 0.8 0.5 r ( w ) 0 w -0.5 -1 -1.5 w The full-step Newton iteration can be unstable !! While the reduced-steps Newton iteration is stable... 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 9 / 32

Does Newton always work ? Is the Newton step ∆ w always providing a direction ”improving” r ( w ) ? 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 10 / 32

Does Newton always work ? Is the Newton step ∆ w always providing a direction ”improving” r ( w ) ? I.e. is there always a t > 0 s.t. � r ( w + t ∆ w ) � < � r ( w ) � is true ? 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 10 / 32

Does Newton always work ? Is the Newton step ∆ w always providing a direction ”improving” r ( w ) ? I.e. is there always a t > 0 s.t. � r ( w + t ∆ w ) � < � r ( w ) � is true ? Yes... but 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 10 / 32

Does Newton always work ? Is the Newton step ∆ w always providing a direction ”improving” r ( w ) ? I.e. is there always a t > 0 s.t. � r ( w + t ∆ w ) � < � r ( w ) � is true ? Yes... but Proof : � r ( w + t ∆ w ) � < � r ( w ) � holds for some t > 0 if � d � d t � r ( w + t ∆ w ) � 2 < 0 � � t =0 with � r ( w ) � 2 differentiable. 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 10 / 32

Does Newton always work ? Is the Newton step ∆ w always providing a direction ”improving” r ( w ) ? I.e. is there always a t > 0 s.t. � r ( w + t ∆ w ) � < � r ( w ) � is true ? Yes... but Proof : � r ( w + t ∆ w ) � < � r ( w ) � holds for some t > 0 if � d � d t � r ( w + t ∆ w ) � 2 < 0 � � t =0 with � r ( w ) � 2 differentiable. I.e. 2 r ( w ) T d d t r ( w + t ∆ w ) t =0 < 0 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 10 / 32

Does Newton always work ? Is the Newton step ∆ w always providing a direction ”improving” r ( w ) ? I.e. is there always a t > 0 s.t. � r ( w + t ∆ w ) � < � r ( w ) � is true ? Yes... but Proof : � r ( w + t ∆ w ) � < � r ( w ) � holds for some t > 0 if � d � d t � r ( w + t ∆ w ) � 2 < 0 � � t =0 with � r ( w ) � 2 differentiable. I.e. 2 r ( w ) T d d t r ( w + t ∆ w ) t =0 < 0 We have d d t r ( w + t ∆ w ) t =0 = ∇ r ( w ) T ∆ w = −∇ r ( w ) T ∇ r ( w ) − T r ( w ) = − r ( w ) 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 10 / 32

Does Newton always work ? Is the Newton step ∆ w always providing a direction ”improving” r ( w ) ? I.e. is there always a t > 0 s.t. � r ( w + t ∆ w ) � < � r ( w ) � is true ? Yes... but Proof : � r ( w + t ∆ w ) � < � r ( w ) � holds for some t > 0 if � d � d t � r ( w + t ∆ w ) � 2 < 0 � � t =0 with � r ( w ) � 2 differentiable. I.e. 2 r ( w ) T d d t r ( w + t ∆ w ) t =0 < 0 We have d d t r ( w + t ∆ w ) t =0 = ∇ r ( w ) T ∆ w = −∇ r ( w ) T ∇ r ( w ) − T r ( w ) = − r ( w ) Then � d � = − 2 � r ( w ) � 2 < 0 d t � r ( w + t ∆ w ) � 2 � � t =0 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 10 / 32

Does Newton always work ? Is the Newton step ∆ w always providing a direction ”improving” r ( w ) ? I.e. is there always a t > 0 s.t. � r ( w + t ∆ w ) � < � r ( w ) � is true ? Yes... but How to select the step size t ∈ ]0 , 1] ? Globalization... Line-search : reduce t until some criteria of progression on � r � are met Trust region : confine the step ∆ w within a region where ∇ r ( w ) provides a good model of r ( w ) Filter techniques : monitor progress on specific components of r ( w ) separately ... ... ensures that progress is made in one way or another . Note: most of these techniques are specific to optimization. 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 10 / 32

But still, Newton can fail... Solve r ( w ) = 0 0.6 0.4 0.2 r(w) 0 w -0.2 -0.4 w 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 11 / 32

But still, Newton can fail... Solve r ( w ) = 0 Newton stops with r ( w ) � = 0 and ∇ r ( w ) singular 0.6 i.e. the Newton direction ∆ w given by 0.4 ∇ r ( w ) ⊤ ∆ w = − r ( w ) 0.2 r(w) is undefined... 0 w -0.2 -0.4 w 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 11 / 32

But still, Newton can fail... Solve r ( w ) = 0 Newton stops with r ( w ) � = 0 and ∇ r ( w ) singular 0.6 i.e. the Newton direction ∆ w given by 0.4 ∇ r ( w ) ⊤ ∆ w = − r ( w ) 0.2 r(w) is undefined... 0 w -0.2 -0.4 w This is a common failure mode for Newton-based solvers when tackling very non-linear r and starting with a poor initial guess !! 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 11 / 32

Convergence of full-step Newton methods Newton method: ∇ r ( w ) ⊤ ∆ w = − r ( w ) w ← w + ∆ w 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 12 / 32

Convergence of full-step Newton methods Newton method: ∇ r ( w ) ⊤ ∆ w = − r ( w ) w ← w + ∆ w Yields the iteration k = 0 , 1 , .... : w k +1 ← w k − ∇ r ( w k ) −⊤ r ( w k ) 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 12 / 32

Convergence of full-step Newton methods Newton method: Newton-type method (Jacobian approx.) ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + ∆ w w ← w + ∆ w Yields the iteration k = 0 , 1 , .... : w k +1 ← w k − ∇ r ( w k ) −⊤ r ( w k ) 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 12 / 32

Convergence of full-step Newton methods Newton method: Newton-type method (Jacobian approx.) ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + ∆ w w ← w + ∆ w Yields the iteration k = 0 , 1 , .... : Yields the iteration k = 0 , 1 , .... : w k +1 ← w k − ∇ r ( w k ) −⊤ r ( w k ) w k +1 ← w k − M − 1 k r ( w k ) 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 12 / 32

Convergence of full-step Newton methods Newton method: Newton-type method (Jacobian approx.) ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + ∆ w w ← w + ∆ w Yields the iteration k = 0 , 1 , .... : Yields the iteration k = 0 , 1 , .... : w k +1 ← w k − ∇ r ( w k ) −⊤ r ( w k ) w k +1 ← w k − M − 1 k r ( w k ) Theorem : assume � � ∇ r ( w ) T − ∇ r ( w ∗ ) T �� M − 1 � � ≤ ω � w − w ∗ � , for Nonlinearity of r : k w ∈ [ w k , w ⋆ ] 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 12 / 32

Convergence of full-step Newton methods Newton method: Newton-type method (Jacobian approx.) ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + ∆ w w ← w + ∆ w Yields the iteration k = 0 , 1 , .... : Yields the iteration k = 0 , 1 , .... : w k +1 ← w k − ∇ r ( w k ) −⊤ r ( w k ) w k +1 ← w k − M − 1 k r ( w k ) Theorem : assume � � ∇ r ( w ) T − ∇ r ( w ∗ ) T �� M − 1 � ≤ ω � w − w ∗ � , for � Nonlinearity of r : k w ∈ [ w k , w ⋆ ] � � k ( ∇ r ( w k ) T − M k ) � � M − 1 � Jacobian approximation error: � ≤ κ k < 1 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 12 / 32

Convergence of full-step Newton methods Newton method: Newton-type method (Jacobian approx.) ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + ∆ w w ← w + ∆ w Yields the iteration k = 0 , 1 , .... : Yields the iteration k = 0 , 1 , .... : w k +1 ← w k − ∇ r ( w k ) −⊤ r ( w k ) w k +1 ← w k − M − 1 k r ( w k ) Theorem : assume � � ∇ r ( w ) T − ∇ r ( w ∗ ) T �� M − 1 � � ≤ ω � w − w ∗ � , for Nonlinearity of r : k w ∈ [ w k , w ⋆ ] � � k ( ∇ r ( w k ) T − M k ) � M − 1 � � Jacobian approximation error: � ≤ κ k < 1 Good initial guess � w 0 − w ∗ � ≤ 2 ω (1 − max { κ k } ) 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 12 / 32

Convergence of full-step Newton methods Newton method: Newton-type method (Jacobian approx.) ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + ∆ w w ← w + ∆ w Yields the iteration k = 0 , 1 , .... : Yields the iteration k = 0 , 1 , .... : w k +1 ← w k − ∇ r ( w k ) −⊤ r ( w k ) w k +1 ← w k − M − 1 k r ( w k ) Theorem : assume � � ∇ r ( w ) T − ∇ r ( w ∗ ) T �� M − 1 � � ≤ ω � w − w ∗ � , for Nonlinearity of r : k w ∈ [ w k , w ⋆ ] � � k ( ∇ r ( w k ) T − M k ) � � M − 1 � Jacobian approximation error: � ≤ κ k < 1 Good initial guess � w 0 − w ∗ � ≤ 2 ω (1 − max { κ k } ) Then w k → w ∗ with the following linear-quadratic contraction in each iteration: � � κ k + ω � w k +1 − w ∗ � 2 � w k − w ∗ � � w k − w ∗ � . ≤ 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 12 / 32

Convergence of full-step Newton methods Newton method: Newton-type method (Jacobian approx.) ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + ∆ w w ← w + ∆ w Yields the iteration k = 0 , 1 , .... : Yields the iteration k = 0 , 1 , .... : w k +1 ← w k − ∇ r ( w k ) −⊤ r ( w k ) w k +1 ← w k − M − 1 k r ( w k ) Theorem : assume � � ∇ r ( w ) T − ∇ r ( w ∗ ) T �� M − 1 � � ≤ ω � w − w ∗ � , for Nonlinearity of r : k w ∈ [ w k , w ⋆ ] � � k ( ∇ r ( w k ) T − M k ) � � M − 1 � Jacobian approximation error: � ≤ κ k < 1 Good initial guess � w 0 − w ∗ � ≤ 2 ω (1 − max { κ k } ) Then w k → w ∗ with the following linear-quadratic contraction in each iteration: � � κ k + ω � w k +1 − w ∗ � 2 � w k − w ∗ � � w k − w ∗ � . ≤ What about reduced steps ? Slow convergence when t < 1 (damped phase). When full steps become feasible, fast convergence to the solution. 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 12 / 32

Newton methods - Short Survival Guide Exact Newton method: Newton-type method ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + t ∆ w w ← w + t ∆ w 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 13 / 32

Newton methods - Short Survival Guide Exact Newton method: Newton-type method ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + t ∆ w w ← w + t ∆ w Exact Newton direction ∆ w improves r for a sufficiently small step size t ∈ ]0 , 1] 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 13 / 32

Newton methods - Short Survival Guide Exact Newton method: Newton-type method ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + t ∆ w w ← w + t ∆ w Exact Newton direction ∆ w improves r for a sufficiently small step size t ∈ ]0 , 1] Inexact Newton direction ∆ w improves r for a sufficiently small step size t ∈ ]0 , 1] if M > 0 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 13 / 32

Newton methods - Short Survival Guide Exact Newton method: Newton-type method ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + t ∆ w w ← w + t ∆ w Exact Newton direction ∆ w improves r for a sufficiently small step size t ∈ ]0 , 1] Inexact Newton direction ∆ w improves r for a sufficiently small step size t ∈ ]0 , 1] if M > 0 Exact full ( t = 1) Newton steps converge quadratically if close enough to the solution 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 13 / 32

Newton methods - Short Survival Guide Exact Newton method: Newton-type method ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + t ∆ w w ← w + t ∆ w Exact Newton direction ∆ w improves r for a sufficiently small step size t ∈ ]0 , 1] Inexact Newton direction ∆ w improves r for a sufficiently small step size t ∈ ]0 , 1] if M > 0 Exact full ( t = 1) Newton steps converge quadratically if close enough to the solution Inexact full ( t = 1) Newton steps converge linearly if close enough to the solution and if the Jacobian approximation is ”sufficiently good” 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 13 / 32

Newton methods - Short Survival Guide Exact Newton method: Newton-type method ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + t ∆ w w ← w + t ∆ w Exact Newton direction ∆ w improves r for a sufficiently small step size t ∈ ]0 , 1] Inexact Newton direction ∆ w improves r for a sufficiently small step size t ∈ ]0 , 1] if M > 0 Exact full ( t = 1) Newton steps converge quadratically if close enough to the solution Inexact full ( t = 1) Newton steps converge linearly if close enough to the solution and if the Jacobian approximation is ”sufficiently good” Newton iteration fails if ∇ r becomes singular 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 13 / 32

Newton methods - Short Survival Guide Exact Newton method: Newton-type method ∇ r ( w ) ⊤ ∆ w = − r ( w ) M ∆ w = − r ( w ) w ← w + t ∆ w w ← w + t ∆ w Exact Newton direction ∆ w improves r for a sufficiently small step size t ∈ ]0 , 1] Inexact Newton direction ∆ w improves r for a sufficiently small step size t ∈ ]0 , 1] if M > 0 Exact full ( t = 1) Newton steps converge quadratically if close enough to the solution Inexact full ( t = 1) Newton steps converge linearly if close enough to the solution and if the Jacobian approximation is ”sufficiently good” Newton iteration fails if ∇ r becomes singular Newton methods with globalization converge in two phases: damped (slow) phase when reduced steps ( t < 1) are needed, quadratic/ linear when full steps are possible. 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 13 / 32

Outline 1 KKT conditions - Quick Reminder The Newton method 2 Newton on the KKT conditions 3 Sequential Quadratic Programming 4 Hessian approximation 5 Maratos effect 6 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 14 / 32

Core idea A vast majority of solvers try to find a KKT point w , µ , λ i.e: g ( w ) = 0 , h ( w ) ≤ 0 , Primal Feasibility: ∇ w L ( w , µ , λ ) = 0 , µ ≥ 0 , Dual Feasibility: Complementarity Slackness: µ i h i ( w ) = 0 , i = 1 , ... where L = Φ ( w ) + λ ⊤ g ( w ) + µ ⊤ h ( w ) 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 15 / 32

Core idea A vast majority of solvers try to find a KKT point w , µ , λ i.e: g ( w ) = 0 , h ( w ) ≤ 0 , Primal Feasibility: ∇ w L ( w , µ , λ ) = 0 , µ ≥ 0 , Dual Feasibility: Complementarity Slackness: µ i h i ( w ) = 0 , i = 1 , ... where L = Φ ( w ) + λ ⊤ g ( w ) + µ ⊤ h ( w ) Let’s consider for now equality constrained problems, i.e. find w , λ s.t.: ∇ w L ( w , λ ) = 0 g ( w ) = 0 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 15 / 32

Core idea A vast majority of solvers try to find a KKT point w , µ , λ i.e: g ( w ) = 0 , h ( w ) ≤ 0 , Primal Feasibility: ∇ w L ( w , µ , λ ) = 0 , µ ≥ 0 , Dual Feasibility: Complementarity Slackness: µ i h i ( w ) = 0 , i = 1 , ... where L = Φ ( w ) + λ ⊤ g ( w ) + µ ⊤ h ( w ) Let’s consider for now equality constrained problems, i.e. find w , λ s.t.: ∇ w L ( w , λ ) = 0 g ( w ) = 0 Idea: apply the Newton method on the KKT conditions, i.e. Solve... � ∇ w L ( w , λ ) � r ( w , λ ) = = 0 g ( w ) 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 15 / 32

Core idea A vast majority of solvers try to find a KKT point w , µ , λ i.e: g ( w ) = 0 , h ( w ) ≤ 0 , Primal Feasibility: ∇ w L ( w , µ , λ ) = 0 , µ ≥ 0 , Dual Feasibility: Complementarity Slackness: µ i h i ( w ) = 0 , i = 1 , ... where L = Φ ( w ) + λ ⊤ g ( w ) + µ ⊤ h ( w ) Let’s consider for now equality constrained problems, i.e. find w , λ s.t.: ∇ w L ( w , λ ) = 0 g ( w ) = 0 Idea: apply the Newton method on the KKT conditions, i.e. Solve... ... by iterating � ∇ w L ( w , λ ) � ∆ w � � r ( w , λ ) = = 0 ∇ r ( w , λ ) T = − r ( w , λ ) g ( w ) ∆ λ 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 15 / 32

Newton method on the KKT conditions KKT conditions Newton direction � ∇ w L ( w , λ ) � ∆ w � � ∇ r ( w , λ ) T r ( w , λ ) = = 0 = − r ( w , λ ) g ( w ) ∆ λ 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 16 / 32

Newton method on the KKT conditions KKT conditions Newton direction � ∇ w L ( w , λ ) � ∆ w � � ∇ r ( w , λ ) T r ( w , λ ) = = 0 = − r ( w , λ ) g ( w ) ∆ λ Given by: ∇ 2 w L ( w , λ )∆ w + ∇ w , λ L ( w , λ )∆ λ = −∇ w L ( w , λ ) ∇ g ( w ) T ∆ w = − g ( w ) 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 16 / 32

Newton method on the KKT conditions KKT conditions Newton direction � ∇ w L ( w , λ ) � ∆ w � � ∇ r ( w , λ ) T r ( w , λ ) = = 0 = − r ( w , λ ) g ( w ) ∆ λ Given by: using ∇ w L ( w , λ ) = ∇ Φ( w ) + ∇ g ( w ) λ ∇ 2 w L ( w , λ )∆ w + ∇ w , λ L ( w , λ )∆ λ = −∇ w L ( w , λ ) ∇ g ( w ) T ∆ w = − g ( w ) 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 16 / 32

Newton method on the KKT conditions KKT conditions Newton direction � ∇ w L ( w , λ ) � ∆ w � � ∇ r ( w , λ ) T r ( w , λ ) = = 0 = − r ( w , λ ) g ( w ) ∆ λ Given by: using ∇ w L ( w , λ ) = ∇ Φ( w ) + ∇ g ( w ) λ ∇ 2 w L ( w , λ )∆ w + ∇ g ( w )∆ λ = −∇ w L ( w , λ ) ∇ g ( w ) T ∆ w = − g ( w ) 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 16 / 32

Newton method on the KKT conditions KKT conditions Newton direction � ∇ w L ( w , λ ) � ∆ w � � ∇ r ( w , λ ) T r ( w , λ ) = = 0 = − r ( w , λ ) g ( w ) ∆ λ Given by: using ∇ w L ( w , λ ) = ∇ Φ( w ) + ∇ g ( w ) λ ∇ 2 w L ( w , λ )∆ w + ∇ g ( w )∆ λ = −∇ Φ( w ) − ∇ g ( w ) λ ∇ g ( w ) T ∆ w = − g ( w ) 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 16 / 32

Newton method on the KKT conditions KKT conditions Newton direction � ∇ w L ( w , λ ) � ∆ w � � ∇ r ( w , λ ) T r ( w , λ ) = = 0 = − r ( w , λ ) g ( w ) ∆ λ Given by: using ∇ w L ( w , λ ) = ∇ Φ( w ) + ∇ g ( w ) λ ∇ 2 w L ( w , λ )∆ w + ∇ g ( w ) ( λ + ∆ λ ) = −∇ Φ( w ) ∇ g ( w ) T ∆ w = − g ( w ) 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 16 / 32

Newton method on the KKT conditions KKT conditions Newton direction � ∇ w L ( w , λ ) � ∆ w � � ∇ r ( w , λ ) T r ( w , λ ) = = 0 = − r ( w , λ ) g ( w ) ∆ λ Given by: using ∇ w L ( w , λ ) = ∇ Φ( w ) + ∇ g ( w ) λ ∇ 2 w L ( w , λ )∆ w + ∇ g ( w ) ( λ + ∆ λ ) = −∇ Φ( w ) ∇ g ( w ) T ∆ w = − g ( w ) The Newton direction on the KKT conditions � ∇ 2 � ∇ Φ( w ) � � � � w L ( w , λ ) ∇ g ( w ) ∆ w = − ∇ g ( w ) T 0 λ + ∆ λ g ( w ) � �� KKT matrix (symmetric indefinite) 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 16 / 32

Newton method on the KKT conditions KKT conditions Newton direction � ∇ w L ( w , λ ) � ∆ w � � ∇ r ( w , λ ) T r ( w , λ ) = = 0 = − r ( w , λ ) g ( w ) ∆ λ Given by: using ∇ w L ( w , λ ) = ∇ Φ( w ) + ∇ g ( w ) λ ∇ 2 w L ( w , λ )∆ w + ∇ g ( w ) ( λ + ∆ λ ) = −∇ Φ( w ) ∇ g ( w ) T ∆ w = − g ( w ) The Newton direction on the KKT conditions � H ( w , λ ) � ∇ Φ( w ) � � � � ∇ g ( w ) ∆ w = − ∇ g ( w ) T 0 λ + ∆ λ g ( w ) � �� KKT matrix (symmetric indefinite) where H ( w , λ ) = ∇ 2 w L ( w , λ ) is the Hessian of the problem. 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 16 / 32

Newton method on the KKT conditions KKT conditions Newton direction � ∇ w L ( w , λ ) � ∆ w � � ∇ r ( w , λ ) T r ( w , λ ) = = 0 = − r ( w , λ ) g ( w ) ∆ λ Given by: using ∇ w L ( w , λ ) = ∇ Φ( w ) + ∇ g ( w ) λ ∇ 2 w L ( w , λ )∆ w + ∇ g ( w ) ( λ + ∆ λ ) = −∇ Φ( w ) ∇ g ( w ) T ∆ w = − g ( w ) The Newton direction on the KKT conditions � H ( w , λ ) � ∆ w � ∇ Φ( w ) � � � ∇ g ( w ) = − ∇ g ( w ) T λ + 0 g ( w ) � �� KKT matrix (symmetric indefinite) where H ( w , λ ) = ∇ 2 w L ( w , λ ) is the Hessian of the problem. Note: update of the dual variable is λ + = λ + ∆ λ 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 16 / 32

Newton method on the KKT conditions KKT conditions Newton direction � ∇ w L ( w , λ ) � ∆ w � � ∇ r ( w , λ ) T r ( w , λ ) = = 0 = − r ( w , λ ) g ( w ) ∆ λ Given by: using ∇ w L ( w , λ ) = ∇ Φ( w ) + ∇ g ( w ) λ ∇ 2 w L ( w , λ )∆ w + ∇ g ( w ) ( λ + ∆ λ ) = −∇ Φ( w ) ∇ g ( w ) T ∆ w = − g ( w ) The Newton direction on the KKT conditions � H ( w , λ ) � ∆ w � ∇ Φ( w ) � � � ∇ g ( w ) = − ∇ g ( w ) T λ + 0 g ( w ) � �� KKT matrix (symmetric indefinite) where H ( w , λ ) = ∇ 2 w L ( w , λ ) is the Hessian of the problem. Note: update of the dual variable is λ + = λ + ∆ λ ∇ w L ( w , λ ) is not needed for computing the Newton step 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 16 / 32

Newton method on the KKT conditions KKT conditions Newton direction � ∇ w L ( w , λ ) � ∆ w � � ∇ r ( w , λ ) T r ( w , λ ) = = 0 = − r ( w , λ ) g ( w ) ∆ λ Given by: using ∇ w L ( w , λ ) = ∇ Φ( w ) + ∇ g ( w ) λ ∇ 2 w L ( w , λ )∆ w + ∇ g ( w ) ( λ + ∆ λ ) = −∇ Φ( w ) ∇ g ( w ) T ∆ w = − g ( w ) The Newton direction on the KKT conditions � H ( w , λ ) � ∆ w � ∇ Φ( w ) � � � ∇ g ( w ) = − ∇ g ( w ) T λ + 0 g ( w ) � �� KKT matrix (symmetric indefinite) where H ( w , λ ) = ∇ 2 w L ( w , λ ) is the Hessian of the problem. Note: update of the dual variable is λ + = λ + ∆ λ ∇ w L ( w , λ ) is not needed for computing the Newton step The updated dual variables λ + are readily provided ! 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 16 / 32

Newton Iteration for Optimization - Example � 2 � 1 � � 1 1 2 w T w + w T min 1 4 0 w s.t. g ( w ) = w T w − 1 = 0 2 1.5 1 0.5 w 1 0 -0.5 -1 -1.5 -2 -2 -1 0 1 2 w 1 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 17 / 32

Newton Iteration for Optimization - Example � 2 � 1 � � 1 1 2 w T w + w T Iterate: min 1 4 0 w � � ∆ w � ∇ Φ � � � H ∇ g s.t. g ( w ) = w T w − 1 = 0 = − ∇ g T λ + 0 g 2 1.5 1 0.5 w 1 0 -0.5 -1 -1.5 -2 -2 -1 0 1 2 w 1 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 17 / 32

Newton Iteration for Optimization - Example � 2 � 1 � � 1 1 2 w T w + w T Iterate: min 1 4 0 w � � ∆ w � ∇ Φ � � � H ∇ g s.t. g ( w ) = w T w − 1 = 0 = − ∇ g T λ + 0 g � 2 w 1 with: � 2 ∇ g ( w ) = 2 w = 2 w 2 1.5 1 0.5 w 1 0 -0.5 -1 -1.5 -2 -2 -1 0 1 2 w 1 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 17 / 32

Newton Iteration for Optimization - Example � 2 � 1 � � 1 1 2 w T w + w T Iterate: min 1 4 0 w � � ∆ w � ∇ Φ � � � H ∇ g s.t. g ( w ) = w T w − 1 = 0 = − ∇ g T λ + 0 g � 2 w 1 with: � 2 ∇ g ( w ) = 2 w = 2 w 2 1.5 1 L ( w , λ ) = Φ ( w ) + λ g ( w ) � 2 � 1 0.5 � � 1 w 1 0 ∇ w L ( w , λ ) = w + + 2 λ w 1 4 0 -0.5 -1 -1.5 -2 -2 -1 0 1 2 w 1 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 17 / 32

Newton Iteration for Optimization - Example � 2 � 1 � � 1 1 2 w T w + w T Iterate: min 1 4 0 w � � ∆ w � ∇ Φ � � � H ∇ g s.t. g ( w ) = w T w − 1 = 0 = − ∇ g T λ + 0 g � 2 w 1 with: � 2 ∇ g ( w ) = 2 w = 2 w 2 1.5 1 L ( w , λ ) = Φ ( w ) + λ g ( w ) � 2 � 1 0.5 � � 1 w 1 0 ∇ w L ( w , λ ) = w + + 2 λ w 1 4 0 -0.5 -1 � 2 + 2 λ -1.5 � 1 H ( w , λ ) = -2 1 4 + 2 λ -2 -1 0 1 2 w 1 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 17 / 32

Newton Iteration for Optimization - Example � 2 � 1 � � 1 1 2 w T w + w T Iterate: min 1 4 0 w � � ∆ w � ∇ Φ � � � H ∇ g s.t. g ( w ) = w T w − 1 = 0 = − ∇ g T λ + 0 g � 2 w 1 with: � 2 ∇ g ( w ) = 2 w = 2 w 2 1.5 1 L ( w , λ ) = Φ ( w ) + λ g ( w ) � 2 � 1 0.5 � � 1 w 1 0 ∇ w L ( w , λ ) = w + + 2 λ w 1 4 0 -0.5 -1 � 2 + 2 λ -1.5 � 1 H ( w , λ ) = -2 1 4 + 2 λ -2 -1 0 1 2 w 1 � � 2 w 1 + w 2 + 1 ∇ Φ ( w ) = w 1 + 4 w 2 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 17 / 32

Newton Iteration for Optimization - Example � 2 � 1 � � 1 1 2 w T w + w T min Algorithm: Newton method 1 4 0 w Input : guess w , λ s.t. g ( w ) = w T w − 1 = 0 while �∇L� or � g � ≥ tol do Guess λ = 0, step t = 1 Compute 2 H ( w , λ ) , ∇ g ( w ) , ∇ Φ ( w ) , g ( w ) 1.5 1 Compute Newton direction 0.5 � � ∆ w � ∇ Φ � � � ∇ g H w 1 0 = − ∇ g T λ + 0 g -0.5 -1 ∆ λ = λ + − λ -1.5 -2 Compute Newton step, t ∈ ]0 , 1] -2 -1 0 1 2 w 1 w ← w + t ∆ w , λ ← λ + t ∆ λ return w , λ 17 th of February, 2016 S. Gros Optimal Control with DAEs, lecture 5 17 / 32

In case you missed it - Who am I ? Name: S ebastien Gros - PowerPoint PPT Presentation

In case you missed it - Who am I ? Name: S ebastien Gros Nationality: Swiss Residence: G oteborg, Sweden Affiliation: Chalmers University of Technology Department: Signals & Systems Position: Assistant Professor Email:

Missed Pay Study of the Dutch Offshore EBN Dutch Exploration Day, 23 May 2016 Why Missed Pay

Lectur Lecture 20: e 20: DC M DC Motor otors Exam Exam 2 Results 2 Results Most M ost

The Incas built at least 23,000 km of roads throughout their empire but missed to invent the

Lecture 8 ANNOUNCEMENTS A summary of frequently misunderstood/missed concepts is A summary

Acing the Interview 1 Acing the Interview (MAR 2013) Ive missed more than 9,000 shots in my

Questions? Questions? Questions? Questions? Questions? Questions? Questions? Questions?

Capacity markets and the internal market: the French court case (C-543/15 ) A missed opportunity?

1 Welcome and introductions Indicate if you are: EC, NC Pre-K, Title I, Head Start, anybody we

Case Comparisons Department of Government London School of Economics and Political Science Uses

What if I missed the 2018 Mandatory Educational Presentation? 1) You will need to submit a

COVID-19 Wider opening at Blue Coat school Welcome Back! We have really missed you all

Science and the Usability Specialist: Recent Research Findings You Might Have Missed Fiona

even more ways to generate self log missed opportunities data lost forever different modes of

Acute Kidney Injury Ajay Dhaygude Admitting specialty Inadequate risk assessment Missed

Basic Principles of Fractures & Easily Missed Fractures Mr Irfan Merchant Trauma &

marketing Dr Xavier Font Leeds Beckett University Missed business opportunity Use it to make

Quadratic Programming Seminar Andreas Potschka Interdisciplinary Center for Scientific Computing,

A good recipe for solving MINLPs The Problem The Ingredients VNS Local Branching Leo Liberti 1

Accurate Clock Mesh Sizing via Sequential Quadratic Programming Venkata Rajesh Mekala, Yifang

Sequential Quadratic Programming 1 Lecture 17 ME EN 575 Andrew Ning aning@byu.edu Outline

1/31/2007 Massachusetts Institute of Technology Context Model Selection Finite Horizon

A two-step sequential linear programming algorithm for MINLP problems: An application to gas

Designs of Orthogonal Filter Banks and Orthogonal Cosine-Modulated Filter Banks Jie Yan

1 Feature Extraction and Description Visual Vocabulary Construction From a database of

In case you missed it - Who am I ? Name: S ebastien Gros - PowerPoint PPT Presentation

In case you missed it - Who am I ? Name: S ebastien Gros Nationality: Swiss Residence: G oteborg, Sweden Affiliation: Chalmers University of Technology Department: Signals & Systems Position: Assistant Professor Email:

Missed Pay Study of the Dutch Offshore EBN Dutch Exploration Day, 23 May 2016 Why Missed Pay

Lectur Lecture 20: e 20: DC M DC Motor otors Exam Exam 2 Results 2 Results Most M ost

The Incas built at least 23,000 km of roads throughout their empire but missed to invent the

Lecture 8 ANNOUNCEMENTS A summary of frequently misunderstood/missed concepts is A summary

Acing the Interview 1 Acing the Interview (MAR 2013) Ive missed more than 9,000 shots in my

Questions? Questions? Questions? Questions? Questions? Questions? Questions? Questions?

Capacity markets and the internal market: the French court case (C-543/15 ) A missed opportunity?

1 Welcome and introductions Indicate if you are: EC, NC Pre-K, Title I, Head Start, anybody we

Case Comparisons Department of Government London School of Economics and Political Science Uses

What if I missed the 2018 Mandatory Educational Presentation? 1) You will need to submit a

COVID-19 Wider opening at Blue Coat school Welcome Back! We have really missed you all

Science and the Usability Specialist: Recent Research Findings You Might Have Missed Fiona

even more ways to generate self log missed opportunities data lost forever different modes of

Acute Kidney Injury Ajay Dhaygude Admitting specialty Inadequate risk assessment Missed

Basic Principles of Fractures &amp; Easily Missed Fractures Mr Irfan Merchant Trauma &amp;

marketing Dr Xavier Font Leeds Beckett University Missed business opportunity Use it to make

Quadratic Programming Seminar Andreas Potschka Interdisciplinary Center for Scientific Computing,

A good recipe for solving MINLPs The Problem The Ingredients VNS Local Branching Leo Liberti 1

Accurate Clock Mesh Sizing via Sequential Quadratic Programming Venkata Rajesh Mekala, Yifang

Sequential Quadratic Programming 1 Lecture 17 ME EN 575 Andrew Ning aning@byu.edu Outline

1/31/2007 Massachusetts Institute of Technology Context Model Selection Finite Horizon

A two-step sequential linear programming algorithm for MINLP problems: An application to gas

Designs of Orthogonal Filter Banks and Orthogonal Cosine-Modulated Filter Banks Jie Yan

1 Feature Extraction and Description Visual Vocabulary Construction From a database of

Basic Principles of Fractures & Easily Missed Fractures Mr Irfan Merchant Trauma &