Outline n From linear to nonlinear n Model-predictive control (MPC) n - PDF document

Nonlinear Optimization for Optimal Control Part 2 Pieter Abbeel UC Berkeley EECS Outline n From linear to nonlinear n Model-predictive control (MPC) n POMDPs Page 1 �

From Linear to Nonlinear We know how to solve (assuming g t , U t , X t convex): n (1) How about nonlinear dynamics: n Shooting Methods (feasible) Collocation Methods (infeasible) Iterate for i=1, 2, 3, … Iterate for i=1, 2, 3, … Execute --- (no execution)--- (from solving (1)) Linearize around resulting trajectory Linearize around current solution of (1) Solve (1) for current linearization Solve (1) for current linearization Sequential Quadratic Programming (SQP) = either of the above methods, but instead of using linearization, linearize equality constraints, convex-quadratic approximate objective function Example Shooting Page 2 �

Example Collocation Practical Benefits and Issues with Shooting + At all times the sequence of controls is meaningful, and the objective function optimized directly corresponds to the current control sequence - For unstable systems, need to run feedback controller during forward simulation n Why? Open loop sequence of control inputs computed for the linearized system will not be perfect for the nonlinear system. If the nonlinear system is unstable, open loop execution would give poor performance. n Fixes: n Run Model Predictive Control for forward simulation n Compute a linear feedback controller from the 2 nd order Taylor expansion at the optimum (exercise: work out the details!) Page 3 �

Practical Benefits and Issues with Collocation + Can initialize with infeasible trajectory. Hence if you have a rough idea of a sequence of states that would form a reasonable solution, you can initialize with this sequence of states without needing to know a control sequence that would lead through them, and without needing to make them consistent with the dynamics - Sequence of control inputs and states might never converge onto a feasible sequence Iterative LQR versus Sequential Convex Programming Both can solve n Can run iterative LQR both as a shooting method or as a collocation method, it’s just a n different way of executing “Solve (1) for current linearization.” In case of shooting, the sequence of linear feedback controllers found can be used for (closed-loop) execution. Iterative LQR might need some outer iterations, adjusting “t” of the log barrier n Shooting Methods (feasible) Collocation Methods (infeasible) Iterate for i=1, 2, 3, … Iterate for i=1, 2, 3, … Execute feedback controller (from solving (1)) --- (no execution)--- Linearize around resulting trajectory Linearize around current solution of (1) Solve (1) for current linearization Solve (1) for current linearization Sequential Quadratic Programming (SQP) = either of the above methods, but instead of using linearization, linearize equality constraints, convex-quadratic approximate objective function Page 4 �

Outline n From linear to nonlinear n Model-predictive control (MPC) For an entire semester course on MPC: see Francesco Borrelli n POMDPs Model Predictive Control n Given: n For k=0, 1, 2, …, T n Solve n Execute u k n Observe resulting state, Page 5 �

Initialization n Initialization with solution from iteration k-1 can make solver very fast n can be done most conveniently with infeasible start Newton method Terminal Cost n Re-solving over full horizon can be computationally too expensive given frequency at which one might want to do control n Instead solve Estimate of cost-to-go n Estimate of cost-to-go n If using iterative LQR can use quadratic value function found for time t+H n If using nonlinear optimization for open-loop control sequence à can find quadratic approximation from Hessian at solution (exercise, try to derive it!) Page 6 �

Car Control with MPC Video n Prof. Francesco Borrelli (M.E.) and collaborators n http://video.google.com/videoplay? docid=-8338487882440308275 Outline n From linear to nonlinear n Model-predictive control (MPC) n POMDPs Page 7 �

POMDP Examples n Localization/Navigation à Coastal Navigation n SLAM + robot execution à Active exploration of unknown areas n Needle steering à maximize probability of success n “Ghostbusters” (188) à Can choose to “sense” or “bust” while navigating a maze with ghosts n “Certainty equivalent solution” does not always do well Robotic Needle Steering [from van den Berg, Patil, Alterovitz, Abbeel, Goldberg, WAFR2010] Page 8 �

Robotic Needle Steering [from van den Berg, Patil, Alterovitz, Abbeel, Goldberg, WAFR2010] POMDP: Partially Observable Markov Decision Process n Belief state B t , B t (x) = P( x t = x | z 0 , …, z t , u 0 , …, u t-1 ) n If the control input is u t , and observation z t+1 then B t+1 (x’) = ∑ x B t (x) P(x’|x, u t ) P( z t+1 |x’) Page 9 �

POMDP Solution Methods n Value Iteration: n Perform value iteration on the “belief state space” n High-dimensional space, usually impractical n Approximate belief with Gaussian n Just keep track of mean and covariance n Using (extended or unscented) KF, dynamics model, observation model, we get a nonlinear system equation for our new state variables, : n Can now run any of the nonlinear optimization methods for optimal control Example: Nonlinear Optimization for Control in Belief Space using Gaussian Approximations [van den Berg, Patil, Alterovitz, ISSR 2011] Page 10 �

Example: Nonlinear Optimization for Control in Belief Space using Gaussian Approximations [van den Berg, Patil, Alterovitz, ISSR 2011] Linear Gaussian System with Quadratic Cost: Separation Principle n Very special case: n Linear Gaussian Dynamics n Linear Gaussian Observation Model n Quadratic Cost n Fact: The optimal control policy in belief space for the above system consists of running n the optimal feedback controller for the same system when the state is fully observed, which we know from earlier lectures is a time-varying linear feedback controller easily found by value iteration n a Kalman filter, which feeds its state estimate into the feedback controller Page 11 �

Outline n From linear to nonlinear n Model-predictive control (MPC) n - PDF document

Nonlinear Optimization for Optimal Control Part 2 Pieter Abbeel UC Berkeley EECS Outline n From linear to nonlinear n Model-predictive control (MPC) n POMDPs Page 1 From Linear to Nonlinear We know how to solve (assuming g t , U t ,

Ins Domingues Breast Cancer Workshop April 7th 2015 Outline Outline Outline Outline

Presentation Preparation Outline Speech Outline Template ***Use this outline to guide you in

Outline for St Outline for St Outline for

Beob Kyun Kim, S oonwook Hwang {kyun, hwang}@ kisti.re.kr KIS TI, Korea Outline Outline

Catherine Revels, World Bank November 2009 Presentation outline Presentation outline

Battlestar Galactica Battlestar Galactica Galactica Battlestar Outline Outline Outline

Outline 2 Outline 2 ZSim core simulation techniques Outline 2 ZSim core simulation

Appendix J: Capstone Presentation Outline Revised Spring 2016 CAPSTONE PRESENTATION OUTLINE This

PT1 TMP Presentation Outline 1 Group Members: ___________________________________ Use this outline

Broverview Outline 2 Outline Philosophy and Architecture A framework for network traffic

Xingqian Peng, Huaqiao University, China Presented by Zhen Wu Presented by Zhen Wu October 30,2011

1 Web Application Development 2 3 Web Application Development CSS Outline An outline is a

Lecture Outline Strengthening Induction Hypothesis. Lecture Outline Strengthening Induction

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

High Dimensional Approximation - Outline Background and Sources Wolfgang Dahmen Seminar: USC,

Outline Outline Deaf and Hearing Impaired Deaf and Hearing Impaired Physical Structures of

CSCI [4|6] 730 Operating Systems Synchronization Part 1 : The Basics Maria Hybinette, UGA

Processing Project Ideas Class 4a. 9 Sep 2014 Instructor: Bhiksha Raj 9 Sep 2014 11755/18979

Outline Rhetoric & Music Rhetoric and its appeals

UCC 2018 Ayan Hersi Martin Le Kayla Pianh Miko Pugal 1 AGE UP Southend Ultimate is Community

Zero-Shot Relation Extraction via Reading Comprehension Omer Levy Minjoon Seo Eunsol Choi Luke

One-Shot Verifiable Encryption from Lattices Vadim Lyubashevsky and Gregory Neven IBM Research

How (Not) to Shoot in Your Foot with SDN Local Fast Failover A Load-Connectivity Tradeoff

Non-asymptotic entanglement distillation arXiv:1706.06221 Kun Fang Joint work with Xin Wang,

Outline n From linear to nonlinear n Model-predictive control (MPC) n - PDF document

Nonlinear Optimization for Optimal Control Part 2 Pieter Abbeel UC Berkeley EECS Outline n From linear to nonlinear n Model-predictive control (MPC) n POMDPs Page 1 From Linear to Nonlinear We know how to solve (assuming g t , U t ,

Ins Domingues Breast Cancer Workshop April 7th 2015 Outline Outline Outline Outline

Presentation Preparation Outline Speech Outline Template ***Use this outline to guide you in

Outline for St Outline for St Outline for

Beob Kyun Kim, S oonwook Hwang {kyun, hwang}@ kisti.re.kr KIS TI, Korea Outline Outline

Catherine Revels, World Bank November 2009 Presentation outline Presentation outline

Battlestar Galactica Battlestar Galactica Galactica Battlestar Outline Outline Outline

Outline 2 Outline 2 ZSim core simulation techniques Outline 2 ZSim core simulation

Appendix J: Capstone Presentation Outline Revised Spring 2016 CAPSTONE PRESENTATION OUTLINE This

PT1 TMP Presentation Outline 1 Group Members: ___________________________________ Use this outline

Broverview Outline 2 Outline Philosophy and Architecture A framework for network traffic

Xingqian Peng, Huaqiao University, China Presented by Zhen Wu Presented by Zhen Wu October 30,2011

1 Web Application Development 2 3 Web Application Development CSS Outline An outline is a

Lecture Outline Strengthening Induction Hypothesis. Lecture Outline Strengthening Induction

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

High Dimensional Approximation - Outline Background and Sources Wolfgang Dahmen Seminar: USC,

Outline Outline Deaf and Hearing Impaired Deaf and Hearing Impaired Physical Structures of

CSCI [4|6] 730 Operating Systems Synchronization Part 1 : The Basics Maria Hybinette, UGA

Processing Project Ideas Class 4a. 9 Sep 2014 Instructor: Bhiksha Raj 9 Sep 2014 11755/18979

Outline Rhetoric &amp; Music Rhetoric and its appeals

UCC 2018 Ayan Hersi Martin Le Kayla Pianh Miko Pugal 1 AGE UP Southend Ultimate is Community

Zero-Shot Relation Extraction via Reading Comprehension Omer Levy Minjoon Seo Eunsol Choi Luke

One-Shot Verifiable Encryption from Lattices Vadim Lyubashevsky and Gregory Neven IBM Research

How (Not) to Shoot in Your Foot with SDN Local Fast Failover A Load-Connectivity Tradeoff

Non-asymptotic entanglement distillation arXiv:1706.06221 Kun Fang Joint work with Xin Wang,

Outline Rhetoric & Music Rhetoric and its appeals