optimal control and dynamic programming
play

Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte - PowerPoint PPT Presentation

Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte Antunes Introduction Outline Course info Introduction to optimal control and applications Dynamic programming algorithm Course information Teaching staff


  1. Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte Antunes

  2. Introduction

  3. Outline • Course info • Introduction to optimal control and applications • Dynamic programming algorithm

  4. Course information Teaching staff • Lecturer: Duarte Antunes (d.antunes@tue.nl) • Assistants: Eelco van Horssen (e.p.v.horssen@tue.nl) Ruben di Filippo (r.di.filippo@student.tue.nl) Grading • 1 Exam and 3 homework assignments to be submitted via Matlab cody coursework* • Assignments can be solved individually or in a group (max 4 people); select group via canvas. • If solved in a group, presence at BZ 7 & 14 is mandatory to discuss individual grade of assignments 1, 2 respectively. Peer assessment of homework 3 should be sent by Feb 4th. • Contributions to the final grade: each of the 3 homework assignments, 40/3%; exam, 60%. Question hours Monday Tuesday Wednesday Thursday Friday 12h30-13h30 17h30-18h30 15h45-17h30 12h45-13h45 10h45-12h30 Eelco Ruben BZ Duarte BZ GEM-Z 0.55 GEM-Z 0.55 Paviljoen U46 GEM-Z -1.139 Paviljoen U46 *https://coursework.mathworks.com/, students will receive an email to register ºsince there are questions hours every working day, no further appointments will be scheduled and we try to avoid answering questions via e-mail. 1

  5. Course schedule Lectures (L): Wednesdays 13h45-15h30 (LUNA 1.050), Fridays 8h45-10h30 (Paviljoen B2). Guided self-study (BZ): Wednesdays 15h45-17h30 (Paviljoen U46) Fridays 10h45-12h30 (Paviljoen U46) Deadlines: PSI: Dec. 5th, 23h45 ; PSII: Jan 11rd 23h45; PSIII: Feb 4th, 23h45. Exam: February 1st, 13h30-16h30, retake: April 12, 18h00-21h00. Monday Tuesday Wednesday Thursday Friday L1 BZ1 13 14 15 16 L2 BZ2 17 November 20 21 L3 BZ3 22 23 L4 BZ4 24 27 28 29 30 1 L5 BZ5 L6 BZ6 L7 BZ7* 4 5 6 7 L8 BZ8 8 L9 BZ9 L10 BZ10 11 12 13 14 15 December 18 19 L11 BZ11 20 21 L12 BZ12 22 L13 BZ13 L14 BZ14* 8 9 10 11 12 January & L15 BZ15 L16 BZ16 15 16 17 18 19 22 23 24 25 26 February Exam 29 30 31 1 2 2 * In BZ7 and BZ14 the grade of each member of a group pertaining to homework 1 and 2, respectively, will be discussed.

  6. Course material Main course material • Slides and Problem sets • Dynamic Programming and Optimal Control , Dimitri P . Bertsekas, Athena Scientific, Volume I, 2005. ISBN-13: 1-886529-08-6, Chapters 1-6* Further reading • Applied Optimal Control , Jr. Arthur E. Bryson, Yu-Chi Ho CRC Press, 1975, ISBN-13: 978-0891162285, Chapters 1-3 • Calculus of variations and optimal control theory , D. Liberzon, Princeton University Press, 2012. ISBN-13: 978-0-691-15187-8 Chapters 1-4 • Planning algorithms, Steven M. LaValle, Cambridge university press, 2016, ISBN-13:978-0521862059 Chapters 1-12, 14 • Other reference books [1]-[10] 3 *Slides and video lectures available at http://www.athenasc.com/dpbook.html

  7. Outline of the course Topic Lectures I Discrete 1. Introduction and the dynamic programming algorithm optimization 2. Stochastic dynamic programming problems 3. Shortest path problems in graphs 4. Bayes filter and partially observable Markov decision processes 5. State-feedback controller design for linear systems -LQR II Stage 6. Optimal estimation and output feedback- Kalman filter and LQG decision 7. Discretization problems 8. Discrete-time Pontryagin’s maximum principle 9. Approximate dynamic programming 10. Hamilton-Jacobi-Bellman equation and deterministic LQR in continuous-time III Continuous- time optimal 11. Linear quadratic control in continuous-time - LQR/LQG control 12. Frequency-domain properties of LQR/LQG problems 13. Pontryagin’s maximum principle I 14. Pontryagin’s maximum principle II 15 & 16. Revision/sample exam 4

  8. Position in the MSc programs Systems and control oriented programs • Systems and control, Mechanical and Electrical Engineering with control specialisation • Clear track Q1 System theory for control, Q2 Optimal control, Q3 Model predictive control • Optimal control is one of the cornerstones of control systems theory Other programs • Optimal control and dynamic programming is very broad and may be useful for you. • For example for the Automotive students: optimal control appears in many automotive applications, such as optimization of powertrains, optimal power management in hybrid vehicles, etc. 5

  9. Background Matlab • Nice intro: https://matlabacademy.mathworks.com/. • Best way to learn: read Matlab documentation and gain experience. System theory • Basic knowledge of concepts such as state-space representation, observability, controllability is useful • Course ‘System Theory for Control’ taught at TU/e is enough. • If you have not taken the course, suggestion for a book: ‘Linear systems theory’, 2009, João Hespanha. Optimization • Notions of gradient, convex functions, constrains, see Appendix B of Bertsekas’ book. • Advanced book: Convex optimization, Boyd, Vanderberghe available at http://stanford.edu/~boyd/cvxbook/. Probability theory • Basic notions, see Appendix C of Bertsekas’ book. 6

  10. Outline • Course info • Introduction to optimal control and applications • Dynamic programming algorithm

  11. Optimal control Optimality • Useful design principle in many engineering contexts (optimize efficiency of a refrigerator, minimize the fuel consumption of a car, etc.). • Nature is described by laws derived from optimality principles. • We optimize every day to make decisions (true?). Optimal control • Deals with problems in which optimal decisions or control actions are pursued over a time period in order to reach final and intermediate goals. • Arises in the control of physical systems (e.g. mechanical, electrical, biological) and in many other contexts (e.g., economics, computer science, and game theory). 7

  12. Optimal control vs static optimization Static optimization • Determine one optimal decision. • Examples: decide on the price of a product, determine the slope of a straight line which best fits data, etc. J J ( u ∗ ) u ∗ u 8

  13. Optimal control vs static optimization Optimal control • Determine several optimal decisions over time. • Decisions are functions of state, i.e., a control law to cope with disturbances. • Examples: driving a car/bike in a race, positioning the tip of a robot arm in the presence of disturbances, playing chess, etc. θ (3) ˜ θ (2) θ (2) θ (1) Disturbance at time t = 2 θ (0) θ ( t ) 6 = ˜ θ ( t ) t � 2 9

  14. Optimal control formulation Dynamic model • Specifies the rules of the problem or the equations of the physical system. • State: summarizes relevant information to make future decisions. • Control actions: influence the evolution of the state over time. • State evolution may be deterministic or stochastic (driven by disturbances). Cost function • Encapsulates the goals to be achieved in the problem. • Typically additive over time and by convention should be minimized. Goal: find a control policy which minimizes the cost • Policy: set of functions mapping the state at each instant of time to an action. • Related problem: compute an optimal path/trajectory consisting of optimal decisions over time for a given initial state. 10

  15. Optimal control problems Three classes of problems will be considered in the course time state space Discrete optimization problems* discrete discrete Stage decision problems discrete general Continuous-time optimal continuous general control problems Some applications are discussed next and more applications later. However there are many others - see Appendix B. 11

  16. Applications Traditional process control • controlling an inverted pendulum, mass-spring damper, double integrator, quadcopter, etc. Aerospace • minimum-fuel launch of a satellite, etc. Operational research, management, finance • inventory control, control of a queue, control of networks (data, traffic, etc.), etc. Computer Science • shortest path in graphs, scheduling, selection problems, among others. Other fields • Computational biology, automotive, games, many others. Next slides address some applications treated in the course, where we will consider also cases where uncertainty is present. 12

  17. Discrete Optimization Problems Specified by a transition diagram with decision stages h − 1 c 1 c 0 n 1 n 1 1 n 0 2 c h − 1 c 0 n h − 1 1 c h n h n 0 n 0 1 1 − n h n h c 1 c 0 23 22 c h − 1 c 1 c 0 22 22 21 2 2 2 c h − 1 c 0 c 1 12 21 21 c h 1 c h − 1 c 0 c 1 11 11 11 1 1 1 1 Stage 0 Stage 1 Stage h − 1 Stage h • Dynamic model: circles indicate states at each of stages; arrows indicate actions h for each state which lead to states at next stages. • Costs are associated with actions for each state at each stage ; for the c k j k i i,j terminal stage the costs depend only on the state . c h h i 13 i

  18. Discrete Optimization Problems Challenges 0 3 3 1 4 2 2 2 1 2 2 2 2 2 1 5 1 1 2 5 0 1 1 1 1 (ii) Determine an optimal policy specifying (i) Determine an optimal path for a given initial for each state the first decision of the optimal state which minimizes the sum of costs incurred path from that state to the terminal stage. at every stage (including the terminal stage). 3 3 5 0 1 2 1 1 2 1 2 2 2 5 1 1 2 2 1 2 1 1 1 1 1 14

Recommend


More recommend