mpc berkeley edu
play

www.mpc.berkeley.edu Borrelli (UC Berkeley) Iterative Learning MPC - PowerPoint PPT Presentation

Iterative Learning Model Predictive Control Francesco Borrelli Email: fborrelli@berkeley.edu University of California Berkeley, USA www.mpc.berkeley.edu Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC Slide 1 Acknowledgements


  1. Iteration 1, Step 2 Use SS 0 as terminal set at Iteration 1 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 50

  2. Iteration 1, Step 2 Use SS 0 as terminal set at Iteration 1 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 51

  3. Iteration 1, Step 3 Use SS 0 as terminal set at Iteration 1 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 52

  4. Iteration 1, Step 4 Use SS 0 as terminal set at Iteration 1 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 53

  5. Iteration 2 Safe Set Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 54

  6. Constructing the terminal set Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 55

  7. Terminal Set : Convex all of Sample Safe Set for Constrained Linear Dynamical Systems is a Control Invariant Set Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 56

  8. Learning Model Predictive Control (LMPC) • Convergence • Performance improvement • Local optimality Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 57

  9. Terminal Cost at Iteration 0 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 58

  10. Terminal Cost at Iteration 0 A control Lyapunov “function” Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 59

  11. Terminal Cost at the j-th iteration ≡ Define Compute terminal cost as Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 60

  12. Terminal Cost: Barycentric Approximation of Q() Control Lyapunov Function (for Constrained Linear Dynamical Systems) Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 61

  13. ILMPC Summary MPC strategy: Optimize over inputs and lambdas For constrained linear systems Safety guarantees: - Constraint satisfaction at iteration j=> satisfaction at iteration j+1 Performance improvement guarantees: - Closed loop cost at iteration j >= cost at iteration j+1 Convergence to global optimal solution Constraint qualification conditions required for cost decrease Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 62

  14. Performance Improvement Proof Conjecture Notation Closed-loop state and input trajectory at iteration j Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 63

  15. Performance Improvement Proof Step 1: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 64

  16. Performance Improvement Proof Step 1: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 65

  17. Performance Improvement Proof Step 1: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 66

  18. Performance Improvement Proof  Step 1: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 67

  19. Performance Improvement Proof Step 1: Step 2: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 68

  20. Performance Improvement Proof Step 1: Step 2: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 69

  21. Performance Improvement Proof Step 1:  Step 2: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 70

  22. Performance Improvement Proof Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 71

  23. Iterative Learning MPC Optimize over inputs and lambdas Simple proofs For constrained linear systems Safety and Performance improvement guarantees Convergence to global optimal solution (for linear Constraint qualification conditions required for cost decrease If full column rank, improvement cannot be obtained Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 72

  24. Constrained LQR Example Control objective System dynamics System constraints Starting Position Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 73

  25. Iterative LMPC with horizon N=2 Control objective System dynamics System constraints Terminal Constraint Initial Condition Will not work! Will work if one sets N=3 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 74

  26. Comparison with R.L.?? RL term too broad Two good references: Bertsekas paper connecting MPC and ADP* Lewis and Vrabile survey on CSM** Recht survey (section 6): https://arxiv.org/abs/1806.09460 ILMPC highlights Continuous state formulation Constraints satisfaction and Sampled Safe Sets Q-function constructed (learned) locally based on cost/model driven exploration and past trails Q- function at stored state is “exact” and lowerbounds property at intermediate points (for convex problems) *Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC **Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 75

  27. About Model Learning in Racing Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 76

  28. Autonomous Racing Control Problem Control objective Start & end position System dynamics System constraints Obstacle avoidance Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 77

  29. Learning Model Predictive Control (LMPC) Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 78

  30. Learning Process The lap time decreases until the LMPC converges to a set of trajectories Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 80

  31. Learning Model Predictive Control (LMPC) Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 82

  32. Useful Vehicle Model Abstraction Nonlinear Dynamical System Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 83

  33. Useful Vehicle Model Abstraction Nonlinear Dynamical System Kinematic Equations Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 84

  34. Useful Vehicle Model Abstraction Nonlinear Dynamical System Kinematic Equations Identifying the Dynamical System Linearization around predicted trajectory Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 85

  35. Useful Vehicle Model Abstraction Nonlinear Dynamical System Dynamic Equations Kinematic Equations Identifying the Dynamical System Local Linear Regression Linearization around predicted trajectory Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 86

  36. Useful Vehicle Model Abstraction Identifying the Dynamical System Local Linear Regression Linearization around predicted trajectory Important Design Steps Compute trajectory to linearize around uses previous optimal inputs and 1. inputs in the safe set Enforce model-based sparsity in local linear regression 2. Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 87

  37. Useful Vehicle Model Abstraction Nonlinear Dynamical System The velocity update is not affected by Position and Acceleration command Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 88

  38. Useful Vehicle Model Abstraction Identifying the Dynamical System Local Linear Regression Linearization around predicted trajectory Important Design Steps Compute trajectory to linearize around pusing previous optimal inputs 1. and inputs in the safe set Enforce model-based sparsity in local linear regression 2. Use data close to current state trajectory for parameter ID 3. Use kernel K() to weight differently data as a function of distance to 4. linearized trajectory Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 89

  39. Accelerations Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 90

  40. Results Gain from steering to lateral velocity Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 91

  41. About Model Learning Ball in Cup Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 95

  42. Ball in a Cup System with MuJoCo Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 96

  43. Ball in a Cup Control Problem Control objective Start & end position System dynamics System constraints Obstacle avoidance Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 97

  44. Learning Model Predictive Control (LMPC) Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 98

  45. Useful Mujoco Model Abstraction Identifying the Dynamical System Local Linear Regression Linearization around predicted trajectory Important Design Steps Compute trajectory to linearize around pusing previous optimal inputs 1. and inputs in the safe set Enforce model-based sparsity in local linear regression 2. Use data close to current state trajectory for parameter ID 3. Use kernel K() to weight differently data as a function of distance to 4. linearized trajectory Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 99

  46. Ball in a Cup System At iteration 0 find a sequence by sampling parametrized inputs profiles (takes 5mins) Use ILMPC: At iteration 1, time reduced of 10%, cup height movement reduced of 35% Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 100

  47. Back to our main chart.. Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 101

  48. Three Forms of Learning Reduce load for Skill acquisition Performance improvement Routine Execution How we do this? Model Predictive Control + A Simple Idea + Good Practices Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 102

  49. Offline π(∙) and Online π ( x ) Computation Option 1 ( Offline Based ): “Complex” Offline, “Simple” Online π(∙) often Piecewise Constant (except special classes) Dynamic Programing is one choice Basic rule: n>5 impossible Option 2 ( Online Based ): “Simple” Offline, “Complex” Online Compute on-line π ( x ) with a “sophisticated” algorithm Interior point method solver is one choice Basic Rule: avoid use `home- made’ solvers Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 103

  50. One Simple Way: Data-Based Policy for π(∙) Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 104

  51. One Simple Way: Data-Based Policy for π(∙) Historical data of converged iterations Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 105

  52. Three Forms of Learning 3 - Computation Load Reduction Lap Time at each iteration Average CPU Load at each iteration Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 107

  53. Experimental Results Factor of 10 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 108

  54. Data Based Policy: Alternatives Nearest Neighbor Train ReLU Neural Network Local Explicit MPC All Continuous Piecewise Affine Policies Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 109

  55. Learning MPC Incorporating data in advance model based controller What about noise and model uncertainty? Learned from data In Practice Noise and model uncertainty: Robust case Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 110

  56. ILMPC – Robust and Adaptive design Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 111

  57. At Iteration 0 Linear System Terminal Goal Set Successful Iteration Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 112

Recommend


More recommend