Iteration 1, Step 2 Use SS 0 as terminal set at Iteration 1 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 50
Iteration 1, Step 2 Use SS 0 as terminal set at Iteration 1 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 51
Iteration 1, Step 3 Use SS 0 as terminal set at Iteration 1 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 52
Iteration 1, Step 4 Use SS 0 as terminal set at Iteration 1 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 53
Iteration 2 Safe Set Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 54
Constructing the terminal set Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 55
Terminal Set : Convex all of Sample Safe Set for Constrained Linear Dynamical Systems is a Control Invariant Set Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 56
Learning Model Predictive Control (LMPC) • Convergence • Performance improvement • Local optimality Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 57
Terminal Cost at Iteration 0 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 58
Terminal Cost at Iteration 0 A control Lyapunov “function” Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 59
Terminal Cost at the j-th iteration ≡ Define Compute terminal cost as Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 60
Terminal Cost: Barycentric Approximation of Q() Control Lyapunov Function (for Constrained Linear Dynamical Systems) Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 61
ILMPC Summary MPC strategy: Optimize over inputs and lambdas For constrained linear systems Safety guarantees: - Constraint satisfaction at iteration j=> satisfaction at iteration j+1 Performance improvement guarantees: - Closed loop cost at iteration j >= cost at iteration j+1 Convergence to global optimal solution Constraint qualification conditions required for cost decrease Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 62
Performance Improvement Proof Conjecture Notation Closed-loop state and input trajectory at iteration j Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 63
Performance Improvement Proof Step 1: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 64
Performance Improvement Proof Step 1: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 65
Performance Improvement Proof Step 1: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 66
Performance Improvement Proof Step 1: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 67
Performance Improvement Proof Step 1: Step 2: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 68
Performance Improvement Proof Step 1: Step 2: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 69
Performance Improvement Proof Step 1: Step 2: Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 70
Performance Improvement Proof Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 71
Iterative Learning MPC Optimize over inputs and lambdas Simple proofs For constrained linear systems Safety and Performance improvement guarantees Convergence to global optimal solution (for linear Constraint qualification conditions required for cost decrease If full column rank, improvement cannot be obtained Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 72
Constrained LQR Example Control objective System dynamics System constraints Starting Position Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 73
Iterative LMPC with horizon N=2 Control objective System dynamics System constraints Terminal Constraint Initial Condition Will not work! Will work if one sets N=3 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 74
Comparison with R.L.?? RL term too broad Two good references: Bertsekas paper connecting MPC and ADP* Lewis and Vrabile survey on CSM** Recht survey (section 6): https://arxiv.org/abs/1806.09460 ILMPC highlights Continuous state formulation Constraints satisfaction and Sampled Safe Sets Q-function constructed (learned) locally based on cost/model driven exploration and past trails Q- function at stored state is “exact” and lowerbounds property at intermediate points (for convex problems) *Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC **Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 75
About Model Learning in Racing Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 76
Autonomous Racing Control Problem Control objective Start & end position System dynamics System constraints Obstacle avoidance Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 77
Learning Model Predictive Control (LMPC) Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 78
Learning Process The lap time decreases until the LMPC converges to a set of trajectories Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 80
Learning Model Predictive Control (LMPC) Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 82
Useful Vehicle Model Abstraction Nonlinear Dynamical System Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 83
Useful Vehicle Model Abstraction Nonlinear Dynamical System Kinematic Equations Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 84
Useful Vehicle Model Abstraction Nonlinear Dynamical System Kinematic Equations Identifying the Dynamical System Linearization around predicted trajectory Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 85
Useful Vehicle Model Abstraction Nonlinear Dynamical System Dynamic Equations Kinematic Equations Identifying the Dynamical System Local Linear Regression Linearization around predicted trajectory Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 86
Useful Vehicle Model Abstraction Identifying the Dynamical System Local Linear Regression Linearization around predicted trajectory Important Design Steps Compute trajectory to linearize around uses previous optimal inputs and 1. inputs in the safe set Enforce model-based sparsity in local linear regression 2. Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 87
Useful Vehicle Model Abstraction Nonlinear Dynamical System The velocity update is not affected by Position and Acceleration command Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 88
Useful Vehicle Model Abstraction Identifying the Dynamical System Local Linear Regression Linearization around predicted trajectory Important Design Steps Compute trajectory to linearize around pusing previous optimal inputs 1. and inputs in the safe set Enforce model-based sparsity in local linear regression 2. Use data close to current state trajectory for parameter ID 3. Use kernel K() to weight differently data as a function of distance to 4. linearized trajectory Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 89
Accelerations Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 90
Results Gain from steering to lateral velocity Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 91
About Model Learning Ball in Cup Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 95
Ball in a Cup System with MuJoCo Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 96
Ball in a Cup Control Problem Control objective Start & end position System dynamics System constraints Obstacle avoidance Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 97
Learning Model Predictive Control (LMPC) Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 98
Useful Mujoco Model Abstraction Identifying the Dynamical System Local Linear Regression Linearization around predicted trajectory Important Design Steps Compute trajectory to linearize around pusing previous optimal inputs 1. and inputs in the safe set Enforce model-based sparsity in local linear regression 2. Use data close to current state trajectory for parameter ID 3. Use kernel K() to weight differently data as a function of distance to 4. linearized trajectory Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 99
Ball in a Cup System At iteration 0 find a sequence by sampling parametrized inputs profiles (takes 5mins) Use ILMPC: At iteration 1, time reduced of 10%, cup height movement reduced of 35% Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 100
Back to our main chart.. Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 101
Three Forms of Learning Reduce load for Skill acquisition Performance improvement Routine Execution How we do this? Model Predictive Control + A Simple Idea + Good Practices Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 102
Offline π(∙) and Online π ( x ) Computation Option 1 ( Offline Based ): “Complex” Offline, “Simple” Online π(∙) often Piecewise Constant (except special classes) Dynamic Programing is one choice Basic rule: n>5 impossible Option 2 ( Online Based ): “Simple” Offline, “Complex” Online Compute on-line π ( x ) with a “sophisticated” algorithm Interior point method solver is one choice Basic Rule: avoid use `home- made’ solvers Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 103
One Simple Way: Data-Based Policy for π(∙) Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 104
One Simple Way: Data-Based Policy for π(∙) Historical data of converged iterations Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 105
Three Forms of Learning 3 - Computation Load Reduction Lap Time at each iteration Average CPU Load at each iteration Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 107
Experimental Results Factor of 10 Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 108
Data Based Policy: Alternatives Nearest Neighbor Train ReLU Neural Network Local Explicit MPC All Continuous Piecewise Affine Policies Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 109
Learning MPC Incorporating data in advance model based controller What about noise and model uncertainty? Learned from data In Practice Noise and model uncertainty: Robust case Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 110
ILMPC – Robust and Adaptive design Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 111
At Iteration 0 Linear System Terminal Goal Set Successful Iteration Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC – Slide 112
Recommend
More recommend