Safe Learning-Based Control using Gaussian Processes Prof. Angela Schoellig IFAC World Congress 2020 – Learning for Control Tutorial
The Future of Automation Large prior uncertainties. Active decision making. Expect safe and high-performance behavior. 2 Angela Schoellig
Robots in My Lab Model uncertainties that limit performance: Unknown Unknown terrain and aerodynamic topography effects Interaction with Unknown unknown objects weather conditions 3 Angela Schoellig
Learning from data can improve performance. Baseline Closed-Loop System Desired Actual Ref. Iteratively Input Output Output for Output Signal Baseline System Learned different trials Controller Reference Desired System State trajectory Repetitive error Reference input with earlier and larger amplitude 4 Angela Schoellig
Learning from data can improve performance. Video 2x Baseline Closed-Loop System Desired Actual Ref. Iteratively Input Output Output Signal Baseline System Learned Controller Reference System State Reference input with earlier and larger amplitude 5 Angela Schoellig
Learned Triple Flip [ICRA10] https://youtu.be/bWExDW9J9sA 6 Angela Schoellig
Learning from data can improve performance. Learning a sin ingle task through repetition Offlin ine le learnin ing of in inverse model Baseline Closed-Loop System Baseline Closed-Loop System Desired Actual Ref. Iteratively Actual Output Output Signal Desired Output Baseline Deep Neural Output System Learned Baseline Controller System Network Offline Reference Controller Learning System System State State [ECC’09, IROS’12, AURO’12] [ICRA’16, CDC’17, RAL’18, ECC’19] 7 Angela Schoellig
Mobile Manipulator Control [IROS’20 ] http://tiny.cc/ball_catch 8 Angela Schoellig
Learning from data can improve performance. Learning a sin ingle task through repetition Offlin ine le learnin ing of in inverse model Baseline Closed-Loop System Baseline Closed-Loop System Desired Actual Ref. Iteratively Actual Output Output Signal Desired Output Baseline Deep Neural Output System Learned Baseline Controller System Network Offline Reference Controller Learning System System State State [ECC’09, IROS’12, AURO’12] [ICRA’16, CDC’17, RAL’18, ECC’19] In Input-output stabili lity Base aseli line con ontrolle ller required if baseline system is stable Trai ainin ing phas ase Acausal corrections possible St State con onstrain ints not considered 9 Angela Schoellig
Problem Statement Design a controller for systems with prior uncertainty that learns online and continuously improves performance while satisfying safety constraints. Considered system dynamics: Compare to (simplified view): with a-priori given sets • Robust contr trol: l: finds controller that achieves Key features: stability and performance for all possible • Nonparametric model • Im • Improved performance with ith more Adaptiv ive control: l: estimates and uses estimate in controller data 10 Angela Schoellig
Approach Gaussian processes Nonparametric model for reliable confidence intervals unknown model error Stochastic Disturbance Model Robust contr trol Algorithm to safely acquire stability & performance Actual data and optimize task Output Robust under uncertainty System Controller Desired Output System State Lyapunov analysis Defining and analyzing closed-loop safety stability of learned models = safe model-based reinforcement le learning 11 Angela Schoellig
Approach Gaussian processes Nonparametric model for reliable confidence intervals unknown model error Stochastic Disturbance Model Robust contr trol Algorithm to safely acquire stability & performance Actual data and optimize task Output Robust under uncertainty System Controller Desired Output System State Lyapunov analysis Defining and analyzing closed-loop safety stability of learned models = safe model-based reinforcement le learning 12 Angela Schoellig
Gaussian Process Theorem (informally): The function is contained in the Gau aussian Process Optim imiz ization in in the Bandit it Setting: scaled Gaussian process confidence intervals No Regret an and Exp xperim imental l Desig ign with probability at least . N. Srinivas, A. Krause, S. Kakade, M.Seeger, ICML 2010 13 Angela Schoellig
Gaussian Process • Can model arbitrary smooth functions. • For a given input, it provides an interval in which the function value lies with high probability. • As more data is gathered, the uncertainty is reduced. Our model framework for developing reinforcement learning algorithms with safety guarantees. 14 Angela Schoellig
Approach Gaussian processes Nonparametric model for reliable confidence intervals unknown model error 1. Lin 1. Linear Robust contr trol Algorithm to safely acquire 2. Nonlinear 2. stability & performance data and optimize task under uncertainty 3. Nonlinear, 3. , predictive Lyapunov analysis Defining and analyzing closed-loop safety stability of learned models = safe model-based reinforcement le learning 15 Angela Schoellig
Linear Robust Control [ECC’15] • Gaussian Process Model • Linear Robust Control • Task: stabilization of an operating point • Lin inear robust control: • linearization about operating point • Local Stability Guarantees • Local asymptotic stabili lity around true operating poin int with high probability 16 Angela Schoellig
Linear Robust Control [ECC’15 ] https://youtu.be/YqhLnCm0KXY 17 Angela Schoellig
Linear Robust Control [ECC’15 ] https://youtu.be/YqhLnCm0KXY 18 Angela Schoellig
Nonlinear Robust Control for Differentially Flat Systems [L- CSS’20] • Model / Assumptions • Differentially flat, control-affine real dynamics and prior model • Gaussian Process models in inverse nonlinear mis ismatch • Linear Robust Control • Task: high-performance tracking • Linear robust control for feedback- linearized system • Global Tracking Guarantees • Tracking error is uniformly ultimately bounded with high probability 19 Angela Schoellig
Nonlinear Robust Control for Differentially Flat Systems [L- CSS’20] • Model / Assumptions Bound • Differentially flat, control-affine real dynamics and prior model Gaussian • Gaussian Process models in inverse Process Robustness Differentially Flat System nonlinear mis ismatch Term Actual Input Desired Input to Linear to Linear Nominal Inverse Nominal Nonlinear Linear Feedback Nonlinear LQR Term Dynamics Linearization • Linear Robust Control Mismatch Nonlinear Mismatch • Task: high-performance tracking • Linear robust control for feedback- linearized system • Global Tracking Guarantees • Tracking error is uniformly ultimately bounded with high probability 20 Angela Schoellig
Nonlinear Robust Control for Differentially Flat Systems [L- CSS’20] Cart-pendulum example with model parameter uncertainties: Robust, online learning control with global Predictiv ive cap apabili ilitie ies guarantees on tracking error. State con onstrain ints 21 Angela Schoellig
Robust Predictive Control [IJRR’16, JFR’16] • Gaussian Process Model • Nonlinear, Robust Model Predictive Control • Task: high-performance tracking • Approximations in prediction and nonlinear optimization step • Guarantees [e.g., Tomlin’13, Krause’18, Zeilinger’18] • Robustly asymptotically stable • Robust constraint satisfaction • Recursively guaranteeing the existence of safe control actions Unscented Transform for prediction 22 Angela Schoellig
Robust Predictive Control [IJRR’16, JFR’16] Example: Mobile robot path th foll llowing • Problem setup: • Learning: Driving too fast Slow down for safety Faster driving after learning 23 Angela Schoellig
Robust Predictive Control [IJRR’16, JFR’16 ] https://youtu.be/3xRNmNv5Efk 24 Angela Schoellig
Summary Design a controller for systems with prior uncertainty that learns online and continuously improves performance while satisfying safety constraints. Gauss ssian processes Nonparametric model for reliable confidence intervals unknown model error 1. Lin 1. Linear • Local stability guarantees Robust contr trol Algorithm to safely acquire 2. Nonlinear 2. stability & performance data and optimize task • Global tracking error under uncertainty guarantees 3. 3. Nonlinear, , predictive • Probabilistic constraint Lyapunov stability Defining and analyzing satisfaction and stability closed-loop safety stability of learned models 25 Angela Schoellig
Acknowledgements www.dynsyslab.org Senior collaborators: Andreas Krause, Tim Barfoot, Raffaello D’Andrea Funding: 26 Angela Schoellig
Other Learning Control Results from My Lab • Syste tems wit ith changing dynamics [ICRA’17, IROS’18, RAL’18, JACSP’19, RAL’19] • Transfer le learning betw tween sim imilar syste tems (similarity metric from robust control) M. Paton, “Expanding the Limits of Vision - Based Autonomous Path Following,”, 2017. [IROS’17, ICRA’17, RAL’18, ACSP’18] • Coll llaborative le learning of in inte terconnecte ted syste tems [AURO’19] • Acti tive le learning [ICRA’16, NeurIPS’17, CDC’19] 27 Angela Schoellig
Recommend
More recommend