safe learning based control using gaussian processes
play

Safe Learning-Based Control using Gaussian Processes Prof. Angela - PowerPoint PPT Presentation

Safe Learning-Based Control using Gaussian Processes Prof. Angela Schoellig IFAC World Congress 2020 Learning for Control Tutorial The Future of Automation Large prior uncertainties. Active decision making. Expect safe and high-performance


  1. Safe Learning-Based Control using Gaussian Processes Prof. Angela Schoellig IFAC World Congress 2020 – Learning for Control Tutorial

  2. The Future of Automation Large prior uncertainties. Active decision making. Expect safe and high-performance behavior. 2 Angela Schoellig

  3. Robots in My Lab Model uncertainties that limit performance: Unknown Unknown terrain and aerodynamic topography effects Interaction with Unknown unknown objects weather conditions 3 Angela Schoellig

  4. Learning from data can improve performance. Baseline Closed-Loop System Desired Actual Ref. Iteratively Input Output Output for Output Signal Baseline System Learned different trials Controller Reference Desired System State trajectory Repetitive error Reference input with earlier and larger amplitude 4 Angela Schoellig

  5. Learning from data can improve performance. Video 2x Baseline Closed-Loop System Desired Actual Ref. Iteratively Input Output Output Signal Baseline System Learned Controller Reference System State Reference input with earlier and larger amplitude 5 Angela Schoellig

  6. Learned Triple Flip [ICRA10] https://youtu.be/bWExDW9J9sA 6 Angela Schoellig

  7. Learning from data can improve performance. Learning a sin ingle task through repetition Offlin ine le learnin ing of in inverse model Baseline Closed-Loop System Baseline Closed-Loop System Desired Actual Ref. Iteratively Actual Output Output Signal Desired Output Baseline Deep Neural Output System Learned Baseline Controller System Network Offline Reference Controller Learning System System State State [ECC’09, IROS’12, AURO’12] [ICRA’16, CDC’17, RAL’18, ECC’19] 7 Angela Schoellig

  8. Mobile Manipulator Control [IROS’20 ] http://tiny.cc/ball_catch 8 Angela Schoellig

  9. Learning from data can improve performance. Learning a sin ingle task through repetition Offlin ine le learnin ing of in inverse model Baseline Closed-Loop System Baseline Closed-Loop System Desired Actual Ref. Iteratively Actual Output Output Signal Desired Output Baseline Deep Neural Output System Learned Baseline Controller System Network Offline Reference Controller Learning System System State State [ECC’09, IROS’12, AURO’12] [ICRA’16, CDC’17, RAL’18, ECC’19] In Input-output stabili lity Base aseli line con ontrolle ller required if baseline system is stable Trai ainin ing phas ase Acausal corrections possible St State con onstrain ints not considered 9 Angela Schoellig

  10. Problem Statement Design a controller for systems with prior uncertainty that learns online and continuously improves performance while satisfying safety constraints. Considered system dynamics: Compare to (simplified view): with a-priori given sets • Robust contr trol: l: finds controller that achieves Key features: stability and performance for all possible • Nonparametric model • Im • Improved performance with ith more Adaptiv ive control: l: estimates and uses estimate in controller data 10 Angela Schoellig

  11. Approach Gaussian processes Nonparametric model for reliable confidence intervals unknown model error Stochastic Disturbance Model Robust contr trol Algorithm to safely acquire stability & performance Actual data and optimize task Output Robust under uncertainty System Controller Desired Output System State Lyapunov analysis Defining and analyzing closed-loop safety stability of learned models = safe model-based reinforcement le learning 11 Angela Schoellig

  12. Approach Gaussian processes Nonparametric model for reliable confidence intervals unknown model error Stochastic Disturbance Model Robust contr trol Algorithm to safely acquire stability & performance Actual data and optimize task Output Robust under uncertainty System Controller Desired Output System State Lyapunov analysis Defining and analyzing closed-loop safety stability of learned models = safe model-based reinforcement le learning 12 Angela Schoellig

  13. Gaussian Process Theorem (informally): The function is contained in the Gau aussian Process Optim imiz ization in in the Bandit it Setting: scaled Gaussian process confidence intervals No Regret an and Exp xperim imental l Desig ign with probability at least . N. Srinivas, A. Krause, S. Kakade, M.Seeger, ICML 2010 13 Angela Schoellig

  14. Gaussian Process • Can model arbitrary smooth functions. • For a given input, it provides an interval in which the function value lies with high probability. • As more data is gathered, the uncertainty is reduced. Our model framework for developing reinforcement learning algorithms with safety guarantees. 14 Angela Schoellig

  15. Approach Gaussian processes Nonparametric model for reliable confidence intervals unknown model error 1. Lin 1. Linear Robust contr trol Algorithm to safely acquire 2. Nonlinear 2. stability & performance data and optimize task under uncertainty 3. Nonlinear, 3. , predictive Lyapunov analysis Defining and analyzing closed-loop safety stability of learned models = safe model-based reinforcement le learning 15 Angela Schoellig

  16. Linear Robust Control [ECC’15] • Gaussian Process Model • Linear Robust Control • Task: stabilization of an operating point • Lin inear robust control: • linearization about operating point • Local Stability Guarantees • Local asymptotic stabili lity around true operating poin int with high probability 16 Angela Schoellig

  17. Linear Robust Control [ECC’15 ] https://youtu.be/YqhLnCm0KXY 17 Angela Schoellig

  18. Linear Robust Control [ECC’15 ] https://youtu.be/YqhLnCm0KXY 18 Angela Schoellig

  19. Nonlinear Robust Control for Differentially Flat Systems [L- CSS’20] • Model / Assumptions • Differentially flat, control-affine real dynamics and prior model • Gaussian Process models in inverse nonlinear mis ismatch • Linear Robust Control • Task: high-performance tracking • Linear robust control for feedback- linearized system • Global Tracking Guarantees • Tracking error is uniformly ultimately bounded with high probability 19 Angela Schoellig

  20. Nonlinear Robust Control for Differentially Flat Systems [L- CSS’20] • Model / Assumptions Bound • Differentially flat, control-affine real dynamics and prior model Gaussian • Gaussian Process models in inverse Process Robustness Differentially Flat System nonlinear mis ismatch Term Actual Input Desired Input to Linear to Linear Nominal Inverse Nominal Nonlinear Linear Feedback Nonlinear LQR Term Dynamics Linearization • Linear Robust Control Mismatch Nonlinear Mismatch • Task: high-performance tracking • Linear robust control for feedback- linearized system • Global Tracking Guarantees • Tracking error is uniformly ultimately bounded with high probability 20 Angela Schoellig

  21. Nonlinear Robust Control for Differentially Flat Systems [L- CSS’20] Cart-pendulum example with model parameter uncertainties: Robust, online learning control with global Predictiv ive cap apabili ilitie ies guarantees on tracking error. State con onstrain ints 21 Angela Schoellig

  22. Robust Predictive Control [IJRR’16, JFR’16] • Gaussian Process Model • Nonlinear, Robust Model Predictive Control • Task: high-performance tracking • Approximations in prediction and nonlinear optimization step • Guarantees [e.g., Tomlin’13, Krause’18, Zeilinger’18] • Robustly asymptotically stable • Robust constraint satisfaction • Recursively guaranteeing the existence of safe control actions Unscented Transform for prediction 22 Angela Schoellig

  23. Robust Predictive Control [IJRR’16, JFR’16] Example: Mobile robot path th foll llowing • Problem setup: • Learning: Driving too fast Slow down for safety Faster driving after learning 23 Angela Schoellig

  24. Robust Predictive Control [IJRR’16, JFR’16 ] https://youtu.be/3xRNmNv5Efk 24 Angela Schoellig

  25. Summary Design a controller for systems with prior uncertainty that learns online and continuously improves performance while satisfying safety constraints. Gauss ssian processes Nonparametric model for reliable confidence intervals unknown model error 1. Lin 1. Linear • Local stability guarantees Robust contr trol Algorithm to safely acquire 2. Nonlinear 2. stability & performance data and optimize task • Global tracking error under uncertainty guarantees 3. 3. Nonlinear, , predictive • Probabilistic constraint Lyapunov stability Defining and analyzing satisfaction and stability closed-loop safety stability of learned models 25 Angela Schoellig

  26. Acknowledgements www.dynsyslab.org Senior collaborators: Andreas Krause, Tim Barfoot, Raffaello D’Andrea Funding: 26 Angela Schoellig

  27. Other Learning Control Results from My Lab • Syste tems wit ith changing dynamics [ICRA’17, IROS’18, RAL’18, JACSP’19, RAL’19] • Transfer le learning betw tween sim imilar syste tems (similarity metric from robust control) M. Paton, “Expanding the Limits of Vision - Based Autonomous Path Following,”, 2017. [IROS’17, ICRA’17, RAL’18, ACSP’18] • Coll llaborative le learning of in inte terconnecte ted syste tems [AURO’19] • Acti tive le learning [ICRA’16, NeurIPS’17, CDC’19] 27 Angela Schoellig

Recommend


More recommend