Online Control with Adversarial Disturbances Naman Agarwal Google - PowerPoint PPT Presentation

Online Control with Adversarial Disturbances Naman Agarwal Google AI Princeton Joint Work with Brian Bullins, Elad Hazan, Sham Kakade, Karan Singh

Dynamical Systems with Control • Robotics ! "#$ = &(! " , ) " ) • Autonomous Vehicles + , • Data Center Cooling - , [Cohen et al ‘18]

! " : State Our Setting ) " : Control Robustly Control a Noisy Linear Dynamical System • Known Dynamics ! "#$ = &! " + () " + * " • Fully Observable State

! " : State Our Setting ) " : Control Robustly Control a Noisy Linear Dynamical System • Known Dynamics ! "#$ = &! " + () " + * " • Fully Observable State Disturbance + , adversarially chosen ( ||+ , || ≤ / )

! " : State Our Setting ) " : Control Robustly Control a Noisy Linear Dynamical System • Known Dynamics ! "#$ = &! " + () " + * " • Fully Observable State Disturbance 0 1 adversarially chosen ( ||0 1 || ≤ 4 ) • Online and Adversarial Minimize Costs - ∑ , " (! " , ) " ) • General Convex Function

! " : State Our Setting ) " : Control Robustly Control a Noisy Linear Dynamical System • Known Dynamics ! "#$ = &! " + () " + * " • Fully Observable State Disturbance 0 1 adversarially chosen ( ||0 1 || ≤ 4 ) • Online and Adversarial Minimize Costs - ∑ , " (! " , ) " ) • General Convex Function vs. Linear Quadratic Regulator (LQR ): Adversarial vs Random Disturbance Online, Convex Costs vs Known Quadratic Loss

Goal – Minimize Regret • Fixed Time horizon - ! • Produce actions " # , " % … " ' to minimize regret w.r.t best in hindsight ' ' ( + ) , ) , " ) − min ( + ) , ) (3), 3, ) (3) 1 )*# )*#

Goal – Minimize Regret • Fixed Time horizon - ! • Produce actions " # , " % … " ' to minimize regret w.r.t best in hindsight ' ' ( + ) , ) , " ) − min ( + ) , ) (3), 3, ) (3) 1 )*# )*# " ) only knows 5 # … 5 ) Best Linear Policy knowing 5 # … 5 ' Optimal for LQR

Goal – Minimize Regret • Fixed Time horizon - ! • Produce actions " # , " % … " ' to minimize regret w.r.t best in hindsight ' ' ( + ) , ) , " ) − min ( + ) , ) (3), 3, ) (3) 1 )*# )*# " ) only knows 5 # … 5 ) Best Linear Policy knowing 5 # … 5 ' Optimal for LQR Counterfactual Regret – , ) (3) depends on K

Previous work: 8 9 Control • min-max problem, worst case perturbation: min $ max ' (:* + - . , , 0(2 ,34 , … 2 6 ) , • Disturbance 2 4:: adversarially chosen

Previous work: 8 9 Control • min-max problem, worst case perturbation: min $ max ' (:* + - . , , 0(2 ,34 , … 2 6 ) , • Disturbance 2 4:: adversarially chosen Compute Adaptivity • Closed form: Quadratics • 8 9 is Pessimistic • Difficult for general costs • Regret: adapts to favorable sequence

Main Result Efficient Online Algorithm: ! " … ! $ s.t. $ $ ∑ &'" /∈1&2345 ∑ &'" ( & ) & , ! & − min ( & ) & , 6) & ≤ 8( :) • Convexity through Improper Relaxation • Efficient → Polynomial in system parameters, logarithmic in T

Outline of the approach 1. Improper Learning: Can we even figure out the best in hindsight policy? ”relaxed” policy class: Next Control a linear function of previous ! " 2. Strong Stability ⇒ error feedback policy: learn change to action via ”small horizon” of previous disturbances. 3. Small Horizon ⇒ Efficient Reduction to Online Convex Optimization (OCO) with memory [Anava et al.]

Thank You! For more details please visit the Poster Pacific Ballroom #155 namanagarwal@google.com

Online Control with Adversarial Disturbances Naman Agarwal Google - PowerPoint PPT Presentation

Online Control with Adversarial Disturbances Naman Agarwal Google AI Princeton Joint Work with Brian Bullins, Elad Hazan, Sham Kakade, Karan Singh Dynamical Systems with Control Robotics ! "#$ = &(! " , ) " )

Ionospheric disturbances disturbances Ionospheric Ionospheric disturbances possibly associated

Adaptive Control Chapter 14: Adaptive regulation Rejection of unknown disturbances 1

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

INTERTEMPORAL DISTURBANCES GIORGIO E. PRIMICERI, ERNST SCHAUMBURG, AND ANDREA TAMBALOTTI Abstract.

Western Disturbances and associated weather Naresh Kumar Western Disturbance o Western

Effects from Geomagnetic Disturbances on the Bulk Power System Reliability Risk Management

Adversarial Online Learning with noise Alon Resler Yishay Mansour Tel Aviv University Jun 13,

Control Department of Chemical Engineering I.I.T. Bombay, India Effect of disturbances in

Industrial Robots Industrial Robots Control Control Part 1 Control Control Part 1 Part 1

Introduction to Constraint Programming Marco Chiarandini Department of Mathematics & Computer

Financial and Management Constraints: Characterizing which Firms are Affected Helke Seitz Nordic

Optimal Multi-Element VLC Bulb Design with Power and Lighting Quality Constraints Sifat Ibne

Linear Programming Eric Price UT Austin CS 331, Spring 2020 Coronavirus Edition CS 331, Spring

Housing Market Spillovers: Evidence from the End of Rent Control in Cambridge MA David H. Autor

SOME SURPRISING SIMPLE COMBINED CONTROL AND STOPPING PROBLEMS V ACLAV E. BENE S 26

Lecture 22 Access Control Stephen Checkoway Oberlin College Slides based on Baileys ECE

Taking Control of your SmartNIC Andy Gospodarek (Broadcom) Or Gerlitz (Mellanox) What is a

Online Control with Adversarial Disturbances Naman Agarwal Google - PowerPoint PPT Presentation

Online Control with Adversarial Disturbances Naman Agarwal Google AI Princeton Joint Work with Brian Bullins, Elad Hazan, Sham Kakade, Karan Singh Dynamical Systems with Control Robotics ! "#$ = &(! " , ) " )

Ionospheric disturbances disturbances Ionospheric Ionospheric disturbances possibly associated

Adaptive Control Chapter 14: Adaptive regulation Rejection of unknown disturbances 1

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

INTERTEMPORAL DISTURBANCES GIORGIO E. PRIMICERI, ERNST SCHAUMBURG, AND ANDREA TAMBALOTTI Abstract.

Western Disturbances and associated weather Naresh Kumar Western Disturbance o Western

Effects from Geomagnetic Disturbances on the Bulk Power System Reliability Risk Management

Adversarial Online Learning with noise Alon Resler Yishay Mansour Tel Aviv University Jun 13,

Control Department of Chemical Engineering I.I.T. Bombay, India Effect of disturbances in

Industrial Robots Industrial Robots Control Control Part 1 Control Control Part 1 Part 1

Introduction to Constraint Programming Marco Chiarandini Department of Mathematics &amp; Computer

Financial and Management Constraints: Characterizing which Firms are Affected Helke Seitz Nordic

Optimal Multi-Element VLC Bulb Design with Power and Lighting Quality Constraints Sifat Ibne

Linear Programming Eric Price UT Austin CS 331, Spring 2020 Coronavirus Edition CS 331, Spring

Housing Market Spillovers: Evidence from the End of Rent Control in Cambridge MA David H. Autor

SOME SURPRISING SIMPLE COMBINED CONTROL AND STOPPING PROBLEMS V ACLAV E. BENE S 26

Lecture 22 Access Control Stephen Checkoway Oberlin College Slides based on Baileys ECE

Taking Control of your SmartNIC Andy Gospodarek (Broadcom) Or Gerlitz (Mellanox) What is a

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Introduction to Constraint Programming Marco Chiarandini Department of Mathematics & Computer