Learning convex bounds for linear quadratic control policy synthesis - PowerPoint PPT Presentation

Learning convex bounds for linear quadratic control policy synthesis Jack Umenberger Thomas B. Schön

Learning to control data control (observations of the (stabilize the upright dynamical system) equilibrium position) learning NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Problem set-up x t +1 = Ax t + Bu t + w t u t x t w t ∼ N (0 , Π ) NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Problem set-up x t +1 = Ax t + Bu t + w t u t x t w t ∼ N (0 , Π ) Goal: find a static state-feedback controller, u = Kx, to minimize P T lim T !1 1 t =0 E [ x 0 t Qx t + u 0 t Ru t ] , T NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Problem set-up x t +1 = Ax t + Bu t + w t x t +1 = Ax t + Bu t + w t u t x t w t ∼ N (0 , Π ) , w t ∼ N (0 , Π ) Goal: find a static state-feedback controller, u = Kx, to minimize P T lim T !1 1 t =0 E [ x 0 t Qx t + u 0 t Ru t ] , T Challenge: we don’t know the system parameters θ = { A, B, Π } NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Learning from data x t +1 = Ax t + Bu t + w t w t ∼ N (0 , Π ) NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Learning from data x t +1 = Ax t + Bu t + w t u 0: T w t ∼ N (0 , Π ) NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Learning from data x t +1 = Ax t + Bu t + w t x 0: T u 0: T w t ∼ N (0 , Π ) NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Learning from data x t +1 = Ax t + Bu t + w t x 0: T u 0: T w t ∼ N (0 , Π ) D := { u 0: T , x 0: T } From this data we can form the posterior belief over model parameters: posterior ( θ |D ) NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Learning from data x t +1 = Ax t + Bu t + w t x 0: T u 0: T w t ∼ N (0 , Π ) D := { u 0: T , x 0: T } From this data we can form the posterior belief over model parameters: posterior ( θ |D ) Instead of optimizing the cost for fixed parameters cost ( K | θ ) NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Learning from data x t +1 = Ax t + Bu t + w t x 0: T u 0: T w t ∼ N (0 , Π ) D := { u 0: T , x 0: T } From this data we can form the posterior belief over model parameters: posterior ( θ |D ) Instead of optimizing the cost for fixed parameters cost ( K | θ ) We can optimize the expected cost over the posterior R cost_avg ( K ) = cost ( K | θ ) posterior ( θ |D ) d θ NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Convex upper bounds cost_avg ( K ) cost policy, K NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Convex upper bounds P M 1 θ i ∼ posterior ( θ |D ) cost_avg ( K ) ≈ cost_mc ( K ) := i =1 cost ( K | θ i ) M cost_avg ( K ) cost policy, K NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Convex upper bounds P M 1 θ i ∼ posterior ( θ |D ) cost_avg ( K ) ≈ cost_mc ( K ) := i =1 cost ( K | θ i ) M cost_avg ( K ) cost cost_mc ( K ) policy, K NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Convex upper bounds P M 1 θ i ∼ posterior ( θ |D ) cost_avg ( K ) ≈ cost_mc ( K ) := i =1 cost ( K | θ i ) M cost_bound ( K | K ( k ) ) cost_avg ( K ) cost cost_mc ( K ) policy, K K ( k ) NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Convex upper bounds P M 1 θ i ∼ posterior ( θ |D ) cost_avg ( K ) ≈ cost_mc ( K ) := i =1 cost ( K | θ i ) M cost_avg ( K ) cost_bound ( K | K ( k +1) ) cost cost_mc ( K ) policy, K K ( k ) K ( k +1) NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Convexification The crux of the problem is the matrix inequality known quantities   ( A i + B i K ) 0 K 0 X i � Q X � 1  ⌫ 0 A i + B i K 0 decision variables  i R � 1 0 K NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Convexification The crux of the problem is the matrix inequality known quantities   ( A i + B i K ) 0 K 0 X i � Q X � 1  ⌫ 0 A i + B i K 0 decision variables  i R � 1 0 K • Replace the ‘problematic’ term with a Taylor series approx. X − 1 i linear approximation NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Convexification The crux of the problem is the matrix inequality known quantities   ( A i + B i K ) 0 K 0 X i � Q X � 1  ⌫ 0 A i + B i K 0 decision variables  i R � 1 0 K • Replace the ‘problematic’ term with a Taylor series approx. X − 1 i • Leads to a new linear matrix inequality with a smaller feasible set. linear approximation NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Convexification The crux of the problem is the matrix inequality known quantities   ( A i + B i K ) 0 K 0 X i � Q X � 1  ⌫ 0 A i + B i K 0 decision variables  i R � 1 0 K • Replace the ‘problematic’ term with a Taylor series approx. X − 1 i • Leads to a new linear matrix inequality with a smaller feasible set. • Hence: convex upper bound. linear approximation NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Performance better performance more data for learning NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Poster presentation Poster #166 Today 05:00 -- 07:00 PM @ Room 210 & 230 NeurIPS 2018 Thu Dec 6th 05:00 -- 07:00 PM @ Room 210 & 230 #166

Learning convex bounds for linear quadratic control policy synthesis - PowerPoint PPT Presentation

Learning convex bounds for linear quadratic control policy synthesis Jack Umenberger Thomas B. Schn Learning to control data control (observations of the (stabilize the upright dynamical system) equilibrium position) learning

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

The quadratic formula You may recall the quadratic formula for roots of quadratic polynomials ax 2

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Convex Analysis Jos e De Don a September 2004 Centre of Complex Dynamic Systems and

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

Convex hull: basic facts Convex hull: basic facts CG Lecture 1 CG Lecture 1 Problem : give a set

Convex hulls of spheres and convex hulls of convex polytopes lying on parallel hyperplanes

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS133 Computational Geometry Convex Hull 4/12/2018 1 Convex Hull Given a set of n points,

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

Circuit Lower-bounds Lecture 24 Weak circuits are indeed weak 1 Circuit Lower-bounds 2

Learning Linear Quadratic Regulators Efficiently with Only Regret T Alon Cohen Joint

Convex Programs COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Convex

Linear Models for Classification Henrik I Christensen Robotics & Intelligent Machines @ GT

Discriminant Analysis using Logistic Regression OLS1D XL4E: V0D XL4E : OLS1D V0D XL4E : OLS1D V0D

Support vector machines (SVMs) Lecture 5 David Sontag New

Parameter-Free Convex Learning through Coin Betting Francesco Orabona and Dvid Pl Yahoo

The weak Bruhat order on the symmetric group is Sperner Yibo Gao Joint work with: Christian

CS675: Convex and Combinatorial Optimization Fall 2019 Introduction to Matroid Theory

Non-Uniform Stochastic Average Gradient for Training Conditional Random Fields Mark Schmidt, Reza

INTRODUCTION TO WEB DEVELOPMENT IN C++ WITH WT 4 https://www.webtoolkit.eu/wt Roel Standaert

Learning convex bounds for linear quadratic control policy synthesis - PowerPoint PPT Presentation

Learning convex bounds for linear quadratic control policy synthesis Jack Umenberger Thomas B. Schn Learning to control data control (observations of the (stabilize the upright dynamical system) equilibrium position) learning

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

The quadratic formula You may recall the quadratic formula for roots of quadratic polynomials ax 2

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Convex Analysis Jos e De Don a September 2004 Centre of Complex Dynamic Systems and

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

Convex hull: basic facts Convex hull: basic facts CG Lecture 1 CG Lecture 1 Problem : give a set

Convex hulls of spheres and convex hulls of convex polytopes lying on parallel hyperplanes

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS133 Computational Geometry Convex Hull 4/12/2018 1 Convex Hull Given a set of n points,

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

Circuit Lower-bounds Lecture 24 Weak circuits are indeed weak 1 Circuit Lower-bounds 2

Learning Linear Quadratic Regulators Efficiently with Only Regret T Alon Cohen Joint

Convex Programs COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Convex

Linear Models for Classification Henrik I Christensen Robotics &amp; Intelligent Machines @ GT

Discriminant Analysis using Logistic Regression OLS1D XL4E: V0D XL4E : OLS1D V0D XL4E : OLS1D V0D

Support vector machines (SVMs) Lecture 5 David Sontag New

Parameter-Free Convex Learning through Coin Betting Francesco Orabona and Dvid Pl Yahoo

The weak Bruhat order on the symmetric group is Sperner Yibo Gao Joint work with: Christian

CS675: Convex and Combinatorial Optimization Fall 2019 Introduction to Matroid Theory

Non-Uniform Stochastic Average Gradient for Training Conditional Random Fields Mark Schmidt, Reza

INTRODUCTION TO WEB DEVELOPMENT IN C++ WITH WT 4 https://www.webtoolkit.eu/wt Roel Standaert

Linear Models for Classification Henrik I Christensen Robotics & Intelligent Machines @ GT