numerical approximation for optimal control problems via
play

Numerical approximation for optimal control problems via MPC and HJB - PowerPoint PPT Presentation

Numerical approximation for optimal control problems via MPC and HJB Giulia Fabrini Konstanz Women In Mathematics 15 May, 2018 G. Fabrini (University of Konstanz) Numerical approximation for OCP 1 / 33 Outline Outline Introduction and


  1. Numerical approximation for optimal control problems via MPC and HJB Giulia Fabrini Konstanz Women In Mathematics 15 May, 2018 G. Fabrini (University of Konstanz) Numerical approximation for OCP 1 / 33

  2. Outline Outline Introduction and motivations 1 Hamilton-Jacobi-Bellman 2 Model Predictive Control 3 Numerical Tests 4 G. Fabrini (University of Konstanz) Numerical approximation for OCP 2 / 33

  3. Introduction and motivations Outline Introduction and motivations 1 Hamilton-Jacobi-Bellman 2 Model Predictive Control 3 Numerical Tests 4 G. Fabrini (University of Konstanz) Numerical approximation for OCP 3 / 33

  4. Introduction and motivations Introduction and Motivations Some history The method is largely due to the work of Lev Pontryagin and Richard Bellman in the 1950s. The theory of control analyzes the properties of controlled systems, i.e. dynamical systems on which we can act through a control. Aim Bring the system from an initial state to a certain final state satisfying some criteria. Several applications: Robotics, aeronautics, electrical and aerospace engineering, biological and medical field. G. Fabrini (University of Konstanz) Numerical approximation for OCP 4 / 33

  5. Introduction and motivations Controlled dynamical system � ˙ y ( t ) = F ( y ( t ) , u ( t )) ( t > 0 ) ( x ∈ R n ) y ( 0 ) = x Assumptions on the data u ( · ) ∈ U : control, where U := { u ( · ) : [ 0 , + ∞ [ → U measurable } F : R n × U → R n is the dynamics, which satisfies: F is continuous respect to ( y , u ) F is local bounded F is Lipschitz respect to y ⇒ ∃ ! y x ( t , u ) , solution of the problem (Caratheodory Theorem). G. Fabrini (University of Konstanz) Numerical approximation for OCP 5 / 33

  6. Introduction and motivations The infinite horizon problem Cost functional � ∞ L ( y x ( t , u ) , u ( t )) e − λ t dt . J ( x , u ) := 0 where λ > 0 is the interest rate and L is the running cost . Goal We are interested in minimizing this cost functional. We want to find an optimal pair ( y ∗ , u ∗ ) which minimizes the cost functional. G. Fabrini (University of Konstanz) Numerical approximation for OCP 6 / 33

  7. Introduction and motivations Two possible approaches Open-loop control Control expressed as functions of time t (necessary condition - Pontryagin Minimum Principle or direct methods, e.g. gradient method ). Remark: it cannot take into account errors in the real state of the system due to model errors or external disturbances. Feedback control Control expressed as functions of the state system (Dynamic Programming, Model Predictive Control) Remark: robust to external perturbations. G. Fabrini (University of Konstanz) Numerical approximation for OCP 7 / 33

  8. Hamilton-Jacobi-Bellman Outline Introduction and motivations 1 Hamilton-Jacobi-Bellman 2 Model Predictive Control 3 Numerical Tests 4 G. Fabrini (University of Konstanz) Numerical approximation for OCP 8 / 33

  9. Hamilton-Jacobi-Bellman Value Function v ( y 0 ) := u ( · ) ∈U J ( y 0 , u ( · )) inf The value function is the unique viscosity solution of the Bellman equation associated to the control problem via Dynamic Programming. Dynamic Programming Principle �� t � L ( y x ( s ) , u ( s )) e − λ s ds + v ( y x ( t )) e − λ t v ( x ) = min u ∈U t 0 Hamilton-Jacobi-Bellman λ v ( x ) + max u ∈ U {− f ( x , u ) ∇ v ( x ) − L ( x , u ) } = 0 , x ∈ Ω , u ∈ U G. Fabrini (University of Konstanz) Numerical approximation for OCP 9 / 33

  10. Hamilton-Jacobi-Bellman Feedback Control Given v ( x ) for any x ∈ R n , we define u ( y ∗ x ( t ))) = arg min u ∈ U [ F ( x , u ) ∇ v ( x ) + L ( x , u )] where y ∗ ( t ) is the solution of � ˙ y ∗ ( t ) = F ( y ∗ ( t ) , u ∗ ( t )) , t ∈ ( t 0 , ∞ ] y ( t 0 ) = x , Technical difficulties The bottleneck is the approximation of the value function v , however this remains the main goal since v allows to get back to feedback controls in a rather simple way. G. Fabrini (University of Konstanz) Numerical approximation for OCP 10 / 33

  11. Hamilton-Jacobi-Bellman Numerical computation of the value function The bottleneck of the DP approach is the computation of the value function, since this requires to solve a non linear PDE in high-dimension. This is a challenging problem due to the huge number of nodes involved and to the singularities of the solution. Another important issue is the choice of the domain. Several numerical schemes: finite difference, finite volume, Semi-Lagrangian (obtained by a discrete version of the Dynamic Programming Principle ). G. Fabrini (University of Konstanz) Numerical approximation for OCP 11 / 33

  12. Hamilton-Jacobi-Bellman Semi-Lagrangian discretization of HJB These schemes are based on the direct discretization of the directional derivative f ( x , u ) ∇ v ( x ) . Continuous Version λ v ( x ) = − max u ∈ U {− F ( x , u ) · Dv ( x ) − L ( x , u ) } Semi-Discrete Approximation (Value Iteration) Making a discretization in time of the continuous control problem (e.g. using the Euler method): V k + 1 = min u ∈ U { e − λ ∆ t V k ( x + ∆ t F ( x , u )) + ∆ t L ( x i , u ) } G. Fabrini (University of Konstanz) Numerical approximation for OCP 12 / 33

  13. Hamilton-Jacobi-Bellman Semi-Lagrangian discretization of HJB Fully discrete SL Value Iteration (VI) scheme V k + 1 T ( V k ) , = for i = 1 , . . . , N , with � � T ( V k ) u ∈ U { β I 1 [ V k ]( x i + ∆ t F ( x i , u )) + L ( x i , u ) } . ≡ min i Fixed grid in Ω ⊂ R n bounded, Steps ∆ x . Nodes: { x 1 , . . . , x N } Stability for large time steps ∆ t . Error estimation: (Falcone/Ferretti ’97) Slow convergence, since β = e − λ ∆ t → 1 when ∆ t → 0 PROBLEM: Find a reasonable computational domain. G. Fabrini (University of Konstanz) Numerical approximation for OCP 13 / 33

  14. Hamilton-Jacobi-Bellman Value Iteration for infinite horizon optimal control ( VI) Require: Mesh G , ∆ t , initial guess V 0 , tolerance ǫ . 1: while � V k + 1 − V k � ≥ ǫ do for x i ∈ G do 2: 3: � V k � V k + 1 u ∈ U { e − λ ∆ t I = min ( x i + ∆ t F ( x i , u )) + ∆ t L ( x i , u ) } i k = k + 1 4: end for 5: 6: end while Remarks VI algorithm converges (slowly) for any initial guess V 0 . We can provide an error estimate, G. Fabrini (University of Konstanz) Numerical approximation for OCP 14 / 33

  15. Model Predictive Control Outline Introduction and motivations 1 Hamilton-Jacobi-Bellman 2 Model Predictive Control 3 Numerical Tests 4 G. Fabrini (University of Konstanz) Numerical approximation for OCP 15 / 33

  16. Model Predictive Control Model Predictive Control Dynamics: � ˙ y ( t ) = F ( y ( t ) , u ( t )) y ( 0 ) = y 0 t > 0 Infinite horizon cost functional: � ∞ J ∞ ( u ( · )) = L ( y ( t ; u , y 0 )) e − λ t dt 0 Finite horizon cost functional: � t N 0 J N ( u ( · )) = L ( y ( t ; u , y 0 )) e − λ t dt , t N 0 = t 0 + N ∆ t , N ∈ N t 0 G. Fabrini (University of Konstanz) Numerical approximation for OCP 16 / 33

  17. Model Predictive Control MPC trajectories (in L. Grüne, J. Pannek, NMPC ) black=prediction (obtained with an open loop optimization) red= MPC closed loop, y ( t n ) = y µ N ( t n ) G. Fabrini (University of Konstanz) Numerical approximation for OCP 17 / 33

  18. Model Predictive Control MPC METHOD solves infinite time horizon problem by means of iterative solution of finite horizon ( N ≥ 2) optimal control problems. min J N ( u ) s.t. u ∈ U N FEEDBACK CONTROL : µ N ( y ( t )) = u ∗ ( t ) We obtain a closed loop representation by applying the map µ N � � y ( t ) , µ N ( y ( t )) ˙ y = F OPTIMAL... TRAJECTORIES: y ∗ ( t 0 ) , . . . , y ∗ ( t N 0 ) CONTROLS: u ∗ ( t 0 ) , . . . , u ∗ ( t N 0 ) G. Fabrini (University of Konstanz) Numerical approximation for OCP 18 / 33

  19. Model Predictive Control Advantages and disadvantages HJB PRO 1. Valid for all problems in any dimension, a-priori error estimates in L ∞ . 2. SL schemes can work on structured and unstructured grids. 3. The computation of feedbacks is almost built in. HJB CONS 1. “Curse of dimensionality” 2. Computational cost and huge memory allocations. MPC PRO 1. Easy to implement,short computational time. 2. It can be applied to high dimensional problems. 3. Feedback controls. MPC CONS 1. Approximate feedback just along one trajectory. 3. With a short horizon �→ sub-optimal trajectory. G. Fabrini (University of Konstanz) Numerical approximation for OCP 19 / 33

  20. Model Predictive Control IDEA Try to combine the advantages of the two methods in order to obtain an efficient algorithm. The approximation of the HJB equation needs to restrict the computational domain Ω to a subset of R n . The choice of the domain is totally arbitrary. GOAL: To find a reasonable way to compute Ω QUESTION: How to compute the computational domain? SOLUTION: An inexact MPC solver may provide a reference trajectory for our optimal control problem. G. Fabrini (University of Konstanz) Numerical approximation for OCP 20 / 33

Recommend


More recommend