output feedback optimal control with constraints
play

Output Feedback Optimal Control with Constraints Mar a M. Seron - PowerPoint PPT Presentation

Output Feedback Optimal Control with Constraints Mar a M. Seron September 2004 Centre for Complex Dynamic Systems and Control Outline Introduction 1 Problem Formulation 2 3 Optimal Solutions Optimal Solution for N = 1 Optimal


  1. Output Feedback Optimal Control with Constraints Mar´ ıa M. Seron September 2004 Centre for Complex Dynamic Systems and Control

  2. Outline Introduction 1 Problem Formulation 2 3 Optimal Solutions Optimal Solution for N = 1 Optimal Solution for N = 2 Discussion on Implementation Suboptimal Strategies 4 Certainty Equivalent Control Partially Stochastic Certainty Equivalent Control Simulations 5 Centre for Complex Dynamic Systems and Control

  3. Introduction Here we address the problem of constrained optimal control for systems with uncertainty and incomplete state information. We adopt a stochastic description of uncertainty, which associates probability distributions to the uncertain elements, that is, disturbances and initial conditions. When incomplete state information exists, a popular observer-based control strategy in the presence of stochastic disturbances is to use the certainty equivalence [CE] principle, introduced before for deterministic systems. In the stochastic framework, CE consists of estimating the state and then using these estimates as if they were the true state in the control law that results if the problem were formulated as a deterministic problem (that is, without uncertainty). Centre for Complex Dynamic Systems and Control

  4. Introduction The CE strategy is motivated by the unconstrained problem with a quadratic objective function, for which CE is indeed the optimal solution. Here we analyse the optimality of the CE principle. We will see that CE is not optimal in general. We will also analyse the possibility of obtaining truly optimal solutions for single input linear systems with input constraints and uncertainty related to output feedback and stochastic disturbances. We first find the optimal solution for the case of horizon N = 1, and then we indicate the complications that arise in the case of horizon N = 2. Centre for Complex Dynamic Systems and Control

  5. Problem Formulation We consider the following time-invariant, discrete time linear system with disturbances x k + 1 = Ax k + Bu k + w k , (1) y k = Cx k + v k , where x k , w k ∈ R n and u k , y k , v k ∈ R . The control u k is constrained to take values in the set U = { u ∈ R : − ∆ ≤ u ≤ ∆ } , for a given constant value ∆ > 0. The disturbances w k and v k are i.i.d. random vectors, with probability density functions (pdf) p w ( · ) and p v ( · ) , respectively. The initial state, x 0 , is characterised by a pdf p x 0 ( · ) . We assume that ( A , B ) is reachable and that ( A , C ) is observable. Centre for Complex Dynamic Systems and Control

  6. Problem Formulation We further assume that, at time k , the value of the state x k is not available to the controller. Instead, the following sets of past inputs and outputs, grouped as the information vector I k , represent all the information available to the controller at the time instant k :  { y 0 } if k = 0 ,      { y 0 , y 1 , u 0 } if k = 1 ,       I k =  { y 0 , y 1 , y 2 , u 0 , u 1 } k = 2 , if    . .   . .  . .       { y 0 , y 1 , . . . , y N − 1 , u 0 , u 1 , . . . u N − 2 } k = N − 1 .  if  Then, I k ∈ R 2 k + 1 , and also I k + 1 = { I k , y k + 1 , u k } , where I k ⊂ I k + 1 . Centre for Complex Dynamic Systems and Control

  7. Problem Formulation For system (1), under the assumptions made, we formulate the optimisation problem: N − 1 � � � minimise E F ( x N ) + L ( x k , u k ) , (2) k = 0 where F ( x N ) = x  N Px N , k Qx k + Ru 2 L ( x k , u k ) = x  k , subject to the system equations (1) and the input constraint u k ∈ U , for k = 0 , . . . , N − 1. Note that, under the stochastic assumptions, the expression F ( x N ) + � N − 1 k = 0 L ( x k , u k ) is a random variable. Hence, it is only meaningful to formulate the minimisation problem in terms of its statistics, for example, its expected value as in (2). Centre for Complex Dynamic Systems and Control

  8. Problem Formulation The result of the above minimisation problem will be a sequence of functions { π  0 ( · ) , π  1 ( · ) , . . . , π  N − 1 ( · ) } that enable the controller to calculate the desired optimal control action depending on the information available to the controller at each time instant k , that is, k ( I k ) . u  = π  k These functions also must ensure that the constraints be always satisfied. We thus make the following definition. Centre for Complex Dynamic Systems and Control

  9. Problem Formulation Definition (Admissible Policies for Incomplete State Information) k ( · ) : R 2 k + 1 → R for A policy Π N is a fi nite sequence of functions π k = 0 , 1 , . . . , N − 1 , that is, Π N = � π 0 ( · ) , π 1 ( · ) , · · · , π N − 1 ( · ) � . A policy Π N is called an admissible control policy if and only if for all I k ∈ R 2 k + 1 , π k ( I k ) ∈ U for k = 0 , . . . , N − 1 . Further, the class of all admissible control policies will be denoted by ¯ Π N = � Π N : Π N is admissible � . Using the above definition, we can then state the optimal control problem of interest as follows. Centre for Complex Dynamic Systems and Control

  10. Problem Formulation Definition (Stochastic Finite Horizon Optimal Control Problem) Given the pdfs p x 0 ( · ) , p w ( · ) and p v ( · ) of the initial state x 0 and the disturbances w k and v k , respectively, we seek the optimal control policy Π  N belonging to the class of all admissible control policies ¯ Π N , which minimises the objective function N − 1 � � � L ( x k , π k ( I k )) V N ( Π N ) = F ( x N ) + , (3) E x 0 , w k , v k k = 0 k = 0 ,..., N − 1 subject to the constraints x k + 1 = Ax k + B π k ( I k ) + w k , y k = Cx k + v k , I k + 1 = { I k , y k + 1 , u k } , for k = 0 , . . . , N − 1 . Centre for Complex Dynamic Systems and Control

  11. Problem Formulation In (3) the terminal state weighting F ( · ) and the per-stage weighting L ( · , · ) are given by F ( x N ) = x  N Px N , (4) L ( x k , π k ( I k )) = x  k Qx k + R π 2 k ( I k ) , with P > 0, R > 0 and Q ≥ 0. The optimal control policy is then Π  N = arg inf V N ( Π N ) , Π N ∈ ¯ Π N with the following resulting optimal objective function value V  = inf V N ( Π N ) . (5) N Π N ∈ ¯ Π N Centre for Complex Dynamic Systems and Control

  12. Problem Formulation It is important to recognise that the optimisation problem of Definition 3.2 takes into account the fact that new information will be available to the controller at future time instants. This is called closed loop optimisation , as opposed to open loop optimisation where the control values { u 0 , u 1 , . . . , u N − 1 } are selected all at once, at stage zero. For deterministic systems, in which there is no uncertainty, the distinction between open loop and closed loop optimisation is irrelevant, and the minimisation of the objective function over all sequences of controls or over all control policies yields the same result. Centre for Complex Dynamic Systems and Control

  13. Problem Formulation In what follows, and as done before, the matrix P in (4) will be taken to be the solution to the algebraic Riccati equation, P = A  PA + Q − K  ¯ RK , (6) where K � ¯ R − 1 B  PA , ¯ R � R + B  PB . (7) Centre for Complex Dynamic Systems and Control

  14. Optimal Solutions The problem just described belongs to the class of the so-called sequential decision problems under uncertainty. A key feature of these problems is that an action taken at a particular stage affects all future stages. Thus, the control action has to be computed taking into account the future consequences of the current decision. The only general approach known to address sequential decision problems is dynamic programming. We next briefly show how dynamic programming is used to solve the stochastic optimal control problem just defined. Centre for Complex Dynamic Systems and Control

  15. Dynamic Programming The dynamic programming algorithm for the case of incomplete state information can be expressed via the following sequential optimisation (sub-) problems [ SOP ]: For k = N − 1, J N − 1 ( I N − 1 ) = L N − 1 ( I N − 1 , u N − 1 ) , ˜ SOP N − 1 : inf u N − 1 ∈ U (8) subject to: x N = Ax N − 1 + Bu N − 1 + w N − 1 . where � � L N − 1 ( I N − 1 , π N − 1 ( I N − 1 )) = E ˜ F ( x N ) + L ( x N − 1 , π N − 1 ( I N − 1 )) | I N − 1 , π N − 1 ( I N − 1 ) . Centre for Complex Dynamic Systems and Control

  16. Dynamic Programming For k = 0 , . . . , N − 2, J k ( I k ) = inf � ˜ L k ( I k , u k ) + E � J k + 1 ( I k + 1 ) | I k , u k �� SOP k : , u k ∈ U subject to: x k + 1 = Ax k + Bu k + w k , I k + 1 = { I k , y k + 1 , u k } , y k + 1 = Cx k + 1 + v k + 1 , where L k ( I k , π k ( I k )) = E ˜ � L ( x k , π k ( I k )) | I k , π k ( I k ) � for k = 0 , . . . , N − 2 . Centre for Complex Dynamic Systems and Control

Recommend


More recommend