taylor expansions of the value function associated with
play

Taylor Expansions of the Value Function Associated with - PowerPoint PPT Presentation

Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Taylor Expansions of the Value Function Associated with Stabilization Problems Laurent Pfeiffer Inria-Saclay and CMAP, Ecole Polytechnique


  1. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Taylor Expansions of the Value Function Associated with Stabilization Problems Laurent Pfeiffer Inria-Saclay and CMAP, Ecole Polytechnique Joint work with Tobias Breiten and Karl Kunisch (U. Graz) ICODE Workshop on numerical solutions of HJB equations, January 8, 2020

  2. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Introduction We consider the following bilinear optimal control problem : � ∞ 1 Y + β 2 � y ( t ) � 2 2 | u ( t ) | 2 d t , u ∈ L 2 (0 , ∞ ) J ( u , y 0 ) := inf 0 ( P ( y 0 )) � ˙ y ( t ) = Ay ( t ) + Ny ( t ) u ( t ) + Bu ( t ) , where: y (0) = y 0 ∈ Y , with associated value function : V ( y 0 ) := inf u ∈ L 2 (0 , ∞ ) J ( u , y 0 ) . Key ideas: The derivatives D j V (0) are characterized by a sequence of equations. This allows for the numerical approximation of V and the optimal feedback law (locally, around 0).

  3. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Assumptions Functional framework: V ⊂ Y ⊂ V ∗ is a Gelfand triple of real Hilbert spaces, where the embedding of V into Y is dense and compact W (0 , ∞ ) = { y ∈ L 2 (0 , ∞ ; V ) | ˙ y ∈ L 2 (0 , ∞ ; V ∗ ) } . Assumptions: (A1) The operator − A can be associated with a V - Y coercive bilinear form a : V × V → R such that ∃ λ ∈ R and δ > 0 satisfying a ( v , v ) ≥ δ � v � 2 V − λ � v � 2 Y , for all v ∈ V . (A2) The operator N is such that N ∈ L ( V , Y ) and N ∗ ∈ L ( V , Y ). (A3) [Stabilizability] There exists an operator F ∈ L ( Y , R ) such that the semigroup e ( A + BF ) t is exponentially stable on Y . Another technical assumption is also needed.

  4. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm 1 Taylor expansions and feedback laws 2 Numeric results 3 Elements of analysis 4 Receding-horizon algorithm

  5. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm 1 Taylor expansions and feedback laws 2 Numeric results 3 Elements of analysis 4 Receding-horizon algorithm

  6. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Roadmap The Taylor expansion of order k , denoted V k is of the form: V k ( y 0 ) = 1 2 T 2 ( y 0 , y 0 ) + 1 3! T 3 ( y 0 , y 0 , y 0 ) + ... + 1 k ! T k ( y 0 , ..., y 0 ) , where T j = D j V (0) is a bounded multilinear form from Y j to R . Remark: V (0) = 0, D V (0) = 0. We formally show that T 2 is the unique solution to an algebraic Riccati equation (ARE) T 3 , T 4 ,... are the unique solutions to (linear) generalized Lyapunov equations (GLE).

  7. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm HJB equation Proposition Assume that there exists a neighborhood Y 0 of 0 such that 1 Problem P ( y 0 ) has a continuous solution u, ∀ y 0 ∈ D ( A ) ∩ Y 0 2 The value function is continuously differentiable on Y 0 . Then, for all y 0 ∈ D ( A ) ∩ Y 0 , � 2 = 0 . � D V ( y 0 ) Ay 0 + 1 2 � y 0 � 2 Y − 1 D V ( y 0 )( Ny 0 + B ) (HJB) 2 β Moreover, for all continuous solutions ¯ u to problem P ( y 0 ) , u ( t ) = − 1 ¯ β D V (¯ y ( t ))( N ¯ y ( t ) + B ) , for a.e. t . � �� � Control in feedback form!

  8. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Taylor expansion The equations characterizing ( T j ) j =2 , 3 ,... are then obtained by successive differentiation of the HJB equation. First differentiation of (HJB) w.r.t. y in some direction z 1 ∈ D ( A ): D 2 V ( y )( Ay , z 1 ) + D V ( y ) Az 1 + � y , z 1 � Y − 1 � �� � D 2 V ( y )( Ny + B , z 1 ) + D V ( y ) Nz 1 D V ( y )( Ny + B ) = 0 . β Note: y 0 → y.

  9. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Taylor expansion Second differentiation of (HJB): D 3 V ( y )( Ay , z 1 , z 2 ) + D 2 V ( y )( Az 2 , z 1 ) + D 2 V ( y )( Az 1 , z 2 ) + � z 1 , z 2 � Y − 1 D 2 V ( y )( Ny + B , z 1 ) + D V ( y ) Nz 1 D 2 V ( y )( Ny + B , z 2 ) + D V ( y ) Nz 2 � �� � β − 1 D 3 V ( y )( Ny + B , z 1 , z 2 ) � �� � D V ( y )( Ny + B ) β − 1 D 2 V ( y )( Nz 2 , z 1 ) + D 2 V ( y )( Nz 1 , z 2 ) � �� � D V ( y )( Ny + B ) = 0 . β For y = 0, using the representation D 2 V (0)( z 1 , z 2 ) = � z 1 , Π z 2 � , where Π: Y → Y , we obtain an algebraic Riccati equation: A ∗ Π + Π A + Id − 1 β Π BB ∗ Π = 0 . (ARE) It has a unique self-adjoint and non-negative solution.

  10. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Taylor expansion Third differentiation of (HJB), at y = 0: D 3 V (0)( Az 3 , z 1 , z 2 ) + D 3 V (0)( Az 2 , z 1 , z 3 ) + D 3 V (0)( Az 1 , z 2 , z 3 ) − 1 D 3 V (0)( B , z 1 , z 3 ) + D 2 V (0)( Nz 3 , z 1 ) + D 2 V (0)( Nz 1 , z 3 ) D 2 V (0)( B , z 2 ) � � β − 1 D 3 V (0)( B , z 2 , z 3 ) + D 2 V (0)( Nz 3 , z 2 ) + D 2 V (0)( Nz 2 , z 3 ) D 2 V (0)( B , z 1 ) � � β − 1 D 3 V (0)( B , z 1 , z 2 ) + D 2 V (0)( Nz 2 , z 1 ) + D 2 V (0)( Nz 1 , z 2 ) D 2 V (0)( B , z 3 ) = 0 . � � β We set: A Π = A − 1 β BB ∗ Π, we obtain: T 3 ( A Π z 1 , z 2 , z 3 ) + T 3 ( z 1 , A Π z 2 , z 3 ) + T 3 ( z 1 , z 2 , A Π z 3 ) = 1 ∀ ( z 1 , z 2 , z 3 ) ∈ D ( A ) 3 , 2 β R 3 ( z 1 , z 2 , z 3 ) , where the trilinear form R 3 : Y 3 → R is determined by Π, N , and B .

  11. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Taylor expansion Differentiation of order j of (HJB), at y = 0: T j ( A Π z 1 , z 2 , ..., z k ) + ... + T j ( z 1 , ..., z k − 1 , A Π z k ) = 1 ∀ ( z 1 , ..., z j ) ∈ D ( A ) j . 2 β R j ( z 1 , ..., z j ) , (GLE( j )) Properties of the derived generalized Lyapunov equations : linear equation computable right-hand side: the multilinear form R j : Y j → R is explicitely determined by Π, D 3 V (0),..., D j − 1 V (0), N , and B .

  12. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Theorem There exists a unique sequence ( T j ) j =3 , 4 ,... of symmetric bounded multilinear forms such that T j : Y j → R is a solution to GLE ( j ) . Proof. Representation formula: � ∞ � � e A π t z 1 , ..., e A π z k T j ( z 1 , ..., z k ) = − R j d t . 0 Remark: the well-posedness of the GLEs can be established without knowledge regarding the differentiability of V .

  13. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Feedback law Polynomial V k of degree k : V k ( y ) = � k 1 j ! T j ( y , ..., y ) . k =2 Feedback law u k of order k : u k : y ∈ Y �→ u k ( y ) = − 1 β D V k ( y )( Ny + B ) . Closed-loop system of order k : y k ( t ) = Ay k ( t ) + ( Ny k ( t ) + B ) u k ( y k ( t )) , ˙ y k (0) = y 0 . Open-loop control U k ( y 0 ) generated by the feedback u k and y 0 : U k ( y 0 ; t ) = u k ( y k ( t )) .

  14. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm 1 Taylor expansions and feedback laws 2 Numeric results 3 Elements of analysis 4 Receding-horizon algorithm

  15. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Numerical approach 1 Discretize the operators A , N , and B in such a way that the bilinear structure is preserved (e.g. with finite differences) 2 Find a reduced-order model with a generalization of the balanced truncation method: � ∞ 1 R n + β 2 � C r y r ( t ) � 2 2 | u ( t ) | 2 d t , u ∈ L 2 (0 , ∞ ) J ( u , y 0 ) := inf 0 � ˙ y r ( t ) = A r y r ( t ) + N r y r ( t ) u ( t ) + B r u ( t ) , where: y r (0) = y 0 , r ∈ Y . 3 Solve the reduced GLE with a tensor-calculus technique .

  16. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Lyapunov equations The associated reduced GLE of order k : T k , r ( A Π , r z 1 , z 2 , ..., z k ) + ... + T k , r ( z 1 , ..., z k − 1 , A Π , r z k ) 1 = 2 β R k , r ( z 1 , ..., z k ) is equivalent to a linear system with r k variables. Solution: � ∞ R k , r ( e A Π , r t z 1 , ..., e A Π , r t z k )d t . T k , r ( z 1 , ..., z k ) = − 0 An approximation is given by: ℓ � w i R k , r ( e A Π , r t i z 1 , ..., e A Π , r t i z k ) , i = − ℓ for an appropriate choice of points t i and weights w i .

  17. Taylor expansions and feedback laws Numeric results Elements of analysis Receding-horizon algorithm Fokker-Planck equation Controlled Fokker-Planck equation : ∂ρ ∂ t = ν ∆ ρ + ∇ · ( ρ ∇ G ) + u ∇ · ( ρ ∇ α j ) in Ω × (0 , ∞ ) , 0 = ( ν ∇ ρ + ρ ∇ G ) · � n on Γ × (0 , ∞ ) , ρ ( x , 0) = ρ 0 ( x ) in Γ , where Ω ∈ R d denotes a bounded domain with smooth boundary Γ. For all t , ρ ( · , t ) is the probability density function of X t , sol. to √ d X ( t ) = −∇ x V ( X ( t ) , t )d t + 2 ν d W t , where the potential V is controlled by u : V ( x , t ) = G ( x ) + u ( t ) α ( x ) , ∀ x ∈ Ω , ∀ t ≥ 0 .

Recommend


More recommend