unconstrained and constrained optimal control of
play

Unconstrained and Constrained Optimal Control of Piecewise - PowerPoint PPT Presentation

Unconstrained and Constrained Optimal Control of Piecewise Deterministic Markov Processes Oswaldo Costa, Franois Dufour, Alexey Piunovskiy Universidade de Sao Paulo Institut de Mathmatiques de Bordeaux INRIA Bordeaux Sud-Ouest University


  1. Unconstrained and Constrained Optimal Control of Piecewise Deterministic Markov Processes Oswaldo Costa, François Dufour, Alexey Piunovskiy Universidade de Sao Paulo Institut de Mathématiques de Bordeaux INRIA Bordeaux Sud-Ouest University of Liverpool This study has been carried out with financial support from the French State, managed by the French National Research Agency (ANR) in the frame of the "Investments for the future" Programme IdEx Bordeaux - CPU (ANR-10-IDEX-03-02)

  2. Outline 1. Piecewise deterministic Markov processes ◮ Introduction ◮ Parameters of the model ◮ Construction of the controlled process ◮ Admissible strategies 2. Optimization problems ◮ Unconstrained and constrained problems ◮ Assumptions 3. Non explosion 4. The unconstrained problem and the dynamic programming approach 5. The constrained problem and the linear programming approach Workshop on switching dynamics & verification - IHP - January 28-29, 2016 2 / 42

  3. Controlled piecewise deterministic Markov processes Introduction Davis (80’s) General class of non-diffusion stochastic hybrid models: deterministic trajectory punctuated by random jumps. Applications Engineering systems, biology, operations research, management science, economics, dependability and safety, . . . Workshop on switching dynamics & verification - IHP - January 28-29, 2016 3 / 42

  4. Controlled piecewise deterministic Markov processes Parameters of the model ◮ the state space: X open subset of R d (boundary ∂ X ). ◮ the flow: φ ( x , t ) : R d × R → R d satisfying φ ( x , t + s ) = φ ( φ ( x , s ) , t ) for all x ∈ R d and ( t , s ) ∈ R 2 . → active boundary: ∆ = { z ∈ ∂ X : z = φ ( x , t ) for some x ∈ X and t ∈ R ∗ + } . For x ∈ X . = X ∪ ∆, t ∗ ( x ) = inf { t ∈ R + : φ ( x , t ) ∈ ∆ } . ◮ A is the action space, assumed to be a Borel space. A g ∈ B ( A ) (respectively A i ∈ B ( A ) ) is the set of gradual or continuous (respectively impulsive ) actions satisfying A = A i + A g . Workshop on switching dynamics & verification - IHP - January 28-29, 2016 4 / 42

  5. Controlled piecewise deterministic Markov processes Parameters of the model ◮ The set of feasible actions in state x ∈ X is A ( x ) ⊂ A . Let us introduce the following sets K = K i ∪ K g with K g = { ( x , a ) ∈ X × A g : a ∈ A ( x ) } K i = { ( x , a ) ∈ ∆ × A i : a ∈ A ( x ) } ◮ The jumps intensity λ which is a R + -valued measurable function defined on K g . ◮ The stochastic kernel Q on X given K satisfying Q ( X \ { x }| x , a ) = 1 for any ( x , a ) ∈ K g . It describes the state of the process after any jump. Workshop on switching dynamics & verification - IHP - January 28-29, 2016 5 / 42

  6. Controlled piecewise deterministic Markov processes Uncontrolled process Definition of a PDMP Parameters: flow φ , intensity of the jumps λ , transition kernel Q E E x 0 Workshop on switching dynamics & verification - IHP - January 28-29, 2016 6 / 42

  7. Controlled piecewise deterministic Markov processes Uncontrolled process Definition of a PDMP Parameters: flow φ , intensity of the jumps λ , transition kernel Q E E T 1 x 0 Workshop on switching dynamics & verification - IHP - January 28-29, 2016 6 / 42

  8. Controlled piecewise deterministic Markov processes Uncontrolled process Definition of a PDMP Parameters: flow φ , intensity of the jumps λ , transition kernel Q E E T 1 Q x 0 x 1 Workshop on switching dynamics & verification - IHP - January 28-29, 2016 6 / 42

  9. Controlled piecewise deterministic Markov processes Uncontrolled process Definition of a PDMP Parameters: flow φ , intensity of the jumps λ , transition kernel Q E E T 2 T 1 Q x 0 x 1 Workshop on switching dynamics & verification - IHP - January 28-29, 2016 6 / 42

  10. Controlled piecewise deterministic Markov processes Uncontrolled process Definition of a PDMP Parameters: flow φ , intensity of the jumps λ , transition kernel Q E E T 2 T 1 Q x 0 x 1 Workshop on switching dynamics & verification - IHP - January 28-29, 2016 6 / 42

  11. Controlled piecewise deterministic Markov processes Construction of the controlled process + × X ) ∞ � with � � ∞ � � � X × ( R ∗ The canonical space Ω = n =1 Ω n + × X ) n × ( {∞} × { x ∞ } ) ∞ . Ω n = X × ( R ∗ Introduce the mappings X n : Ω → X ∞ = X ∪ { x ∞ } by X n ( ω ) = x n and Θ n : Ω → R ∗ + by Θ n ( ω ) = θ n ; Θ 0 ( ω ) = 0 where ω = ( x 0 , θ 1 , x 1 , θ 2 , x 2 , . . . ) ∈ Ω . n n � � In addition T n ( ω ) = Θ i ( ω ) = θ i with T ∞ ( ω ) = lim n →∞ T n ( ω ). i =1 i =1 H n is the set of path up to n . H n = ( X 0 , Θ 1 , X 1 , . . . , Θ n , X n ) is the history of the process up to n . Workshop on switching dynamics & verification - IHP - January 28-29, 2016 7 / 42

  12. Controlled piecewise deterministic Markov processes Construction of the process � ξ t � The controlled process t ∈ R + : � φ ( X n , t − T n ) if T n ≤ t < T n +1 for n ∈ N ; ξ t ( ω ) = x ∞ , if T ∞ ≤ t . The flow is not controlled. Workshop on switching dynamics & verification - IHP - January 28-29, 2016 8 / 42

  13. Controlled piecewise deterministic Markov processes Admissible strategies and conditional distribution An admissible control strategy is a sequence u = ( π n , γ n ) n ∈ N such that, for any n ∈ N , ◮ π n is a stochastic kernel on A g given H n × R ∗ + : π n ( da | h n , t ) = 1 for t ∈ ]0 , t ∗ ( x n )[, ◮ γ n is a stochastic kernel on A i given H n : γ n ( da | h n ) = 1 where h n = ( x 0 , θ 1 , x 1 , . . . θ n , x n ) ∈ H n . The set of admissible control strategies is denoted by U . Workshop on switching dynamics & verification - IHP - January 28-29, 2016 9 / 42

  14. Controlled piecewise deterministic Markov processes Admissible strategies and conditional distribution For an admissible control strategy u = ( π n , γ n ) n ∈ N , we can equivalently consider the random processes with values in P ( A g ) and P ( A i ) respectively as � π ( da | t ) = I { T n < t ≤ T n +1 } π n ( da | H n , t − T n ) n ∈ N and � γ ( da | t ) = I { T n < t ≤ T n +1 } γ n ( da | H n ) , n ∈ N for t ∈ R ∗ + . Workshop on switching dynamics & verification - IHP - January 28-29, 2016 10 / 42

  15. Controlled piecewise deterministic Markov processes Admissible strategies and conditional distribution � π n , γ n � Interaction of u = n ∈ N and the parameters of the model: ◮ the intensity of jumps � λ u n ( h n , t ) = A g λ ( φ ( x n , t ) , a ) π n ( da | h n , t ) , and the corresponding rate of jumps � Λ u λ u n ( h n , t ) = n ( h n , s ) ds , ]0 , t ] ◮ the distribution of the state after a ( stochastic ) jump � 1 Q g , u ( dx | h n , t ) = A g Q ( dx | φ ( x n , t ) , a ) λ ( φ ( x n , t ) , a ) π n ( da | h n , t ) n λ u n ( h n , t ) ◮ the distribution of the state after a ( boundary ) jump � Q i , u A i Q ( dx | φ ( x n , t ∗ ( x n )) , a ) γ n ( da | h n ) . n ( dx | h n ) = Workshop on switching dynamics & verification - IHP - January 28-29, 2016 11 / 42

  16. Controlled piecewise deterministic Markov processes Admissible strategies and conditional distribution We want the joint distribution of the next sojourn time and state be given by G n G n (Γ 1 × Γ 2 | h n ) � � I { x n = x ∞ } + e − Λ u n ( h n , + ∞ ) I { x n ∈ X } I { t ∗ ( x n )= ∞} = δ (+ ∞ , x ∞ ) (Γ 1 × Γ 2 ) � n (Γ 2 | h n ) e − Λ u n ( h n , t ∗ ( x n )) I { t ∗ ( x n ) < ∞} δ t ∗ ( x n ) (Γ 1 ) Q i , u + I { x n ∈ X } � � n ( h n , t ) e − Λ u n ( h n , t ) dt Q g , u (Γ 2 | h n , t ) λ u + , n ]0 , t ∗ ( x n )[ ∩ Γ 1 where Γ 1 ∈ B ( R ∗ + ), Γ 2 ∈ B ( X ∞ ) and h n = ( x 0 , θ 1 , x 1 , . . . , θ n , x n ) ∈ H n . Workshop on switching dynamics & verification - IHP - January 28-29, 2016 12 / 42

  17. Controlled piecewise deterministic Markov processes Admissible strategies and conditional distribution Consider an admissible strategy u ∈ U and an initial state x 0 ∈ X � � ? � � Γ 1 × Γ 2 � � P u � F T n � H n (Θ n +1 , X n +1 ) ∈ Γ 1 × Γ 2 = G n x 0 = ⇒ the conditional distribution of (Θ n +1 , X n +1 ) given F T n under P u x 0 is G n ( ·| H n ) ( {F t } is the natural filtration of the process). Workshop on switching dynamics & verification - IHP - January 28-29, 2016 13 / 42

  18. Controlled piecewise deterministic Markov processes Admissible strategies and conditional distribution Consider an admissible strategy u ∈ U and an initial state x 0 ∈ X . There exists a probability P u x 0 on (Ω , F ) such that � { X 0 = x 0 } � P u = 1 x 0 and the positive random measure ν defined on R ∗ + × X by � G n ( dt − T n , dx | H n ) ν ( dt , dx ) = G n ([ t − T n , + ∞ ] × X ∞ | H n ) I { T n < t ≤ T n +1 } n ∈ N is the compensator of � µ ( dt , dx ) = I { T n ( ω ) < ∞} δ ( T n ( ω ) , X n ( ω )) ( dt , dx ) . n ≥ 1 with respect to P u x 0 (Jacod, Multivariate point processes , 1975). Workshop on switching dynamics & verification - IHP - January 28-29, 2016 14 / 42

Recommend


More recommend