constrained and unconstrained optimal control of
play

Constrained and Unconstrained Optimal Control of Piecewise - PowerPoint PPT Presentation

Constrained and Unconstrained Optimal Control of Piecewise Deterministic Markov Processes Oswaldo Costa, Franois Dufour, Alexey Piunovskiy Universidade de Sao Paulo Institut de Mathmatiques de Bordeaux INRIA Bordeaux Sud-Ouest University


  1. Constrained and Unconstrained Optimal Control of Piecewise Deterministic Markov Processes Oswaldo Costa, François Dufour, Alexey Piunovskiy Universidade de Sao Paulo Institut de Mathématiques de Bordeaux INRIA Bordeaux Sud-Ouest University of Liverpool This study has been carried out with financial support from the French State, managed by the French National Research Agency (ANR) in the frame of the "Investments for the future" Programme IdEx Bordeaux - CPU (ANR-10-IDEX-03-02)

  2. Outline 1. Controlled piecewise deterministic Markov processes ◮ Introduction ◮ Parameters of the model ◮ Construction of the process ◮ Admissible strategies 2. Optimization problems ◮ Unconstrained and constrained problems ◮ Assumptions 3. Non explosion 4. The unconstrained problem and the dynamic programming approach 5. The constrained problem and the linear programming approach Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 2 / 38

  3. Controlled piecewise deterministic Markov processes Introduction Davis (80’s) General class of non-diffusion dynamic stochastic hybrid models: deterministic trajectory punctuated by random jumps. Applications Engineering systems, biology, operations research, management science, economics, dependability and safety, . . . Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 3 / 38

  4. Controlled piecewise deterministic Markov processes Parameters of the model ◮ the state space: X open subset of R d (boundary ∂ X ). ◮ the flow: φ ( x , t ) : R d × R → R d satisfying φ ( x , t + s ) = φ ( φ ( x , s ) , t ) for all x ∈ R d and ( t , s ) ∈ R 2 . → active boundary: ∆ = { x ∈ ∂ X : x = φ ( y , t ) for some y ∈ X and t ∈ R ∗ + } . For x ∈ X . = X ∪ ∆, t ∗ ( x ) = inf { t ∈ R + : φ ( x , t ) ∈ ∆ } . ◮ A is the action space, assumed to be a Borel space. A i ∈ B ( A ) (respectively A g ∈ B ( A )) is the set of impulsive (respectively gradual) actions satisfying A = A i ∪ A g with A i ∩ A g = ∅ . Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 4 / 38

  5. Controlled piecewise deterministic Markov processes Parameters of the model ◮ The set of feasible actions in state x ∈ X is A ( x ) ⊂ A . Let us introduce the following sets K = K i ∪ K g with K g = { ( x , a ) ∈ X × A g : a ∈ A ( x ) } ∈ B ( X × A g ) , K i = { ( x , a ) ∈ ∆ × A i : a ∈ A ( x ) } ∈ B (∆ × A i ) . ◮ The controlled jumps intensity λ which is a R + -valued measurable function defined on K g . ◮ The stochastic kernel Q on X given K satisfying Q ( X \ { x }| x , a ) = 1 for any ( x , a ) ∈ K g . It describes the state of the process after any jump. Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 5 / 38

  6. Controlled piecewise deterministic Markov processes Uncontrolled process Definition of a PDMP Parameters: flow φ , intensity of the jumps λ , transition kernel Q E ν 0 E ν 1 ( ν 0 , x 0 ) Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 6 / 38

  7. Controlled piecewise deterministic Markov processes Uncontrolled process Definition of a PDMP Parameters: flow φ , intensity of the jumps λ , transition kernel Q E ν 0 E ν 1 T 1 ( ν 0 , x 0 ) Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 6 / 38

  8. Controlled piecewise deterministic Markov processes Uncontrolled process Definition of a PDMP Parameters: flow φ , intensity of the jumps λ , transition kernel Q E ν 0 E ν 1 T 1 Q ν 0 ( ν 0 , x 0 ) ( ν 1 , x 1 ) Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 6 / 38

  9. Controlled piecewise deterministic Markov processes Uncontrolled process Definition of a PDMP Parameters: flow φ , intensity of the jumps λ , transition kernel Q E ν 0 E ν 1 T 2 T 1 Q ν 0 ( ν 0 , x 0 ) ( ν 1 , x 1 ) Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 6 / 38

  10. Controlled piecewise deterministic Markov processes Uncontrolled process Definition of a PDMP Parameters: flow φ , intensity of the jumps λ , transition kernel Q E ν 0 E ν 1 T 2 T 1 Q ν 0 ( ν 0 , x 0 ) ( ν 1 , x 1 ) Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 6 / 38

  11. Controlled piecewise deterministic Markov processes Construction of the process � X × ( R ∗ + × X ) ∞ � � ∞ The canonical space Ω = n =1 Ω n with + × X ) n × ( {∞} × { x ∞ } ) ∞ . Ω n = X × ( R ∗ Introduce the mappings X n : Ω → X ∞ = X ∪ { x ∞ } by X n ( ω ) = x n and Θ n : Ω → R ∗ + by Θ n ( ω ) = θ n ; Θ 0 ( ω ) = 0 where ω = ( x 0 , θ 1 , x 1 , θ 2 , x 2 , . . . ) ∈ Ω . n n � � In addition T n ( ω ) = Θ i ( ω ) = θ i with T ∞ ( ω ) = lim n →∞ T n ( ω ). i =1 i =1 H n is the set of path up to n and H n = ( X 0 , Θ 1 , X 1 , . . . , Θ n , X n ) is the n -term random history process. Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 7 / 38

  12. Controlled piecewise deterministic Markov processes Construction of the process The random measure µ associated with (Θ n , X n ) n ∈ N is a measure defined on R ∗ + × X by � µ ( dt , dx ) = I { T n ( ω ) < ∞} δ ( T n ( ω ) , X n ( ω )) ( dt , dx ) . n ≥ 1 � ξ t � The controlled process t ∈ R + : � φ ( X n , t − T n ) if T n ≤ t < T n +1 for n ∈ N ; ξ t ( ω ) = x ∞ , if T ∞ ≤ t . For t ∈ R + , define F t = σ { H 0 } ∨ σ { µ (]0 , s ] × B ) : s ≤ t , B ∈ B ( X ) } . Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 8 / 38

  13. Controlled piecewise deterministic Markov processes Admissible strategies and conditional distribution An admissible control strategy is a sequence u = ( π n , γ n ) n ∈ N such that, for any n ∈ N , ◮ π n is a stochastic kernel on A g given H n × R ∗ + satisfying π n ( A ( φ ( x n , t )) | h n , t ) = 1 for h n = ( x 0 , θ 1 , x 1 , . . . θ n , x n ) ∈ H n and t ∈ ]0 , t ∗ ( x n )[. ◮ γ n is a stochastic kernel on A i given H n satisfying γ n ( A ( φ ( x n , t ∗ ( x n ))) | h n ) = 1 for h n = ( x 0 , θ 1 , x 1 , . . . θ n , x n ). The set of admissible control strategies is denoted by U . Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 9 / 38

  14. Controlled piecewise deterministic Markov processes Admissible strategies and conditional distribution When an admissible control strategy u = ( π n , γ n ) n ∈ N is considered then π and γ denote the random processes with values in P ( A g ) and P ( A i ) correspondingly as � π ( da | t ) = I { T n < t ≤ T n +1 } π n ( da | H n , t − T n ) n ∈ N and � γ ( da | t ) = I { T n < t ≤ T n +1 } γ n ( da | H n ) , n ∈ N for t ∈ R ∗ + . Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 10 / 38

  15. Controlled piecewise deterministic Markov processes Admissible strategies and conditional distribution � π n , γ n � n ∈ N ∈ U , the intensity of jumps For a strategy u = � λ u A g λ ( φ ( x n , t ) , a ) π n ( da | h n , t ) , n ( h n , t ) = and the rate of jumps � Λ u λ u n ( h n , t ) = n ( h n , s ) ds , ]0 , t ] the distribution of the state after a (stochastic) jump � 1 Q g , u ( dx | h n , t ) = A g Q ( dx | φ ( x n , t ) , a ) λ ( φ ( x n , t ) , a ) π n ( da | h n , t ) n λ u n ( h n , t ) the distribution of the state after a (boundary) jump � A i Q ( dx | φ ( x n , t ∗ ( x n )) , a ) γ n ( da | h n ) . Q i , u n ( dx | h n ) = Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 11 / 38

  16. Controlled piecewise deterministic Markov processes Admissible strategies and conditional distribution Introduce the stochastic kernel G n on R ∗ + × X ∞ given H n , � � I { x n = x ∞ } + e − Λ u n ( h n , + ∞ ) I { x n ∈ X } I { t ∗ ( x n )= ∞} G n (Γ | h n ) = δ (+ ∞ , x ∞ ) (Γ) � � n ( dx | h n ) e − Λ u I Γ ( t , x ) δ t ∗ ( x n ) ( dt ) Q i , u n ( h n , t ∗ ( x n )) + I { x n ∈ X } R ∗ + × X � � n ( h n , t ) e − Λ u n ( h n , t ) dt I Γ ( t , x ) Q g , u ( dx | h n , t ) λ u + , n ]0 , t ∗ ( x n )[ × X where Γ ∈ B ( R ∗ + × X ∞ ) and h n = ( x 0 , θ 1 , x 1 , . . . , θ n , x n ) ∈ H n . G n the joint distribution of the next sojourn time and state? Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 12 / 38

  17. Controlled piecewise deterministic Markov processes Admissible strategies and conditional distribution Consider an admissible strategy u ∈ U and an initial state x 0 ∈ X . There exists a probability P u x 0 on (Ω , F ) such that the restriction of P u x 0 to (Ω , F 0 ) is given by � { x 0 } × ( R ∗ + × X ∞ ) ∞ � P u = 1 x 0 and the positive random measure ν defined on R ∗ + × X by � G n ( dt − T n , dx | H n ) ν ( dt , dx ) = G n ([ t − T n , + ∞ ] × X ∞ | H n ) I { T n < t ≤ T n +1 } n ∈ N is the predictable projection of µ with respect to P u x 0 . → The conditional distribution of (Θ n +1 , X n +1 ) given F T n under P u x 0 is determined by G n ( ·| H n ). Workshop Piecewise Deterministic Markov Processes – Montpellier – May 2015 13 / 38

Recommend


More recommend