reduction of continuous time control to
play

Reduction of continuous-time control to Problem discrete-time - PowerPoint PPT Presentation

Reduction of continuous- time control to discrete-time control A. Jean-Marie Reduction of continuous-time control to Problem discrete-time control statement The model Uniformization A. Jean-Marie Event model Application OCOQS


  1. Reduction of continuous- time control to discrete-time control A. Jean-Marie Reduction of continuous-time control to Problem discrete-time control statement The model Uniformization A. Jean-Marie Event model Application OCOQS Meeting, 24 January 2012

  2. Outline Reduction of continuous- time control to discrete-time Problem statement 1 control A. Jean-Marie The basic model 2 Problem statement The model Uniformization 3 Uniformization Event model Application Event model 4 Application 5

  3. Progress Reduction of continuous- time control to discrete-time Problem statement 1 control A. Jean-Marie The basic model 2 Problem statement The model Uniformization 3 Uniformization Event model Application Event model 4 Application 5

  4. Problem statement Reduction of continuous- time control to discrete-time control Consider some continuous-time, discrete-event, infinite-horizon A. Jean-Marie control problem. Problem statement The standard way to analyze such problems is to reduce them The model to a discrete-time problem using some embedding of a Uniformization discrete-time process into the continuous-time one. Event model Application The optimal policy is deduced from the solution of the discrete-time problem.

  5. Problem statement (ctd) Reduction of continuous- time control to discrete-time There are various ways to place the observation points: control jump instants, A. Jean-Marie controllable event instants, Problem statement uniformization instants. The model They may result in different value functions. Uniformization Event model Question Application Is there a way to “play” with the embedding process in order to obtain structural properties of the optimal policy?

  6. Progress Reduction of continuous- time control to discrete-time Problem statement 1 control A. Jean-Marie The basic model 2 Problem statement The model Uniformization 3 Uniformization Event model Application Event model 4 Application 5

  7. A basic continuous-time control model Reduction of As a starting point, consider: continuous- time control a continuous-time, piecewise-constant process to discrete-time { X ( t ); t ≥ 0 } over some discrete state space X ; control A. Jean-Marie a sequence of decision instants { T n ; n ∈ N } , endogenous a finite set of actions A ; Problem statement at a decision point t , given the current state x = X ( t ), The model there is a feasible set of actions A x ⊂ A . Uniformization Assuming that action a ∈ A s is applied, Event model a reward r ( x , a , y ) is obtained; Application the state jumps to a random T a ( x ) with distribution P xay = P ( T a ( x ) = y ); given y , the next decision point is at t + τ , where τ has an exponential distribution with parameter λ y . between decision points, a reward is accumulated at ℓ ( x ( t )), piecewise constant by assumption.

  8. Basic model (ctd.) Reduction of continuous- time control to discrete-time Reward criterion: expected total discounted reward. Given control X (0) = x , A. Jean-Marie � � ∞ Problem statement e − α t ℓ ( X ( t ))d t J ( x ) = E The model 0 ∞ Uniformization � � e − α T n r ( X ( T − n ) , A ( T n ) , X ( T + + n )) . Event model Application n =1 The goal is to find the optimal feedback control d : X → A (with the constraint that d ( x ) ∈ A x for all x ) to maximize J .

  9. Basic embedding Reduction of continuous- time control Features of this model: to discrete-time control is instantaneous and localized in time control A. Jean-Marie evolution is strictly Markovian Problem immediate generalization to semi-Markov statement decision/transition instants. The model Two possibilities for the observation of the process: Uniformization Event model just before a transition/control: → V − ( x ) Application just after a transition/control: → V + ( x ) Question: What is their relation with J ( x )?

  10. Direct Bellman equations Reduction of continuous- time control to discrete-time control Conditioning on T 1 , the first decision point, we get: A. Jean-Marie 1 Problem V + ( x ) ℓ ( x ) + λ x V − ( x ) � � = statement α + λ x The model �� � V − ( x ) r ( x , a , y ) + V + ( y ) Uniformization � � = max P xay a ∈A x Event model y Application r ( x , a , T a ( x )) + V + ( T a ( x )) � � �� = max E . a ∈A x

  11. Basic functional equations Reduction of Eliminating V + or V − leads to two forms of Bellman’s continuous- time control equation: to discrete-time control Bellman Equations A. Jean-Marie Problem 1 � statement V + ( x ) = ℓ ( x ) α + λ x The model Uniformization � � � r ( x , a , y ) + V + ( y ) � + λ x max P xay Event model a ∈A x y Application � � V − ( x ) = max P xay r ( x , a , y ) a ∈A x y 1 � � ℓ ( y ) + λ y V − ( y ) � + . α + λ y

  12. Progress Reduction of continuous- time control to discrete-time Problem statement 1 control A. Jean-Marie The basic model 2 Problem statement The model Uniformization 3 Uniformization Event model Application Event model 4 Application 5

  13. Uniformization ` a la carte For each state x , define ν x ≥ λ x and introduce a new, Reduction of continuous- uncontrollable transition point after τ ∼ Exp( ν x ). time control to Extend the state space to X × { r , u } , discrete-time control r = regular event, u = uniformization event. A. Jean-Marie Table of rewards and transition probabilities: Problem statement x ′ y ′ r ( x ′ , a , y ′ ) a P x ′ ay ′ The model λ y Uniformization ( x , r ) a ( y , r ) r ( x , a , y ) P xay ν y Event model ν y − λ y Application ( x , r ) a ( y , u ) r ( x , a , y ) P xay ν y λ y ( x , u ) ∗ ( x , r ) 0 ν y ν y − λ y ( x , u ) ∗ ( y , u ) 0 ν y Running reward: ℓ ( x , e ) = ℓ ( x ); transition rate: λ ( x , e ) = ν x .

  14. Relationships Reduction of continuous- time control to discrete-time Lemma control A. Jean-Marie Let V ( · ) be the direct value function and V u ( · , · ) be the uniformized value function. Then: Problem statement The model V − V − ( x ) u ( x , r ) = Uniformization V − V + ( x ) u ( x , u ) = Event model 1 Application V + ( ℓ ( x ) + ν x V − ( x )) u ( x , r ) = α + ν x 1 V + ( ℓ ( x ) + ν x V + ( x )) . u ( x , u ) = α + ν x

  15. Interpretations No uniformization ( λ x = µ x ): Reduction of continuous- time control 1 V + ( ℓ ( x ) + λ x V − ( x )) = V + ( x ) to u ( x , r ) = discrete-time α + λ x control �� T 1 � A. Jean-Marie V + e − α u ℓ ( x )d u + e − α T 1 V + ( x ) u ( x , u ) = . E Problem 0 statement Hyper-frequent uniformization ( ν x → ∞ ): The model ν x →∞ V + V − ( x ) = V − lim u ( x , r ) = u ( x , u ) Uniformization Event model ν x →∞ V + V + ( x ) = V − lim u ( x , u ) = u ( x , r ) . Application No discounting ( α → 0): ℓ ( x ) V + + V − ( x ) u ( x , r ) ∼ ν x ℓ ( x ) V + + V + ( x ) . u ( x , u ) ∼ ν x

  16. Bellman equations for the uniformized process Reduction of Lemma continuous- time control The basic value functions V + and V − satisfy: to discrete-time control 1 � A. Jean-Marie V + ( x ) ℓ ( x ) + ( ν x − λ x ) V + ( x ) = α + ν x Problem statement � � � r ( x , a , y ) + V + ( y ) � The model + λ x max P xay a ∈A x Uniformization y Event model 1 � V − ( x ) ( ν x − λ x ) V − ( x ) Application = α + ν x � � + ( α + λ x ) max P xay r ( x , a , y ) + a ∈A x y 1 �� ( ℓ ( y ) + λ y V − ( y )) α + λ y

  17. Progress Reduction of continuous- time control to discrete-time Problem statement 1 control A. Jean-Marie The basic model 2 Problem statement The model Uniformization 3 Uniformization Event model Application Event model 4 Application 5

  18. The event model If transitions have several “types”, the strictly markovian model Reduction of continuous- requires to extend the state space: x = ( s , e ) with s the actual time control to system state, and e the event type. We get: discrete-time control A. Jean-Marie 1 � V + ( s , e ) Problem = ℓ ( s , e ) statement α + λ s , e The model � � P (( s , e ); a ; ( s ′ , e ′ )) + λ s , e max Uniformization a ∈A s , e s ′ e ′ Event model � � Application r (( s , e ) , a , ( s ′ , e ′ )) + V + ( s ′ , e ′ ) � � � V − ( s , e ) P (( s , e ); a ; ( s ′ , e ′ )) = max a ∈A s , e s ′ e ′ r (( s , e ) , a , ( s ′ , e ′ )) + ℓ ( s ′ , e ′ ) + λ s ′ , e ′ V − ( s ′ , e ′ ) � � α + λ s ′ , e ′

  19. The event model Reduction of continuous- time control to discrete-time control A. Jean-Marie Question Problem Under which conditions is it possible to “get rid” of the event statement part in the state representation. The model Uniformization Is it possible that: Event model Application V + ( s , e ) = V + ( s ) ∀ e ?

  20. Progress Reduction of continuous- time control to discrete-time Problem statement 1 control A. Jean-Marie The basic model 2 Problem statement The model Uniformization 3 Uniformization Event model Application Event model 4 Application 5

Recommend


More recommend