time markov models
play

time Markov Models Mohmmadsadegh Mohagheghi And Behrang Chaboki - PowerPoint PPT Presentation

Dirac-based Reduction Techniques for Quantitative Analysis of Discrete- time Markov Models Mohmmadsadegh Mohagheghi And Behrang Chaboki Vali-e-Asr University of Rafsanjan Probabilistic Model Checking Probabilistic Model Checking A Markov


  1. Dirac-based Reduction Techniques for Quantitative Analysis of Discrete- time Markov Models Mohmmadsadegh Mohagheghi And Behrang Chaboki Vali-e-Asr University of Rafsanjan

  2. Probabilistic Model Checking

  3. Probabilistic Model Checking • A Markov Decision Process (MDP) is a tuple M = (S, s 0 , Act, P, R) where: • S is a set of states, • s 0 is the initial state, • Act is a finite set of actions, • P is a probabilistic transition function, • R is a reward function.

  4. Probabilistic Model Checking • A policy is used to resolve non-deterministic choices of an MDP. • A policy 𝜌: 𝑇 → 𝐵𝑑𝑢 selects one action for each state s. • For an MDP M, every possible policy 𝜌 induces a quotient DTMC.

  5. Numeric Computations • Reachability Probabilities: The (maximal or minimal) probability of finally reaching one of the goal states [For MDPs] • Expected Rewards: The (maximal or minimal) expectation of accumulated rewards until reaching a goal state

  6. Numeric Computations • Extremal Reachablity Probabilities & Expected Rewards Solving a Linear Program (Exact Solutions) Using Iterative Methods (In practice)

  7. Jacobi Iterative Method (DTMCs) • Starting from an initial vector 𝑊 of value-states, in each iteration and every state 𝑡 ∈ 𝑇 update 𝑊 𝑡 as:    α, V P(s, s' ).V s s' α) s' Post(s,     α, V R(s) P(s, s' ).V s s' α) s' Post(s,

  8. Value Iteration (MDPs) • Starting from an initial vector 𝑊 of value-states, in each iteration and every state 𝑡 ∈ 𝑇 update 𝑊 𝑡 as:    α, V Max ( P(s, s' ).V )  α s Act(s) s' α) s' Post(s,     α, V Max (R(s) P(s, s' ).V )  α s Act(s) s' α) s' Post(s,

  9. Value Iteration (MDPs) • Starting from an initial vector 𝑊 of value-states, in each iteration and every state s of S update 𝑊 𝑡 as:     α, V Max (R(s) P(s, s' ).V )  α s Act(s) s' α) s' Post(s, • Terminate criterion:   old Max (V - V )  s S s s for some tiny ε (10^-6)

  10. Policy Iteration Select a policy 𝜌 repeat Compute the values of the induced DTMC Update 𝜌 Until no change in policies

  11. Dirac-based Reduction Technique • Idea: If 𝑄(𝑡, 𝑡′) = 1 in a DTMC, reachability probability of s and 𝑡′ are equal.

  12. Dirac-based Reduction Technique • Dirac transitions are used to classify 𝑇 . • The states of each class are connected with Dirac transitions and have the same reachability probabilities. • Apply iterative commutations on the reduced DTMC

  13. Dirac-based Reduction Technique • DTMC reduction can be used for policy iteration • Time complexity: Linear in the size of DTMCs

  14. Dirac-based Reduction Technique • Expected rewards: If 𝑄(𝑡, 𝑡′) = 1 then 𝑊 𝑡′ = 𝑊 𝑡 + 𝑆(𝑡) • State-rewards should be modified for reduced DTMCs

  15. Experimental Results • We implemented Dirac-based methods in PRISM. • Available in: https://github.com/sadeghrk/prism/tree/Dira cBased-Improving

  16. Experimental Results

  17. Questions?

Recommend


More recommend