Dirac-based Reduction Techniques for Quantitative Analysis of Discrete- time Markov Models Mohmmadsadegh Mohagheghi And Behrang Chaboki Vali-e-Asr University of Rafsanjan
Probabilistic Model Checking
Probabilistic Model Checking • A Markov Decision Process (MDP) is a tuple M = (S, s 0 , Act, P, R) where: • S is a set of states, • s 0 is the initial state, • Act is a finite set of actions, • P is a probabilistic transition function, • R is a reward function.
Probabilistic Model Checking • A policy is used to resolve non-deterministic choices of an MDP. • A policy 𝜌: 𝑇 → 𝐵𝑑𝑢 selects one action for each state s. • For an MDP M, every possible policy 𝜌 induces a quotient DTMC.
Numeric Computations • Reachability Probabilities: The (maximal or minimal) probability of finally reaching one of the goal states [For MDPs] • Expected Rewards: The (maximal or minimal) expectation of accumulated rewards until reaching a goal state
Numeric Computations • Extremal Reachablity Probabilities & Expected Rewards Solving a Linear Program (Exact Solutions) Using Iterative Methods (In practice)
Jacobi Iterative Method (DTMCs) • Starting from an initial vector 𝑊 of value-states, in each iteration and every state 𝑡 ∈ 𝑇 update 𝑊 𝑡 as: α, V P(s, s' ).V s s' α) s' Post(s, α, V R(s) P(s, s' ).V s s' α) s' Post(s,
Value Iteration (MDPs) • Starting from an initial vector 𝑊 of value-states, in each iteration and every state 𝑡 ∈ 𝑇 update 𝑊 𝑡 as: α, V Max ( P(s, s' ).V ) α s Act(s) s' α) s' Post(s, α, V Max (R(s) P(s, s' ).V ) α s Act(s) s' α) s' Post(s,
Value Iteration (MDPs) • Starting from an initial vector 𝑊 of value-states, in each iteration and every state s of S update 𝑊 𝑡 as: α, V Max (R(s) P(s, s' ).V ) α s Act(s) s' α) s' Post(s, • Terminate criterion: old Max (V - V ) s S s s for some tiny ε (10^-6)
Policy Iteration Select a policy 𝜌 repeat Compute the values of the induced DTMC Update 𝜌 Until no change in policies
Dirac-based Reduction Technique • Idea: If 𝑄(𝑡, 𝑡′) = 1 in a DTMC, reachability probability of s and 𝑡′ are equal.
Dirac-based Reduction Technique • Dirac transitions are used to classify 𝑇 . • The states of each class are connected with Dirac transitions and have the same reachability probabilities. • Apply iterative commutations on the reduced DTMC
Dirac-based Reduction Technique • DTMC reduction can be used for policy iteration • Time complexity: Linear in the size of DTMCs
Dirac-based Reduction Technique • Expected rewards: If 𝑄(𝑡, 𝑡′) = 1 then 𝑊 𝑡′ = 𝑊 𝑡 + 𝑆(𝑡) • State-rewards should be modified for reduced DTMCs
Experimental Results • We implemented Dirac-based methods in PRISM. • Available in: https://github.com/sadeghrk/prism/tree/Dira cBased-Improving
Experimental Results
Questions?
Recommend
More recommend