pac statistical model checking for markov
play

PAC Statistical Model Checking for Markov Decision Processes and - PowerPoint PPT Presentation

PAC Statistical Model Checking for Markov Decision Processes and Stochastic Games 1 Pranav Ashok, Jan K ret nsk y, Maximilian Weininger Technical University of Munich Highlights of Logic, Automata and Games Warsaw, Poland September


  1. PAC Statistical Model Checking for Markov Decision Processes and Stochastic Games 1 Pranav Ashok, Jan Kˇ ret´ ınsk´ y, Maximilian Weininger Technical University of Munich Highlights of Logic, Automata and Games Warsaw, Poland September 19, 2019 1 based on paper presented at CAV 2019

  2. Stochastic Game Reachability 0 . 2 a b c 0 . 8 Objective player: maximize P(F ) player: minimize P(F ) Reachability in limited information stochastic games 2/6

  3. Stochastic Game Reachability 0 . 2 a b c 0 . 8 Objective player: maximize P(F ) player: minimize P(F ) Reachability in limited information stochastic games 2/6

  4. Stochastic Game Reachability 0 . 2 a b c 0 . 8 Objective player: maximize P(F ) player: minimize P(F ) Reachability in limited information stochastic games 2/6

  5. This work: Black-box (limited information setting) Unknown successor distribution Problem statement Compute V ( s ) = max σ min τ P σ,τ ( F ) = min τ max σ P σ,τ ( F ) s s with guarantees Reachability in limited information stochastic games 3/6

  6. Background ◮ Seminal paper on Stochastic Games [ Condon 90 ] quadratic programming, strategy iteration, value iteration Reachability in limited information stochastic games 4/6

  7. Background ◮ Seminal paper on Stochastic Games [ Condon 90 ] quadratic programming, strategy iteration, value iteration ◮ Algos not directly applicable on general SG ◮ First practical algorithm for general SG giving guarantees [ Kelmendi et. al. 2018 ] Reachability in limited information stochastic games 4/6

  8. Background ◮ Seminal paper on Stochastic Games [ Condon 90 ] quadratic programming, strategy iteration, value iteration ◮ Algos not directly applicable on general SG ◮ First practical algorithm for general SG giving guarantees [ Kelmendi et. al. 2018 ] ◮ This work: first algorithm for limited information SG Reachability in limited information stochastic games 4/6

  9. The Algorithm Similar to Kelmendi et. al. 2018 while U − L is large 1. Simulate and estimate 2. Back-propagate Reachability in limited information stochastic games 5/6

  10. The Algorithm Similar to Kelmendi et. al. 2018 while U − L is large 1. Simulate and estimate 2. Back-propagate The how ◮ Simulation finds important parts of state space Reachability in limited information stochastic games 5/6

  11. The Algorithm Similar to Kelmendi et. al. 2018 while U − L is large 1. Simulate and estimate 2. Back-propagate The how ◮ Simulation finds important parts of state space ◮ Simulation computes Hoeffding confidence intervals ball around estimate such that real prob. falls in the ball with high confidence Reachability in limited information stochastic games 5/6

  12. The Algorithm Similar to Kelmendi et. al. 2018 while U − L is large 1. Simulate and estimate 2. Back-propagate The how ◮ Simulation finds important parts of state space ◮ Simulation computes Hoeffding confidence intervals ball around estimate such that real prob. falls in the ball with high confidence ◮ Information conservatively back-propagated Reachability in limited information stochastic games 5/6

  13. The Algorithm Similar to Kelmendi et. al. 2018 while U − L is large 1. Simulate and estimate 2. Back-propagate The how ◮ Simulation finds important parts of state space ◮ Simulation computes Hoeffding confidence intervals ball around estimate such that real prob. falls in the ball with high confidence ◮ Information conservatively back-propagated ◮ Other tricks to ensure fixpoint convergence Reachability in limited information stochastic games 5/6

  14. Conclusion ◮ Algorithm for reachability in limited information MDP/SG result ∈ [0 . 6 − ǫ, 0 . 6 + ǫ ] with prob of going wrong 10 − 8 ◮ Implemented and benchmarked in PRISM Model Checker ◮ First algorithm to do so for SG ◮ First practical algorithm for MDPs Reachability in limited information stochastic games 6/6

Recommend


More recommend