expectations or guarantees i want it all a crossroad
play

Expectations or Guarantees? I Want It All! A Crossroad between Games - PowerPoint PPT Presentation

Expectations or Guarantees? I Want It All! A Crossroad between Games and MDPs V. Bruy` ere (UMONS) E. Filiot (ULB) M. Randour (UMONS-ULB) J.-F. Raskin (ULB) Grenoble - 05.04.2014 SR 2014 - 2nd International Workshop on Strategic Reasoning


  1. Expectations or Guarantees? I Want It All! A Crossroad between Games and MDPs V. Bruy` ere (UMONS) E. Filiot (ULB) M. Randour (UMONS-ULB) J.-F. Raskin (ULB) Grenoble - 05.04.2014 SR 2014 - 2nd International Workshop on Strategic Reasoning

  2. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion The talk in two slides (1/2) Verification and synthesis: � a reactive system to control , � an interacting environment , � a specification to enforce . Focus on quantitative properties . Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 1 / 26

  3. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion The talk in two slides (1/2) Verification and synthesis: � a reactive system to control , � an interacting environment , � a specification to enforce . Focus on quantitative properties . Several ways to look at the interactions, and in particular, the nature of the environment . Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 1 / 26

  4. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion The talk in two slides (2/2) Games MDPs → antagonistic adversary → stochastic adversary → guarantees on worst-case → optimize expected value Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 2 / 26

  5. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion The talk in two slides (2/2) Games MDPs → antagonistic adversary → stochastic adversary → guarantees on worst-case → optimize expected value ∧ BWC synthesis → ensure both Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 2 / 26

  6. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion The talk in two slides (2/2) Games MDPs → antagonistic adversary → stochastic adversary → guarantees on worst-case → optimize expected value ∧ BWC synthesis → ensure both Studied Mean-Payoff Shortest Path value functions Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 2 / 26

  7. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Advertisement Featured in STACS’14 [BFRR14] Full paper available on arXiv: abs/1309.5439 Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 3 / 26

  8. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion 1 Context 2 BWC Synthesis 3 Mean-Payoff 4 Shortest Path 5 Conclusion Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 4 / 26

  9. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion 1 Context 2 BWC Synthesis 3 Mean-Payoff 4 Shortest Path 5 Conclusion Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 5 / 26

  10. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  11. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  12. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  13. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  14. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  15. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  16. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore Then, (2 , 5 , 2) ω machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  17. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Markov decision processes MDP P = ( G , S 1 , S ∆ , ∆) with ∆: S ∆ → D ( S ) 2 2 � P 1 states = 5 � stochastic states = MDP = game + strategy of P 2 − 1 7 � P = G [ λ 2 ] − 4 1 2 1 2 Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 7 / 26

  18. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Markov chains MC M = ( G , δ ) with δ : S → D ( S ) MC = MDP + strategy of P 1 = game + both strategies 2 2 � M = P [ λ 1 ] = G [ λ 1 , λ 2 ] 1 5 4 3 4 − 1 7 − 4 1 2 1 2 Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 8 / 26

  19. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Markov chains MC M = ( G , δ ) with δ : S → D ( S ) MC = MDP + strategy of P 1 = game + both strategies 2 2 � M = P [ λ 1 ] = G [ λ 1 , λ 2 ] 1 5 4 Event A ⊆ Plays( G ) 3 4 � probability P M − 1 s init ( A ) 7 − 4 1 Measurable f : Plays( G ) → R ∪ {−∞ , ∞} 2 1 � expected value E M s init ( f ) 2 Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 8 / 26

  20. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Classical interpretations System trying to ensure a specification = P 1 � whatever the actions of its environment Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 9 / 26

  21. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Classical interpretations System trying to ensure a specification = P 1 � whatever the actions of its environment The environment can be seen as � antagonistic two-player game, worst-case threshold problem for µ ∈ Q ∃ ? λ 1 ∈ Λ 1 , ∀ λ 2 ∈ Λ 2 , ∀ π ∈ Outs G ( s init , λ 1 , λ 2 ) , f ( π ) ≥ µ Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 9 / 26

  22. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Classical interpretations System trying to ensure a specification = P 1 � whatever the actions of its environment The environment can be seen as � antagonistic two-player game, worst-case threshold problem for µ ∈ Q ∃ ? λ 1 ∈ Λ 1 , ∀ λ 2 ∈ Λ 2 , ∀ π ∈ Outs G ( s init , λ 1 , λ 2 ) , f ( π ) ≥ µ � fully stochastic MDP, expected value threshold problem for ν ∈ Q ∃ ? λ 1 ∈ Λ 1 , E P [ λ 1 ] s init ( f ) ≥ ν Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 9 / 26

  23. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion 1 Context 2 BWC Synthesis 3 Mean-Payoff 4 Shortest Path 5 Conclusion Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 10 / 26

  24. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion What if you want both? In practice, we want both 1 nice expected performance in the everyday situation, 2 strict (but relaxed) performance guarantees even in the event of very bad circumstances. Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 11 / 26

Recommend


More recommend