stv model checking for strategies under imperfect
play

STV: Model Checking for Strategies under Imperfect Information - PowerPoint PPT Presentation

STV: Model Checking for Strategies under Imperfect Information Damian Kurpiewski Institute of Computer Science Polish Academy of Sciences (joint work with Wojtek Jamroga and Micha Knapik) LAMAS, 09/05/2020 Outline 2 / 1 Outline 3 / 1


  1. STV: Model Checking for Strategies under Imperfect Information Damian Kurpiewski Institute of Computer Science Polish Academy of Sciences (joint work with Wojtek Jamroga and Michał Knapik) LAMAS, 09/05/2020

  2. Outline 2 / 1

  3. Outline 3 / 1

  4. ATL: What Agents Can Achieve • ATL: Alternating-time Temporal Logic [Alur et al. 1997-2002] • Temporal logic meets game theory • Main idea: cooperation modalities � � A � � Φ : coalition A has a collective strategy to enforce Φ ❀ Φ can include temporal operators: X (next), F (sometime in the future), G (always in the future), U (strong until) 4 / 1

  5. ATL with incomplete information • Imperfect information ( q ∼ a q ′ ) 5 / 1

  6. ATL with incomplete information • Imperfect information ( q ∼ a q ′ ) • Imperfect recall - agent memory coded within state of the model 5 / 1

  7. ATL with incomplete information • Imperfect information ( q ∼ a q ′ ) • Imperfect recall - agent memory coded within state of the model • Uniform strategies - specify same choices for indistinguishable states: q ∼ a q ′ = ⇒ s a ( q ) = s a ( q ′ ) 5 / 1

  8. ATL with incomplete information • Imperfect information ( q ∼ a q ′ ) • Imperfect recall - agent memory coded within state of the model • Uniform strategies - specify same choices for indistinguishable states: q ∼ a q ′ = ⇒ s a ( q ) = s a ( q ′ ) • Fixpoint equivalences do not hold anymore 5 / 1

  9. ATL with incomplete information • Imperfect information ( q ∼ a q ′ ) • Imperfect recall - agent memory coded within state of the model • Uniform strategies - specify same choices for indistinguishable states: q ∼ a q ′ = ⇒ s a ( q ) = s a ( q ′ ) • Fixpoint equivalences do not hold anymore • Model checking ATL ir is ∆ p 2 -complete 5 / 1

  10. Example - Simple Model of Voting and Coercion ( wait, − ) q 0 ( vote 1 , − ) ( vote 2 , − ) ( wait, − ) ( wait, − ) q 1 q 2 vote i , 1 vote i , 2 c ( ng, − ) ( ng, − ) ) ( g − i , v e e v , i − g ( ) q 3 q 4 q 5 q 6 vote i , 1 vote i , 1 vote i , 2 vote i , 2 c ( − , pun ) ( − , pun ) ( − , pun ) ( − , pun ) ) ) ( ( p p − − n n , , , , n n − − p p ( ( ) ) c c q 7 q 8 q 9 q 10 q 11 q 12 q 13 q 14 finish i finish i finish i finish i finish i finish i finish i finish i vote i , 1 vote i , 1 vote i , 1 vote i , 1 vote i , 2 vote i , 2 vote i , 2 vote i , 2 pun i pun i pun i pun i 6 / 1

  11. Example Formulae • � � coercer � � F ( ¬ pun 1 ∨ vote 1 , 1 ) : “Coercer can coerce Voter to vote for first candidate” 7 / 1

  12. Example Formulae • � � coercer � � F ( ¬ pun 1 ∨ vote 1 , 1 ) : “Coercer can coerce Voter to vote for first candidate” FALSE 7 / 1

  13. Example Formulae • � � coercer � � F ( ¬ pun 1 ∨ vote 1 , 1 ) : “Coercer can coerce Voter to vote for first candidate” FALSE • � � voter 1 � � G ( ¬ pun 1 ∧ ¬ vote 1 , 1 ) : “Voter can avoid punishment without voting for first candidate” 7 / 1

  14. Example Formulae • � � coercer � � F ( ¬ pun 1 ∨ vote 1 , 1 ) : “Coercer can coerce Voter to vote for first candidate” FALSE • � � voter 1 � � G ( ¬ pun 1 ∧ ¬ vote 1 , 1 ) : “Voter can avoid punishment without voting for first candidate” TRUE 7 / 1

  15. Outline 8 / 1

  16. Approximate Verification of Strategic Ability M | = ir ϕ : DIFFICULT! 9 / 1

  17. Approximate Verification of Strategic Ability M | = ir ϕ : DIFFICULT! M | = LB ( ϕ ) M | = ir ϕ M | = UB ( ϕ )

  18. Approximate Verification of Strategic Ability M | = ir ϕ : DIFFICULT! M | = LB ( ϕ ) M | = ir ϕ M | = UB ( ϕ ) perfect information

  19. Approximate Verification of Strategic Ability M | = ir ϕ : DIFFICULT! M | = LB ( ϕ ) M | = ir ϕ M | = UB ( ϕ ) perfect information our contribution

  20. Approximate Verification of Strategic Ability M | = ir ϕ : DIFFICULT! M | = LB ( ϕ ) M | = ir ϕ M | = UB ( ϕ ) perfect information our contribution 9 / 1

  21. Domino DFS q 0 start ( A , U ) ( A , V ) ( B , ⋆ ) 1 q 2 q 1 ( B , ⋆ ) ( B , ⋆ ) ( A , V ) ( A , V ) ( A , U ) ( A , U ) q 3 q 4 q 5 p p ¬ p 10 / 1

  22. Domino DFS q 0 start ( A , U ) ( A , V ) ( B , ⋆ ) 1 q 2 q 1 ( B , ⋆ ) ( B , ⋆ ) ( A , V ) ( A , V ) ( A , U ) ( A , U ) q 3 q 4 q 5 p p ¬ p 11 / 1

  23. Domino DFS q 0 start ( A , U ) ( A , V ) ( B , ⋆ ) 1 q 2 q 1 ( B , ⋆ ) ( B , ⋆ ) ( A , V ) ( A , V ) ( A , U ) ( A , U ) q 3 q 4 q 5 p p ¬ p 12 / 1

  24. Domino DFS q 0 start ( A , U ) ( A , V ) ( B , ⋆ ) 1 q 2 q 1 ( B , ⋆ ) ( B , ⋆ ) ( A , V ) ( A , V ) ( A , U ) ( A , U ) q 3 q 4 q 5 p p ¬ p 13 / 1

  25. Domino DFS q 0 start ( A , U ) ( A , V ) ( B , ⋆ ) 1 q 2 q 1 ( B , ⋆ ) ( B , ⋆ ) ( A , V ) ( A , V ) ( A , U ) ( A , U ) q 3 q 4 q 5 p p ¬ p 14 / 1

  26. Outline 15 / 1

  27. Implemented models • Bridge scenario • Castles • TianJi • Drones • Simple Voting 16 / 1

  28. Bridge scenario 17 / 1

  29. Bridge scenario • Typical bridge play scenario, modified by two variables: n, k • Each player holds k cards in hand • Deck consists of 4n cards in total • We consider only endplay • Random deal • Four players - S, W, N, E • Declarer (S) handles his own cards and the ones of the dummy (N) • Players remember already played cards • Everyone see dummy cards • NoTrump contract 18 / 1

  30. DEMO 19 / 1

  31. Experimental results - Bridge scenario Conf. DominoDFS MCMAS Approx. Approx. opt. ( 1 , 1 ) 0 . 0006 0 . 12 0 . 0008 < 0 . 0001 ( 2 , 2 ) 0 . 01 8712 ∗ 0 . 01 < 0 . 0001 ( 3 , 3 ) 0 . 8 timeout 0 . 8 0 . 06 ( 4 , 4 ) 160 timeout 384 5 . 5 ( 5 , 5 ) ∗ 1373 timeout 8951 39 ( 5 , 5 ) memout timeout memout 138 ( 6 , 6 ) ∗ memout timeout memout 4524 20 / 1

  32. Experimental results - Castles Conf. DominoDFS MCMAS SMC ( 1 , 1 , 1 ) 0 . 3 65 63 ( 2 , 1 , 1 ) 1 . 5 12898 184 ( 3 , 1 , 1 ) 25 timeout 6731 ( 2 , 2 , 1 ) 25 timeout 4923 ( 2 , 2 , 2 ) 160 timeout timeout ( 3 , 2 , 2 ) 2688 timeout timeout ( 3 , 3 , 2 ) timeout timeout timeout 21 / 1

  33. THANK YOU 22 / 1

Recommend


More recommend