Games Where You Can Play Optimally with Arena-Independent Finite Memory Patricia Bouyer 1 Stéphane Le Roux 1 Youssouf Oualhadj 2 Mickael Randour 3 Pierre Vandenhove 1,3 1 LSV – CNRS & ENS Paris-Saclay, Université Paris-Saclay, France 2 LACL – Université Paris-Est Créteil, France 3 F.R.S.-FNRS & UMONS – Université de Mons, Belgium June 22, 2020 – MOVEP 2020
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion Outline Strategy synthesis for two-player turn-based games Design optimal controllers for systems interacting with an antagonistic environment. “Optimal” w.r.t. an objective or a specification. Goal: interest in “simple” controllers Finite-memory determinacy: when do finite-memory controllers suffice? Inspiration Results by Gimbert and Zielonka 1 about memoryless determinacy. 1 Gimbert and Zielonka, “Games Where You Can Play Optimally Without Any Memory”, 2005. Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 2 / 17
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion 1 Memoryless determinacy 2 The need for memory 3 Arena-independent finite memory Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 3 / 17
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion 1 Memoryless determinacy 2 The need for memory 3 Arena-independent finite memory Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 4 / 17
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion Two-player turn-based zero-sum games on graphs ⊥ ⊥ s 1 s 2 s 3 ⊥ ⊥ ⊥ ⊥ ⊥ C = {⊤ , ⊥} ⊤ ⊥ s 4 s 5 s 6 ⊤ ⊥ ⊥ • Finite two-player arenas: S 1 (circles, for P 1 ) and S 2 (squares, for P 2 ), edges E . • Set C of colors. Edges are colored. • “Objectives” given by preference relations ⊑ ∈ C ω × C ω (total preorder). Zero-sum, ⊑ − 1 . • A strategy for P i is a (partial) function σ : E ∗ → E . Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 5 / 17
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion Memoryless determinacy Question Given a preference relation, do “simple” strategies suffice to play optimally in all arenas? E ∗ S i → E . ❩ ✚ A strategy σ of P i is memoryless if it is a function ✚ ❩ ⊥ ⊥ s 1 s 2 s 3 ⊥ ⊥ ⊥ C = {⊤ , ⊥} ⊤ ⊥ ⊥ ⊥ s 4 s 5 s 6 ⊤ ⊥ ⊥ E.g., for reachability, memoryless strategies suffice. Also suffice for safety, Büchi, co-Büchi, parity, mean-payoff, energy, average-energy. . . Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 6 / 17
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion Memoryless determinacy Good understanding of memoryless determinacy: • sufficient conditions to guarantee memoryless optimal strategies for both players. 2,3 • sufficient conditions to guarantee memoryless optimal strategies for one player. 4,5,6 • characterization of the preference relations admitting optimal memoryless strategies for both players. 7 2 Gimbert and Zielonka, “When Can You Play Positionally?”, 2004. 3 Aminof and Rubin, “First-cycle games”, 2017. 4 Kopczynski, “Half-Positional Determinacy of Infinite Games”, 2006. 5 Gimbert, “Pure Stationary Optimal Strategies in Markov Decision Processes”, 2007. 6 Gimbert and Kelmendi, “Two-Player Perfect-Information Shift-Invariant Submixing Stochastic Games Are Half-Positional”, 2014. 7 Gimbert and Zielonka, “Games Where You Can Play Optimally Without Any Memory”, 2005. Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 7 / 17
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion Gimbert and Zielonka’s characterization 8 Let ⊑ be a preference relation. Two results: 1 Characterization of memoryless determinacy w.r.t. properties of ⊑ . 2 Corollary: One-to-two-player memoryless lifting If ◮ in all one-player arenas of P 1 , P 1 has an optimal memoryless strategy, ◮ in all one-player arenas of P 2 , P 2 has an optimal memoryless strategy, then both players have an optimal memoryless strategy in all two-player arenas. Extremely useful in practice. Very easy to recover memoryless determinacy of, e.g., mean-payoff and parity games. 8 Gimbert and Zielonka, “Games Where You Can Play Optimally Without Any Memory”, 2005. Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 8 / 17
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion 1 Memoryless determinacy 2 The need for memory 3 Arena-independent finite memory Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 9 / 17
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion The need for memory Memoryless strategies do not always suffice. ( − 1, − 1) A B (1, − 1) s 1 s 2 ( − 1, 1) ( − 1, − 1) • Büchi( A ) ∧ Büchi( B ): requires finite memory . A m 1 m 2 A B B • Mean payoff ≥ 0 in both dimensions: requires infinite memory . 9 � Combinations of objectives usually require memory. 9 Chatterjee, Doyen, et al., “Generalized Mean-payoff and Energy Games”, 2010. Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 10 / 17
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion An attempt at lifting [GZ05] to FM determinacy • Lack of a good understanding of finite-memory determinacy. • Related work : sufficient properties to preserve FM determinacy in Boolean combinations of objectives. 10 • Our approach: Hope: extend Gimbert and Zielonka’s results ❤❤❤❤❤ ✭ One-to-two-player lifting for ✭✭✭✭✭ memoryless finite-memory determinacy. ❤ 10 Le Roux, Pauly, and Randour, “Extending Finite-Memory Determinacy by Boolean Combination of Winning Conditions”, 2018. Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 11 / 17
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion Counterexample Let C ⊆ Z . P 1 wants to achieve a play π = c 1 c 2 . . . ∈ C ω s.t. n n � � lim sup c i = + ∞ or ∃ ∞ n , c i = 0 . n i =0 i =0 Optimal FM strategies in one-player arenas. . . . . . but not in two-player arenas: P 1 wins but needs infinite memory . 1 s 1 s 2 − 1 1 − 1 Intuition : In one-player arenas, P 1 can bound the memory he needs in advance. In two-player arenas, P 2 can generate arbitrarily long sequences. Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 12 / 17
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion 1 Memoryless determinacy 2 The need for memory 3 Arena-independent finite memory Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 13 / 17
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion Arena-independent memory • For Büchi( A ) ∧ Büchi( B ), this structure suffices to play optimally on all arenas for P 1 . A m 1 m 2 A B B • The counterexample fails because in one-player arenas, the size of the memory is dependent on the size of the arena . • Observation: for many objectives, one fixed memory structure suffices for all arenas . “For all A , does there exist M . . . ?” → “Does there exist M , for all A . . . ?” Method: reproducing the approach of Gimbert and Zielonka given a memory structure M . Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 14 / 17
Memoryless determinacy The need for memory Arena-independent finite memory Conclusion Characterization of arena-independent determinacy Let ⊑ be preference relation, M be a memory structure. 1 Characterization of “playing with M is sufficient” in terms of properties of ⊑ . 2 Corollary: One-to-two-player lifting If ◮ in all one-player arenas of P 1 , P 1 has an optimal strategy with memory M 1 , ◮ in all one-player arenas of P 2 , P 2 has an optimal strategy with memory M 2 , then both players have an optimal strategy in all two-player arenas with memory M 1 ⊗ M 2 . In short : the study of one-player arenas is sufficient to determine whether playing with arena-independent finite memory suffices. Playing Optimally with Arena-Independent Finite Memory Bouyer, Le Roux, Oualhadj, Randour, Vandenhove 15 / 17
Recommend
More recommend