Energy and Meanpayoff Games Laurent Doyen LSV, ENS Cachan & - PowerPoint PPT Presentation

Energy and Mean�payoff Games Laurent Doyen LSV, ENS Cachan & CNRS joint work with Aldric Degorre, Raffaella Gentilini, Jean�François Raskin, Szymon Torunczyk ACTS 2010, Chennai

Synthesis problem Specification avoid failure, ensure progress, etc. Correctness relation

Synthesis problem System - Model Specification avoid failure, ensure progress, etc. Correctness relation Solved as a game – system vs. environment solution = winning strategy This talk: quantitative games (resource-constrained systems)

Energy games (staying alive)

Energy games (CdAHS03,BFLM08) Maximizer Minimizer positive weight = reward play: (1,4) (4,1) (1,4) (4,1) … weights: �1 +2 �1 +2 … energy level: � 0 2 1 3 2 4 3 …

Energy games (CdAHS03,BFL+08) Maximizer Minimizer positive weight = reward play: (1,4) (4,1) (1,4) (4,1) … weights: �1 +2 �1 +2 … energy level: � 0 2 1 3 2 4 3 … � Initial credit

Energy games Strategies: Maximizer Minimizer play: Infinite sequence of edges consistent with strategies and outcome is winning if: Energy level

Energy games Decision problem: Decide if there exist an initial credit c 0 and a strategy of the maximizer to maintain the energy level always nonnegative.

Energy games Decision problem: Decide if there exist an initial credit c 0 and a strategy of the maximizer to maintain the energy level always nonnegative. For energy games, memoryless strategies suffice.

Energy games c 0 =2 c 0 =2 Decision problem: Decide if there exist an initial credit c 0 and a strategy of the maximizer to maintain the energy level always nonnegative. For energy games, memoryless strategies suffice. c 0 =1 c 0 =0 A memoryless strategy is winning if all cycles are nonnegative when is fixed.

Algorithm

Algorithm for energy games Initial credit is useful to survive before a cycle is formed Length(AcyclicPath) ≤ Q Q: #states E: #edges W: maximal weight

Algorithm for energy games Initial credit is useful to survive before a cycle is formed Length(AcyclicPath) ≤ Q Q: #states E: #edges Minimum initial credit is at most QW W: maximal weight

Algorithm for energy games The minimum initial credit is such that: in Maximizer state q: in Minimizer state q: Compute successive under�approximations of the minimum initial credit.

Algorithm for energy games 0 0 Fixpoint algorithm: � start with 0 0

Algorithm for energy games 0 1 0 2 Fixpoint algorithm: � start with � iterate at Maximizer states: 0 1 0 0 at Minimizer states:

Algorithm for energy games 0 1 2 0 2 2 Fixpoint algorithm: � start with � iterate at Maximizer states: 0 1 1 0 0 0 at Minimizer states:

Algorithm for energy games 0 1 2 0 2 2 Fixpoint algorithm: � start with � iterate at Maximizer states: 0 1 1 0 0 0 at Minimizer states: Termination argument: monotonic operators, and finite codomain Complexity: O(EQW)

Mean�payoff games

Mean�payoff games (EM79) Maximizer Minimizer positive weight = reward play: (1,4) (4,1) (1,4) (4,1) … weights: �1 +2 �1 +2 … mean�payoff value: (limit of weight average)

Mean�payoff games (EM79) Mean�payoff value: either or Decision problem: Given a rational threshold , decide if there exists a strategy of the maximizer to ensure mean�payoff value at least . Note: we can assume e.g. by shifting all weights by .

Mean�payoff games Mean�payoff value: either or Decision problem: Given a rational threshold , decide if there exists a strategy of the maximizer to ensure mean�payoff value at least . Assuming A memoryless strategy is winning if all cycles are nonnegative when is fixed.

Mean�payoff games Mean�payoff value: either or Decision problem: log�space equivalent to Given a rational threshold , decide energy games [BFL+08] if there exists a strategy of the maximizer to ensure mean�payoff value at least . Assuming A memoryless strategy is winning if all cycles are nonnegative when is fixed.

Complexity Energy games Mean�payoff games O(EQW) (this talk) Decision problem O(EQW) O(EQ 2 W) [ZP96] Deterministic Pseudo�polynomial algorithms

Outline ► Perfect information • Mean�payoff games • Energy games • Algorithms ► Imperfect information • Energy with fixed initial credit • Energy with unknown initial credit • Mean�payoff

Imperfect information (staying alive in the dark)

Imperfect information – Why ? System - Model Specification avoid failure, ensure progress, etc. Correctness relation • Private variables/internal state • Noisy sensors Strategies should not rely on hidden information

Imperfect information – How ? � � • Coloring of the state space observations = set of states with the same color

Imperfect information – How ? � � � �� Maximizer states only Playing the game: 1. Maximizer chooses an action (a or b) 2. Minimizer chooses successor state (compatible with Maximizer’s action) 3. The color of the next state is visible to Maximizer

Imperfect information – How ? � � �� Actions Observations

Imperfect information – How ? Observation�based strategies � � �� Goal: all outcomes have � � nonnegative energy level, � or nonnegative mean�payoff value Actions Observations

Complexity Energy games Mean�payoff games O(EQW) (this talk) Perfect O(EQW) information O(EQ 2 W) [ZP96] Imperfect ? ? information

Imperfect information Observation�based strategies � � �� Goal: all outcomes have � nonnegative energy level, � � or nonnegative mean�payoff value Two variants for Energy games: � fixed initial credit � unknown initial credit

Fixed initial credit Can you win with initial credit = 3 ? Actions Observations

Fixed initial credit Can you win with initial credit = 3 ? Keep track of � which can be the current state, and � what is the worst�case energy level Initially: (3, ⊥ , ⊥ )

Example (3, ⊥ , ⊥ ) �� ( ⊥ ,2,2)

Example (3, ⊥ , ⊥ ) �� ( ⊥ ,2,2) � � � � (3, ⊥ , ⊥ ) ( ⊥ ,2,1) ( ⊥ ,1,3) (3, ⊥ , ⊥ )

Example (3, ⊥ , ⊥ ) �� ( ⊥ ,2,2) � � � � (3, ⊥ , ⊥ ) ( ⊥ ,2,1) ( ⊥ ,1,3) (3, ⊥ , ⊥ ) Stop search whenever � negative value, or � comparable ancestor

Example (3, ⊥ , ⊥ ) �� ( ⊥ ,2,2) � � � � (3, ⊥ , ⊥ ) ( ⊥ ,2,1) ( ⊥ ,1,3) (3, ⊥ , ⊥ ) � � � � (4, ⊥ , ⊥ ) ( ⊥ ,1,0) ( ⊥ ,1,4) (2, ⊥ , ⊥ ) Stop search whenever: � negative value, or � comparable ancestor

Example (3, ⊥ , ⊥ ) �� ( ⊥ ,2,2) � � � � (3, ⊥ , ⊥ ) ( ⊥ ,2,1) ( ⊥ ,1,3) (3, ⊥ , ⊥ ) � � � � (4, ⊥ , ⊥ ) ( ⊥ ,1,0) ( ⊥ ,1,4) (2, ⊥ , ⊥ ) Initial credit = 3 is not sufficient !

Example (3, ⊥ , ⊥ ) �� ( ⊥ ,2,2) � � � � (3, ⊥ , ⊥ ) ( ⊥ ,2,1) ( ⊥ ,1,3) (3, ⊥ , ⊥ ) � � � � (4, ⊥ , ⊥ ) ( ⊥ ,1,0) ( ⊥ ,1,4) (2, ⊥ , ⊥ ) Search will terminate because is well�quasi ordered.

Example Upper bound: non�primitive recursive (3, ⊥ , ⊥ ) �� Lower bound: EXPSPACE�hard ( ⊥ ,2,2) Proof (not shown in this talk): reduction from � � the infinite execution problem of Petri Nets. � � (3, ⊥ , ⊥ ) ( ⊥ ,2,1) ( ⊥ ,1,3) (3, ⊥ , ⊥ ) � � � � (4, ⊥ , ⊥ ) ( ⊥ ,1,0) ( ⊥ ,1,4) (2, ⊥ , ⊥ ) Search will terminate because is well�quasi ordered.

Complexity Energy games Mean�payoff games (unknown initial credit) O(EQW) (this talk) Perfect O(EQW) information O(EQ 2 W) [ZP96] Imperfect r.e. ? information

Memory requirement With imperfect information: Corollary: Finite�memory strategies suffice in energy games

Memory requirement With imperfect information: Corollary: Finite�memory strategies suffice in energy games In mean�payoff games: • �� memory may be required • limsup vs. liminf definition do �� coincide

Memory requirement Energy games Mean�payoff games Perfect memoryless memoryless information Imperfect finite memory infinite memory information

Unknown initial credit Theorem The unknown initial credit problem for energy games is undecidable. (even for blind games) Proof: Using a reduction from the halting problem of 2�counter machines.

2�counter machines • 2 counters c 1 , c 2 • increment, decrement, zero test q1: inc c 1 goto q2 q2: inc c 1 goto q3 q3: if c 1 == 0 goto q6 else dec c 1 goto q4 q4: inc c 2 goto q5 q5: inc c 2 goto q3 q6: halt

Energy and Meanpayoff Games Laurent Doyen LSV, ENS Cachan & - PowerPoint PPT Presentation

Energy and Meanpayoff Games Laurent Doyen LSV, ENS Cachan & CNRS joint work with Aldric Degorre, Raffaella Gentilini, JeanFranois Raskin, Szymon Torunczyk ACTS 2010, Chennai Synthesis problem Specification avoid failure,

On the Approximation of Mean-Payoff Games Raffaella Gentilini University of Perugia Convegno

Games Miheer Dewaskar Chennai Mathematical Institute April 27, 2016 1 / 19 Outline Finite

The Multiple Dimensions of Mean-Payoff Games Laurent Doyen CNRS & LSV, ENS Paris-Saclay RP

Mean-payoff games with incomplete information Paul Hunter, Guillermo P erez, Jean-Franc ois

Strategy recovery for stochastic mean payoff games Marcello Mamino TU Dresden GRASTA 15,

Ergodic Mean-Payoff Games for the Analysis of Attacks in Crypto-Currencies Krishnendu Chatterjee 1

Pre-Grundy Games Games And Graphs Workshop 2017 In collaboration with : Eric Duch ene,

S S S S erious Games erious Games erious Games erious Games + Computer S + Computer S +

Potential Games Matoula Petrolia April 14, 2011 Examples Potential Games Potential vs

Today Experts/Zero-Sum Games Equilibrium. Boosting and Experts. Routing and Experts. Two person

Robust Predictions in Games with Incomplete Information joint with Stephen Morris (Princeton

Nash Dynamics and Potential Games Maria Serna Fall 2016 AGT-MIRI, FIB Potential Games Contents

Congestion Games with affine functions Maria Serna Fall 2016 AGT-MIRI, FIB-UPC Congestion Games

DETERMINISTIC MEAN FIELD GAMES Italo Capuzzo Dolcetta Sapienza Universit` a di Roma and GNAMPA

Mean Field Games problems for linear control system and ergodic behavior of Mean Field Games

LOGIC OF GAMES Andreas Blass University of Michigan Ann Arbor, MI 48109 ablass@umich.edu Games

Naive Bayesian Learning in Social Networks Jerry Anunrojwong (Harvard) joint with Nat Sothanaphan

Online Learning, and Private Optimization Ellen Vitercik Northwestern Quarterly Theory Workshop

Adversarial Risk Analysis for Counterterrorism Modeling Jesus Rios IBM research joint work with

A reverse Sidorenko inequality Independent sets, colorings, and graph homomorphisms Yufei Zhao

CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg Shah 1

CS 285 Instructor: Sergey Levine UC Berkeley Recap: Q-learning fit a model to estimate return

On the E ffi ciency of the Walrasian Mechanism Moshe Babaio ff Brendan Lucier (Microsoft

All Investors are Risk-averse Expected Utility Maximizers Carole Bernard (UW), Jit Seng Chen

Energy and Meanpayoff Games Laurent Doyen LSV, ENS Cachan & - PowerPoint PPT Presentation

Energy and Meanpayoff Games Laurent Doyen LSV, ENS Cachan & CNRS joint work with Aldric Degorre, Raffaella Gentilini, JeanFranois Raskin, Szymon Torunczyk ACTS 2010, Chennai Synthesis problem Specification avoid failure,

On the Approximation of Mean-Payoff Games Raffaella Gentilini University of Perugia Convegno

Games Miheer Dewaskar Chennai Mathematical Institute April 27, 2016 1 / 19 Outline Finite

The Multiple Dimensions of Mean-Payoff Games Laurent Doyen CNRS &amp; LSV, ENS Paris-Saclay RP

Mean-payoff games with incomplete information Paul Hunter, Guillermo P erez, Jean-Franc ois

Strategy recovery for stochastic mean payoff games Marcello Mamino TU Dresden GRASTA 15,

Ergodic Mean-Payoff Games for the Analysis of Attacks in Crypto-Currencies Krishnendu Chatterjee 1

Pre-Grundy Games Games And Graphs Workshop 2017 In collaboration with : Eric Duch ene,

S S S S erious Games erious Games erious Games erious Games + Computer S + Computer S +

Potential Games Matoula Petrolia April 14, 2011 Examples Potential Games Potential vs

Today Experts/Zero-Sum Games Equilibrium. Boosting and Experts. Routing and Experts. Two person

Robust Predictions in Games with Incomplete Information joint with Stephen Morris (Princeton

Nash Dynamics and Potential Games Maria Serna Fall 2016 AGT-MIRI, FIB Potential Games Contents

Congestion Games with affine functions Maria Serna Fall 2016 AGT-MIRI, FIB-UPC Congestion Games

DETERMINISTIC MEAN FIELD GAMES Italo Capuzzo Dolcetta Sapienza Universit` a di Roma and GNAMPA

Mean Field Games problems for linear control system and ergodic behavior of Mean Field Games

LOGIC OF GAMES Andreas Blass University of Michigan Ann Arbor, MI 48109 ablass@umich.edu Games

Naive Bayesian Learning in Social Networks Jerry Anunrojwong (Harvard) joint with Nat Sothanaphan

Online Learning, and Private Optimization Ellen Vitercik Northwestern Quarterly Theory Workshop

Adversarial Risk Analysis for Counterterrorism Modeling Jesus Rios IBM research joint work with

A reverse Sidorenko inequality Independent sets, colorings, and graph homomorphisms Yufei Zhao

CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg Shah 1

CS 285 Instructor: Sergey Levine UC Berkeley Recap: Q-learning fit a model to estimate return

On the E ffi ciency of the Walrasian Mechanism Moshe Babaio ff Brendan Lucier (Microsoft

All Investors are Risk-averse Expected Utility Maximizers Carole Bernard (UW), Jit Seng Chen

The Multiple Dimensions of Mean-Payoff Games Laurent Doyen CNRS & LSV, ENS Paris-Saclay RP