Stochastic games Antonín Kuˇ cera Preliminaries Games Strategies, plays Objectives Stochastic Games Reachability objectives The value (in Formal Verification) Min strategies Max strategies Determinacy Finite-state games BPA games Branching-time objectives Basic properties Antonín Kuˇ cera Deciding the winner Games with time Masaryk University Brno SFM-10:QAPL 2010 1/56
Stochastic Game theory games Antonín Kuˇ cera Preliminaries Games Game theory studies the behavior of rational “players” who can Strategies, plays Objectives make choice and attempt to achieve a certain objective. A Reachability player’s success depends on the choices of the other players. objectives The value Min strategies stochastic games: Max strategies Determinacy Finite-state games the impact of players’ choices in uncertain; BPA games Branching-time the players’ choice can be randomized. objectives Basic properties Deciding the winner games in computer science: Games with time formal semantics; communication protocols; Internet auctions; . . . many other things. SFM-10:QAPL 2010 2/56
Stochastic Stochastic games in formal verification games Antonín Kuˇ cera Preliminaries Games Strategies, plays Objectives Reachability Our setting: objectives The value Min strategies state space: discrete Max strategies Determinacy players: controller, environment Finite-state games BPA games objectives: antagonistic Branching-time objectives Basic properties choice: turn-based, randomized Deciding the winner Games with information: perfect time Is there a strategy for the controller such that the system satisfies a certain property no matter what the environment does? SFM-10:QAPL 2010 3/56
Stochastic Outline games Antonín Kuˇ cera Preliminaries Games Strategies, plays Objectives Reachability objectives Preliminaries. The value Min strategies Games, strategies, objectives. Max strategies Determinacy Finite-state games Stochastic games with reachability objectives. BPA games Branching-time objectives The (non)existence of optimal strategies. Basic properties Deciding the winner Algorithms for finite-state games. Games with time Stochastic games with branching-time objectives. Stochastic games with time. SFM-10:QAPL 2010 4/56
Stochastic Markov chains games Antonín Kuˇ cera Preliminaries Definition 1 (Markov chain) Games Strategies, plays Objectives 1 Reachability 4 M = ( S , → , Prob ) 1 1 objectives s t 2 3 The value 1 S is at most countable set of states; Min strategies 3 Max strategies 1 1 Determinacy 4 3 → ⊆ S × S is a transition relation; Finite-state games BPA games u Prob is a probability assignment. Branching-time objectives 1 Basic properties Deciding the winner Games with time SFM-10:QAPL 2010 5/56
Stochastic Markov chains games Antonín Kuˇ cera Preliminaries Definition 1 (Markov chain) Games Strategies, plays Objectives 1 Reachability 4 M = ( S , → , Prob ) 1 1 objectives s t 2 3 The value 1 S is at most countable set of states; Min strategies 3 Max strategies 1 1 Determinacy 4 3 → ⊆ S × S is a transition relation; Finite-state games BPA games u Prob is a probability assignment. Branching-time objectives 1 Basic properties Deciding the winner Games with time We want to measure the probability of certain subsets of Run ( s ) . For every finite path w initiated in s , we define the probability of Run ( w ) in the natural way. This assignment can be uniquely extended to the (Borel) σ -algebra F generated by all Run ( w ) . Thus, we obtain the probability space ( Run ( s ) , F , P ) . SFM-10:QAPL 2010 5/56
Stochastic Turn-based stochastic games games Antonín Kuˇ cera Preliminaries Games Strategies, plays Objectives Reachability objectives Definition 2 (Turn-based stochastic game) The value Min strategies Max strategies G = ( V , E , ( V � , V � , V � ) , Prob ) Determinacy Finite-state games the set V is at most countable; BPA games 0 . 2 Branching-time each vertex has a successor; objectives 0 . 8 Basic properties Deciding the winner Prob is positive; Games with time G is a Markov decision process (MDP) if V � = ∅ or V � = ∅ . 0 . 4 0 . 6 SFM-10:QAPL 2010 6/56
Stochastic Strategies games Antonín Kuˇ cera Preliminaries Games Strategies, plays Definition 3 (Strategy) Objectives Let G = ( V , E , ( V � , V � , V � ) , Prob ) be a game. A strategy for Reachability objectives player � is a function σ which to every wv ∈ V ∗ V � assigns a The value Min strategies probability distribution over the set of outgoing edges of v. Max strategies Determinacy Finite-state games BPA games A strategy for player � is defined analogously. Branching-time objectives Basic properties We can classify strategies according to Deciding the winner Games with memory requirements: history-dependent (H), time finite-memory (F), memoryless (M) randomization: randomized (R), deterministic (D) Thus, we obtain the classes of MD, MR, FD, FR, HD, and HR strategies. SFM-10:QAPL 2010 7/56
Stochastic Plays games Antonín Kuˇ cera Preliminaries Games Strategies, plays Objectives Reachability objectives Definition 4 (Play) The value Min strategies Let G = ( V , E , ( V � , V � , V � ) , Prob ) be a game. Each pair ( σ, π ) of Max strategies Determinacy strategies for player � and player � determines a unique play Finite-state games G ( σ,π ) , which is a Markov chain where V + is the set of states and BPA games Branching-time transitions are defined accordingly. objectives Basic properties Deciding the winner Games with Plays are infinite trees. time For a pair of memoryless strategies ( σ, π ) , the play G ( σ,π ) can be depicted as a Markov chain with the set of states V . SFM-10:QAPL 2010 8/56
Stochastic Plays (2) games Antonín Kuˇ cera Preliminaries Example 5 (A game and its play) Games Strategies, plays Objectives v u 1 Reachability objectives The value Min strategies Max strategies Determinacy Finite-state games BPA games Branching-time objectives Basic properties Deciding the winner Games with time SFM-10:QAPL 2010 9/56
Stochastic Plays (2) games Antonín Kuˇ cera Preliminaries Example 5 (A game and its play) Games Strategies, plays Objectives v u 1 Reachability objectives The value Min strategies Is there a strategy σ such that v | = G > 0 ( v ) in G σ ? Max strategies Determinacy Finite-state games BPA games Branching-time objectives Basic properties Deciding the winner Games with time SFM-10:QAPL 2010 9/56
Stochastic Plays (2) games Antonín Kuˇ cera Preliminaries Example 5 (A game and its play) Games Strategies, plays Objectives v u 1 Reachability objectives The value Min strategies Is there a strategy σ such that v | = G > 0 ( v ) in G σ ? Max strategies Determinacy Finite-state games BPA games Is there a strategy σ such that v | = G > 0 ( v ∧ F > 0 u ) in G σ ? Branching-time objectives Basic properties Deciding the winner Games with time SFM-10:QAPL 2010 9/56
Stochastic Plays (2) games Antonín Kuˇ cera Preliminaries Example 5 (A game and its play) Games Strategies, plays Objectives v u 1 Reachability objectives The value Min strategies Is there a strategy σ such that v | = G > 0 ( v ) in G σ ? Max strategies Determinacy Finite-state games BPA games Is there a strategy σ such that v | = G > 0 ( v ∧ F > 0 u ) in G σ ? Branching-time objectives Obviously, there is no such MR (or even FR) strategy. Basic properties Deciding the winner Games with time SFM-10:QAPL 2010 9/56
Stochastic Plays (2) games Antonín Kuˇ cera Preliminaries Example 5 (A game and its play) Games Strategies, plays Objectives v u 1 Reachability objectives The value Min strategies Is there a strategy σ such that v | = G > 0 ( v ) in G σ ? Max strategies Determinacy Finite-state games BPA games Is there a strategy σ such that v | = G > 0 ( v ∧ F > 0 u ) in G σ ? Branching-time objectives Obviously, there is no such MR (or even FR) strategy. Basic properties Deciding the winner 1 / 2 | wv | 1 − 1 / 2 | wv | Games with Let σ ( wv ) = v − − − − → u , v − − − − − − → v time 1 / 2 3 / 4 7 / 8 15 / 16 v vv vvv vvvv 1 / 2 1 / 4 1 / 8 1 / 16 vu vvu vvvu vvvvu 1 1 1 1 SFM-10:QAPL 2010 9/56
Stochastic A taxonomy of objectives games Antonín Kuˇ cera Preliminaries Each play of a game G is assigned a (numerical) yield. The Games Strategies, plays goal of player � / � is to maximize/minimize the yield. Objectives Reachability objectives Win-lose objectives assign either 1 or 0 to each play. The value Min strategies P � ̺ ϕ , where ϕ is an LTL formula. Max strategies Determinacy Finite-state games PCTL or PCTL* objectives. BPA games Branching-time objectives Objectives specified by Borel measurable payoffs. Basic properties Deciding the winner yield ( G σ,π ) = E ( f σ,π ) , where f : Run ( G ) → R is measurable. Games with time Qualitative payoffs assign either 1 or 0 to each run Büchi, parity, Rabin, Street, Muller, etc. Quantitative payoffs � n i = 0 rew ( w ( i )) Mean payoff: MP ( w ) = lim n →∞ n Discounted payoff: DP ( w ) = � ∞ i = 0 λ i · rew ( w ( i )) SFM-10:QAPL 2010 10/56
Recommend
More recommend