Fixed points to solve games Let Q be a set of safe states, the states in which Player I can force the game to within Q is given by the following fixed point expression : ∪{ R | R = Q ∩ CPre 1 ( R ) }
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 Does Player I, who owns the rounded positions, have a strategy to stay within the set of states ? Q \ { 1111 }
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 We must compute ∪{ R | R = ( Q 1 ∪ Q 2 ) \ { 1111 } ∩ CPre 1 ( R ) } To do that, we use the Tarski fixpoint theorem.
Tarski-Kleene Theorem Let be a complete lattice , the f be a � L, � , � , � , � , ⊥� Scott-continuous function on L , then lfp f is the limit of the sequence : f ( ⊥ ) , f ( f ( ⊥ )), ..., f (... f ( ⊥ )...), ... gfp f is the limit of the sequence : f (T), f ( f (T)), ..., f (.... f (T)...), ...
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q ) X 1 = ( Q \ { 1111 } ) ∩ 1CPre ( X 0 )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q ) X 1 = ( Q \ { 1111 } ) ∩ 1CPre ( X 0 )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q ) X 1 = ( Q \ { 1111 } ) ∩ 1CPre ( X 0 )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q ) X 1 = ( Q \ { 1111 } ) ∩ 1CPre ( X 0 )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q ) X 1 = ( Q \ { 1111 } ) ∩ 1CPre ( X 0 )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q ) X 1 = ( Q \ { 1111 } ) ∩ 1CPre ( X 0 ) X 2 = ( Q \ { 1111 } ) ∩ 1CPre ( X 1 )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q ) X 1 = ( Q \ { 1111 } ) ∩ 1CPre ( X 0 ) X 2 = ( Q \ { 1111 } ) ∩ 1CPre ( X 1 )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q ) X 1 = ( Q \ { 1111 } ) ∩ 1CPre ( X 0 ) X 2 = ( Q \ { 1111 } ) ∩ 1CPre ( X 1 )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q ) X 1 = ( Q \ { 1111 } ) ∩ 1CPre ( X 0 ) X 2 = ( Q \ { 1111 } ) ∩ 1CPre ( X 1 )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q ) X 1 = ( Q \ { 1111 } ) ∩ 1CPre ( X 0 ) X 2 = ( Q \ { 1111 } ) ∩ 1CPre ( X 1 )
Fixpoint for a safety game 0100 0101 1101 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q ) This is the X 1 = ( Q \ { 1111 } ) ∩ 1CPre ( X 0 ) greatest X 2 = ( Q \ { 1111 } ) ∩ 1CPre ( X 1 ) = X 1 fixed point
Fixpoint for a safety game X 2 is exactly the set of positions from which Player I can avoid 0100 0101 1101 entering {1111} , no matter how Player II behaves. 0000 1111 1000 1010 1110 X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q ) This is the X 1 = ( Q \ { 1111 } ) ∩ 1CPre ( X 0 ) greatest X 2 = ( Q \ { 1111 } ) ∩ 1CPre ( X 1 ) = X 1 fixed point
Fixpoint for a safety game X 2 is exactly the set of positions from which Player I can avoid 0100 0101 1101 entering {1111} , no matter how Player II behaves. 0000 1111 Player I has a positional 1000 1010 1110 (memoryless) strategy to win the game X 0 = ( Q \ { 1111 } ) ∩ 1CPre ( Q ) This is the X 1 = ( Q \ { 1111 } ) ∩ 1CPre ( X 0 ) greatest X 2 = ( Q \ { 1111 } ) ∩ 1CPre ( X 1 ) = X 1 fixed point
0100 0101 1101 0000 1111 1000 1010 1110
Let be a TGS, let µX · Q ∪ 1CPre ( X ) G = � Q 1 , Q 2 , ι, δ � be a reachability Reach ( G, Q ) game defined on G , Player I has a Safety game for set Q winning strategy for this game iff ι ∈ ∩{ R | R = Q ∪ CPre 1 ( R ) } νX · Q ∩ 1CPre ( X ) Let be a TGS, let G = � Q 1 , Q 2 , ι, δ � be a safety game Safe ( G, Q ) defined on G , Player I has a winning Reachability game for set Q strategy for this game iff µX · Q ∪ 1CPre ( X ) ι ∈ ∪{ R | R = Q ∩ CPre 1 ( R ) }
Games of imperfect information
Perfect information hypothesis? Typical hybrid system
Perfect information hypothesis? The temperature is in the interval ( c − 1 , c + 1) Typical hybrid system
Perfect information hypothesis? Finite precision = imperfect information The temperature is in the interval ( c − 1 , c + 1) Typical hybrid system
a a 2 a b 4 b 1 Bad a b 3 b a Player 0 chooses a letter Player 1 resolves nondeterminism
a a 2 a b 4 b 1 Bad a b 3 b a Imperfect information
Obs 0 a a 2 a b 4 b 1 Bad a b 3 b a Imperfect information
Obs 0 a a 2 a b Obs 1 4 b 1 Bad a b 3 b a Imperfect information
Obs 0 a a 2 a b Obs 1 4 b 1 Bad a b 3 b a Slight generalization of incomplete information Imperfect information
Obs 0 a a 2 a b 4 b 1 Bad a b 3 b a When observing Obs 0, there is no unique good choice: memory is necessary Imperfect information
Games / Strategies - A game of imperfect information : game structure + observation structure - Observation structure : (Obs, γ ) where Obs is a finite set of observations and γ maps every observation to a set of states (we require that every state has at least one observation). -A observation based strategy is a function that maps every sequence o 1 σ 1 o 2 ...o n to a letter in Σ . Our objective is to find an algorithm to construct observation based strategies that avoid Bad.
Games / Strategies - A game of imperfect information : Notation: a game structure of game structure + observation structure imperfect information is a tuple (S,S 0 , Σ , → ,Obs, γ ). - Observation structure : (Obs, γ ) where Obs is a finite set of observations and γ maps every observation to a set of states (we require that every state has at least one observation). -A observation based strategy is a function that maps every sequence o 1 σ 1 o 2 ...o n to a letter in Σ . Our objective is to find an algorithm to construct observation based strategies that avoid Bad.
Games / Strategies - A game of imperfect information : Notation: a game structure of game structure + observation structure imperfect information is a tuple (S,S 0 , Σ , → ,Obs, γ ). - Observation structure : (Obs, γ ) where Obs is a finite Those games generalize games set of observations and γ maps every observation to a of perfect information set of states (we require that every state has at least where Obs=S and γ is the one observation). identity function -A observation based strategy is a function that maps every sequence o 1 σ 1 o 2 ...o n to a letter in Σ . Our objective is to find an algorithm to construct observation based strategies that avoid Bad.
Games / Strategies - A game of imperfect information : Notation: a game structure of game structure + observation structure imperfect information is a tuple (S,S 0 , Σ , → ,Obs, γ ). - Observation structure : (Obs, γ ) where Obs is a finite Those games generalize games set of observations and γ maps every observation to a of perfect information set of states (we require that every state has at least Those games generalize games where and is the one observation). of incomplete information : identity function in that case Obs partitions -A observation based strategy is a function that maps every sequence o 1 σ 1 o 2 ...o n to a letter in Σ . the state space S. [Rei84] Our objective is to find an algorithm to construct observation based strategies that avoid Bad.
Classical Approaches • To solve games of perfect information : • (elegant) fixed point algorithms using a controllable predecessor operator • To solve games of imperfect information • [Reif84] builds a game of perfect information using a knowledge-based subset construction and then solve this games using classical techniques
Classical Approaches • To solve games of perfect information : After a finite prefix of a game, Player I has a partial knowledge of the current state of the • (elegant) fixed point algorithms using a game : a set of states controllable predecessor operator • To solve games of imperfect information • [Reif84] builds a game of perfect information using a knowledge-based subset construction and then solve this games using classical techniques
Classical Approaches • To solve games of perfect information : After a finite prefix of a game, Player I has a partial knowledge of the current state of the • (elegant) fixed point algorithms using a game : a set of states controllable predecessor operator We propose here a new • To solve games of imperfect information solution that avoid the • [Reif84] builds a game of perfect preliminary explicit subset construction. information using a knowledge-based subset construction and then solve this games using classical techniques
A fixed point algorithm We define a controllable predecessor operator for a set of sets of states q CPre ( q ) = { s ⊆ Bad | ∃ σ ∈ Σ · ∀ obs ∈ Obs · ∃ s ′ ∈ q : Post σ ( s ) ∩ γ ( obs ) ⊆ s ′ } (i) s does not intersect with Bad , (ii) there exists s.t. the set of possible successors of s by is bled ( σ ) bled ( σ ) covered by q (a) no matter how the adversary resolves non-determinism, (b) no matter the compatible observation Obs
Example q ={A, B} a 1 b Obs 1 b 2 Obs 2 b b 3 c c 4
Example q ={A, B} a 1 b Obs 1 b 2 Obs 2 b b 3 c Cpre({A,B})= Blue sets c 4
Maximal sets If there is a strategy for set A, there is a strategy for any B included in A a It is enough to keep only 1 b the maximal sets b 2 b b 3 c c 4 CPre ( q ) = [ { s ⊆ Bad | ∃ σ ∈ Σ · ∀ obs ∈ Obs · ∃ s ′ ∈ q : Post σ ( s ) ∩ γ ( obs ) ⊆ s ′ } ]
Antichains Definition 4 [Antichain of sets of states] An antichain on the partially ordered set � 2 S , ⊆� is a set q ⊆ 2 S such that for any A, B ∈ q we have A �⊂ B . Let us call L the set of antichains on S . Definition 5 [ ⊑ ] Let q, q ′ ∈ 2 2 S and define q ⊑ q ′ if and only if ∀ A ∈ q : ∃ A ′ ∈ q ′ : A ⊆ A ′ lub : q 1 � q 2 = �{ s | s ∈ q 1 ∨ s ∈ q 2 }� glb : q 1 � q 2 = �{ s 1 ∩ s 2 | s 1 ∈ q 1 ∧ s 2 ∈ q 2 }� � ⊑� The minimal element is ∅ , the maximal element { S } . � L, ⊑� is a complete lattice. The minimal element is , the
CPre over antichains CPre ( q ) = [ { s ⊆ Bad | ∃ σ ∈ Σ · ∀ obs ∈ Obs · ∃ s ′ ∈ q : Post σ ( s ) ∩ γ ( obs ) ⊆ s ′ } ] • CPre is a monotone function over the lattice of antichains • CPre has a least and a greatest fixed point Advantage : we only keep the needed information to find a strategy
Main theorem Let G = � S, S 0 , Σ , → , Obs , γ � be a two-player game of imperfect information. Player 1 has a winning observation based strategy to avoid Bad, iff � { S 0 ∩ γ ( obs ) | obs ∈ Obs } � { q | q = CPre ( q ) } . We can extract a strategy from the fixed point
a a 2 a b 4 b 1 Bad a b 3 b a Does Player 0 have an observation based strategy to avoid Bad ?
a a 2 a b 4 b 1 Bad a b 3 b a Does Player 0 have an observation based strategy to avoid Bad ? Let us compute the gfp of CPre over L.
q 0 = � a a q 1 = {{ 1 , 2 , 3 } a,b } 2 a b 4 b 1 Bad a b 3 b a
q 0 = � a a q 1 = {{ 1 , 2 , 3 } a,b } 2 a b 4 b 1 Bad a b 3 b a q 2 = CPre ( {{ 1 , 2 , 3 }} )
q 0 = � a a q 1 = {{ 1 , 2 , 3 } a,b } 2 a b 4 b 1 Bad a b 3 b a q 2 = CPre ( {{ 1 , 2 , 3 }} ) = {{ 2 } b , { 1 , 3 } a }
q 0 = � a a q 1 = {{ 1 , 2 , 3 } a,b } 2 a b 4 b 1 Bad a b 3 b a Indeed, Post a ( { 1 , 3 } ) ∩ { 1 , 2 , 4 } ⊆ { 1 , 2 , 3 } q 2 = CPre ( {{ 1 , 2 , 3 }} ) Post a ( { 1 , 3 } ) ∩ { 1 , 3 } ⊆ { 1 , 2 , 3 } = {{ 2 } b , { 1 , 3 } a } Post b ( { 2 } ) ∩ { 1 , 3 } ⊆ { 1 , 2 , 3 } Post b ( { 2 } ) ∩ { 1 , 2 , 4 } ⊆ { 1 , 2 , 3 }
q 0 = � a a q 1 = {{ 1 , 2 , 3 } a,b } 2 a q 2 = {{ 2 } b , { 1 , 3 } a } b 4 b 1 Bad a b 3 b a q 3 = CPre ( {{ 2 } , { 1 , 3 }} )
Recommend
More recommend