 
              Comparators for Quantitative games BY Suguman Bansal Swarat Chaudhuri Moshe Y. Vardi Dagstuhl Seminar March 16, 2017
Repeated games • Infinitely many rounds of a base game • Each agent receives a reward in every round • Reward of each agent from game is an aggregation of its rewards from every round • Discounted-sum aggregation • Mean-payoff aggregation . . . 3/16/17 Comparators for quantitative games 2
Automated reasoning about repeated games • Existence of rational behavior in a repeated game • Find one rational behavior in repeated games • Find all rational behaviors in a repeated game • Finite representation • What properties hold in repeated game if agents behave rationally • Properties of rational behaviors in repeated games 3/16/17 Comparators for quantitative games 3
Strategy in the Repeated games [Rubinstein 1986] • Finite state machines 𝐷, 𝐷 , 2 𝐷, 𝐸 , 0 • Strategies is defined for an agent • All other agents comprise of environment 𝑡 ) 𝑡 * • Transitions denote one round of repeated game w.r.t. agent 𝐸, 𝐷 , 3 • Transition on (agent-Action, envt-Action) • Weight on transition is Reward of agent for 𝐸, 𝐸 , 1 that transition Tit-for-tat strategy in Repeated games 3/16/17 Algorithmic analysis of Regular repeated games 4
Game executions [Rubinstein 1986] 𝐷, 𝐷 , 2 𝐷, 𝐷 , 2 𝐷, 𝐸 , 0 𝐷, 𝐸 , 0 𝑏, 𝑐 , 2 × 𝑡 ) S1 𝑡 * 𝑡 ) 𝑡 * 𝑐, 𝑏 , 3 S2 𝐸, 𝐷 , 3 𝐸, 𝐷 , 3 𝐸, 𝐸 , 1 𝐸, 𝐸 , 1 Synchronize on all agent actions 𝐷, 𝐷 , (2, 2 ) 𝑏, 𝑐 , (2, 3) 𝐷, 𝐷 , (2, 2 ) (S1, S2) 𝐷, 𝐷 , (2, 2 ) 𝐷, 𝐷 , (2, 2 ) 3/16/17 Algorithmic analysis of Regular repeated games 5
Strategies are Trees 𝑡 ) 𝐷, 𝐷 , 2 𝐷, 𝐸 , 0 𝑡 ) 𝑡 * 𝑡 ) 𝑡 * 𝑡 ) 𝑡 * 𝑡 ) 𝑡 * 𝐸, 𝐷 , 3 𝐸, 𝐸 , 1 3/16/17 Comparators for quantitative games 6
Strategies Tree and their compositions • Every player has a regular set of strategies • Weighted tree automata • Composition of agent strategies results in regular set of game executions • Compare rewards of player along different game executions 3/16/17 Comparators for quantitative games 7
Comparing rewards • Reward of agent in a game execution • Reward sequence of agent 𝑇 = (𝑡 2 , 𝑡 ) , … ) • 𝑡 4 is reward received in the i-th round of a game execution • Aggregate function 𝑔 ∶ ℕ 8 → ℝ • Reward of agent is 𝑔 𝑇 • Which game execution results in a greater reward? • Reward sequences 𝐵 and 𝐶 on different game executions • Is 𝑔 𝐵 ≤ 𝑔(𝐶) ? 3/16/17 Comparators for quantitative games 8
Outline of the talk ... • Introduce Comparator [Bansal, Chaudhuri, Vardi (under submission)] • A novel automata-theoretic technique to compare the aggregate of reward sequences • Comparator for discounted-sum aggregate function • Applications of Comparators [Ongoing work] 3/16/17 Comparators for quantitative games 9
Comparator • Comparator for aggregate function 𝑔: ℕ 8 → ℝ is a Büchi automaton • Pair of bounded reward sequences 𝐵, 𝐶 is a word of the comparator for 𝑔: ℕ 8 → ℝ iff 𝑔 𝐵 ≤ 𝑔(𝐶) • Comparator for Limsup aggregate function • Limsup of sequence of natural numbers is the largest number that occurs infinitely often 3/16/17 Comparators for quantitative games 10
Discounted-sum Comparator • Discounted sum of reward sequence 𝑆 with discount factor 𝑒 > 1 is 𝐸𝑇 C 𝑆 = 𝑠 2 + 𝑠 𝑒 + 𝑠 * ) 𝑒 * … • Discounted-sum Comparator with discount-factor 𝑒 > 1 • Pair of bounded reward sequences 𝐵, 𝐶 is a word of the DS Comparator with discount-factor 𝑒 iff 𝐸𝑇 C 𝐵 ≤ 𝐸𝑇 C 𝐶 3/16/17 Comparators for quantitative games 11
DS Comparator : Core Insight – I • Sequence 𝐵 = (𝑏 2 , 𝑏 ) , 𝑏 * … ) • Discount factor 𝑒 > 1 F G F H Number in base 𝑒 • 𝐸𝑇 C 𝐵 = 𝑏 2 + C + C H + ⋯ [Chaudhuri, Sankaranarayanan, = 𝑏 2 . 𝑏 ) 𝑏 * … C = 𝐵 C Vardi , LICS 2013] • Use lexicographic ordering of sequences • Works only if 𝑏 4 ≤ 𝑒 − 1 for all 𝑗 ≥ 0 • Further difficulty when discount-factor is non-integeral [ Akiyama, Frougny, Sakarovitch, IJoM 2008 ] 3/16/17 Comparators for quantitative games 12
DS Comparator : Core Insight – II • Sequence 𝐵 = (𝑏 2 , 𝑏 ) , 𝑏 * … ) • Discount factor 𝑒 > 1 F G F H • 𝐸𝑇 C (𝐵) = 𝑏 2 + C + C H + ⋯ = 𝑏 2 . 𝑏 ) 𝑏 * … C = 𝐵 C • 𝐸𝑇 C 𝐵 ≤ 𝐸𝑇 C (𝐶) iff 𝐵 C ≤ 𝐶 C Arithmetic in base 𝑒 • Find 𝐷 = (𝑑 2 , 𝑑 ) , 𝑑 * , … ) , such that • 𝐸𝑇 C 𝐷 = 𝐷 C ≥ 0 • 𝐵 C + 𝐷 C = 𝐶 C 3/16/17 Comparators for quantitative games 13
DS Comparator : Core Insight – II (cont..) • Consider ( 𝑒 = 10 ) X 2 1 0 0 0 0 …. A 5 13 6 0 0 0 …. C + 0 8 6 0 0 0 …. B 7 2 2 0 0 0 …. 𝑗 = 0, 𝑏 2 + 𝑑 2 + 𝑦 2 = 𝑐 2 => 𝐸𝑇 C 𝐵 + 𝐸𝑇 C 𝐷 = 𝐸𝑇 C (𝐶) 𝑗 > 0, 𝑏 4 + 𝑑 4 + 𝑦 4 = 𝑐 4 + 𝑒 ⋅ 𝑦 4l) 3/16/17 Comparators for quantitative games 14
DS Comparator : Finite-state memory? n • Upper bound 𝜈 on reward sequences, 𝑒 = o > 1 is discount factor • For all bounded reward sequences 𝐵, 𝐶 • Can find sequences 𝐷 and 𝑌 that satisfy equations Cl) , where is 𝑑 4 of the form r C • 0 ≤ 𝑑 4 ≤ 𝜈 ⋅ o for integer 𝑛 Cl) , where is 𝑦 4 of the form r t • 𝑦 4 ≤ 1 + o for integer 𝑛 • Finitely many possibilties for 𝑦 4 , 𝑑 u pairs 3/16/17 Comparators for quantitative games 15
DS Comparator : Construction 𝑏 2 + 𝑦 2 + 𝑑 2 = 𝑐 2 𝑏 4 + 𝑑 4 + 𝑦 4 = 𝑐 4 + 𝑒 ⋅ 𝑦 4l) 𝑏 2 , 𝑐 2 𝑏 4 , 𝑐 4 𝑦 4l) , 𝑑 4l) start 𝑦 2 , 𝑑 2 𝑦 4 , 𝑑 4 Automaton accepts (𝐵, 𝐶) iff 𝐸𝑇 C (𝐵) ≤ 𝐸𝑇 C (𝐶) 3/16/17 Comparators for quantitative games 16
So far ... ü Introduce Comparator [Bansal, Chaudhuri, Vardi (under review)] ü A novel automata-theoretic technique to compare the aggregate of reward sequences ü Comparator for discounted-sum aggregate function • Applications of Comparators 3/16/17 Comparators for quantitative games 17
Application in quantitative graph games • Graph game with vertices 𝑊 2 , 𝑊 ) and edges 𝐹 • From vertex 𝑤 ∈ 𝑊 4 , agent 𝑄 4 picks the next vertex • Each agent receives a reward in every vertex • Reward of agent from game is given by aggregation over its reward sequence • Objective of every player is to receive greater reward than the other player • Find a winning strategy for player 𝑄 4 ? 3/16/17 Comparators for quantitative games 18
Quantitative graph game – Solution If aggregate function 𝑔 ∶ ℕ 8 → ℝ has a comparator • Objective of agent is an 𝜕 -regular objective • We know how to solve graph games with 𝜕 -regular objectives! • Leverages algorithms from qualitative domain • Quantitative objective converted to qualitative objective • Generic solutions for aggregate functions • As long as their comparator exists 3/16/17 Comparators for quantitative games 19
Comparators for repeated games ? • Finite-state representations for strategies in repeated games [Rubinstein 1986] • Regular set of strategies for each agent • Finite-state, weighted-tree automata-based representation for strategies • Finite-state representations of rational behaviors in this repeated game • Use comparator 3/16/17 Comparators for quantitative games 20
Summary • Introduced the idea of Comparators for qualitatively comparison of aggregate of reward sequences • Comparators have applications in games and beyond • Quantitative graph games • Discounted-sum inclusion problem • Generic algorithms for aggregate functions with comparators • Leverage algorithms and heuristics from qualitative analysis • Potential to use comparators to reason about repeated games 3/16/17 Comparators for quantitative games 21
Recommend
More recommend