Lecture 8 Feb 2, 2010 CS 886
Outline • Multi-agent systems • Game theory • Russell and Norvig: Sect 17.6 2 CS886 Lecture Slides (c) 2010 P. Poupart
Multi-agent systems • So far… – Single agent optimizing some objectives in a possibly uncertain environment – But, what if there are several agents? • Multi-agent systems – Two (or more) agents can influence the world – How should an agent act given that it shares “control” with other agents? 3 CS886 Lecture Slides (c) 2010 P. Poupart
Multi-agent Systems • Search techniques for deterministic games with alternating play – Minimax algorithm – Alpha-beta pruning • Today: – Extend decision theory to multi-agent systems – View other agents as sources of uncertainty – Framework: Game theory 4 CS886 Lecture Slides (c) 2010 P. Poupart
What is game theory? • Game theory is a formal way to analyze interactions among a group of rational agents who behave strategically – Group: Must have more than 1 decision maker • Otherwise you have a decision problem, not a game Solitaire is not a game! 5 CS886 Lecture Slides (c) 2010 P. Poupart
What is game theory? • Game theory is a formal way to analyze interactions among a group of rational agents who behave strategically – Interaction: What one agent does directly affects at least one other agent in the group – Rational: An agent chooses its best action – Strategic: Agents take into account how other agents influence the game 6 CS886 Lecture Slides (c) 2010 P. Poupart
Games • Examples: – Chess, soccer, poker, etc. – Elections – Auctions, Trades – Taxation system – Negotiation – Packet routing protocols, – Driving laws 7 CS886 Lecture Slides (c) 2010 P. Poupart
Two aspects • Agent design – Given a game, what is a rational strategy? – Ex: playing chess, driving, voting, filling up an income tax report, etc. • Mechanism design – Given that agents behave rationally, what should the rules of the game be? – Ex: designing driving laws, an election, a taxation system, an auction, etc. 8 CS886 Lecture Slides (c) 2010 P. Poupart
Strategic Games (aka normal form) • Formally: <I,{S i },{U i }> • Set of agents I={1,2,…,n} • Each agent i can choose a strategy s i ∈ S i • Outcome of the game is defined by a strategy profile (s 1 ,…,s n ) ∈ S • Agents have preferences over the outcomes – utility functions: U i (s 1 ,…,s n ) ∈ ℜ 9 CS886 Lecture Slides (c) 2010 P. Poupart
Example: Election • Agents: electors • Strategies: possible votes for different candidates • Outcome: set of all votes determines a winner (elected candidate) • Utility fn: preferences for each candidate 10 CS886 Lecture Slides (c) 2010 P. Poupart
Simple Games • Assumptions: – Single decision – Deterministic game – Fully observable game – Simultaneous play • Possible to relax those assumptions… 11 CS886 Lecture Slides (c) 2010 P. Poupart
Example: Even or Odd Agent 2 One Two 2,-2 -3,3 One Zero-sum Agent 1 game. -3,3 4,-4 Two Σ i=1n u i (o)=0 I={1,2} S i ={One,Two} An outcome is (One, Two) U 1 ((One,Two))=-3 and U 2 ((One,Two))=3 12 CS886 Lecture Slides (c) 2010 P. Poupart
Examples of strategic games Baseball or Soccer Chicken B S T C 2,1 0,0 -1,- 1 10,0 B T 0,0 1,2 0,10 5,5 S C Coordination Game Anti-Coordination Game 13 CS886 Lecture Slides (c) 2010 P. Poupart
Example: Prisoner’s Dilemma Confess Don’t Confess -5,-5 0,-10 Confess -10,0 -1,-1 Don’t Confess 14 CS886 Lecture Slides (c) 2010 P. Poupart
Playing a game • We now know how to describe a game • Next step – Playing the game! • Recall, agents are rational – Let p i be agent i’s beliefs about what its opponent will do – Agent i is rational if it chooses to play strategy s i * where s i * = argmax si Σ s-i u i (s i ,s -i )p i (s -i ) Notation: s -i =(s 1 ,…,s i-1 ,s i+1 ,…,s n ) 15 CS886 Lecture Slides (c) 2010 P. Poupart
Dominated Strategies • Definition : A strategy s i is strictly dominated if ∃ s i ’, ∀ s -i , u i (s i ,s -i ) < u i (s i ’,s -i ) • A rational agent will never play a strictly dominated strategy! – This allows us to solve some games! 16 CS886 Lecture Slides (c) 2010 P. Poupart
Example: Prisoner’s Dilemma Confess Don’t Confess -5,-5 0,-10 Confess -10,0 -1,-1 Don’t Confess Confess Don’t Confess Confess -5,-5 0,-10 -5,-5 Confess Confess Equilibrium Outcome 17 CS886 Lecture Slides (c) 2010 P. Poupart
Strict Dominance does not capture the whole picture A B C 0,4 4,0 5,3 A 4,0 0,4 5,3 B 3,5 3,5 6,6 C What strict dominance eliminations can we do? None… So what should the players of this game do? 18 CS886 Lecture Slides (c) 2010 P. Poupart
Nash Equilibrium • Sometimes an agent’s best-response depends on the strategies other agents are playing • A strategy profile, s*, is a Nash equilibrium if no agent has incentive to deviate from its strategy given that others do not deviate : 19 CS886 Lecture Slides (c) 2010 P. Poupart
Nash Equilibrium • Equivalently, s * is a N.E. iff ∀ i s i * = argmax si u i (s i ,s -i *) (C,C) is a N.E. because A B C 0,4 4,0 5,3 A 4,0 0,4 5,3 B AND 3,5 3,5 6,6 C 20 CS886 Lecture Slides (c) 2010 P. Poupart
Another example B S 2,1 0,0 B 0,0 1,2 S 2 Nash Equilibria Coordination Game 21 CS886 Lecture Slides (c) 2010 P. Poupart
Yet another example Agent 2 One Two 2,-2 -3,3 One Agent 1 -3,3 4,-4 Two There is no PURE strategy Nash Equilibrium for this game 22 CS886 Lecture Slides (c) 2010 P. Poupart
(Mixed) Nash Equilibria • Mixed strategy σ i : – σ i ∈ Σ i defines a probability distribution over S i • Strategy profile: σ =( σ 1 ,…, σ n ) • Expected utility: u i ( σ )= Σ s ∈ S ( Π j σ (s j ))u i (s) • Nash Equilibrium: σ * is a (mixed) Nash equilibrium if 23 CS886 Lecture Slides (c) 2010 P. Poupart
Yet another example B One Two 2,-2 -3,3 p = Pr(one) One q = Pr(one) A -3,3 4,-4 Two How do we determine p and q? 24 CS886 Lecture Slides (c) 2010 P. Poupart
Exercise B S This game has 3 Nash Equilibria 2,1 0,0 (2 pure strategy NE and 1 B mixed strategy NE). Find them. 0,0 1,2 S 25 CS886 Lecture Slides (c) 2010 P. Poupart
Mixed Nash Equilibrium • Theorem (Nash 50): Every game in which the strategy sets S 1 ,…, S n have a finite number of elements has a mixed strategy equilibrium. John Nash Nobel Prize in Economics (1994) 26 CS886 Lecture Slides (c) 2010 P. Poupart
Other Useful Theorems • Thm: In an n-player pure strategy game G=(S 1 ,…,S n ; u 1 ,..,u n ), if iterated elimination of strictly dominated strategies eliminates all but the strategies (S 1 * ,…,S n * ) then these strategies are the unique NE of the game • Thm: Any NE will survive iterated elimination of strictly dominated strategies. 27 CS886 Lecture Slides (c) 2010 P. Poupart
Nash Equilibrium • Interpretations: – Focal points, self-enforcing agreements, stable social convention, consequence of rational inference.. • Criticisms – They may not be unique • Ways of overcoming this: Refinements of equilibrium concept, Mediation, Learning – They may be hard to find – People don’t always behave based on what equilibria would predict (ultimatum games and notions of fairness,…) 28 CS886 Lecture Slides (c) 2010 P. Poupart
Bayesian Games • What should player A do? Player B L R 3,? -2,? U Player A 0,? 6,? D Question: When does such a situation arise? 29 CS886 Lecture Slides (c) 2010 P. Poupart
Bayesian Games Chris H C • Hockey lover gets 2 2,2 0,0 units for watching H Pat hockey and 1 unit for 0,0 1,1 C watching curling • Curling lover gets 2 With 2/3 chance units for watching curling and 1 unit for watching hockey Chris • Pat is a hockey lover H C • Pat thinks that Chris is 2,1 0,0 H probably a hockey lover, Pat but is not sure 0,0 1,2 C With 1/3 chance 30 CS886 Lecture Slides (c) 2010 P. Poupart
Bayesian Games • In a Bayesian game each player has a type • All players know their own type, but have only a probability distribution over their opponents’ types • Game G – Set of action spaces: A 1 ,…,A n – Set of type spaces: T 1 ,…,T n – Set of beliefs: P 1 ,…,P n – Set of payoff functions: u 1 ,…,u n – P i (t -i |t i ) is the prob distribution of the types for the other players, given player i has type t i – u i (a 1 ,…,a n ;t i ) is the utility (payoff) to agent i if player j chooses action a j and agent i has type t i ∈ T i 31 CS886 Lecture Slides (c) 2010 P. Poupart
Knowledge Assumptions (Who knows what) • All players know A i ’s, T i ’s, P i ’s and u i ’s • The i’th player knows t i but not t 1 ,t 2 ,…t i-1 , t i+1 ,…,t n • All players know that all players know the above • And they know that they know that they know…… (common knowledge) • Def: A strategy s i (t i ) in a Bayesian game is a mapping from T i to A i (i.e. it specifies what action should be taken for each type) 32 CS886 Lecture Slides (c) 2010 P. Poupart
Recommend
More recommend