Lecture 8 Feb 2, 2010 CS 886 Outline Multi-agent systems Game - PowerPoint PPT Presentation
Lecture 8 Feb 2, 2010 CS 886 Outline Multi-agent systems Game theory Russell and Norvig: Sect 17.6 2 CS886 Lecture Slides (c) 2010 P. Poupart Multi-agent systems So far Single agent optimizing some objectives in a
Lecture 8 Feb 2, 2010 CS 886
Outline • Multi-agent systems • Game theory • Russell and Norvig: Sect 17.6 2 CS886 Lecture Slides (c) 2010 P. Poupart
Multi-agent systems • So far… – Single agent optimizing some objectives in a possibly uncertain environment – But, what if there are several agents? • Multi-agent systems – Two (or more) agents can influence the world – How should an agent act given that it shares “control” with other agents? 3 CS886 Lecture Slides (c) 2010 P. Poupart
Multi-agent Systems • Search techniques for deterministic games with alternating play – Minimax algorithm – Alpha-beta pruning • Today: – Extend decision theory to multi-agent systems – View other agents as sources of uncertainty – Framework: Game theory 4 CS886 Lecture Slides (c) 2010 P. Poupart
What is game theory? • Game theory is a formal way to analyze interactions among a group of rational agents who behave strategically – Group: Must have more than 1 decision maker • Otherwise you have a decision problem, not a game Solitaire is not a game! 5 CS886 Lecture Slides (c) 2010 P. Poupart
What is game theory? • Game theory is a formal way to analyze interactions among a group of rational agents who behave strategically – Interaction: What one agent does directly affects at least one other agent in the group – Rational: An agent chooses its best action – Strategic: Agents take into account how other agents influence the game 6 CS886 Lecture Slides (c) 2010 P. Poupart
Games • Examples: – Chess, soccer, poker, etc. – Elections – Auctions, Trades – Taxation system – Negotiation – Packet routing protocols, – Driving laws 7 CS886 Lecture Slides (c) 2010 P. Poupart
Two aspects • Agent design – Given a game, what is a rational strategy? – Ex: playing chess, driving, voting, filling up an income tax report, etc. • Mechanism design – Given that agents behave rationally, what should the rules of the game be? – Ex: designing driving laws, an election, a taxation system, an auction, etc. 8 CS886 Lecture Slides (c) 2010 P. Poupart
Strategic Games (aka normal form) • Formally: <I,{S i },{U i }> • Set of agents I={1,2,…,n} • Each agent i can choose a strategy s i ∈ S i • Outcome of the game is defined by a strategy profile (s 1 ,…,s n ) ∈ S • Agents have preferences over the outcomes – utility functions: U i (s 1 ,…,s n ) ∈ ℜ 9 CS886 Lecture Slides (c) 2010 P. Poupart
Example: Election • Agents: electors • Strategies: possible votes for different candidates • Outcome: set of all votes determines a winner (elected candidate) • Utility fn: preferences for each candidate 10 CS886 Lecture Slides (c) 2010 P. Poupart
Simple Games • Assumptions: – Single decision – Deterministic game – Fully observable game – Simultaneous play • Possible to relax those assumptions… 11 CS886 Lecture Slides (c) 2010 P. Poupart
Example: Even or Odd Agent 2 One Two 2,-2 -3,3 One Zero-sum Agent 1 game. -3,3 4,-4 Two Σ i=1n u i (o)=0 I={1,2} S i ={One,Two} An outcome is (One, Two) U 1 ((One,Two))=-3 and U 2 ((One,Two))=3 12 CS886 Lecture Slides (c) 2010 P. Poupart
Examples of strategic games Baseball or Soccer Chicken B S T C 2,1 0,0 -1,- 1 10,0 B T 0,0 1,2 0,10 5,5 S C Coordination Game Anti-Coordination Game 13 CS886 Lecture Slides (c) 2010 P. Poupart
Example: Prisoner’s Dilemma Confess Don’t Confess -5,-5 0,-10 Confess -10,0 -1,-1 Don’t Confess 14 CS886 Lecture Slides (c) 2010 P. Poupart
Playing a game • We now know how to describe a game • Next step – Playing the game! • Recall, agents are rational – Let p i be agent i’s beliefs about what its opponent will do – Agent i is rational if it chooses to play strategy s i * where s i * = argmax si Σ s-i u i (s i ,s -i )p i (s -i ) Notation: s -i =(s 1 ,…,s i-1 ,s i+1 ,…,s n ) 15 CS886 Lecture Slides (c) 2010 P. Poupart
Dominated Strategies • Definition : A strategy s i is strictly dominated if ∃ s i ’, ∀ s -i , u i (s i ,s -i ) < u i (s i ’,s -i ) • A rational agent will never play a strictly dominated strategy! – This allows us to solve some games! 16 CS886 Lecture Slides (c) 2010 P. Poupart
Example: Prisoner’s Dilemma Confess Don’t Confess -5,-5 0,-10 Confess -10,0 -1,-1 Don’t Confess Confess Don’t Confess Confess -5,-5 0,-10 -5,-5 Confess Confess Equilibrium Outcome 17 CS886 Lecture Slides (c) 2010 P. Poupart
Strict Dominance does not capture the whole picture A B C 0,4 4,0 5,3 A 4,0 0,4 5,3 B 3,5 3,5 6,6 C What strict dominance eliminations can we do? None… So what should the players of this game do? 18 CS886 Lecture Slides (c) 2010 P. Poupart
Nash Equilibrium • Sometimes an agent’s best-response depends on the strategies other agents are playing • A strategy profile, s*, is a Nash equilibrium if no agent has incentive to deviate from its strategy given that others do not deviate : 19 CS886 Lecture Slides (c) 2010 P. Poupart
Nash Equilibrium • Equivalently, s * is a N.E. iff ∀ i s i * = argmax si u i (s i ,s -i *) (C,C) is a N.E. because A B C 0,4 4,0 5,3 A 4,0 0,4 5,3 B AND 3,5 3,5 6,6 C 20 CS886 Lecture Slides (c) 2010 P. Poupart
Another example B S 2,1 0,0 B 0,0 1,2 S 2 Nash Equilibria Coordination Game 21 CS886 Lecture Slides (c) 2010 P. Poupart
Yet another example Agent 2 One Two 2,-2 -3,3 One Agent 1 -3,3 4,-4 Two There is no PURE strategy Nash Equilibrium for this game 22 CS886 Lecture Slides (c) 2010 P. Poupart
(Mixed) Nash Equilibria • Mixed strategy σ i : – σ i ∈ Σ i defines a probability distribution over S i • Strategy profile: σ =( σ 1 ,…, σ n ) • Expected utility: u i ( σ )= Σ s ∈ S ( Π j σ (s j ))u i (s) • Nash Equilibrium: σ * is a (mixed) Nash equilibrium if 23 CS886 Lecture Slides (c) 2010 P. Poupart
Yet another example B One Two 2,-2 -3,3 p = Pr(one) One q = Pr(one) A -3,3 4,-4 Two How do we determine p and q? 24 CS886 Lecture Slides (c) 2010 P. Poupart
Exercise B S This game has 3 Nash Equilibria 2,1 0,0 (2 pure strategy NE and 1 B mixed strategy NE). Find them. 0,0 1,2 S 25 CS886 Lecture Slides (c) 2010 P. Poupart
Mixed Nash Equilibrium • Theorem (Nash 50): Every game in which the strategy sets S 1 ,…, S n have a finite number of elements has a mixed strategy equilibrium. John Nash Nobel Prize in Economics (1994) 26 CS886 Lecture Slides (c) 2010 P. Poupart
Other Useful Theorems • Thm: In an n-player pure strategy game G=(S 1 ,…,S n ; u 1 ,..,u n ), if iterated elimination of strictly dominated strategies eliminates all but the strategies (S 1 * ,…,S n * ) then these strategies are the unique NE of the game • Thm: Any NE will survive iterated elimination of strictly dominated strategies. 27 CS886 Lecture Slides (c) 2010 P. Poupart
Nash Equilibrium • Interpretations: – Focal points, self-enforcing agreements, stable social convention, consequence of rational inference.. • Criticisms – They may not be unique • Ways of overcoming this: Refinements of equilibrium concept, Mediation, Learning – They may be hard to find – People don’t always behave based on what equilibria would predict (ultimatum games and notions of fairness,…) 28 CS886 Lecture Slides (c) 2010 P. Poupart
Bayesian Games • What should player A do? Player B L R 3,? -2,? U Player A 0,? 6,? D Question: When does such a situation arise? 29 CS886 Lecture Slides (c) 2010 P. Poupart
Bayesian Games Chris H C • Hockey lover gets 2 2,2 0,0 units for watching H Pat hockey and 1 unit for 0,0 1,1 C watching curling • Curling lover gets 2 With 2/3 chance units for watching curling and 1 unit for watching hockey Chris • Pat is a hockey lover H C • Pat thinks that Chris is 2,1 0,0 H probably a hockey lover, Pat but is not sure 0,0 1,2 C With 1/3 chance 30 CS886 Lecture Slides (c) 2010 P. Poupart
Bayesian Games • In a Bayesian game each player has a type • All players know their own type, but have only a probability distribution over their opponents’ types • Game G – Set of action spaces: A 1 ,…,A n – Set of type spaces: T 1 ,…,T n – Set of beliefs: P 1 ,…,P n – Set of payoff functions: u 1 ,…,u n – P i (t -i |t i ) is the prob distribution of the types for the other players, given player i has type t i – u i (a 1 ,…,a n ;t i ) is the utility (payoff) to agent i if player j chooses action a j and agent i has type t i ∈ T i 31 CS886 Lecture Slides (c) 2010 P. Poupart
Knowledge Assumptions (Who knows what) • All players know A i ’s, T i ’s, P i ’s and u i ’s • The i’th player knows t i but not t 1 ,t 2 ,…t i-1 , t i+1 ,…,t n • All players know that all players know the above • And they know that they know that they know…… (common knowledge) • Def: A strategy s i (t i ) in a Bayesian game is a mapping from T i to A i (i.e. it specifies what action should be taken for each type) 32 CS886 Lecture Slides (c) 2010 P. Poupart
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.