15-382 C OLLECTIVE I NTELLIGENCE – S18 L ECTURE 27: G AME T HEORY 2 I NSTRUCTOR : G IANNI A. D I C ARO
T HE PROFESSOR ’ S DILEMMA Class Simultaneous move Listen Sleep Non-cooperative game Complete information Make 10 6 ,10 6 -10,0 Professor Imperfect information effort Solution concept: predict how the game will be played Slack with rational agents 0,-10 0,0 off Prediction ≡ Solution Nash: Equilibrium concept Dominant strategies? Nope, if Class listen, and Professor slacks off, Sleep provides a higher payoff! No dominant strategy: best strategy it doesn’t matter what other player’s strategy 15781 Fall 2016: Lecture 22 2
N ASH EQUILIBRIUM (1951) Can we find an equilibrium also in absence of a dominant strategy? At equilibrium, each player’s strategy is a best response to strategies of others Formally, a Nash equilibrium is strategy profile 𝑡 = 𝑡 1 … , 𝑡 𝑜 ∈ 𝑇 𝑜 such that: ′ ∈ 𝑇, 𝑣 𝑗 𝑡 ≥ 𝑣 𝑗 (𝑡 𝑗 ′ , 𝑡 −𝑗 ) ∀𝑗 ∈ 𝑂, ∀𝑡 𝑗 John F. Nash, Nobel Prize in Economics, 1994 15781 Fall 2016: Lecture 22 3
N ASH EQUILIBRIUM In equilibrium, each player is playing the strategy that is a “ best response ” to the strategies of the other players. No one has an incentive to change strategy given the strategy choices of the others A NE is an equilibrium where each player’s strategy is optimal given the strategies of all other players . A Nash Equilibrium exists when there is no unilateral profitable deviation from any of the players involved Nash Equilibria are self-enforcing strategies: when players are at a Nash Equilibrium they have no desire to move because they will be worse off → Equilibrium in the policy space Dominant strategy ⟹ Nash equilibrium : All solutions in dominant strategies are also Nash equilibria, but the vice versa is not 15781 Fall 2016: Lecture 22 necessarily (and not usually) true 4
N ASH EQUILIBRIUM Equilibrium is not : The best possible outcome of the game. Equilibrium in the one-shot prisoners’ dilemma is for both players to confess, which is not the best possible outcome (not Pareto optimal) A situation where players always choose the same action. Sometimes equilibrium will involve changing action choices ( mixed strategy equilibrium). 15781 Fall 2016: Lecture 22 5
N ASH EQUILIBRIUM How many Nash equilibria does the Professor’s Dilemma have ? Listen Sleep 10 6 ,10 6 -10,0 Make effort Slack off 0,-10 0,0 ML - SS 15781 Fall 2016: Lecture 22 6
N ASH EQUILIBRIA : H OW DO WE FIND THEM ? Nash equilibrium: A play of the game where each strategy is a best reply to the given strategy of the other. Let’s examine all the possible pure strategy profiles and check if for a profile (X,Y) one player could improve its payoff, given the strategy of the other (M, L)? If Prof plays M, then L is the best reply given M. Neither player can increase its the payoff by choosing a different action o (S,L)? If Prof plays S, S is the best reply given S, not L o (M, S)? If Prof plays M, then L is the best reply given M, not S (S,S)? If Prof plays S, then S is the best reply given S. Neither player can increase its the payoff by choosing a different action 15781 Fall 2016: Lecture 22 7
N ASH EQUILIBRIUM FOR P RISONER ’ S D ILEMMA Prisoner B Don’t Confess confess Confess Don’t Prisoner A -1,-1 -9,0 Confess 0,-9 -6,-6 15781 Fall 2016: Lecture 22 8
C OORDINATION G AME : S TAG HUNT (Originally from J.J. Rousseau) Two equilibria at ( stag, stag ) and ( rabbit, rabbit ) → Players' optimal strategy depend on their expectation on what the other player may do. This game has been used as an analogy for social cooperation, and mutual trust In Prisoner’s dilemma, the Nash equilibrium corresponds to defect, no cooperate! 9
C OMPETITION G AME Both players simultaneously choose an integer from 0 to 3 They both win the smaller of the two numbers in points. In addition, if one player chooses a larger number than the other, then it has to give up two points to the other. Does the (unique) NE at (0,0) make sense? 15781 Fall 2016: Lecture 22 10
R OCK - PAPER - SCISSORS R P S R 0,0 -1,1 1,-1 P 1,-1 0,0 -1,1 S -1,1 1,-1 0,0 Nash equilibrium? Is there a pure strategy as best response? 15781 Fall 2016: Lecture 22 11
R OCK -P APER -S CISSORS For every pure strategy (X,Y), there R P S is a different strategy choice that increases the payoff of a player R 0,0 -1,1 1,-1 E.g., for strategy (P,R), player B can get a higher payoff playing strategy S instead R P 1,-1 0,0 -1,1 E.g., for strategy (S,R), player A can get a higher payoff playing strategy P instead S S -1,1 1,-1 0,0 No strategy equilibrium can be settled, players have the incentive to No (pure) Nash equilibria: keep switching their strategy Best response: randomize! 15781 Fall 2016: Lecture 22 12
M IXED STRATEGIES Mixed strategy: a probability distribution over ( pure ) strategies The mixed strategy of player 𝑗 ∈ 𝑂 is 𝑦 𝑗 , where 𝑦 𝑗 (𝑡 𝑗 ) = Pr[𝑗 plays 𝑡 𝑗 ] (e.g., 𝑦 𝑗 𝑆 = 0.3, 𝑦 𝑗 𝑄 = 0.5, 𝑦 𝑗 𝑇 = 0.2) The (expected) utility of player 𝑗 ∈ 𝑂 is 𝑜 𝑣 𝑗 𝑦 1 , … , 𝑦 𝑜 = 𝑣 𝑗 𝑡 1 , … , 𝑡 𝑜 ⋅ ෑ 𝑦 𝑘 (𝑡 𝑘 ) (𝑡 1 ,…,𝑡 𝑜 )∈𝑇 𝑜 𝑘=1 Mixed strategy Pure strategy Utility of pure Joint probability of profile profile strategy the pure strategy profile profile given the mixed profile 15781 Fall 2016: Lecture 22 13
E XERCISE : M IXED NE 1 1 R P S Player 1 plays 2 , 2 , 0 , player 2 1 1 plays 0, 2 , 2 . What is 𝑣 1 ? R 0,0 -1,1 1,-1 1 1 1 Both players play P 1,-1 0,0 -1,1 3 , 3 , 3 . What is 𝑣 1 ? S -1,1 1,-1 0,0 15781 Fall 2016: Lecture 22 14
E XERCISE : M IXED NE 1 1 1 1 2 , 2 , 0 , player 2 plays 0, 2 , 2 . What is 𝑣 1 ? Player 1 plays + + R P S R 0,0 -1,1 1,-1 P 1,-1 0,0 -1,1 S -1,1 1,-1 0,0 In the second case, because of symmetry, the utility is 15781 Fall 2016: Lecture 22 zero: It’s a zero-sum game 15
M IXED S TRATEGIES E QUILIBRIUM IS N ASH The mixed strategy profile 𝑦 ∗ in a strategic game is a mixed strategy Nash equilibrium if ∗ , 𝑦 −𝑗 ∗ ∗ ∀ 𝑦 𝑗 and 𝑗 𝑣 𝑗 𝑦 𝑗 ≥ 𝑣 𝑗 𝑦 𝑗 , 𝑦 −𝑗 𝑣 𝑗 𝑦 is player 𝑗 ’s expected utility with mixed strategy profile 𝑦 → Same definition as in the case f pure strategies, where 𝑣 𝑗 was the utility of a pure strategy instead of a mixed strategy 15781 Fall 2016: Lecture 22 16
M IXED S TRATEGIES N ASH E QUILIBRIUM ∗ is Using best response functions, 𝑦 ∗ is a mixed strategy NE iff 𝑦 𝑗 the best response for every player 𝑗 . If a mixed strategy 𝑦 ∗ is a best response, then each of the pure strategies in the mix must be best response : they must yield the same expected payoff (otherwise it would just make sense to choose the one with the better payoff) → If a mixed strategy is a best response for player 𝑗 , then the player must be indifferent among the pure strategies in the mix E.g., in the RPS game, if the mixed strategy of player 𝑗 assigns non-zero probabilities p R for playing R and p P for playing P, then 𝑗 ’s expected utility for playing R or P has to be the same 15781 Fall 2016: Lecture 22 17
E XERCISE : M IXED NE Which is a NE? R P S 1 1 1 1 1. 2 , 2 , 0 , 2 , 2 , 0 R 0,0 -1,1 1,-1 1 1 1 1 2. 2 , 2 , 0 , 2 , 0, 2 P 1,-1 0,0 -1,1 1 1 1 1 1 1 3. 3 , 3 , 3 , 3 , 3 , 3 S -1,1 1,-1 0,0 1 2 2 1 4. 3 , 3 , 0 , 3 , 0, 3 Any other NE? 15781 Fall 2016: Lecture 22 18
N ASH ’ S T HEOREM Theorem [Nash, 1950]: In any game with finite number of strategies there exists at least one (possibly mixed) Nash equilibrium Player B Left Right 1,2 0,4 Up Player A 0,5 3,2 Down This game has no pure strategy Nash equilibria but it does have a Nash 15781 Fall 2016: Lecture 22 equilibrium in mixed strategies. How is it computed? 19
C OMPUTATION OF MS NE Player B Left Right 1,2 0,4 Up Player A 0,5 3,2 Down Player A plays Up with probability p U and plays Down with probability 1-p U Player B plays Left with probability p L and plays Right with probability 1-p L. 15781 Fall 2016: Lecture 22 20
C OMPUTATION OF MS NE Player B L, p L R,1- p L U, p U 1,2 0,4 Player A D,1- p U 0,5 3,2 15781 Fall 2016: Lecture 22 21
C OMPUTATION OF MS NE Player B L, p L R,1- p L U, p U 1,2 0,4 Player A D,1- p U 0,5 3,2 If B plays Left, its expected utility is p p 2 5 1 ( ) U U 15781 Fall 2016: Lecture 22 22
C OMPUTATION OF MS NE Player B L, p L R,1- p L U, p U 1,2 0,4 Player A D,1- p U 0,5 3,2 If B plays Right, its expected utility is p p 4 2 1 ( ). U U 15781 Fall 2016: Lecture 22 23
C OMPUTATION OF MS NE Player B L, p L R,1- p L U, p U 1,2 0,4 Player A D,1- p U 0,5 3,2 p p p p 2 5 1 ( ) 4 2 1 ( ) If then U U U U B would play only Left, which would be a pure strategy. But there are no (pure) Nash equilibria in which B plays only Left. 15781 Fall 2016: Lecture 22 24
Recommend
More recommend