g ame t heory 2
play

G AME T HEORY 2 I NSTRUCTOR : G IANNI A. D I C ARO T HE PROFESSOR S - PowerPoint PPT Presentation

15-382 C OLLECTIVE I NTELLIGENCE S18 L ECTURE 27: G AME T HEORY 2 I NSTRUCTOR : G IANNI A. D I C ARO T HE PROFESSOR S DILEMMA Class Simultaneous move Listen Sleep Non-cooperative game Complete information Make 10 6 ,10 6


  1. 15-382 C OLLECTIVE I NTELLIGENCE – S18 L ECTURE 27: G AME T HEORY 2 I NSTRUCTOR : G IANNI A. D I C ARO

  2. T HE PROFESSOR ’ S DILEMMA Class  Simultaneous move Listen Sleep  Non-cooperative game  Complete information Make 10 6 ,10 6 -10,0 Professor  Imperfect information effort  Solution concept: predict how the game will be played Slack with rational agents 0,-10 0,0 off  Prediction ≡ Solution  Nash: Equilibrium concept Dominant strategies? Nope, if Class listen, and Professor slacks off, Sleep provides a higher payoff! No dominant strategy: best strategy it doesn’t matter what other player’s strategy 15781 Fall 2016: Lecture 22 2

  3. N ASH EQUILIBRIUM (1951)  Can we find an equilibrium also in absence of a dominant strategy?  At equilibrium, each player’s strategy is a best response to strategies of others  Formally, a Nash equilibrium is strategy profile 𝑡 = 𝑡 1 … , 𝑡 𝑜 ∈ 𝑇 𝑜 such that: ′ ∈ 𝑇, 𝑣 𝑗 𝑡 ≥ 𝑣 𝑗 (𝑡 𝑗 ′ , 𝑡 −𝑗 ) ∀𝑗 ∈ 𝑂, ∀𝑡 𝑗 John F. Nash, Nobel Prize in Economics, 1994 15781 Fall 2016: Lecture 22 3

  4. N ASH EQUILIBRIUM  In equilibrium, each player is playing the strategy that is a “ best response ” to the strategies of the other players. No one has an incentive to change strategy given the strategy choices of the others  A NE is an equilibrium where each player’s strategy is optimal given the strategies of all other players .  A Nash Equilibrium exists when there is no unilateral profitable deviation from any of the players involved  Nash Equilibria are self-enforcing strategies: when players are at a Nash Equilibrium they have no desire to move because they will be worse off → Equilibrium in the policy space  Dominant strategy ⟹ Nash equilibrium : All solutions in dominant strategies are also Nash equilibria, but the vice versa is not 15781 Fall 2016: Lecture 22 necessarily (and not usually) true 4

  5. N ASH EQUILIBRIUM Equilibrium is not :  The best possible outcome of the game. Equilibrium in the one-shot prisoners’ dilemma is for both players to confess, which is not the best possible outcome (not Pareto optimal)  A situation where players always choose the same action. Sometimes equilibrium will involve changing action choices ( mixed strategy equilibrium). 15781 Fall 2016: Lecture 22 5

  6. N ASH EQUILIBRIUM  How many Nash equilibria does the Professor’s Dilemma have ? Listen Sleep 10 6 ,10 6 -10,0 Make effort Slack off 0,-10 0,0 ML - SS 15781 Fall 2016: Lecture 22 6

  7. N ASH EQUILIBRIA : H OW DO WE FIND THEM ?  Nash equilibrium: A play of the game where each strategy is a best reply to the given strategy of the other.  Let’s examine all the possible pure strategy profiles and check if for a profile (X,Y) one player could improve its payoff, given the strategy of the other  (M, L)? If Prof plays M, then L is the best reply given M. Neither player can increase its the payoff by choosing a different action o (S,L)? If Prof plays S, S is the best reply given S, not L o (M, S)? If Prof plays M, then L is the best reply given M, not S  (S,S)? If Prof plays S, then S is the best reply given S. Neither player can increase its the payoff by choosing a different action 15781 Fall 2016: Lecture 22 7

  8. N ASH EQUILIBRIUM FOR P RISONER ’ S D ILEMMA Prisoner B Don’t Confess confess Confess Don’t Prisoner A -1,-1 -9,0 Confess 0,-9 -6,-6 15781 Fall 2016: Lecture 22 8

  9. C OORDINATION G AME : S TAG HUNT (Originally from J.J. Rousseau)  Two equilibria at ( stag, stag ) and ( rabbit, rabbit ) → Players' optimal strategy depend on their expectation on what the other player may do.  This game has been used as an analogy for social cooperation, and mutual trust  In Prisoner’s dilemma, the Nash equilibrium corresponds to defect, no cooperate! 9

  10. C OMPETITION G AME  Both players simultaneously choose an integer from 0 to 3  They both win the smaller of the two numbers in points.  In addition, if one player chooses a larger number than the other, then it has to give up two points to the other. Does the (unique) NE at (0,0) make sense? 15781 Fall 2016: Lecture 22 10

  11. R OCK - PAPER - SCISSORS R P S R 0,0 -1,1 1,-1 P 1,-1 0,0 -1,1 S -1,1 1,-1 0,0 Nash equilibrium? Is there a pure strategy as best response? 15781 Fall 2016: Lecture 22 11

  12. R OCK -P APER -S CISSORS  For every pure strategy (X,Y), there R P S is a different strategy choice that increases the payoff of a player R 0,0 -1,1 1,-1  E.g., for strategy (P,R), player B can get a higher payoff playing strategy S instead R P 1,-1 0,0 -1,1  E.g., for strategy (S,R), player A can get a higher payoff playing strategy P instead S S -1,1 1,-1 0,0  No strategy equilibrium can be settled, players have the incentive to No (pure) Nash equilibria: keep switching their strategy Best response: randomize! 15781 Fall 2016: Lecture 22 12

  13. M IXED STRATEGIES  Mixed strategy: a probability distribution over ( pure ) strategies  The mixed strategy of player 𝑗 ∈ 𝑂 is 𝑦 𝑗 , where 𝑦 𝑗 (𝑡 𝑗 ) = Pr[𝑗 plays 𝑡 𝑗 ] (e.g., 𝑦 𝑗 𝑆 = 0.3, 𝑦 𝑗 𝑄 = 0.5, 𝑦 𝑗 𝑇 = 0.2)  The (expected) utility of player 𝑗 ∈ 𝑂 is 𝑜 𝑣 𝑗 𝑦 1 , … , 𝑦 𝑜 = ෍ 𝑣 𝑗 𝑡 1 , … , 𝑡 𝑜 ⋅ ෑ 𝑦 𝑘 (𝑡 𝑘 ) (𝑡 1 ,…,𝑡 𝑜 )∈𝑇 𝑜 𝑘=1 Mixed strategy Pure strategy Utility of pure Joint probability of profile profile strategy the pure strategy profile profile given the mixed profile 15781 Fall 2016: Lecture 22 13

  14. E XERCISE : M IXED NE 1 1 R P S  Player 1 plays 2 , 2 , 0 , player 2 1 1 plays 0, 2 , 2 . What is 𝑣 1 ? R 0,0 -1,1 1,-1 1 1 1  Both players play P 1,-1 0,0 -1,1 3 , 3 , 3 . What is 𝑣 1 ? S -1,1 1,-1 0,0 15781 Fall 2016: Lecture 22 14

  15. E XERCISE : M IXED NE 1 1 1 1 2 , 2 , 0 , player 2 plays 0, 2 , 2 . What is 𝑣 1 ? Player 1 plays + + R P S R 0,0 -1,1 1,-1 P 1,-1 0,0 -1,1 S -1,1 1,-1 0,0 In the second case, because of symmetry, the utility is 15781 Fall 2016: Lecture 22 zero: It’s a zero-sum game 15

  16. M IXED S TRATEGIES E QUILIBRIUM IS N ASH  The mixed strategy profile 𝑦 ∗ in a strategic game is a mixed strategy Nash equilibrium if ∗ , 𝑦 −𝑗 ∗ ∗ ∀ 𝑦 𝑗 and 𝑗 𝑣 𝑗 𝑦 𝑗 ≥ 𝑣 𝑗 𝑦 𝑗 , 𝑦 −𝑗  𝑣 𝑗 𝑦 is player 𝑗 ’s expected utility with mixed strategy profile 𝑦  → Same definition as in the case f pure strategies, where 𝑣 𝑗 was the utility of a pure strategy instead of a mixed strategy 15781 Fall 2016: Lecture 22 16

  17. M IXED S TRATEGIES N ASH E QUILIBRIUM ∗ is  Using best response functions, 𝑦 ∗ is a mixed strategy NE iff 𝑦 𝑗 the best response for every player 𝑗 .  If a mixed strategy 𝑦 ∗ is a best response, then each of the pure strategies in the mix must be best response : they must yield the same expected payoff (otherwise it would just make sense to choose the one with the better payoff)  → If a mixed strategy is a best response for player 𝑗 , then the player must be indifferent among the pure strategies in the mix  E.g., in the RPS game, if the mixed strategy of player 𝑗 assigns non-zero probabilities p R for playing R and p P for playing P, then 𝑗 ’s expected utility for playing R or P has to be the same 15781 Fall 2016: Lecture 22 17

  18. E XERCISE : M IXED NE  Which is a NE? R P S 1 1 1 1 1. 2 , 2 , 0 , 2 , 2 , 0 R 0,0 -1,1 1,-1 1 1 1 1 2. 2 , 2 , 0 , 2 , 0, 2 P 1,-1 0,0 -1,1 1 1 1 1 1 1 3. 3 , 3 , 3 , 3 , 3 , 3 S -1,1 1,-1 0,0 1 2 2 1 4. 3 , 3 , 0 , 3 , 0, 3 Any other NE? 15781 Fall 2016: Lecture 22 18

  19. N ASH ’ S T HEOREM  Theorem [Nash, 1950]: In any game with finite number of strategies there exists at least one (possibly mixed) Nash equilibrium Player B Left Right 1,2 0,4 Up Player A 0,5 3,2 Down This game has no pure strategy Nash equilibria but it does have a Nash 15781 Fall 2016: Lecture 22 equilibrium in mixed strategies. How is it computed? 19

  20. C OMPUTATION OF MS NE Player B Left Right 1,2 0,4 Up Player A 0,5 3,2 Down Player A plays Up with probability p U and plays Down with probability 1-p U Player B plays Left with probability p L and plays Right with probability 1-p L. 15781 Fall 2016: Lecture 22 20

  21. C OMPUTATION OF MS NE Player B L, p L R,1- p L U, p U 1,2 0,4 Player A D,1- p U 0,5 3,2 15781 Fall 2016: Lecture 22 21

  22. C OMPUTATION OF MS NE Player B L, p L R,1- p L U, p U 1,2 0,4 Player A D,1- p U 0,5 3,2 If B plays Left, its expected utility is p   p 2 5 1 ( ) U U 15781 Fall 2016: Lecture 22 22

  23. C OMPUTATION OF MS NE Player B L, p L R,1- p L U, p U 1,2 0,4 Player A D,1- p U 0,5 3,2 If B plays Right, its expected utility is p   p 4 2 1 ( ). U U 15781 Fall 2016: Lecture 22 23

  24. C OMPUTATION OF MS NE Player B L, p L R,1- p L U, p U 1,2 0,4 Player A D,1- p U 0,5 3,2 p   p  p   p 2 5 1 ( ) 4 2 1 ( ) If then U U U U B would play only Left, which would be a pure strategy. But there are no (pure) Nash equilibria in which B plays only Left. 15781 Fall 2016: Lecture 22 24

Recommend


More recommend