csc304 lecture 5
play

CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem - PowerPoint PPT Presentation

CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg Shah 1 Recap Last lecture Cost-sharing games o Price of anarchy (PoA) can be o Price of stability (PoS) is (log ) Potential functions


  1. CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg Shah 1

  2. Recap โ€ข Last lecture โžข Cost-sharing games o Price of anarchy (PoA) can be ๐‘œ o Price of stability (PoS) is ๐‘ƒ(log ๐‘œ) โžข Potential functions and pure Nash equilibria โžข Congestion games โžข Braess โ€™ paradox โžข Updated (slightly more detailed) slides โ€ข Assignment 1 to be posted โ€ข Volunteer note-taker CSC304 - Nisarg Shah 2

  3. Zero-Sum Games โ€ข Total reward constant in all outcomes (w.l.o.g. 0 ) โžข Common term: โ€œzero - sum situationโ€ โžข Psychology literature: โ€œzero - sum thinkingโ€ โžข โ€œStrictly competitive gamesโ€ โ€ข Focus on two-player zero-sum games (2p-zs) โžข โ€œThe more I win, the more you loseโ€ CSC304 - Nisarg Shah 3

  4. Zero-Sum Games Zero-sum game: Rock-Paper-Scissor P2 Rock Paper Scissor P1 Rock (0 , 0) (-1 , 1) (1 , -1) Paper (1 , -1) (0 , 0) (-1 , 1) Scissor (-1 , 1) (1 , -1) (0 , 0) Non-zero- sum game: Prisonerโ€™s dilemma John Stay Silent Betray Sam Stay Silent (-1 , -1) (-3 , 0) Betray (0 , -3) (-2 , -2) CSC304 - Nisarg Shah 4

  5. Zero-Sum Games โ€ข Why are they interesting? โžข Most games we play are zero-sum: chess, tic-tac-toe, rock-paper- scissor, โ€ฆ โžข (win, lose), (lose, win), (draw, draw) โžข (1, -1), (-1, 1), (0, 0) โ€ข Why are they technically interesting? โžข Relation between the rewards of P1 and P2 โžข P1 maximizes his reward โžข P2 maximizes his reward = minimizes reward of P1 CSC304 - Nisarg Shah 5

  6. Zero-Sum Games โ€ข Reward for P2 = - Reward for P1 โžข Only need a single matrix ๐ต : reward for P1 โžข P1 wants to maximize, P2 wants to minimize P2 Rock Paper Scissor P1 Rock 0 -1 1 Paper 1 0 -1 Scissor -1 1 0 CSC304 - Nisarg Shah 6

  7. Rewards in Matrix Form โ€ข Say P1 uses mixed strategy ๐‘ฆ 1 = (๐‘ฆ 1,1 , ๐‘ฆ 1,2 , โ€ฆ ) โžข What are the rewards for P1 corresponding to different possible actions of P2? ๐‘ก ๐‘˜ ๐‘ฆ 1,1 ๐‘ฆ 1,2 ๐‘ฆ 1,3 . . . CSC304 - Nisarg Shah 7

  8. Rewards in Matrix Form โ€ข Say P1 uses mixed strategy ๐‘ฆ 1 = (๐‘ฆ 1,1 , ๐‘ฆ 1,2 , โ€ฆ ) โžข What are the rewards for P1 corresponding to different possible actions of P2? ๐‘ก ๐‘˜ ๐‘ฆ 1,1 , ๐‘ฆ 1,2 , ๐‘ฆ 1,3 , โ€ฆ โˆ— โ– Reward for P1 when P2 ๐‘ˆ โˆ— ๐ต ๐‘˜ chooses s j = ๐‘ฆ 1 CSC304 - Nisarg Shah 8

  9. Rewards in Matrix Form โ€ข Reward for P1 whenโ€ฆ โžข P1 uses mixed strategy ๐‘ฆ 1 โžข P2 uses mixed strategy ๐‘ฆ 2 ๐‘ฆ 2,1 ๐‘ˆ โˆ— ๐ต 1 , ๐‘ฆ 1 ๐‘ˆ โˆ— ๐ต 2 , ๐‘ฆ 1 ๐‘ˆ โˆ— ๐ต 3 โ€ฆ ๐‘ฆ 1 โˆ— ๐‘ฆ 2,2 ๐‘ฆ 2,3 โ‹ฎ ๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ 2 = ๐‘ฆ 1 CSC304 - Nisarg Shah 9

  10. How would the two players act do in this zero-sum game? John von Neumann, 1928 CSC304 - Nisarg Shah 10

  11. Maximin Strategy โ€ข Worst- case thinking by P1โ€ฆ โžข If I choose mixed strategy ๐‘ฆ 1 โ€ฆ โžข P2 would choose ๐‘ฆ 2 to minimize my reward (i.e., maximize his reward) โžข Let me choose ๐‘ฆ 1 to maximize this โ€œworst - case rewardโ€ โˆ— = max ๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ 2 ๐‘Š min ๐‘ฆ 1 1 ๐‘ฆ 2 ๐‘ฆ 1 CSC304 - Nisarg Shah 11

  12. Maximin Strategy โˆ— = max ๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ 2 ๐‘Š min ๐‘ฆ 1 1 ๐‘ฆ 1 ๐‘ฆ 2 โˆ— : maximin value of P1 โ€ข ๐‘Š 1 โˆ— (maximizer) : maximin strategy of P1 โ€ข ๐‘ฆ 1 โˆ— , I guarantee myself at least ๐‘Š โˆ— โ€ โ€ข โ€œBy playing ๐‘ฆ 1 1 โˆ— , P2โ€™s best response โ†’ เทœ โ€ข But if P1 โ†’ ๐‘ฆ 1 ๐‘ฆ 2 โˆ— be the best response to เทœ โžข Will ๐‘ฆ 1 ๐‘ฆ 2 ? CSC304 - Nisarg Shah 12

  13. Maximin vs Minimax Player 1 Player 2 Choose my strategy to Choose my strategy to maximize my reward, worst- minimize P1โ€™s reward, worst - case over P2โ€™s response case over P1โ€™s response โˆ— = max ๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ 2 โˆ— = min ๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ 2 ๐‘Š min ๐‘ฆ 1 ๐‘Š max ๐‘ฆ 1 1 2 ๐‘ฆ 1 ๐‘ฆ 2 ๐‘ฆ 2 ๐‘ฆ 1 โˆ— โˆ— ๐‘ฆ 1 ๐‘ฆ 2 โˆ— and ๐‘Š โˆ— ? Question: Relation between ๐‘Š 1 2 CSC304 - Nisarg Shah 13

  14. Maximin vs Minimax โˆ— = max ๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ 2 โˆ— = min ๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ 2 ๐‘Š min ๐‘ฆ 1 ๐‘Š max ๐‘ฆ 1 1 2 ๐‘ฆ 1 ๐‘ฆ 2 ๐‘ฆ 2 ๐‘ฆ 1 โˆ— โˆ— ๐‘ฆ 1 ๐‘ฆ 2 โˆ— , x 2 โˆ— ) ? โ€ข What if (P1,P2) play (x 1 โˆ— (ensured by P1) โžข P1 must get at least ๐‘Š 1 โˆ— (ensured by P2) โžข P1 must get at most ๐‘Š 2 โˆ— โ‰ค ๐‘Š โˆ— โžข ๐‘Š 1 2 CSC304 - Nisarg Shah 14

  15. The Minimax Theorem โ€ข Jon von Neumann [1928] โ€ข Theorem: For any 2p-zs game, โˆ— = ๐‘Š โˆ— = ๐‘Š โˆ— (called the minimax value of the game) โžข ๐‘Š 1 2 โžข Set of Nash equilibria = โˆ— โˆถ x 1 โˆ— = maximin for P1, x 2 โˆ— = minimax for P2 } โˆ— , x 2 { x 1 โˆ— is best response to ๐‘ฆ 2 โˆ— and vice-versa. โ€ข Corollary: ๐‘ฆ 1 CSC304 - Nisarg Shah 16

  16. The Minimax Theorem โ€ข Jon von Neumann [1928] โ€œ As far as I can see, there could be no theory of games โ€ฆ without that theorem โ€ฆ I thought there was nothing worth publishing until the Minimax Theorem was provedโ€ โ€ข An unequivocal way to โ€œsolveโ€ zero -sum games โžข Optimal strategies for P1 and P2 (up to ties) โžข Optimal rewards for P1 and P2 under a rational play CSC304 - Nisarg Shah 17

  17. Proof of the Minimax Theorem โ€ข Simpler proof using Nashโ€™s theorem โžข But predates Nashโ€™s theorem โ€ข Suppose เทค ๐‘ฆ 1 , เทค ๐‘ฆ 2 is a NE ๐‘ฆ 1 ๐‘ˆ ๐ต เทค โ€ข P1 gets value เทค ๐‘ค = เทค ๐‘ฆ 2 ๐‘ค = max ๐‘ฆ 1 ๐‘ฆ 1 ๐‘ˆ ๐ต เทค โ€ข เทค ๐‘ฆ 1 is best response for P1 : เทค ๐‘ฆ 2 ๐‘ฆ 1 ๐‘ˆ ๐ต ๐‘ฆ 2 โ€ข เทค ๐‘ฆ 2 is best response for P2 : เทค ๐‘ค = min ๐‘ฆ 2 เทค CSC304 - Nisarg Shah 18

  18. Proof of the Minimax Theorem โˆ— = min ๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ 2 โ‰ค ๐‘Š max ๐‘ฆ 1 2 ๐‘ฆ 2 ๐‘ฆ 1 ๐‘ฆ 1 ๐‘ˆ ๐ต เทค ๐‘ฆ 1 ๐‘ˆ ๐ต ๐‘ฆ 2 max ๐‘ฆ 2 = เทค ๐‘ค = min เทค ๐‘ฆ 1 ๐‘ฆ 2 ๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ 2 = ๐‘Š โˆ— โ‰ค max min ๐‘ฆ 1 1 ๐‘ฆ 1 ๐‘ฆ 2 โˆ— โ‰ค ๐‘Š โˆ— โ€ข But we already saw ๐‘Š 1 2 โˆ— = ๐‘Š โˆ— โžข ๐‘Š 1 2 CSC304 - Nisarg Shah 19

  19. Proof of the Minimax Theorem โˆ— = min ๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ 2 = ๐‘Š max ๐‘ฆ 1 2 ๐‘ฆ 2 ๐‘ฆ 1 ๐‘ฆ 1 ๐‘ˆ ๐ต เทค ๐‘ฆ 1 ๐‘ˆ ๐ต ๐‘ฆ 2 max ๐‘ฆ 2 = เทค ๐‘ค = max เทค ๐‘ฆ 1 ๐‘ฆ 2 ๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ 2 = ๐‘Š โˆ— = max min ๐‘ฆ 1 1 ๐‘ฆ 1 ๐‘ฆ 2 โ€ข When (เทค ๐‘ฆ 1 , เทค ๐‘ฆ 2 ) is a NE, เทค ๐‘ฆ 1 and เทค ๐‘ฆ 2 must be maximin and minimax strategies for P1 and P2, respectively. โ€ข The reverse direction is also easy to prove. CSC304 - Nisarg Shah 20

  20. Computing Nash Equilibria โ€ข Can I practically compute a maximin strategy (and thus a Nash equilibrium of the game)? โ€ข Wasnโ€™t it computationally hard even for 2 -player games? โ€ข For 2p-zs games, a Nash equilibrium can be computed in polynomial time using linear programming. โžข Polynomial in #actions of the two players: ๐‘› 1 and ๐‘› 2 CSC304 - Nisarg Shah 21

  21. Computing Nash Equilibria Maximize ๐‘ค Subject to ๐‘ˆ ๐ต ๐‘˜ โ‰ฅ ๐‘ค , ๐‘˜ โˆˆ 1, โ€ฆ , ๐‘› 2 ๐‘ฆ 1 ๐‘ฆ 1 1 + โ‹ฏ + ๐‘ฆ 1 ๐‘› 1 = 1 ๐‘ฆ 1 ๐‘— โ‰ฅ 0, ๐‘— โˆˆ {1, โ€ฆ , ๐‘› 1 } CSC304 - Nisarg Shah 22

  22. Minimax Theorem in Real Life? โ€ข If you were to play a 2-player zero-sum game (say, as player 1), would you always play a maximin strategy? โ€ข What if you were convinced your opponent is an idiot? โ€ข What if you start playing the maximin strategy, but observe that your opponent is not best responding? CSC304 - Nisarg Shah 23

  23. Minimax Theorem in Real Life? Goalie L R Kicker L 0.58 0.95 R 0.93 0.70 Kicker Goalie Maximize ๐‘ค Minimize ๐‘ค Subject to Subject to 0.58๐‘ž ๐‘€ + 0.93๐‘ž ๐‘† โ‰ฅ ๐‘ค 0.58๐‘Ÿ ๐‘€ + 0.95๐‘Ÿ ๐‘† โ‰ค ๐‘ค 0.95๐‘ž ๐‘€ + 0.70๐‘ž ๐‘† โ‰ฅ ๐‘ค 0.93๐‘Ÿ ๐‘€ + 0.70๐‘Ÿ ๐‘† โ‰ค ๐‘ค ๐‘ž ๐‘€ + ๐‘ž ๐‘† = 1 ๐‘Ÿ ๐‘€ + ๐‘Ÿ ๐‘† = 1 ๐‘ž ๐‘€ โ‰ฅ 0, ๐‘ž ๐‘† โ‰ฅ 0 ๐‘Ÿ ๐‘€ โ‰ฅ 0, ๐‘Ÿ ๐‘† โ‰ฅ 0 CSC304 - Nisarg Shah 24

  24. Minimax Theorem in Real Life? Goalie L R Kicker L 0.58 0.95 R 0.93 0.70 Kicker Goalie Maximin: Maximin: ๐‘ž ๐‘€ = 0.38 , ๐‘ž ๐‘† = 0.62 ๐‘Ÿ ๐‘€ = 0.42 , ๐‘Ÿ ๐‘† = 0.58 Reality: Reality: ๐‘ž ๐‘€ = 0.40 , ๐‘ž ๐‘† = 0.60 ๐‘ž ๐‘€ = 0.423 , ๐‘Ÿ ๐‘† = 0.577 Some evidence that people may play minimax strategies. CSC304 - Nisarg Shah 25

  25. Minimax Theorem โ€ข We proved it using Nashโ€™s theorem โžข Cheating. Typically, Nashโ€™s theorem (for the special case of 2p-zs games) is proved using the minimax theorem. John von Neumann โ€ข Useful for proving Yaoโ€™s principle, which provides lower bound for randomized algorithms โ€ข Equivalent to linear programming duality George Dantzig CSC304 - Nisarg Shah 26

  26. von Neumann and Dantzig George Dantzig loves to tell the story of his meeting with John von Neumann on October 3, 1947 at the Institute for Advanced Study at Princeton. Dantzig went to that meeting with the express purpose of describing the linear programming problem to von Neumann and asking him to suggest a computational procedure. He was actually looking for methods to benchmark the simplex method. Instead, he got a 90-minute lecture on Farkas Lemma and Duality (Dantzig's notes of this session formed the source of the modern perspective on linear programming duality). Not wanting Dantzig to be completely amazed, von Neumann admitted: "I don't want you to think that I am pulling all this out of my sleeve like a magician. I have recently completed a book with Morgenstern on the theory of games. What I am doing is conjecturing that the two problems are equivalent. The theory that I am outlining is an analogue to the one we have developed for games.โ€œ - (Chandru & Rao, 1999) CSC304 - Nisarg Shah 27

Recommend


More recommend