CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg Shah 1
Recap โข Last lecture โข Cost-sharing games o Price of anarchy (PoA) can be ๐ o Price of stability (PoS) is ๐(log ๐) โข Potential functions and pure Nash equilibria โข Congestion games โข Braess โ paradox โข Updated (slightly more detailed) slides โข Assignment 1 to be posted โข Volunteer note-taker CSC304 - Nisarg Shah 2
Zero-Sum Games โข Total reward constant in all outcomes (w.l.o.g. 0 ) โข Common term: โzero - sum situationโ โข Psychology literature: โzero - sum thinkingโ โข โStrictly competitive gamesโ โข Focus on two-player zero-sum games (2p-zs) โข โThe more I win, the more you loseโ CSC304 - Nisarg Shah 3
Zero-Sum Games Zero-sum game: Rock-Paper-Scissor P2 Rock Paper Scissor P1 Rock (0 , 0) (-1 , 1) (1 , -1) Paper (1 , -1) (0 , 0) (-1 , 1) Scissor (-1 , 1) (1 , -1) (0 , 0) Non-zero- sum game: Prisonerโs dilemma John Stay Silent Betray Sam Stay Silent (-1 , -1) (-3 , 0) Betray (0 , -3) (-2 , -2) CSC304 - Nisarg Shah 4
Zero-Sum Games โข Why are they interesting? โข Most games we play are zero-sum: chess, tic-tac-toe, rock-paper- scissor, โฆ โข (win, lose), (lose, win), (draw, draw) โข (1, -1), (-1, 1), (0, 0) โข Why are they technically interesting? โข Relation between the rewards of P1 and P2 โข P1 maximizes his reward โข P2 maximizes his reward = minimizes reward of P1 CSC304 - Nisarg Shah 5
Zero-Sum Games โข Reward for P2 = - Reward for P1 โข Only need a single matrix ๐ต : reward for P1 โข P1 wants to maximize, P2 wants to minimize P2 Rock Paper Scissor P1 Rock 0 -1 1 Paper 1 0 -1 Scissor -1 1 0 CSC304 - Nisarg Shah 6
Rewards in Matrix Form โข Say P1 uses mixed strategy ๐ฆ 1 = (๐ฆ 1,1 , ๐ฆ 1,2 , โฆ ) โข What are the rewards for P1 corresponding to different possible actions of P2? ๐ก ๐ ๐ฆ 1,1 ๐ฆ 1,2 ๐ฆ 1,3 . . . CSC304 - Nisarg Shah 7
Rewards in Matrix Form โข Say P1 uses mixed strategy ๐ฆ 1 = (๐ฆ 1,1 , ๐ฆ 1,2 , โฆ ) โข What are the rewards for P1 corresponding to different possible actions of P2? ๐ก ๐ ๐ฆ 1,1 , ๐ฆ 1,2 , ๐ฆ 1,3 , โฆ โ โ Reward for P1 when P2 ๐ โ ๐ต ๐ chooses s j = ๐ฆ 1 CSC304 - Nisarg Shah 8
Rewards in Matrix Form โข Reward for P1 whenโฆ โข P1 uses mixed strategy ๐ฆ 1 โข P2 uses mixed strategy ๐ฆ 2 ๐ฆ 2,1 ๐ โ ๐ต 1 , ๐ฆ 1 ๐ โ ๐ต 2 , ๐ฆ 1 ๐ โ ๐ต 3 โฆ ๐ฆ 1 โ ๐ฆ 2,2 ๐ฆ 2,3 โฎ ๐ โ ๐ต โ ๐ฆ 2 = ๐ฆ 1 CSC304 - Nisarg Shah 9
How would the two players act do in this zero-sum game? John von Neumann, 1928 CSC304 - Nisarg Shah 10
Maximin Strategy โข Worst- case thinking by P1โฆ โข If I choose mixed strategy ๐ฆ 1 โฆ โข P2 would choose ๐ฆ 2 to minimize my reward (i.e., maximize his reward) โข Let me choose ๐ฆ 1 to maximize this โworst - case rewardโ โ = max ๐ โ ๐ต โ ๐ฆ 2 ๐ min ๐ฆ 1 1 ๐ฆ 2 ๐ฆ 1 CSC304 - Nisarg Shah 11
Maximin Strategy โ = max ๐ โ ๐ต โ ๐ฆ 2 ๐ min ๐ฆ 1 1 ๐ฆ 1 ๐ฆ 2 โ : maximin value of P1 โข ๐ 1 โ (maximizer) : maximin strategy of P1 โข ๐ฆ 1 โ , I guarantee myself at least ๐ โ โ โข โBy playing ๐ฆ 1 1 โ , P2โs best response โ เท โข But if P1 โ ๐ฆ 1 ๐ฆ 2 โ be the best response to เท โข Will ๐ฆ 1 ๐ฆ 2 ? CSC304 - Nisarg Shah 12
Maximin vs Minimax Player 1 Player 2 Choose my strategy to Choose my strategy to maximize my reward, worst- minimize P1โs reward, worst - case over P2โs response case over P1โs response โ = max ๐ โ ๐ต โ ๐ฆ 2 โ = min ๐ โ ๐ต โ ๐ฆ 2 ๐ min ๐ฆ 1 ๐ max ๐ฆ 1 1 2 ๐ฆ 1 ๐ฆ 2 ๐ฆ 2 ๐ฆ 1 โ โ ๐ฆ 1 ๐ฆ 2 โ and ๐ โ ? Question: Relation between ๐ 1 2 CSC304 - Nisarg Shah 13
Maximin vs Minimax โ = max ๐ โ ๐ต โ ๐ฆ 2 โ = min ๐ โ ๐ต โ ๐ฆ 2 ๐ min ๐ฆ 1 ๐ max ๐ฆ 1 1 2 ๐ฆ 1 ๐ฆ 2 ๐ฆ 2 ๐ฆ 1 โ โ ๐ฆ 1 ๐ฆ 2 โ , x 2 โ ) ? โข What if (P1,P2) play (x 1 โ (ensured by P1) โข P1 must get at least ๐ 1 โ (ensured by P2) โข P1 must get at most ๐ 2 โ โค ๐ โ โข ๐ 1 2 CSC304 - Nisarg Shah 14
The Minimax Theorem โข Jon von Neumann [1928] โข Theorem: For any 2p-zs game, โ = ๐ โ = ๐ โ (called the minimax value of the game) โข ๐ 1 2 โข Set of Nash equilibria = โ โถ x 1 โ = maximin for P1, x 2 โ = minimax for P2 } โ , x 2 { x 1 โ is best response to ๐ฆ 2 โ and vice-versa. โข Corollary: ๐ฆ 1 CSC304 - Nisarg Shah 16
The Minimax Theorem โข Jon von Neumann [1928] โ As far as I can see, there could be no theory of games โฆ without that theorem โฆ I thought there was nothing worth publishing until the Minimax Theorem was provedโ โข An unequivocal way to โsolveโ zero -sum games โข Optimal strategies for P1 and P2 (up to ties) โข Optimal rewards for P1 and P2 under a rational play CSC304 - Nisarg Shah 17
Proof of the Minimax Theorem โข Simpler proof using Nashโs theorem โข But predates Nashโs theorem โข Suppose เทค ๐ฆ 1 , เทค ๐ฆ 2 is a NE ๐ฆ 1 ๐ ๐ต เทค โข P1 gets value เทค ๐ค = เทค ๐ฆ 2 ๐ค = max ๐ฆ 1 ๐ฆ 1 ๐ ๐ต เทค โข เทค ๐ฆ 1 is best response for P1 : เทค ๐ฆ 2 ๐ฆ 1 ๐ ๐ต ๐ฆ 2 โข เทค ๐ฆ 2 is best response for P2 : เทค ๐ค = min ๐ฆ 2 เทค CSC304 - Nisarg Shah 18
Proof of the Minimax Theorem โ = min ๐ โ ๐ต โ ๐ฆ 2 โค ๐ max ๐ฆ 1 2 ๐ฆ 2 ๐ฆ 1 ๐ฆ 1 ๐ ๐ต เทค ๐ฆ 1 ๐ ๐ต ๐ฆ 2 max ๐ฆ 2 = เทค ๐ค = min เทค ๐ฆ 1 ๐ฆ 2 ๐ โ ๐ต โ ๐ฆ 2 = ๐ โ โค max min ๐ฆ 1 1 ๐ฆ 1 ๐ฆ 2 โ โค ๐ โ โข But we already saw ๐ 1 2 โ = ๐ โ โข ๐ 1 2 CSC304 - Nisarg Shah 19
Proof of the Minimax Theorem โ = min ๐ โ ๐ต โ ๐ฆ 2 = ๐ max ๐ฆ 1 2 ๐ฆ 2 ๐ฆ 1 ๐ฆ 1 ๐ ๐ต เทค ๐ฆ 1 ๐ ๐ต ๐ฆ 2 max ๐ฆ 2 = เทค ๐ค = max เทค ๐ฆ 1 ๐ฆ 2 ๐ โ ๐ต โ ๐ฆ 2 = ๐ โ = max min ๐ฆ 1 1 ๐ฆ 1 ๐ฆ 2 โข When (เทค ๐ฆ 1 , เทค ๐ฆ 2 ) is a NE, เทค ๐ฆ 1 and เทค ๐ฆ 2 must be maximin and minimax strategies for P1 and P2, respectively. โข The reverse direction is also easy to prove. CSC304 - Nisarg Shah 20
Computing Nash Equilibria โข Can I practically compute a maximin strategy (and thus a Nash equilibrium of the game)? โข Wasnโt it computationally hard even for 2 -player games? โข For 2p-zs games, a Nash equilibrium can be computed in polynomial time using linear programming. โข Polynomial in #actions of the two players: ๐ 1 and ๐ 2 CSC304 - Nisarg Shah 21
Computing Nash Equilibria Maximize ๐ค Subject to ๐ ๐ต ๐ โฅ ๐ค , ๐ โ 1, โฆ , ๐ 2 ๐ฆ 1 ๐ฆ 1 1 + โฏ + ๐ฆ 1 ๐ 1 = 1 ๐ฆ 1 ๐ โฅ 0, ๐ โ {1, โฆ , ๐ 1 } CSC304 - Nisarg Shah 22
Minimax Theorem in Real Life? โข If you were to play a 2-player zero-sum game (say, as player 1), would you always play a maximin strategy? โข What if you were convinced your opponent is an idiot? โข What if you start playing the maximin strategy, but observe that your opponent is not best responding? CSC304 - Nisarg Shah 23
Minimax Theorem in Real Life? Goalie L R Kicker L 0.58 0.95 R 0.93 0.70 Kicker Goalie Maximize ๐ค Minimize ๐ค Subject to Subject to 0.58๐ ๐ + 0.93๐ ๐ โฅ ๐ค 0.58๐ ๐ + 0.95๐ ๐ โค ๐ค 0.95๐ ๐ + 0.70๐ ๐ โฅ ๐ค 0.93๐ ๐ + 0.70๐ ๐ โค ๐ค ๐ ๐ + ๐ ๐ = 1 ๐ ๐ + ๐ ๐ = 1 ๐ ๐ โฅ 0, ๐ ๐ โฅ 0 ๐ ๐ โฅ 0, ๐ ๐ โฅ 0 CSC304 - Nisarg Shah 24
Minimax Theorem in Real Life? Goalie L R Kicker L 0.58 0.95 R 0.93 0.70 Kicker Goalie Maximin: Maximin: ๐ ๐ = 0.38 , ๐ ๐ = 0.62 ๐ ๐ = 0.42 , ๐ ๐ = 0.58 Reality: Reality: ๐ ๐ = 0.40 , ๐ ๐ = 0.60 ๐ ๐ = 0.423 , ๐ ๐ = 0.577 Some evidence that people may play minimax strategies. CSC304 - Nisarg Shah 25
Minimax Theorem โข We proved it using Nashโs theorem โข Cheating. Typically, Nashโs theorem (for the special case of 2p-zs games) is proved using the minimax theorem. John von Neumann โข Useful for proving Yaoโs principle, which provides lower bound for randomized algorithms โข Equivalent to linear programming duality George Dantzig CSC304 - Nisarg Shah 26
von Neumann and Dantzig George Dantzig loves to tell the story of his meeting with John von Neumann on October 3, 1947 at the Institute for Advanced Study at Princeton. Dantzig went to that meeting with the express purpose of describing the linear programming problem to von Neumann and asking him to suggest a computational procedure. He was actually looking for methods to benchmark the simplex method. Instead, he got a 90-minute lecture on Farkas Lemma and Duality (Dantzig's notes of this session formed the source of the modern perspective on linear programming duality). Not wanting Dantzig to be completely amazed, von Neumann admitted: "I don't want you to think that I am pulling all this out of my sleeve like a magician. I have recently completed a book with Morgenstern on the theory of games. What I am doing is conjecturing that the two problems are equivalent. The theory that I am outlining is an analogue to the one we have developed for games.โ - (Chandru & Rao, 1999) CSC304 - Nisarg Shah 27
Recommend
More recommend