finding optimal mixed finding optimal mixed strategies to
play

Finding Optimal Mixed Finding Optimal Mixed Strategies to Commit to - PowerPoint PPT Presentation

Finding Optimal Mixed Finding Optimal Mixed Strategies to Commit to in g Security Games Vincent Conitzer Departments of Computer Science and Economics Departments of Computer Science and Economics Duke University Co-authors on various


  1. Finding Optimal Mixed Finding Optimal Mixed Strategies to Commit to in g Security Games Vincent Conitzer Departments of Computer Science and Economics Departments of Computer Science and Economics Duke University Co-authors on various parts: @Duke: Dmytro (Dima) Korzhyk, Josh Letchford, Kamesh Munagala, Ron Parr @USC: Zhengyu Yin, Chris Kiekintveld, Milind Tambe @CMU: Tuomas Sandholm

  2. What is game theory? • Game theory studies settings where multiple parties (agents) each have ( g ) – different preferences (utility functions), – different actions that they can take – different actions that they can take • Each agent’s utility (potentially) depends on all agents’ actions i – What is optimal for one agent depends on what other agents do • Very circular! • Game theory studies how agents can rationally form y g y beliefs over what other agents will do, and (hence) how agents should act agents should act – Useful for acting as well as predicting behavior of others

  3. Penalty kick example probability .7 probability .3 action probability 1 Is this a action action “rational” probability .6 outcome? If not, what probability .4 is?

  4. Rock-paper-scissors Column player aka. Column player aka player 2 chooses a column 0, 0 -1, 1 1, -1 Row player 1, -1 0, 0 , , -1, 1 , aka. player 1 chooses a row c ooses a o -1, 1 1, -1 , , 0, 0 , A row or column is called an action or (pure) strategy Row player’s utility is always listed first, column player’s second p y y y , p y Zero-sum game: the utilities in each entry sum to 0 (or a constant) Three-player game would be a 3D table with 3 utilities per entry, etc.

  5. Matching pennies (~penalty kick) L R 1, -1 -1, 1 L -1, 1 1, -1 R

  6. “Chicken” • Two players drive cars towards each other • If one player goes straight that player wins • If one player goes straight, that player wins • If both go straight, they both die D S S D D S 0, 0 0 0 -1, 1 1 1 D not zero-sum 1, -1 -5, -5 S

  7. How to play matching pennies Them L R 1, -1 -1, 1 L Us Us -1, 1 1, -1 R • Assume opponent knows our strategy… – hopeless? • … but we can use randomization • If we play L 60% R 40% If we play L 60%, R 40%... • … opponent will play R… • … we get .6*(-1) + .4*(1) = -.2 t 6*( 1) 4*(1) 2 • What’s optimal for us? What about rock-paper-scissors?

  8. Matching pennies with a sensitive target Them L R 1, -1 -1, 1 L Us Us -2, 2 1, -1 R • If we play 50% L, 50% R, opponent will attack L – We get .5*(1) + .5*(-2) = -.5 g ( ) ( ) • What if we play 55% L, 45% R? • • Opponent has choice between Opponent has choice between – L: gives them .55*(-1) + .45*(2) = .35 – R: gives them .55*(1) + .45*(-1) = .1 R i th 55*(1) 45*( 1) 1 • We get -.35 > -.5

  9. Matching pennies with a sensitive target Them L R 1, -1 -1, 1 L Us Us -2, 2 1, -1 R • What if we play 60% L, 40% R? • Opponent has choice between Opponent has choice between – L: gives them .6*(-1) + .4*(2) = .2 – R: gives them .6 (1) + .4 (-1) = .2 R: gives them 6*(1) + 4*( 1) = 2 • We get -.2 either way • This is the maximin strategy – Maximizes our minimum utility

  10. Let’s change roles Them L R 1, -1 -1, 1 L Us Us -2, 2 1, -1 R • Suppose we know their strategy • If they play 50% L, 50% R, y p y , , von Neumann’s minimax theorem [1927]: maximin – We play L, we get .5*(1)+.5*(-1) = 0 value = minimax value ( (~LP duality) y) • If they play 40% L, 60% R, If they play 40% L 60% R – If we play L, we get .4*(1)+.6*(-1) = -.2 – If we play R, we get .4 (-2)+.6 (1) = -.2 If we play R we get 4*( 2)+ 6*(1) = 2 • This is the minimax strategy

  11. Minimax theorem falls apart in nonzero-sum games D D S S 0 0 0, 0 -1 1 1, 1 D D S 1, 1 1 -1 -5 -5 5, 5 S • Let’s say we play S Let s say we play S • Most they could hurt us is by playing S as well • But that is not rational for them • If we can commit to S they will play D If we can commit to S, they will play D – Commitment advantage

  12. Nash equilibrium [Nash 1950] q [ ] • A profile (= strategy for each player) so that no player wants to deviate player wants to deviate D S 0, 0 -1, 1 D 1, -1 -5, -5 S • This game has another Nash equilibrium in g q mixed strategies – both play D with 80%

  13. The presentation game Presenter Put effort into Put effort into Do not put effort into Do not put effort into presentation (E) presentation (NE) Pay attention Pay attention 2, 2 -8, -7 (A) Audience Do not pay 0, -1 0, 0 attention (NA) • Pure-strategy Nash equilibria: (A, E), (NA, NE) • Mixed-strategy Nash equilibrium: Mixed strategy Nash equilibrium: ((1/10 A, 9/10 NA), (4/5 E, 1/5 NE)) – Utility 0 for audience, -7/10 for presenter y , p – Can see that some equilibria are strictly better for both players than other equilibria, i.e. some equilibria Pareto-dominate other equilibria

  14. Properties of Nash equilibrium in two-player games • In zero-sum games, same thing as maximin/minimax strategies maximin/minimax strategies • Any (finite) game has at least one Nash equilibrium [Nash 1950] • PPAD complete to compute one Nash equilibrium • PPAD-complete to compute one Nash equilibrium [Daskalakis, Goldberg, Papadimitriou 2006; Chen & Deng, 2006] • NP-hard & inapproximable to compute the “best” Nash equilibrium [Gilboa & Zemel 1989; Conitzer & Sandholm 2008] q

  15. Nash isn’t optimal if one player can commit 2, 1 4, 0 U i Unique Nash N h equilibrium 1, 0 3, 1 • Suppose the game is played as follows: – Player 1 commits to playing one of the rows, – Player 2 observes the commitment and then chooses a column Player 2 observes the commitment and then chooses a column • Optimal strategy for player 1: commit to Down

  16. Commitment as an extensive-form game i f • For the case of committing to a pure strategy: Player 1 Player 1 Up Down Player 2 Player 2 Left Right Left Right 2, 1 4, 0 1, 0 3, 1

  17. Commitment to mixed strategies g 2, 1 , 4, 0 , .49 .5 1, 0 3, 1 .51 .5 • Assume follower breaks ties in leader’s favor – In generic games this is the unique SPNE outcome of the extensive- form game [von Stengel & Zamir 2010] – We will also refer to this as a Stackelberg strategy

  18. Commitment as an extensive-form game… i f • … for the case of committing to a mixed strategy: for the case of committing to a mixed strategy: Player 1 (1,0) (0,1) (.5,.5) (=Up) (=Down) … … Player 2 Left Right Left Right Left Right 3, 1 2, 1 4, 0 1.5, .5 3.5, .5 1, 0 • • Economist: Just an extensive form game nothing new here Economist: Just an extensive-form game, nothing new here • Computer scientist: Infinite-size game! Representation matters

  19. Computing the optimal mixed strategy to commit to [Conitzer & Sandholm 2006, von Stengel & Zamir 2010] [C it & S dh l 2006 St l & Z i 2010] • Separate LP for every possible follower’s action t* Leader utility Distributional constraint Follower optimality • Choose t* for which the LP is feasible and has the highest objective The leader plays the highest objective. The leader plays the corresponding strategy <p s >. Slide 7

  20. Easy polynomial-time algorithm for two players for two players [Conitzer & Sandholm 2006; von Stengel & Zamir 2010] • For every column t separately, we solve separately for the best mixed row strategy (defined by p s ) that induces player 2 to play t • maximize Σ p u (s t) • maximize Σ s p s u 1 (s, t) • subject to for any t’, Σ s p s u 2 (s, t) ≥ Σ s p s u 2 (s, t’) Σ p = 1 Σ s p s 1 • (May be infeasible) • Pick the t that is best for player 1

  21. Visualization Visualization L L C C R R U 0,1 1,0 0,0 ( , , ) (0,1,0) = M M 4,0 0,1 0,0 D 0,0 1,0 1,1 C R R L (1,0,0) = U (0,0,1) = D

  22. Observations about commitment to a mixed strategy in a two-player game • Coincides with minimax strategies in zero-sum Coincides with minimax strategies in zero sum games • Leader’s payoff always at least as good as in any Nash equilibrium (see [von Stengel & Zamir 2010] ) q ( ] ) [ g – Can simply commit to the Nash equilibrium strategy – Follower breaks ties in your favor – Actually at least as good as any correlated equilibrium – Close relationship to LP for correlated equilibrium [Conitzer 2010 draft] • No equilibrium selection problem • No equilibrium selection problem • Natural notion of approximation

  23. (a particular kind of) Bayesian games (a particular kind of) Bayesian games follower utilities f follower utilities f leader utilities l d tiliti (type 2) (type 1) 2 2 4 4 1 1 0 0 1 1 0 0 1 3 0 1 1 3 probability .6 probability .4

  24. Multiple types Multiple types - visualization visualization (0 1 0) (0,1,0) Combined C C (0,1,0) R L (0,0,1) (1,0,0) (0,1,0) (1,0,0) (0,0,1) R (R,C) L (1,0,0) C (0,0,1)

  25. LAX techniques [Paruchuri et al. 2008, Pita et al. 2009] • Uses Bayesian games framework • Mixed integer programming formulation for solving Bayesian games optimally solving Bayesian games optimally – Much faster than converting game to normal form, solving that

Recommend


More recommend