csc304 lecture 7 game theory
play

CSC304 Lecture 7 Game Theory : Security games, Applications to - PowerPoint PPT Presentation

CSC304 Lecture 7 Game Theory : Security games, Applications to security CSC304 - Nisarg Shah 1 Until now Simultaneous-move Games All players act simultaneously Nash equilibria = stable outcomes Each player is best responding


  1. CSC304 Lecture 7 Game Theory : Security games, Applications to security CSC304 - Nisarg Shah 1

  2. Until now… • Simultaneous-move Games • All players act simultaneously • Nash equilibria = stable outcomes • Each player is best responding to the strategies of all other players CSC304 - Nisarg Shah 2

  3. Sequential Move Games • Focus on two players: “ leader ” and “ follower ” 1. Leader commits to a (possibly mixed) strategy 𝑦 1 ➢ Cannot change later 2. Follower learns about 𝑦 1 ➢ Follower must believe that leader’s commitment is credible 3. Follower chooses the best response 𝑦 2 ➢ Can assume to be a pure strategy without loss of generality ➢ If multiple actions are best response, break ties in favor of the leader CSC304 - Nisarg Shah 3

  4. Sequential Move Games • Wait. Does this give us anything new? ➢ Can’t I, as player 1, commit to playing 𝑦 1 in a simultaneous-move game too? ➢ Player 2 wouldn’t believe you. No you won’t. I’m Doesn’t I’ll play Yeah playing 𝑦 2 ; 𝑦 1 is not matter. I’m 𝑦 1 . right. a best response. committing. CSC304 - Nisarg Shah 4

  5. That’s unless… • You’re as convincing as this guy. CSC304 - Nisarg Shah 5

  6. How to represent the game? • Extensive form representation ➢ Can also represent “information sets”, multiple moves, … Player 1 Player 2 Player 2 (1,1) (3,0) (0,0) (2,1) CSC304 - Nisarg Shah 6

  7. A Curious Case P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Q: What are the Nash equilibria of this game? • Q: You are P1. What is your reward in Nash equilibrium? CSC304 - Nisarg Shah 7

  8. A Curious Case P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Q: As P1, you want to commit to a pure strategy. Which strategy would you commit to? • Q: What would your reward be now? CSC304 - Nisarg Shah 8

  9. Commitment Advantage P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Reward in the unique Nash equilibrium = 1 • Reward when committing to Down = 2 CSC304 - Nisarg Shah 9

  10. Commitment Advantage P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Higher reward in committing to a mixed strategy ➢ P1 commits to: Up w.p. 0.5 − 𝜗 , Down w.p. 0.5 + 𝜗 ➢ P2 is still better off playing Right ➢ 𝔽 [Reward] to P1 ≈ 2.5 ➢ Note: If P1 plays both actions with probability exactly 0.5, we assume P2 plays Right (break ties in favor of leader) CSC304 - Nisarg Shah 10

  11. Stackelberg vs Nash • Committing first is always better than playing a simultaneous-move game? • Yes! ∗ is a NE, P1 can always commit to 𝑦 1 ∗ , 𝑦 2 ∗ , ensure ➢ If 𝑦 1 ∗ , and achieve the reward in the NE that P2 will play 𝑦 2 ∗ ➢ P1 may be able to commit to a better strategy than 𝑦 1 • Applications to security ➢ Law enforcement is better off committing to a mixed patrolling strategy, and announcing the strategy publicly! CSC304 - Nisarg Shah 11

  12. Stackelberg in Zero-Sum • Recall the minimax theorem: 𝑈 𝐵 𝑦 2 = min 𝑈 𝐵 𝑦 2 max min 𝑦 1 max 𝑦 1 𝑦 1 𝑦 2 𝑦 2 𝑦 1 • P1 goes first → P1 chooses her minimax strategy • P2 goes first → P2 chooses her minimax strategy • Minimax Theorem: It doesn’t make a difference! ➢ Simultaneous-move, P1 going first, and P2 going first are essentially identical scenarios. CSC304 - Nisarg Shah 12

  13. Stackelberg in General-Sum • 2-player non-zero-sum game with reward matrices 𝐵 and 𝐶 ≠ −𝐵 for the two players 𝑈 𝐵 𝑔 𝑦 1 max 𝑦 1 𝑦 1 𝑈 𝐶 𝑦 2 where 𝑔 𝑦 1 = argmax 𝑦 1 𝑦 2 • How do we compute this? CSC304 - Nisarg Shah 13

  14. Example P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Let us separately maximize the reward of P1 in 2 cases: ➢ Strategies that cause P2 to play Left ➢ Strategies that cause P2 to play Right • Suppose P1 commits to Up w.p. 𝑞 , Down w.p. 1 − 𝑞 CSC304 - Nisarg Shah 14

  15. Example P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Strategies that cause P2 to play Left Reward of P1 assuming P2 plays Left Max 𝑞 ⋅ 1 + 1 − 𝑞 ⋅ 0 𝑡. 𝑢. 𝑞 ⋅ 1 + 1 − 𝑞 ⋅ 0 ≥ 𝑞 ⋅ 0 + 1 − 𝑞 ⋅ 1 𝑞 ∈ [0,1] Condition that causes P2 to play Left CSC304 - Nisarg Shah 15

  16. Example P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Strategies that cause P2 to play Left Max 𝑞 𝑡. 𝑢. Answer=1 𝑞 ≥ 1 − 𝑞 𝑞 ∈ [0,1] CSC304 - Nisarg Shah 16

  17. Example P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Strategies that cause P2 to play Right Answer=2.5 Max 𝑞 ⋅ 3 + 1 − 𝑞 ⋅ 2 𝑡. 𝑢. 𝑞 ⋅ 1 + 1 − 𝑞 ⋅ 0 ≤ 𝑞 ⋅ 0 + 1 − 𝑞 ⋅ 1 𝑞 ∈ [0,1] CSC304 - Nisarg Shah 17

  18. Stackelberg via LPs • High-level Idea: ∗ of P2… ➢ For each action 𝑡 2 ➢ Write a linear program with the mixed strategy 𝑦 1 of P1 as the unknown, which… ➢ Maximizes the reward of P1 when P1 plays 𝑦 1 , P2 ∗ … responds with 𝑡 2 ➢ Subject to the constraint that 𝑦 1 in fact incentivizes P2 to ∗ play 𝑡 2 CSC304 - Nisarg Shah 18

  19. Stackelberg via LPs • 𝑇 1 , 𝑇 2 = sets of actions of leader and follower • 𝑇 1 = 𝑛 1 , 𝑇 2 = 𝑛 2 • 𝑦 1 (𝑡 1 ) = probability of leader playing 𝑡 1 • 𝜌 1 , 𝜌 2 = reward functions for leader and follower ∗ ) max Σ 𝑡 1 ∈𝑇 1 𝑦 1 𝑡 1 ⋅ 𝜌 1 (𝑡 1 , 𝑡 2 ∗ , • One LP for each 𝑡 2 take the maximum subject to over all 𝑛 2 LPs ∗ ∀𝑡 2 ∈ 𝑇 2 , Σ 𝑡 1 ∈𝑇 1 𝑦 1 𝑡 1 ⋅ 𝜌 2 𝑡 1 , 𝑡 2 ≥ • The LP corresponding Σ 𝑡 1 ∈𝑇 1 𝑦 1 𝑡 1 ⋅ 𝜌 2 𝑡 1 , 𝑡 2 ∗ optimizes over to 𝑡 2 Σ 𝑡 1 ∈𝑇 1 𝑦 1 𝑡 1 = 1 ∗ is all 𝑦 1 for which 𝑡 2 the best response ∀𝑡 1 ∈ 𝑇 1 , 𝑦 1 𝑡 1 ≥ 0 CSC304 - Nisarg Shah 19

  20. Real-World Applications • Security Games ➢ Defender (leader) has 𝑙 identical patrol units ➢ Defender wants to defend a set of 𝑜 targets 𝑈 ➢ In a pure strategy, each resource can protect a subset of targets 𝑇 ⊆ 𝑈 from a given collection 𝒯 ➢ A target is covered if it is protected by at least one resource ➢ Attacker wants to select a target to attack CSC304 - Nisarg Shah 20

  21. Real-World Applications • Security Games ➢ For each target, the defender and the attacker have two utilities: one if the target is covered, one if it is not. ➢ Defender commits to a mixed strategy; attacker follows by choosing a target to attack. CSC304 - Nisarg Shah 21

  22. Ah! • Q: Because this is a 2-player Stackelberg game, can we just compute the optimal strategy for the defender in polynomial time…? • Time is polynomial in the number of pure strategies of the defender ➢ In security games, this is 𝒯 𝑙 ➢ Exponential in 𝑙 • Intricate computational machinery required… CSC304 - Nisarg Shah 22

  23. LAX CSC304 - Nisarg Shah 23

  24. Real-World Applications • Protecting entry points to LAX • Scheduling air marshals on flights ➢ Must return home • Protecting the Staten Island Ferry ➢ Continuous-time strategies • Fare evasion in LA metro ➢ Bathroom breaks !!! • Wildlife protection in Ugandan forests ➢ Poachers are not fully rational • Cyber security … CSC304 - Nisarg Shah 24

Recommend


More recommend