Game Theory: Lecture #11 Outline: • Strategic form games • Best Response • Nash equilibrium
Strategic games • Setup: Strategic games – Set of players, N = { 1 , ..., n } – A set of actions for each player i ∈ N , A i . – This induces the set of action profiles A = A 1 × A 2 × ... × A n – For each player, preferences over action profiles characterized by a function: U i : A → R • Descriptive agenda: What is a reasonable prediction of social behavior? Alternatively, what should an agent do in a given game? • Fundamental challenge: An agent’s desired choice depends heavily on the agent’s model for the other agents in the game • Previous focus: Zero-sum games – Model of other agents: worst-case / adversarial – Reasonable choice: Security strategies – Expected performance: Security levels • Are security-strategies a reasonable choice beyond zero-sum games? • Example: L R T 2 , 2 0 , 0 B 0 , 0 ǫ, ǫ 1
Nash Equilibrium • Alternative model: Agents are contingent optimizers • Definition: The action profile a ∗ is a Nash equilibrium if for every player i , U i ( a ∗ ) = U i ( a ∗ i , a ∗ − i ) ≥ U i ( a i , a ∗ − i ) for every a i ∈ A i . • Notation: – {− i } represents all players other than i , i.e., {− i } = { 1 , . . . , i − 1 , i + 1 , . . . , n } – a − i represents the choice of all players other than i , i.e., a − i = { a 1 , . . . , a i − 1 , a i +1 , . . . , a n } • Compare: – Optimization case: An optimizer will play the best action – Nash equilibrium: An action profile in which each player is acting as an optimizer – Term “rational” implies that an agent is an “optimizer” • View: Nash equilibrium reasonable outcome associated with rational players • Alternative definition of Nash equilibrium: – Definition: The best response function of player i , B i ( · ) , is B i ( a − i ) = { a i : U i ( a i , a − i ) ≥ U i ( a ′ i , a − i ) for all a ′ i ∈ A i } Note that the best response “function” is actually a “set” – An action profile a ∗ is a Nash equilibrium if for every player i , a ∗ i ∈ B i ( a ∗ − i ) i.e., each player is playing a best response to the actions of other player • Nash equilibrium restated: No player has a unilateral incentive to change action 2
Descriptive question: What’s the outcome? • Descriptive agenda: What is a reasonable predicition of social behavior? – Does a Nash equilibrium exist? – Is a Nash equilibrium unique? – Which Nash equilibrium? – Why Nash equilibrium? • Prisoner’s dilemma: – Setup: Cooperate vs Defect? C D C 3 , 3 0 , 4 D 4 , 0 1 , 1 – Also used to model work vs shirk? arm vs disarm? – What if played several times? • Bach or Stravinsky: (coordination) B S B 2 , 1 0 , 0 S 0 , 0 1 , 2 • Stag hunt: (safety and social cooperation) Stag Hare Stag 2 , 2 0 , 1 Hare 1 , 0 1 , 1 • Typewriter: QWERTY vs. Dvorak (social norms and conventions) Alt Std Alt 3 , 3 0 , 0 Std 0 , 0 1 , 1 3
Nash equilibrium, cont • There can be one NE, multiple NE, or no NE • Examples: Prisoner’s dilemma, BoS, Stag Hunt, Typewriter, and matching pennies : H T H 1 , − 1 − 1 , 1 T − 1 , 1 1 , − 1 • Curiosity: Equilibrium of what ? • Cournot adjustment process: At stage k , player i uses a best response to the move of players − i at stage k − 1 • Consider matching pennies: – Stage 1: ( H, H ) – Stage 2: ( H, T ) – Stage 3: ( T, T ) – Stage 4: ( T, H ) – Stage 5: ... • NE is an equilibrium of Cournot adjustment process • Cournot adjustment process need not converge to a NE (e.g., stag hunt) 4
Example: Routing game High road S D Low road • Assume N players • Congestion: – High road: c H + n H – Low road: c L + n L • Claim: NE when both roads have (almost) same congestion. • Characterization of NE: High satisfied Low satisfied c H + n H ≤ c L + n L + 1 c L + n L ≤ c H + n H + 1 c H + n H ≤ c L + ( N − n H ) + 1 c L + ( N − n H ) ≤ c H + n H + 1 2 n H ≤ N + c L − c H + 1 2 n H ≥ N + c L − c H − 1 • For N = 100 , c H = 20 , c L = 6 : 85 ≤ 2 n H ≤ 87 ⇒ n H = 43 • For N = 100 , c H = 20 , c L = 5 84 ≤ 2 n H ≤ 86 ⇒ 42 ≤ n H ≤ 43 NE is both n H = 42 or n H = 43 5
Example: Routing High road c(x) = x c(x) = 2x S D c(x) = 1 Low road • Setup: – Players: Two agents that each control 1 / 2 units of splittable traffic. – Actions: Players can route 1 / 2 of traffic arbitrarily over H and L – Cost: The cost of an agent is just the total cost of it’s traffic J i ( f H 1 , f H 2 ) = f H i c H ( f H 1 + f H 2 ) + (0 . 5 − f H i ) c L (1 − f H 1 − f H 2 ) – Convention: Use J i ( · ) for cost and U i ( · ) for benefit • Computing best response of player 1 , B 1 ( · ) f H 1 · 2( x + f H 1 ) + (0 . 5 − f H B 1 ( x ) = arg min 1 ) 0 ≤ f H 1 ≤ 0 . 5 • Take derivative and set to 0 yields B 1 ( x ) = 1 4 − x 2 • Player 1 and player 2 are symmetric so we have B 2 ( y ) = 1 4 − y 2 6
Example: Routing • What is the routing profile of the NE? f H 1 = B 1 ( f H 2 ) f H 2 = B 2 ( f H 1 ) • NE is mutual best response f H = 1 / 6 i 7
Example: Routing • Could have also found NE by iteratively eliminating actions that are not best response • Recall: Best response functions 1 = B 1 ( f 2 ) = 1 4 − f 2 f ∗ 2 2 = B 2 ( f 1 ) = 1 4 − f 1 f ∗ 2 • Similar to iterated elimination of strictly dominated strategies: � 0 , 1 � f ∗ i ∈ 2 ⇓ � 0 , 1 � f ∗ i ∈ 4 ⇓ � 1 � 8 , 1 f ∗ i ∈ 4 ⇓ � 1 8 , 3 � f ∗ i ∈ 16 ⇓ � 5 32 , 3 � f ∗ i ∈ 16 ⇓ � 10 � 64 , 11 f ∗ i ∈ 64 . . . i = 1 f ∗ 6 8
Dominated strategies • Issue: Finding a Nash equilibrium is hard – Approach #1: Exhaustively check all joint actions – Approach #2: Investigate best response functions – Best approach depends on game of interest • Certain structures can greatly simplify the analysis – Prisoner’s dilemma: Defect was did better than alternatives (strict) – Second price sealed bid: Internal valuation did no worse than alternatives (weak) • Fact: A strictly dominated strategy cannot be used in a NE • Why? It is never part of a best response • Q: Can a weakly dominated strategy be used in a NE? Yes L R T 2 , 2 1 , 1 B 1 , 1 1 , 2 T weakly dominates B , but both ( T, L ) and ( B, R ) are NE (see also auction example) • Viewpoint: If strictly dominated strategies are not used, we can reduce the game to the remaining strategies. 9
Iterated elimination of strictly dominated strategies • Recall example from previous lecture • Q: What is the NE? • Successively eliminating dominated strategies can (sometimes) lead to NE L C R T 4 , 3 5 , 1 6 , 2 M 2 , 1 8 , 4 3 , 6 B 3 , 0 9 , 6 2 , 8 • Row player has no (strictly) dominated strategies • Column player can eliminate C • Reduced game: L R T 4 , 3 6 , 2 M 2 , 1 3 , 6 B 3 , 0 2 , 8 • Row player can now eliminate both M and B : L R T 4 , 3 6 , 2 • Column player can now eliminate R • NE: ( T, L ) is the sole survivor 10
Recommend
More recommend