game theory lecture 12
play

Game Theory - Lecture #12 Outline: Randomized actions vNM & - PDF document

Game Theory - Lecture #12 Outline: Randomized actions vNM & Bernoulli payoff functions Mixed strategies & Nash equilibrium Hawk/Dove & Mixed strategies Randomized action profiles Original strategic setup: Set of


  1. Game Theory - Lecture #12 Outline: • Randomized actions • vNM & Bernoulli payoff functions • Mixed strategies & Nash equilibrium • Hawk/Dove & Mixed strategies

  2. Randomized action profiles • Original strategic setup: – Set of players, { 1 , 2 , ..., n } – For each player, a set of actions A i – For each player, preferences on action profiles characterized by a payoff function: U i : A → R • Question: How do we extend preferences to lotteries over action profiles? • Extension: Strategic game with vNM (Von Neumann and Morgenstern) preferences – Set of players – For each player, a set of actions A i – For each player, preferences on lotteries on action profiles characterized by a (vNM) payoff function: U i (∆( A )) → R Notation: ∆( Set ) denotes probability distributions over a Set of outcomes • Important special case: vNM preferences given by expected utility over action profiles (Bernoulli payoff) • Key observation: Payoff values define preferences over distributions – Original setting: preferences ⇔ payoffs over profiles – Extension: preferences ⇔ payoffs over profiles ⇒ preferences over distributions • Concern: Moving further away from true preferences 1

  3. Example • Original setting: preferences ⇔ payoffs over profiles – Fact: Several payoff functions reflect preferences C D C D C 2 , 2 0 , 3 C 3 , 3 0 , 4 D 3 , 0 1 , 1 D 4 , 0 1 , 1 – These are the same game (Prisoner’s dilemma) in terms of original ordinal preference • Extension: preferences ⇔ payoffs over profiles ⇒ preferences over distributions – These are different games in terms of probability preferences – Player 1 vNM utility depends on probabilities of { CC, CD, DC, DD } : ∗ Left game: U 1 ( p ) = p CC · 2 + p CD · 0 + p DC · 3 + p DD · 1 ∗ Right game: U 1 ( p ) = p CC · 3 + p CD · 0 + p DC · 4 + p DD · 1 Similar for Player 2 – Compare following probability distributions: (2 / 5 , 3 / 5 , 0 , 0) vs (0 , 0 , 0 , 1) • Payoff values take on heightened importance in extended setting. • Dependence on payoff values can result in peculiar outcomes. 2

  4. Expected payoff peculiarities • In the new framework, the preferences are over probability distributions • Issue: Are expected payoffs “reasonable”? • Example: Allais paradox – Consider the following two lotteries (in millions): $10 $2 $0 $10 $2 $0 0 1 0 0 . 1 0 . 89 0 . 01 vs A a Most prefer A to a ... – Consider another two lotteries: $10 $2 $0 $10 $2 $0 0 . 1 0 0 . 9 0 0 . 11 0 . 89 vs B b Most prefer B to b ... – Q: Are there choices of u (10) , u (2) , u (0) such that expected utilities result in pref- erences ( A > a ) and ( B > b ) – Preference evaluation for ( A > a ) u (2) > 0 . 1 u (10) + 0 . 89 u (2) + 0 . 01 u (0) . – Subtract 0 . 89 u (2) and add 0 . 89 u (0) to each side 0 . 11 u (2) + 0 . 89 u (0) > 0 . 1 u (10) + 0 . 9 u (0) – This implies that the expected payoff of lottery b exceeds that of lottery B ! • Conclusion: Decision maker’s preferences cannot always be represented by an expected payoff function. Nonetheless, we will make use of expected payoffs. 3

  5. Mixed strategies • A mixed strategy is a probability distribution over a player’s actions. Specifically, a player selects α i ∈ ∆( A i ) • Consequences: – Joint action probabilities are products of player probabilities – Bernoulli payoff becomes expected utility with independent players – New notation: U i ( α i , α − i ) • Continuing previous example: – Player 1 chooses α 1 = ( α 1 C , α 1 D ) – Player 2 chooses α 2 = ( α 2 C , α 2 D ) – Resulting probability distribution over joint actions is ( p CC , p CD , p DC , p DD ) = ( α 1 C α 2 C , α 1 C α 2 D , α 1 D α 2 C , α 1 D α 2 D ) – Inherited expected utilities: ∗ Left game: U 1 ( α 1 , α 2 ) = 2 · α 1 C α 2 C + 0 · α 1 C α 2 D + 3 · α 1 D α 2 C + 1 · α 1 D α 2 D ∗ Right game: U 1 ( α 1 , α 2 ) = 3 · α 1 C α 2 C + 0 · α 1 C α 2 D + 4 · α 1 D α 2 C + 1 · α 1 D α 2 D (Likewise for U 2 ( · ) ) • Reconciled viewpoint: New setup is same as old setup with – Set of players – “New” set of actions α i ∈ ∆( A i ) – “New” payoff functions U i ( α i , α − i ) which is expected value of original payoff func- tions assuming independent players 4

  6. Mixed strategy best response • Define the best response function, B i ( · ) , as B i ( α − i ) = { α i : U i ( α i , α − i ) ≥ U i ( α ′ i , α − i ) for all α ′ i ∈ ∆( A i ) } Note that the best response “function” is actually a “set” • This definition is exactly as before except: – Player actions are replaced with mixed strategies – Player utilities are replaced with expected utilities assuming independent players • Example: Generic two player/two action game L R T a, A b, B B c, C d, D – Assume mixed strategies are α 1 = ( p, 1 − p ) for row player and α 2 = ( q, 1 − q ) for column player – Player 1 must maximize over p ∈ [0 , 1] � � � � p q · a + (1 − q ) · b + (1 − p ) q · c + (1 − q ) · d – Fact:  1 ( q · a + (1 − q ) · b ) > ( q · c + (1 − q ) · d )   B row ( q ) = 0 ( q · a + (1 − q ) · b ) < ( q · c + (1 − q ) · d )  [0 , 1] ( q · a + (1 − q ) · b ) = ( q · c + (1 − q ) · d )  – Similar analysis to derive B col ( p ) 5

  7. Mixed strategy Nash equilibrium • The mixed strategy profile α ∗ = ( α ∗ 1 , ..., α ∗ n ) is a mixed strategy Nash equilibrium if for every player i , α ∗ i ∈ B i ( α ∗ − i ) • Celebrated Nash theorem: Every strategic game with vNM preferences in which each player has finitely many actions has a mixed strategy Nash equilibrium. • Nash result due to (advanced) fixed point theory – Want to find ( α ∗ 1 , ..., α ∗ n ) such that α ∗ → ( B 1 ( · ) , ..., B n ( · )) → α ∗ – Illustration: A continuous function on the closed interval [0,1] must have a “fixed point”, i.e., an x ∈ [0 , 1] such that x = f ( x ) 6

  8. Hawk/Dove H D H 0 , 0 6 , 1 D 1 , 6 3 , 3 • Setup: – H : hawk = aggressive – D : dove = passive – Model of game of “chicken” or traffic intersection • First look: What are the pure (i.e., non-randomized) action NE? – Best response function for row player: B row ( H ) = D & B row ( D ) = H – Symmetric for column player – NE: ( H, D ) and ( D, H ) 7

  9. Hawk/Dove: Mixed strategies H D H 0 , 0 6 , 1 D 1 , 6 3 , 3 • Second look: What are the mixed strategy NE? • As before, we construct best response function, but for mixed strategies – Row: Pr ( H ) = p and Pr ( D ) = 1 − p – Column: Pr ( H ) = q and Pr ( D ) = 1 − q – Players select { H, D } independently • Best response for row player: Need to maximize expected payoff, i.e., � � � � 0 ≤ p ≤ 1 p max 0 · q + 6 · (1 − q ) + (1 − p ) 1 · q + 3 · (1 − q ) ⇓  � � � � 1 0 · q + 6 · (1 − q ) > 1 · q + 3 · (1 − q )     � � � � B row ( q ) = [0 , 1] 0 · q + 6 · (1 − q ) = 1 · q + 3 · (1 − q )  � � � �  0 0 · q + 6 · (1 − q ) < 1 · q + 3 · (1 − q )   • Conclusion:   1 q < 3 / 4 1 p < 3 / 4     B row ( q ) = & B col ( p ) = [0 , 1] q = 3 / 4 [0 , 1] p = 3 / 4   0 q > 3 / 4 0 p > 3 / 4   8

  10. H/D: Best response plots 1 0.8 0.6 p 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 q • NE occur at intersection of best response plots – NE of original pure strategy game are still present – New “mixed strategy” NE: ( p ∗ , q ∗ ) = (3 / 4 , 3 / 4) • Peculiarity: At mixed strategy NE, players are indifferent , i.e., B row (3 / 4) = [0 , 1] & B col (3 / 4) = [0 , 1] i.e., at NE, best response is to play ( H, D ) with any probability combination. • The mixed strategy NE makes both players indifferent • Question: Are there other outcome that could lead to more desirable behavior? 9

Recommend


More recommend