EC487 Advanced Microeconomics, Part I: Lecture 10 Leonardo Felli 32L.LG.04 1 December 2017
Repeated Games ◮ This is the class of dynamic games which is best understood in game theory. ◮ Players face in each period the same normal form stage game . ◮ Players’ payoffs are a weighted discounted average of the payoffs players receive in every stage game . Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 2 / 66
Repeated Games (cont’d) Main point of the analysis: ◮ players’ overall payoffs depend on the present and the future stage game payoffs, ◮ it is possible that the threat of a lower future payoff may induce a player at present to choose a strategy different from the stage game best reply. Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 3 / 66
Example: the repeated prisoner dilemma ◮ Stage game: 1 \ 2 C D C 1 , 1 − 1 , 2 2 , − 1 D 0 , 0 ◮ Per period payoff depends on current action: g i ( a t ) . ◮ Players’ common discount factor δ . ◮ It is convenient to label the first period t = 0. Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 4 / 66
Repeated Prisoner Dilemma (cont’d) ◮ Since we are going to compare the equilibrium payoffs for different time horizons we need to re-normalize the payoffs so that they are comparable. ◮ The average discounted payoff for a T -periods game is: T − 1 Π = 1 − δ � δ t g i ( a t ) 1 − δ T t =0 ◮ Clearly if g i ( a t ) = 1 � 1 − δ T − 1 � � 1 − δ T Π = 1 − δ � δ t = � = 1 1 − δ T 1 − δ T 1 − δ t =0 Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 5 / 66
Finitely Repeated Prisoner Dilemma ◮ Assume first that the prisoners’ dilemma game is repeated a finite number of times. ◮ Nash equilibrium payoffs of the stage game: (0 , 0). ◮ Subgame Perfect equilibrium strategies: each player chooses action D independently of the period and the action the other player chose in the past. 1 \ 2 C D C 1 , 1 − 1 , 2 D 2 , − 1 0 , 0 Proof: backward induction. ◮ Subgame Perfection seems to prevent any gain from repeated, but finite interaction, but... Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 6 / 66
Finitely Repeated Game ◮ Consider a different finitely repeated game. ◮ Stage game: L C R T 1 , 1 5 , 0 0 , 0 M 0 , 5 4 , 4 0 , 0 B 0 , 0 0 , 0 3 , 3 ◮ Nash equilibria of the stage game: ( T , L ) and ( B , R ). Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 7 / 66
Finitely Repeated Game (cont’d) Assume the game is played twice and consider the following strategies: Player 1: ◮ play M in the first period; ◮ in the second period play B if the observed outcome is ( M , C ); ◮ in the second period play T if the observed outcome is not ( M , C ); Player 2: ◮ play C in the first period; ◮ in the second period play R if the observed outcome is ( M , C ); ◮ in the second period play L if the observed outcome is not ( M , C ); Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 8 / 66
Finitely Repeated Game (cont’d) Proposition If δ ≥ 1 2 then these strategies are a subgame perfect equilibrium of the game. L C R T 1 , 1 5 , 0 0 , 0 M 0 , 5 4 , 4 0 , 0 B 0 , 0 0 , 0 3 , 3 Proof: Backward induction: in the last period the strategies prescribe a Nash equilibrium. In the first period both player 1 and player 2 conform to the strategies if and only if: � 1 − δ � 1 − δ � [4 + δ 3] = 4 + δ 3 � [5 + δ ] = 5 + δ ≥ 1 − δ 2 1 − δ 2 1 + δ 1 + δ The inequality is satisfied for δ ≥ 1 2 . Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 9 / 66
Infinitely Repeated Prisoner Dilemma ◮ Consider now the the infinitely repeated prisoner dilemma: T = + ∞ . ◮ Stage game: 1 \ 2 C D C 1 , 1 − 1 , 2 D 2 , − 1 0 , 0 Proposition Both player choosing strategy D in every period is an SPE of the repeated game. ◮ Proof: by one deviation principle. Notice that an infinitely repeated game is continuous at infinity. Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 10 / 66
Infinitely Repeated Prisoner Dilemma (cont’d) Proposition The ( D , D ) equilibrium is the only equilibrium if we restrict players’ strategies to be history independent. Proposition If δ ≥ 1 2 then the following strategy profile ( σ A , σ B ) is a SPE of the repeated game: ◮ Player i chooses C in the first period. ◮ Player i continues to choose C as long as no player has chosen D in any previous period. ◮ Player i will choose D if a player has chosen D in the past (for the rest of the game). Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 11 / 66
Infinitely Repeated Prisoner Dilemma (cont’d) Proof: If a player i conforms to the prescribed strategies the payoff is 1. If a player deviates in one period and conforms to the prescribed strategy from there on (one deviation principle) the continuation payoff is: (1 − δ )(2 + 0 + . . . ) = (1 − δ ) 2 If δ ≥ 1 2 then 1 ≥ (1 − δ ) 2 . Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 12 / 66
Infinitely Repeated Prisoner Dilemma (cont’d) We still need to check that in the subgame in which both players are choosing D neither player wants to deviate. However, choosing D in every period is a SPE of the entire game hence it is a SPE of the (punishment) subgames. Notice that using this type of strategies not only choosing ( C , C ) in every period is a SPE outcome, a big number of other SPE outcomes are also achievable. Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 13 / 66
Infinitely Repeated Prisoner Dilemma (cont’d) Indeed there exists a Folk Theorem. ✻ Π 2 ( − 1 , 2) ❍❍❍❍❍❍❍❍ q ❆ ❆ ❍ ❆ ❍ ❍ ❆ ❍ (1 , 1) ❆ ❆ q ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ✲ ❆ ❍ q ❍ ❆ (0 , 0) ❍ Π 1 ❍ ❆ ❍ ❍ ❆ ❍ ❍ ❆ q (2 , − 1) Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 14 / 66
General repeated normal form game Definition Let G be a given stage game: a normal form game N , A i , g i ( a t ) � � G = Definition Let G ∞ be the infinitely repeated game associated with the stage game above: G ∞ = { N , H , P , U i ( σ ) } such that: t =0 A t where A 0 = ∅ ; ◮ H = � ∞ ◮ P ( h ) = N for every h ∈ H − Z ; Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 15 / 66
Repeated Normal form Game (cont’d) ◮ The payoffs for the game G ∞ in the case δ < 1 are: ∞ � δ t g i ( σ t ( h t )) U i ( σ ) = (1 − δ ) t =0 ◮ Denote h t the history known to the players at the beginning of period t : h t = { a 0 , a 1 , . . . , a t − 1 } . ◮ Let H t = A t − 1 to be the space of all possible period t histories. ◮ A pure strategy for player i ∈ { 1 , 2 } in the game G ∞ is then the infinite sequence of mappings: { s t i } ∞ t =0 such that i : H t → A i . s t Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 16 / 66
Repeated Normal form Game (cont’d) In general we will allow players to mix in every possible stage game: ∆ i ( A i ) set of probability distributions on A i . A behavioral mixed strategy in this environment is instead an infinite sequence of mappings: { σ t i } ∞ t =0 such that i : H t → ∆( A i ). σ t Notice that mixed strategies cannot depend on past mixed strategies by the opponents but only on their realizations. The payoffs for the game G ∞ in the case δ < 1 are: ∞ � δ t g i ( σ t ( h t )) U i = E σ (1 − δ ) t =0 Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 17 / 66
Repeated Normal form Game (cont’d) ◮ Notice that E σ ( · ) is the expectation with respect to the distribution over the infinite histories generated by the profile of mixed behavioral strategies { σ t i } ∞ t =0 . ◮ Notice that this specification of payoffs allows us to reinterpret the discount factor δ as: ◮ the probability that the game will be played in the following period, where these probabilities are assumed to be independent across periods. Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 18 / 66
Repeated Normal form Game (cont’d) We allow the players to coordinate their strategies through the use of a public randomizing device whose realization in period t is ω t . Therefore a period t history for player i is: h t = { a 0 , . . . , a t − 1 ; ω 0 , . . . , ω t } . Proposition If α ∗ is a NE strategy profile for the stage game G, then the strategy: “each player i plays α ∗ i independently of the history of play” are a NE and a SPE of the infinitely repeated game G ∞ ( δ ) . Leonardo Felli (LSE) EC487 Advanced Microeconomics, Part II 1 December 2017 19 / 66
Recommend
More recommend