learning and sophistication in coordination games
play

Learning and Sophistication in Coordination Games Kyle Hyndman 1 - PowerPoint PPT Presentation

Learning and Sophistication in Coordination Games Kyle Hyndman 1 Antoine Terracol 2 Jonathan Vaksmann 3 1 Southern Methodist University, Dallas, TX 2 EQUIPPE, Universits de Lille and Centre dconomie de la Sorbonne, Universit Paris 1 3


  1. Learning and Sophistication in Coordination Games Kyle Hyndman 1 Antoine Terracol 2 Jonathan Vaksmann 3 1 Southern Methodist University, Dallas, TX 2 EQUIPPE, Universités de Lille and Centre d’Économie de la Sorbonne, Université Paris 1 3 GAINS-TEPP , Université du Maine and Centre d’Économie de la Sorbonne, Université Paris 1 Workshop AlgeCoFail Dec. 2009 Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 1 / 22

  2. Introduction I Motivations Behavioral approaches used to describe players’ behavior regard people as purely adaptive learners who only best respond to what they have experienced in the past without any awareness of the impact of their actions on their opponents’ behavior. Along these approaches strategic interactions do not play any role in games! Thus a few recent studies exhibit sophistication into players’ behavior. In these approaches players might realize that their opponents are capable of learning and could use this opportunity to play strategically and manipulate them. This is how strategic teaching might arise. Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 2 / 22

  3. Introduction II Previous Research Camerer, Ho and Chong (2002), devised a model of strategic teaching in a population of players. A fraction of them is purely adaptive as postulated by usual learning models and the remaining fraction of players is fully sophisticated and can teach them. Other studies focus on teaching in fixed pairs of players. Ehrblatt, Hyndman, Ozbay, Schotter (2009): Teaching a rapid learner facilitates convergence to a unique NE. Terracol and Vaksmann (2009): More tenacious teachers take the leadership and drive coordination. Our goal : Highlighting the determinants of strategic behavior. Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 3 / 22

  4. Experimental Design I The Experimental Games Table: Payoff Matrices TP H / TC L TP H / TC H X Y X Y X 8,37 X 0,37 40,45 40,45 Y 39,0 Y 37,0 12,32 12,32 TP L / TC L TP L / TC H X Y X Y X 8,37 X 0,37 20,45 20,45 Y 19,0 Y 17,0 12,32 12,32 Game structure: two pure strategy Nash equilibria: (X,X) and (Y,Y) and one MSNE: { ( 0 . 8 , 0 . 2 ); ( 0 . 8 , 0 . 2 ) } Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 4 / 22

  5. Experimental Design II Teaching Incentives Teaching as an investment: Players are likely to forego short-run payoffs to teach and get more in the long-run. Teaching Cost (Optimization Premium for Battalio et al. Ecta 2002): E Y i ( p ) = θ i ( 0 . 8 − p ) , p = prob. attached to X . i ( p ) − E X Where θ i = π i ( X , X ) − π i ( X , Y ) + π i ( Y , Y ) − π i ( Y , X ) . Teaching Premium : ψ i = π i ( X , X ) − π i ( Y , Y ) , i = Row, Column. π i ( Y , Y ) Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 5 / 22

  6. Experimental Design III Teaching Incentives Table: Row Players’ Incentives For Teaching Game ψ r θ r 2.33 5 TP H / TC L 2.33 15 TP H / TC H 0.67 5 TP L / TC L 0.67 15 TP L / TC H Column players’ teaching incentives remain unchanged through games: ψ C = 0 . 4, θ C = 40. Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 6 / 22

  7. Experimental Design IV The Data Parisian Experimental Economics Laboratory (LEEP). 30-40 subjects in each game. 20 repetitions of each stage game, ≃ 1hour and e 13.5 on average. In each period, prior to choosing an action, players are asked (and incentivized) to report their beliefs. Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 7 / 22

  8. Belief Formation Process (BFP) I Precondition for teaching: Players’ might take strategic interactions into account. Usual proxies used to describe players’ BFP postulate that strategic considerations do not play any role. Test of a Sophistication Bias: The impact of players’ previous action on their BFP . Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 8 / 22

  9. Belief Formation Process (BFP) II Empirical Strategy Usual Proxies ✶ { a j ( t )= a } + � t − 1 u = 1 γ u ✶ { a j ( t − u )= a } B a i ( t + 1 ) = 1 + � t − 1 u = 1 γ u 0 ≤ γ ≤ 1. γ = 0 ⇒ Cournot model. γ = 1 ⇒ Fictitious Play model. Elicited Beliefs (using a standard quadratic scoring rule), b a i ( t ) . Belief Differences, D a i ( t ) = b a i ( t ) − B a i ( t ) . Empirical strategy: A positive impact of ✶ { a i ( t − 1 )= a } on D a i ( t ) indicates the presence of a sophistication bias. Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 9 / 22

  10. Belief Formation Process (BFP) — Results Table: Random-Effects Panel Regression: The Sophistication Bias TP H / TC L TP H / TC H TP L / TC L TP L / TC H 0 . 149 ∗∗∗ 0 . 210 ∗∗∗ 0 . 137 ∗∗∗ 0 . 187 ∗∗∗ All ( 0 . 032 ) ( 0 . 041 ) ( 0 . 042 ) ( 0 . 062 ) 0 . 138 ∗∗∗ 0 . 230 ∗∗∗ 0 . 167 ∗∗ 0 . 173 ∗ Row players ( 0 . 046 ) ( 0 . 068 ) ( 0 . 065 ) ( 0 . 091 ) 0 . 163 ∗∗∗ 0 . 195 ∗∗∗ 0 . 098 ∗∗ 0 . 199 ∗∗ Column players ( 0 . 045 ) ( 0 . 049 ) ( 0 . 048 ) ( 0 . 086 ) ∗ 10% level of significance; ∗∗ 5% level of significance; ∗∗∗ 1% level of significance. Robust standard errors in parentheses. Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 10 / 22

  11. Choice Behavior A player over responds to a given action when he plays this action despite the fact that it is not a best response to his static beliefs. Table: Frequency of Choice Behaviour Categorised By Best Response ROW PLAYERS TP h / TC ℓ TP h / TC h BR = X BR = Y BR = X BR = Y 0.25 0.38 0.31 0.26 X X 0.02 0.36 0.01 0.42 Y Y TP ℓ / TC ℓ TP ℓ / TC h BR = X BR = Y BR = X BR = Y 0.37 0.23 0.29 0.17 X X 0.04 0.36 0.06 0.48 Y Y The numbers in each matrix should sum to 1, modulo rounding. Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 11 / 22

  12. Choice Behavior Table: Frequency of Choice Behaviour Categorised By Best Response COLUMN PLAYERS TP h / TC ℓ TP h / TC h BR = X BR = Y BR = X BR = Y 0.27 0.24 0.37 0.19 X X 0.04 0.45 0.02 0.43 Y Y TP ℓ / TC ℓ TP ℓ / TC h BR = X BR = Y BR = X BR = Y 0.39 0.18 0.29 0.20 X X 0.03 0.40 0.04 0.47 Y Y The numbers in each matrix should sum to 1, modulo rounding. Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 12 / 22

  13. Choice Behavior Table: Two-sample t-tests Across Games: Frequency of Over Response to X . R OW P LAYERS TP H / TC L TP H / TC H TP L / TC L TP L / TC H - 1.75 ∗ 2.79 ∗∗∗ 4.19 ∗∗∗ TP H / TC L - - 0.83 2.03 ∗∗ TP H / TC H - - - 1.26 TP L / TC L - - - - TP L / TC H C OLUMN P LAYERS TP H / TC L TP H / TC H TP L / TC L TP L / TC H - 0.94 1.52 0.56 TP H / TC L - - 0.54 0.30 TP H / TC H - - - -0.79 TP L / TC L - - - - TP L / TC H Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 13 / 22

  14. Choice Behavior—Dynamic pattern Proportion of over responses to X . Row players .6 .4 fitted values .2 0 0 5 10 15 20 round TP:H/TC:L TP:H/TC:H TP:L/TC:L TP:L/TC:H Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 14 / 22

  15. Choice Behavior—Dynamic pattern Proportion of over responses to X . Column players .6 .4 fitted values .2 0 0 5 10 15 20 round TP:H/TC:L TP:H/TC:H TP:L/TC:L TP:L/TC:H Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 15 / 22

  16. Coordination TP:H/TC:L TP:H/TC:H 1 .8 Proportion of efficient coordination .6 .4 .2 TP:L/TC:L TP:L/TC:H 1 .8 .6 .4 .2 0 5 10 15 20 0 5 10 15 20 round Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 16 / 22

  17. Tracking players’ behavior I A Model of Sophisticated Learning I Players see their opponent as a γ -learner: Teachers can build their opponent’s beliefs and actions and are allowed to re-evaluate their opponent’s responsiveness (its parameter γ ) at each period on the basis on the information gathered. ⇒ Continuation strategies: σ i ( t ) = ( a i ( t ) , a i ( t + 1 ) , ..., a i ( T )) . Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 17 / 22

  18. Tracking players’ behavior II A Model of Sophisticated Learning II Players seek to maximize their intertemporal expected payoffs E i ( σ a b X i ( t ) · π i ( a , X ) + ( 1 − b X i ( t )) = i ( t )) · π i ( a , Y ) � T u = t + 1 δ u − t � z = X , Y b z i ( u | σ a ( t )) · π i ( a , z ) + When δ = 0, the red part vanishes and the model reduces to the adaptive/myopic model. Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 18 / 22

  19. Tracking players’ behavior III A Model of Sophisticated Learning III As usual in this kind of models, we assume that players optimize stochastically. Players’ choice probabilities: exp � � E i ( σ X ( t )) − E i ( σ Y ( t )) �� λ P X i ( t ) = �� . 1 + exp � � λ E i ( σ X ( t )) − E i ( σ Y ( t )) P Y i ( t ) = 1 − P X i ( t ) . Where, λ > 0. When λ → 0, players tend to randomize over the set of actions. When λ → + ∞ , players tend to optimize deterministically. Hyndman, Terracol, Vaksmann Learning and Sophistication Paris X Dec. 2009 19 / 22

Recommend


More recommend