signaling games and the emergence of linguistic meaning
play

Signaling Games and the Emergence of Linguistic Meaning PENG - PowerPoint PPT Presentation

S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING Signaling Games and the Emergence of Linguistic Meaning PENG 2012/2013 Introduction 1 S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING L ANGUAGE AS C ONVENTION ?


  1. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING Signaling Games and the Emergence of Linguistic Meaning PENG 2012/2013 Introduction 1

  2. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING L ANGUAGE AS C ONVENTION ? ”A name is a spoken sound significant by convention... I say ’by convention’ because no name is a name naturally but only when it has become a symbol.” Aristotle, De Interpretatione ”[L]anguages [are] gradually establish’d by human conventions without any explicit promise. In like manner do gold and silver become the common measures of exchange, and are esteem’d sufficient payment for what is of a hundred times their value.” Hume, Treatise of Human Nature

  3. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING L ANGUAGE AS C ONVENTION ? ”[w]e can hardly suppose a parliament of hitherto speechless elders meeting together and agreeing to call a cow a cow and a wolf a wolf.” Russell, The Analysis of Mind ”Conventions are like fires: under favourable conditions, a sufficient concentration of heat spreads and perpetuates itself. The nature of the fire does not depend on the original source of heat. Matches may be the best fire starters, but that is no reason to think of fires started otherwise as any the less fires.” Lewis, Convention

  4. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING C OORDINATION & S IGNALING R L a L a S R 1 0 t L 1 0 L 0 1 t S 0 1 Messages: One or two lanterns? t L m 1 t L m 1 t L m 1 t L m 1 s 1 : s 2 : s 3 : s 4 : m 2 t S m 2 t S m 2 t S m 2 t S a L r 3 : m 1 a L m 1 m 1 a L m 1 a L r 1 : r 2 : r 4 : a S a S a S a S m 2 m 2 m 2 m 2

  5. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING ◮ a signaling game is a tuple SG = �{ S , R } , T , Pr , M , A , U � ◮ a Lewis game is defined by: � 1 if i = j ◮ T = { t L , t S } ◮ U ( t i , a j ) = 0 else ◮ M = { m 1 , m 2 } a L a S ◮ A = { a L , a S } t L 1 0 ◮ Pr ( t L ) = Pr ( t S ) = . 5 t S 0 1 N t L t S . 5 . 5 S S m 1 m 2 m 1 m 2 R R R R a L a S a L a S a L a S a L a S 1 0 1 0 0 1 0 1

  6. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING P URE STRATEGIES Pure strategies are contingency plans, players act according to. ◮ sender strategy: s : T → M ◮ receiver strategy: r : M → A t L m 1 t L m 1 t L m 1 t L m 1 s 1 : s 2 : s 3 : s 4 : m 2 t S m 2 t S m 2 t S m 2 t S m 1 a L m 1 a L m 1 a L m 1 a L r 1 : r 2 : r 3 : r 4 : a S m 2 a S m 2 a S m 2 a S m 2

  7. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING S IGNALING S YSTEMS ◮ signaling systems are combinations of pure strategies. The Lewis game has two: L 1 = � s 1 , r 1 � and L 2 = � s 2 , r 2 � m 1 a L m 1 a L t L t L L 1 : L 2 : a S a S t S m 2 t S m 2 ◮ signaling systems are strict Nash equilibria of the EU-table : r 1 r 2 r 3 r 4 s 1 1 0 .5 .5 s 2 0 1 .5 .5 s 3 .5 .5 .5 .5 s 4 .5 .5 .5 .5 ◮ in signaling systems messages associate states and actions uniquely ◮ signaling systems constitute evolutionary stable states

  8. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING S IGNALING C ONVENTION ”Given the definition of signaling systems, we can define a signaling convention as any convention whereby members of a population P who are involved as communicators or audience in a certain signaling problem S do their parts of a certain signaling system � Fc , Fa � by acting according to their respective contingency plans. If such a convention exists, we also call � Fc , Fa � a conventional signaling system.” Lewis, Convention

  9. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING A SYMMETRIC S TATIC S IGNALING G AME Given a signaling game SG = �{ S , R } , T , M , A , Pr , C , U ′ � as initially defined. The corresponding asymmetric static signaling game SSG a = �{ S , R } , S , R , U � is defined as follows: ◮ S is a sender, R is a receiver ◮ S = { s | s ∈ [ T → M ] } is the set of the sender’s strategies ◮ R = { r | r ∈ [ M → A ] } is the set of the receiver’s strategies ◮ U : S × R → R is the utility function, defined as U ( s , r ) = � t Pr ( t ) × U ′ ( t , s ( t ) , r ( s ( t ))) A SSG a is asymmetric because sender and receiver have a different set strategies.

  10. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING R EPLICATOR D YNAMICS Given a very large (effectively infinite) population of agents playing a symmetric static game �{ P 1 , P 2 } , S , U : S × S → R � randomly against each other. Then we can define ◮ p ( s i ) : proportion of agents in the population playing strategy s i ◮ U ( s i ) = � s j ∈ S p ( s j ) U ( s i , s j ) : expected utility for agents playing s i ◮ U = � s i ∈ S p ( s i ) U ( s i ) the average fitness of the whole population Replicator Dynamics The RD is defined by the following differential equation: dp ( s i ) = p ( s i )[ U ( s i ) − U ] dt

  11. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING R EPLICATOR D YNAMICS FOR A SYMMETRIC G AMES ? ”In an evolutionary setting, we can either model a situation where senders and receivers belong to different populations or model the case where individuals of the same population at different times assume the role of sender and receiver.” Skyrms, Evolution of the Social Contract ◮ the replicator dynamics is defined for symmetric static games ◮ there are two possible solutions to apply replicator dynamics on a signaling game 1. use a ’two population’ model (sender population & receiver population) 2. symmetrize a asymmetric static signaling game to a symmetric static signaling game

  12. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING R ESULT FOR A ’ TWO - POPULATION ’ MODEL p ( S 2 ) � S 2 , R 2 � . � S 1 , R 1 � p ( R 2 )

  13. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING S YMMETRIC S TATIC S IGNALING G AME Given a asymmetric static signaling game SSG a = � ( S , R ) , S , R , U ′ � as defined before. The corresponding symmetric static signaling game SSG s = � ( S , R ) , L , U � is defined as follows: ◮ S is a sender, R is a receiver ◮ L = { L ij | L ij = ( s i , r j ) ∀ s i ∈ S , r i ∈ R } is the set of languages ◮ U : L × L → R is the utility function over languages, defined as U ( L ij , L kl ) = 1 2 ( U ′ ( s i , r l ) + U ′ ( s k , r j )) r 1 r 2 r 3 r 4 s 1 L 1 L 12 L 13 L 14 s 2 L 21 L 2 L 23 L 24 s 3 L 31 L 32 L 3 L 34 s 4 L 41 L 42 L 43 L 4

  14. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING L 1 L 12 L 13 L 14 L 21 L 2 L 23 L 24 L 31 L 32 L 3 L 34 L 41 L 42 L 43 L 4 L 1 1 . 5 . 75 . 75 . 5 0 . 25 . 25 . 75 . 25 . 5 . 5 . 75 . 25 . 5 . 5 L 12 . 5 0 . 25 . 25 1 . 5 . 75 . 75 . 75 . 25 . 5 . 5 . 75 . 25 . 5 . 5 L 13 . 75 . 25 . 5 . 5 . 75 . 25 . 5 . 5 . 75 . 25 . 5 . 5 . 75 . 25 . 5 . 5 L 14 . 75 . 25 . 5 . 5 . 75 . 25 . 5 . 5 . 75 . 25 . 5 . 5 . 75 . 25 . 5 . 5 L 21 . 5 1 . 75 . 75 0 . 5 . 25 . 25 . 25 . 75 . 5 . 5 . 25 . 75 . 5 . 5 L 2 0 . 5 . 25 . 25 . 5 1 . 75 . 75 . 25 . 75 . 5 . 5 . 25 . 75 . 5 . L 23 . 25 . 75 . 5 . 5 . 25 . 75 . 5 . 5 . 25 . 75 . 5 . 5 . 25 . 75 . 5 . 5 L 24 . 25 . 75 . 5 . 5 . 25 . 75 . 5 . 5 . 25 . 75 . 5 . 5 . 25 . 75 . 5 . 5 L 31 . 75 . 75 . 75 . 75 . 25 . 25 . 25 . 25 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 L 32 . 25 . 25 . 25 . 25 . 75 . 75 . 75 . 75 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 L 3 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 L 34 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 L 41 . 75 . 75 . 75 . 75 . 25 . 25 . 25 . 25 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 L 42 . 25 . 25 . 25 . 25 . 75 . 75 . 75 . 75 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 L 43 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 L 4 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5 . 5

  15. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING R ESULT FOR A ’ ONE - POPULATION ’ MODEL

  16. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING B EHAVIORAL S TRATEGIES Behavioral strategies are functions that map choice points to probability distributions over actions available in that choice point. ◮ behavioral sender strategy σ : T → ∆( M ) ◮ behavioral receiver strategy ρ : M → ∆( A ) � m 1 �→ � a 1 �→  �   �  . 9 . 33 t 1 �→ m 1 �→ m 2 �→ . 1 a 2 �→ . 67  � m 1 �→   � a 1 �→  σ = ρ =     � � . 5 1     t 2 �→ m 2 �→ m 2 �→ . 5 a 2 �→ 0

  17. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING R EINFORCEMENT L EARNING Reinforcement learning via Polya urns: ◮ the sender has a urn ℧ t for each t ∈ T filled with balls of a type m ∈ M ◮ the receiver has a urn ℧ m for each m ∈ M filled with balls of a type a ∈ A σ ( m | t ) = m ( ℧ t ) ρ ( a | m ) = a ( ℧ m ) | ℧ t | | ℧ m | After a played round successful communication will be reinforced (by adding 10 appropriate balls and reducing 4 balls of other types).

  18. S IGNALING G AMES R EPLICATOR D YNAMICS R EINFORCEMENT L EARNING R EINFORCEMENT L EARNING S R ℧ ℧ m 1 a s t s t g a g m 2 ℧ ℧ ◮ the sender has an urn for ◮ the receiver has an urn for each state t ∈ T each message m ∈ M ◮ each urn contains balls of ◮ each urn contains balls of each message m ∈ M each action a ∈ A ◮ the sender decides by ◮ the receiver decides by drawing from urn ℧ t drawing from urn ℧ t ◮ successful communication → urn update ◮ in general a signaling system emerges over time

Recommend


More recommend