Hawk vs. Dove Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 5
symmetri Symmetric normal-form games Example . Hawk-dove game (share V or threaten [possibly fight: − C ]): H D ( V − C ) /2 V H D 0 V /2 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6
symmetri Symmetric normal-form games Example . Hawk-dove game (share V or threaten [possibly fight: − C ]): H D H D ( V − C ) /2 V H H − 2, − 2 2, 0 D 0 V /2 D 0, 2 1, 1 V=2, C=6 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6
symmetri Symmetric normal-form games Example . Hawk-dove game (share V or threaten [possibly fight: − C ]): H D H D ( V − C ) /2 V H H − 2, − 2 2, 0 D 0 V /2 D 0, 2 1, 1 V=2, C=6 Other instantiations: prisoner’s dilemma, chicken (= hawk-dove), matching pennies, stag hunt. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6
Symmetric normal-form games Example . Hawk-dove game (share V or threaten [possibly fight: − C ]): H D H D ( V − C ) /2 V H H − 2, − 2 2, 0 D 0 V /2 D 0, 2 1, 1 V=2, C=6 Other instantiations: prisoner’s dilemma, chicken (= hawk-dove), matching pennies, stag hunt. Definition . A game is symmetri when players have equal actions and payoffs: u i ( a 1 , . . . , a i , . . . , a j , . . . , a n ) = u j ( a 1 , . . . , a j , . . . , a i , . . . , a n ) . for all i and j . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6
Symmetric normal-form games Example . Hawk-dove game (share V or threaten [possibly fight: − C ]): H D H D ( V − C ) /2 V H H − 2, − 2 2, 0 D 0 V /2 D 0, 2 1, 1 V=2, C=6 Other instantiations: prisoner’s dilemma, chicken (= hawk-dove), matching pennies, stag hunt. Definition . A game is symmetri when players have equal actions and payoffs: u i ( a 1 , . . . , a i , . . . , a j , . . . , a n ) = u j ( a 1 , . . . , a j , . . . , a i , . . . , a n ) . for all i and j . So a 2-player game G = ( A , B ) is symmetric iff m = n and B = A T . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6
symmetri equilib rium Symmetric equilibrium Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7
Symmetric equilibrium Definition . Let p be a strategy in an n -player symmetric game. If the n -vector ( p , . . . , p ) is a NE, p is called a rium . symmetri equilib Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7
Symmetric equilibrium Definition . Let p be a strategy in an n -player symmetric game. If the n -vector ( p , . . . , p ) is a NE, p is called a rium . symmetri equilib ■ !! Symmetric equilibria can be identified with strategies !! Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7
Symmetric equilibrium Definition . Let p be a strategy in an n -player symmetric game. If the n -vector ( p , . . . , p ) is a NE, p is called a rium . symmetri equilib ■ !! Symmetric equilibria can be identified with strategies !! ■ (Theorem.) Every symmetric game has at least one symmetric equilibrium. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7
Symmetric equilibrium Definition . Let p be a strategy in an n -player symmetric game. If the n -vector ( p , . . . , p ) is a NE, p is called a rium . symmetri equilib ■ !! Symmetric equilibria can be identified with strategies !! ■ (Theorem.) Every symmetric game has at least one symmetric equilibrium. ■ (Fact.) Symmetric games can have a-symmetric equilibria. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7
Symmetric equilibrium Definition . Let p be a strategy in an n -player symmetric game. If the n -vector ( p , . . . , p ) is a NE, p is called a rium . symmetri equilib ■ !! Symmetric equilibria can be identified with strategies !! ■ (Theorem.) Every symmetric game has at least one symmetric equilibrium. ■ (Fact.) Symmetric games can have a-symmetric equilibria. For example Hawk-Dove: H D H − 2, − 2 2, 0 D 0, 2 1, 1 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7
Symmetric equilibrium Definition . Let p be a strategy in an n -player symmetric game. If the n -vector ( p , . . . , p ) is a NE, p is called a rium . symmetri equilib ■ !! Symmetric equilibria can be identified with strategies !! ■ (Theorem.) Every symmetric game has at least one symmetric equilibrium. ■ (Fact.) Symmetric games can have a-symmetric equilibria. For example Hawk-Dove: H D H − 2, − 2 2, 0 D 0, 2 1, 1 Two asymmetric equilibria and one symmetric equilibrium ( 1/3, 1/3 ) . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7
Evolutionary game theory Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 8
Evolutionary game theory: the idea Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 9
p rop o rtions �tness average �tness Evolutionary game theory: the idea Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10
p rop o rtions �tness average �tness Evolutionary game theory: the idea ■ There are n , say 5, species. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10
p rop o rtions �tness average �tness Evolutionary game theory: the idea ■ There are n , say 5, species. An encounter between individuals of different species yields payoffs for both. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10
p rop o rtions �tness average �tness Evolutionary game theory: the idea ■ There are n , say 5, species. An encounter between individuals of different species yields payoffs for both. For row: s 1 s 2 s 3 s 4 s 5 s 1 6 7 0 − 1 0 s 2 − 1 5 − 1 4 7 A = s 3 9 0 8 9 6 . s 4 0 − 4 − 2 3 − 3 s 5 3 0 6 0 − 1 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10
p rop o rtions �tness average �tness Evolutionary game theory: the idea ■ There are n , say 5, species. An encounter between individuals of different species yields payoffs for both. For row: s 1 s 2 s 3 s 4 s 5 s 1 6 7 0 − 1 0 s 2 − 1 5 − 1 4 7 A = s 3 9 0 8 9 6 . s 4 0 − 4 − 2 3 − 3 s 5 3 0 6 0 − 1 ■ The population consists of a very large number of individuals, each playing a pure strategy. Individuals interact randomly. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10
�tness average �tness Evolutionary game theory: the idea ■ There are n , say 5, species. An encounter ■ We are interested in between individuals of different species rtions : p yields payoffs for both. For row: p rop o = ( p 1 , . . . , p 5 ) . s 1 s 2 s 3 s 4 s 5 s 1 6 7 0 − 1 0 s 2 − 1 5 − 1 4 7 A = s 3 9 0 8 9 6 . s 4 0 − 4 − 2 3 − 3 s 5 3 0 6 0 − 1 ■ The population consists of a very large number of individuals, each playing a pure strategy. Individuals interact randomly. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10
average �tness Evolutionary game theory: the idea ■ There are n , say 5, species. An encounter ■ We are interested in between individuals of different species rtions : p yields payoffs for both. For row: p rop o = ( p 1 , . . . , p 5 ) . s 1 s 2 s 3 s 4 s 5 ■ The �tness of s 1 6 7 0 − 1 0 species i is: s 2 − 1 5 − 1 4 7 A = s 3 9 0 8 9 6 . f i = ∑ 5 j = 1 p j A ij s 4 0 − 4 − 2 3 − 3 = ( Ap ) i . s 5 3 0 6 0 − 1 ■ The population consists of a very large number of individuals, each playing a pure strategy. Individuals interact randomly. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10
Evolutionary game theory: the idea ■ There are n , say 5, species. An encounter ■ We are interested in between individuals of different species rtions : p yields payoffs for both. For row: p rop o = ( p 1 , . . . , p 5 ) . s 1 s 2 s 3 s 4 s 5 ■ The �tness of s 1 6 7 0 − 1 0 species i is: s 2 − 1 5 − 1 4 7 A = s 3 9 0 8 9 6 . f i = ∑ 5 j = 1 p j A ij s 4 0 − 4 − 2 3 − 3 = ( Ap ) i . s 5 3 0 6 0 − 1 ■ The average �tness is ■ The population consists of a very large ¯ f = ∑ 5 i = 1 p i f i number of individuals, each playing a pure = p T Ap . strategy. Individuals interact randomly. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10
The replicator equation Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 11
repli ato r dynami s History of the replicator equation Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12
History of the replicator equation ■ Defined for a single species by Taylor and Jonker (1978), and named by Schuster and Sigmund (1983): “Several evolutionary models in distinct biological fields— population genetics, population ecology, early biochemical evo- lution and sociobiology—lead independently to the same class of dynami s .” repli ato r Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12
History of the replicator equation ■ Defined for a single species by Taylor and Jonker (1978), and named by Schuster and Sigmund (1983): “Several evolutionary models in distinct biological fields— population genetics, population ecology, early biochemical evo- lution and sociobiology—lead independently to the same class of dynami s .” repli ato r ■ The replicator equation is the first game dynamics studied in connection with evolutionary game theory (as developed by Maynard Smith and Price). Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12
History of the replicator equation ■ Defined for a single species by Taylor and Jonker (1978), and named by Schuster and Sigmund (1983): “Several evolutionary models in distinct biological fields— population genetics, population ecology, early biochemical evo- lution and sociobiology—lead independently to the same class of dynami s .” repli ato r ■ The replicator equation is the first game dynamics studied in connection with evolutionary game theory (as developed by Maynard Smith and Price). Taylor P.D., Jonker L. “Evolutionarily stable strategies and game dynamics” in: Math. Biosci. 1978; 40(1) , pp. 145-156. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12
History of the replicator equation ■ Defined for a single species by Taylor and Jonker (1978), and named by Schuster and Sigmund (1983): “Several evolutionary models in distinct biological fields— population genetics, population ecology, early biochemical evo- lution and sociobiology—lead independently to the same class of dynami s .” repli ato r ■ The replicator equation is the first game dynamics studied in connection with evolutionary game theory (as developed by Maynard Smith and Price). Taylor P.D., Jonker L. “Evolutionarily stable strategies and game dynamics” in: Math. Biosci. 1978; 40(1) , pp. 145-156. Schuster P., Sigmund K. “Replicator dynamics” in: J. Theor. Biol. 1983, 100(3) , pp. 533-538. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12
repli ato r equation relative s o re matrix p rop o rtion The replicator equation Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13
relative s o re matrix p rop o rtion The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13
relative s o re matrix p rop o rtion The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. ■ It is assumed that if an individual of species i interacts with an individual of species j , the expected reward for the individual of type i is a constant a ij . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13
p rop o rtion The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. ■ It is assumed that if an individual of species i interacts with an individual of species j , the expected reward for the individual of type i is a constant a ij . a 11 . . . a 1 n . . ... . . matrix : A = Summarised in a . . . relative s o re a n 1 . . . a nn Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13
p rop o rtion The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. ■ It is assumed that if an individual of species i interacts with an individual of species j , the expected reward for the individual of type i is a constant a ij . a 11 . . . a 1 n . . ... . . matrix : A = Summarised in a . . . relative s o re a n 1 . . . a nn Proportions Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13
p rop o rtion The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. ■ It is assumed that if an individual of species i interacts with an individual of species j , the expected reward for the individual of type i is a constant a ij . a 11 . . . a 1 n . . ... . . matrix : A = Summarised in a . . . relative s o re a n 1 . . . a nn Proportions ■ The number of individuals of species i is denoted by q i , or q i ( t ) . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13
The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. ■ It is assumed that if an individual of species i interacts with an individual of species j , the expected reward for the individual of type i is a constant a ij . a 11 . . . a 1 n . . ... . . matrix : A = Summarised in a . . . relative s o re a n 1 . . . a nn Proportions ■ The number of individuals of species i is denoted by q i , or q i ( t ) . p j = Def q j / q is the rtion of species i , where q = q 1 + · · · + q n . ■ p rop o Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13
The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. ■ It is assumed that if an individual of species i interacts with an individual of species j , the expected reward for the individual of type i is a constant a ij . a 11 . . . a 1 n . . ... . . matrix : A = Summarised in a . . . relative s o re a n 1 . . . a nn Proportions ■ The number of individuals of species i is denoted by q i , or q i ( t ) . p j = Def q j / q is the rtion of species i , where q = q 1 + · · · + q n . ■ p rop o ■ So p i ∝ q i and p 1 + · · · + p n = 1. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13
�tness �tness ve to r Fitness Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14
�tness ve to r Fitness ■ The �tness of an individual is its expected reward when it encounters a random individual in the population. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14
�tness ve to r Fitness ■ The �tness of an individual is its expected reward when it encounters a random individual in the population. ■ Example . Suppose 1 3 1 0.1 . A = p = 1 2 3 and 0.4 4 1 3 0.5 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14
Fitness ■ The �tness of an individual is its expected reward when it encounters a random individual in the population. ■ Example . Suppose 1 3 1 0.1 . A = p = 1 2 3 and 0.4 4 1 3 0.5 The r , f , can now be �tness ve to computed as follows: 1 3 1 0.1 1.8 = . f = Ap = 1 2 3 0.4 2.4 4 1 3 0.5 2.3 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14
Fitness ■ The ■ Average fitness: �tness of an individual is its 3 expected reward when it encounters a ¯ ∑ f ( t ) = p i f i ( t ) random individual in the population. j = 1 = p ( Ap ) = 2.29. ■ Example . Suppose 1 3 1 0.1 . A = p = 1 2 3 and 0.4 4 1 3 0.5 The r , f , can now be �tness ve to computed as follows: 1 3 1 0.1 1.8 = . f = Ap = 1 2 3 0.4 2.4 4 1 3 0.5 2.3 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14
Fitness ■ The ■ Average fitness: �tness of an individual is its 3 expected reward when it encounters a ¯ ∑ f ( t ) = p i f i ( t ) random individual in the population. j = 1 = p ( Ap ) = 2.29. ■ Example . Suppose ■ Fitness of species 1: 1 3 1 0.1 3 . A = p = 1 2 3 and 0.4 ∑ f 1 = p j a 1 j 4 1 3 0.5 j = 1 = ( Ap ) 1 = 1.8. The r , f , can now be �tness ve to computed as follows: 1 3 1 0.1 1.8 = . f = Ap = 1 2 3 0.4 2.4 4 1 3 0.5 2.3 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14
Fitness ■ The ■ Average fitness: �tness of an individual is its 3 expected reward when it encounters a ¯ ∑ f ( t ) = p i f i ( t ) random individual in the population. j = 1 = p ( Ap ) = 2.29. ■ Example . Suppose ■ Fitness of species 1: 1 3 1 0.1 3 . A = p = 1 2 3 and 0.4 ∑ f 1 = p j a 1 j 4 1 3 0.5 j = 1 = ( Ap ) 1 = 1.8. The r , f , can now be So species 1 does worse �tness ve to computed as follows: than average. 1 3 1 0.1 1.8 = . f = Ap = 1 2 3 0.4 2.4 4 1 3 0.5 2.3 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14
Fitness ■ The ■ Average fitness: �tness of an individual is its 3 expected reward when it encounters a ¯ ∑ f ( t ) = p i f i ( t ) random individual in the population. j = 1 = p ( Ap ) = 2.29. ■ Example . Suppose ■ Fitness of species 1: 1 3 1 0.1 3 . A = p = 1 2 3 and 0.4 ∑ f 1 = p j a 1 j 4 1 3 0.5 j = 1 = ( Ap ) 1 = 1.8. The r , f , can now be So species 1 does worse �tness ve to computed as follows: than average. 1 3 1 0.1 1.8 ■ Species 2 and 3 have = . f = Ap = 1 2 3 0.4 2.4 fitness 2.4 and 2.3, 4 1 3 0.5 2.3 respectively. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14
The continuous replicator equation Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 15
The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16
The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16
The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16
The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16
The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16
The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? Answer . ˙ p 7 ( t ) Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16
The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? p 7 ( t ) = p 7 ( t )[ f 7 ( t ) − ¯ Answer . ˙ f ( t )] Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16
The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? p 7 ( t ) = p 7 ( t )[ f 7 ( t ) − ¯ Answer . ˙ f ( t )] = 0.2 ( 6 − 9 ) Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16
The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? p 7 ( t ) = p 7 ( t )[ f 7 ( t ) − ¯ Answer . ˙ f ( t )] = 0.2 ( 6 − 9 ) = − 0.6. � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16
The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? p 7 ( t ) = p 7 ( t )[ f 7 ( t ) − ¯ Answer . ˙ f ( t )] = 0.2 ( 6 − 9 ) = − 0.6. � Example 2 . Suppose p 5 ( t ) = 0.2, f 5 ( t ) = 6, and ¯ f ( t ) = 4. Same question. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16
The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? p 7 ( t ) = p 7 ( t )[ f 7 ( t ) − ¯ Answer . ˙ f ( t )] = 0.2 ( 6 − 9 ) = − 0.6. � Example 2 . Suppose p 5 ( t ) = 0.2, f 5 ( t ) = 6, and ¯ f ( t ) = 4. Same question. p 5 ( t ) = p 5 ( t )[ f 5 ( t ) − ¯ f ( t )] = 0.2 ( 6 − 4 ) = 0.4. Answer . ˙ � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16
The dynamics of the replicator equation 1 3 1 1/3 Relative score matrix A = , start proportions p = 1 2 3 1/3 . 4 1 3 1/3 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 17
Phase space of the replicator on the previous page Circled rest points indicate Nash equilibria of the score-matrix, interpreted as the payoff matrix of a symmetric game in normal form. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 18
A replicator dynamic in a higher dimension Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 19
rest p oint (Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20
rest p oint (Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20
rest p oint (Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p = ( ˙ p n ) ˙ p 1 , . . . , ˙ and Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20
rest p oint (Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20
rest p oint (Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20
(Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: p = 0. ■ p is called a oint , if ˙ rest p Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20
(Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: p = 0. (“If at p , then stays at p ”.) ■ p is called a oint , if ˙ rest p Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20
asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: p = 0. (“If at p , then stays at p ”.) ■ p is called a oint , if ˙ rest p ■ A rest point p is called stable if for every neighborhood U (Ly apunov) of p there is another neighborhood U ′ of p such that states in U ′ , if iterated, remain within U . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20
asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: p = 0. (“If at p , then stays at p ”.) ■ p is called a oint , if ˙ rest p ■ A rest point p is called stable if for every neighborhood U (Ly apunov) of p there is another neighborhood U ′ of p such that states in U ′ , if iterated, remain within U . (“If close to p , then always close to p .”) Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20
Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: p = 0. (“If at p , then stays at p ”.) ■ p is called a oint , if ˙ rest p ■ A rest point p is called stable if for every neighborhood U (Ly apunov) of p there is another neighborhood U ′ of p such that states in U ′ , if iterated, remain within U . (“If close to p , then always close to p .”) ■ A rest point p is called stable if p has a neighborhood U asymptoti ally such that all proportion vectors in U , if iterated, converge to p . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20
Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: p = 0. (“If at p , then stays at p ”.) ■ p is called a oint , if ˙ rest p ■ A rest point p is called stable if for every neighborhood U (Ly apunov) of p there is another neighborhood U ′ of p such that states in U ′ , if iterated, remain within U . (“If close to p , then always close to p .”) ■ A rest point p is called stable if p has a neighborhood U asymptoti ally such that all proportion vectors in U , if iterated, converge to p . (“if close to p , then convergence to p .”). Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20
Relation with Nash equilibria State p is a Nash equilibrium: ∀ q : q ( Ap ) ≤ p ( Ap ) ⇔ ∀ q : q f ≤ p f ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . If for all i : f i ≤ ¯ f , then it must be that for all i : f i = ¯ f (check!), which means we have a rest point. Such a rest point is called saturated . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21
Relation with Nash equilibria State p is a Nash equilibrium: ∀ q : q ( Ap ) ≤ p ( Ap ) ⇔ ∀ q : q f ≤ p f ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . If for all i : f i ≤ ¯ f , then it must be that for all i : f i = ¯ f (check!), which means we have a rest point. Such a rest point is called saturated . ■ Nash equilibrium ⇔ saturated rest point. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21
Relation with Nash equilibria State p is a Nash equilibrium: ∀ q : q ( Ap ) ≤ p ( Ap ) ⇔ ∀ q : q f ≤ p f ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . If for all i : f i ≤ ¯ f , then it must be that for all i : f i = ¯ f (check!), which means we have a rest point. Such a rest point is called saturated . ■ Nash equilibrium ⇔ saturated rest point. Proof. ⇒ : take pure q . ⇐ : if for all i : f i ≤ f , then no convex combination of those f i can supersede ¯ f . � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21
Relation with Nash equilibria State p is a Nash equilibrium: ■ Nash equilibrium ⇒ rest point. (Trivial.) ∀ q : q ( Ap ) ≤ p ( Ap ) ⇔ ∀ q : q f ≤ p f ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . If for all i : f i ≤ ¯ f , then it must be that for all i : f i = ¯ f (check!), which means we have a rest point. Such a rest point is called saturated . ■ Nash equilibrium ⇔ saturated rest point. Proof. ⇒ : take pure q . ⇐ : if for all i : f i ≤ f , then no convex combination of those f i can supersede ¯ f . � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21
Relation with Nash equilibria State p is a Nash equilibrium: ■ Nash equilibrium ⇒ rest point. (Trivial.) ∀ q : q ( Ap ) ≤ p ( Ap ) ■ Fully mixed rest point ⇒ ⇔ ∀ q : q f ≤ p f Nash equilibrium. (Because ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . fully mixed implies If for all i : f i ≤ ¯ f , then it must be saturated.) that for all i : f i = ¯ f (check!), which means we have a rest point. Such a rest point is called saturated . ■ Nash equilibrium ⇔ saturated rest point. Proof. ⇒ : take pure q . ⇐ : if for all i : f i ≤ f , then no convex combination of those f i can supersede ¯ f . � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21
Relation with Nash equilibria State p is a Nash equilibrium: ■ Nash equilibrium ⇒ rest point. (Trivial.) ∀ q : q ( Ap ) ≤ p ( Ap ) ■ Fully mixed rest point ⇒ ⇔ ∀ q : q f ≤ p f Nash equilibrium. (Because ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . fully mixed implies If for all i : f i ≤ ¯ f , then it must be saturated.) that for all i : f i = ¯ f (check!), ■ Strict Nash equilibrium ⇒ which means we have a rest asymptotically stable. point. Such a rest point is called saturated . ■ Nash equilibrium ⇔ saturated rest point. Proof. ⇒ : take pure q . ⇐ : if for all i : f i ≤ f , then no convex combination of those f i can supersede ¯ f . � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21
Relation with Nash equilibria State p is a Nash equilibrium: ■ Nash equilibrium ⇒ rest point. (Trivial.) ∀ q : q ( Ap ) ≤ p ( Ap ) ■ Fully mixed rest point ⇒ ⇔ ∀ q : q f ≤ p f Nash equilibrium. (Because ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . fully mixed implies If for all i : f i ≤ ¯ f , then it must be saturated.) that for all i : f i = ¯ f (check!), ■ Strict Nash equilibrium ⇒ which means we have a rest asymptotically stable. point. Such a rest point is called saturated . ■ Limit point in the interior of ∆ n ⇒ Nash equilibrium. ■ Nash equilibrium ⇔ saturated rest point. Proof. ⇒ : take pure q . ⇐ : if for all i : f i ≤ f , then no convex combination of those f i can supersede ¯ f . � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21
Relation with Nash equilibria State p is a Nash equilibrium: ■ Nash equilibrium ⇒ rest point. (Trivial.) ∀ q : q ( Ap ) ≤ p ( Ap ) ■ Fully mixed rest point ⇒ ⇔ ∀ q : q f ≤ p f Nash equilibrium. (Because ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . fully mixed implies If for all i : f i ≤ ¯ f , then it must be saturated.) that for all i : f i = ¯ f (check!), ■ Strict Nash equilibrium ⇒ which means we have a rest asymptotically stable. point. Such a rest point is called saturated . ■ Limit point in the interior of ∆ n ⇒ Nash equilibrium. ■ Nash equilibrium ⇔ saturated rest point. ■ Asymptotically stable in the Proof. ⇒ : take pure q . ⇐ : if interior of ∆ n ⇒ isolated for all i : f i ≤ f , then no trembling-hand perfect Nash convex combination of those equilibrium. f i can supersede ¯ f . � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21
Not all Nash equilibria are Lyapunov stable ( 1, 0, 0 ) is Nash but not Lyapunov stable. (The picture is merely suggestive, since it only contains a few traces of the dynamics.) Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 22
Recommend
More recommend