Multi-agent learning The repliato r dynami Gerard Vreeswijk , - PowerPoint PPT Presentation

Hawk vs. Dove Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 5

symmetri Symmetric normal-form games Example . Hawk-dove game (share V or threaten [possibly fight: − C ]): H D ( V − C ) /2 V H D 0 V /2 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6

symmetri Symmetric normal-form games Example . Hawk-dove game (share V or threaten [possibly fight: − C ]): H D H D ( V − C ) /2 V H H − 2, − 2 2, 0 D 0 V /2 D 0, 2 1, 1 V=2, C=6 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6

symmetri Symmetric normal-form games Example . Hawk-dove game (share V or threaten [possibly fight: − C ]): H D H D ( V − C ) /2 V H H − 2, − 2 2, 0 D 0 V /2 D 0, 2 1, 1 V=2, C=6 Other instantiations: prisoner’s dilemma, chicken (= hawk-dove), matching pennies, stag hunt. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6

Symmetric normal-form games Example . Hawk-dove game (share V or threaten [possibly fight: − C ]): H D H D ( V − C ) /2 V H H − 2, − 2 2, 0 D 0 V /2 D 0, 2 1, 1 V=2, C=6 Other instantiations: prisoner’s dilemma, chicken (= hawk-dove), matching pennies, stag hunt. Definition . A game is symmetri when players have equal actions and payoffs: u i ( a 1 , . . . , a i , . . . , a j , . . . , a n ) = u j ( a 1 , . . . , a j , . . . , a i , . . . , a n ) . for all i and j . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6

Symmetric normal-form games Example . Hawk-dove game (share V or threaten [possibly fight: − C ]): H D H D ( V − C ) /2 V H H − 2, − 2 2, 0 D 0 V /2 D 0, 2 1, 1 V=2, C=6 Other instantiations: prisoner’s dilemma, chicken (= hawk-dove), matching pennies, stag hunt. Definition . A game is symmetri when players have equal actions and payoffs: u i ( a 1 , . . . , a i , . . . , a j , . . . , a n ) = u j ( a 1 , . . . , a j , . . . , a i , . . . , a n ) . for all i and j . So a 2-player game G = ( A , B ) is symmetric iff m = n and B = A T . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6

symmetri equilib rium Symmetric equilibrium Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

Symmetric equilibrium Definition . Let p be a strategy in an n -player symmetric game. If the n -vector ( p , . . . , p ) is a NE, p is called a rium . symmetri equilib Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

Symmetric equilibrium Definition . Let p be a strategy in an n -player symmetric game. If the n -vector ( p , . . . , p ) is a NE, p is called a rium . symmetri equilib ■ !! Symmetric equilibria can be identified with strategies !! Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

Symmetric equilibrium Definition . Let p be a strategy in an n -player symmetric game. If the n -vector ( p , . . . , p ) is a NE, p is called a rium . symmetri equilib ■ !! Symmetric equilibria can be identified with strategies !! ■ (Theorem.) Every symmetric game has at least one symmetric equilibrium. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

Symmetric equilibrium Definition . Let p be a strategy in an n -player symmetric game. If the n -vector ( p , . . . , p ) is a NE, p is called a rium . symmetri equilib ■ !! Symmetric equilibria can be identified with strategies !! ■ (Theorem.) Every symmetric game has at least one symmetric equilibrium. ■ (Fact.) Symmetric games can have a-symmetric equilibria. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

Symmetric equilibrium Definition . Let p be a strategy in an n -player symmetric game. If the n -vector ( p , . . . , p ) is a NE, p is called a rium . symmetri equilib ■ !! Symmetric equilibria can be identified with strategies !! ■ (Theorem.) Every symmetric game has at least one symmetric equilibrium. ■ (Fact.) Symmetric games can have a-symmetric equilibria. For example Hawk-Dove: H D H − 2, − 2 2, 0 D 0, 2 1, 1 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

Symmetric equilibrium Definition . Let p be a strategy in an n -player symmetric game. If the n -vector ( p , . . . , p ) is a NE, p is called a rium . symmetri equilib ■ !! Symmetric equilibria can be identified with strategies !! ■ (Theorem.) Every symmetric game has at least one symmetric equilibrium. ■ (Fact.) Symmetric games can have a-symmetric equilibria. For example Hawk-Dove: H D H − 2, − 2 2, 0 D 0, 2 1, 1 Two asymmetric equilibria and one symmetric equilibrium ( 1/3, 1/3 ) . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

Evolutionary game theory Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 8

Evolutionary game theory: the idea Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 9

p rop o rtions �tness average �tness Evolutionary game theory: the idea Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

p rop o rtions �tness average �tness Evolutionary game theory: the idea ■ There are n , say 5, species. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

p rop o rtions �tness average �tness Evolutionary game theory: the idea ■ There are n , say 5, species. An encounter between individuals of different species yields payoffs for both. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

p rop o rtions �tness average �tness Evolutionary game theory: the idea ■ There are n , say 5, species. An encounter between individuals of different species yields payoffs for both. For row: s 1 s 2 s 3 s 4 s 5   s 1 6 7 0 − 1 0 s 2 − 1 5 − 1 4 7     A = s 3 9 0 8 9 6 .     s 4 0 − 4 − 2 3 − 3   s 5 3 0 6 0 − 1 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

p rop o rtions �tness average �tness Evolutionary game theory: the idea ■ There are n , say 5, species. An encounter between individuals of different species yields payoffs for both. For row: s 1 s 2 s 3 s 4 s 5   s 1 6 7 0 − 1 0 s 2 − 1 5 − 1 4 7     A = s 3 9 0 8 9 6 .     s 4 0 − 4 − 2 3 − 3   s 5 3 0 6 0 − 1 ■ The population consists of a very large number of individuals, each playing a pure strategy. Individuals interact randomly. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

�tness average �tness Evolutionary game theory: the idea ■ There are n , say 5, species. An encounter ■ We are interested in between individuals of different species rtions : p yields payoffs for both. For row: p rop o = ( p 1 , . . . , p 5 ) . s 1 s 2 s 3 s 4 s 5   s 1 6 7 0 − 1 0 s 2 − 1 5 − 1 4 7     A = s 3 9 0 8 9 6 .     s 4 0 − 4 − 2 3 − 3   s 5 3 0 6 0 − 1 ■ The population consists of a very large number of individuals, each playing a pure strategy. Individuals interact randomly. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

average �tness Evolutionary game theory: the idea ■ There are n , say 5, species. An encounter ■ We are interested in between individuals of different species rtions : p yields payoffs for both. For row: p rop o = ( p 1 , . . . , p 5 ) . s 1 s 2 s 3 s 4 s 5 ■ The �tness of   s 1 6 7 0 − 1 0 species i is: s 2 − 1 5 − 1 4 7     A = s 3 9 0 8 9 6 .   f i = ∑ 5 j = 1 p j A ij   s 4 0 − 4 − 2 3 − 3   = ( Ap ) i . s 5 3 0 6 0 − 1 ■ The population consists of a very large number of individuals, each playing a pure strategy. Individuals interact randomly. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

Evolutionary game theory: the idea ■ There are n , say 5, species. An encounter ■ We are interested in between individuals of different species rtions : p yields payoffs for both. For row: p rop o = ( p 1 , . . . , p 5 ) . s 1 s 2 s 3 s 4 s 5 ■ The �tness of   s 1 6 7 0 − 1 0 species i is: s 2 − 1 5 − 1 4 7     A = s 3 9 0 8 9 6 .   f i = ∑ 5 j = 1 p j A ij   s 4 0 − 4 − 2 3 − 3   = ( Ap ) i . s 5 3 0 6 0 − 1 ■ The average �tness is ■ The population consists of a very large ¯ f = ∑ 5 i = 1 p i f i number of individuals, each playing a pure = p T Ap . strategy. Individuals interact randomly. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

The replicator equation Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 11

repli ato r dynami s History of the replicator equation Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12

History of the replicator equation ■ Defined for a single species by Taylor and Jonker (1978), and named by Schuster and Sigmund (1983): “Several evolutionary models in distinct biological fields— population genetics, population ecology, early biochemical evo- lution and sociobiology—lead independently to the same class of dynami s .” repli ato r Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12

History of the replicator equation ■ Defined for a single species by Taylor and Jonker (1978), and named by Schuster and Sigmund (1983): “Several evolutionary models in distinct biological fields— population genetics, population ecology, early biochemical evo- lution and sociobiology—lead independently to the same class of dynami s .” repli ato r ■ The replicator equation is the first game dynamics studied in connection with evolutionary game theory (as developed by Maynard Smith and Price). Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12

History of the replicator equation ■ Defined for a single species by Taylor and Jonker (1978), and named by Schuster and Sigmund (1983): “Several evolutionary models in distinct biological fields— population genetics, population ecology, early biochemical evo- lution and sociobiology—lead independently to the same class of dynami s .” repli ato r ■ The replicator equation is the first game dynamics studied in connection with evolutionary game theory (as developed by Maynard Smith and Price). Taylor P.D., Jonker L. “Evolutionarily stable strategies and game dynamics” in: Math. Biosci. 1978; 40(1) , pp. 145-156. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12

History of the replicator equation ■ Defined for a single species by Taylor and Jonker (1978), and named by Schuster and Sigmund (1983): “Several evolutionary models in distinct biological fields— population genetics, population ecology, early biochemical evo- lution and sociobiology—lead independently to the same class of dynami s .” repli ato r ■ The replicator equation is the first game dynamics studied in connection with evolutionary game theory (as developed by Maynard Smith and Price). Taylor P.D., Jonker L. “Evolutionarily stable strategies and game dynamics” in: Math. Biosci. 1978; 40(1) , pp. 145-156. Schuster P., Sigmund K. “Replicator dynamics” in: J. Theor. Biol. 1983, 100(3) , pp. 533-538. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12

repli ato r equation relative s o re matrix p rop o rtion The replicator equation Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

relative s o re matrix p rop o rtion The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

relative s o re matrix p rop o rtion The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. ■ It is assumed that if an individual of species i interacts with an individual of species j , the expected reward for the individual of type i is a constant a ij . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

p rop o rtion The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. ■ It is assumed that if an individual of species i interacts with an individual of species j , the expected reward for the individual of type i is a constant a ij .   a 11 . . . a 1 n . . ...   . . matrix : A = Summarised in a  . . .  relative s o re a n 1 . . . a nn Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

p rop o rtion The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. ■ It is assumed that if an individual of species i interacts with an individual of species j , the expected reward for the individual of type i is a constant a ij .   a 11 . . . a 1 n . . ...   . . matrix : A = Summarised in a  . . .  relative s o re a n 1 . . . a nn Proportions Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

p rop o rtion The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. ■ It is assumed that if an individual of species i interacts with an individual of species j , the expected reward for the individual of type i is a constant a ij .   a 11 . . . a 1 n . . ...   . . matrix : A = Summarised in a  . . .  relative s o re a n 1 . . . a nn Proportions ■ The number of individuals of species i is denoted by q i , or q i ( t ) . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. ■ It is assumed that if an individual of species i interacts with an individual of species j , the expected reward for the individual of type i is a constant a ij .   a 11 . . . a 1 n . . ...   . . matrix : A = Summarised in a  . . .  relative s o re a n 1 . . . a nn Proportions ■ The number of individuals of species i is denoted by q i , or q i ( t ) . p j = Def q j / q is the rtion of species i , where q = q 1 + · · · + q n . ■ p rop o Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

The replicator equation ■ The equation models how n different specifies grow (or repli ato r decline) due to mutual interaction. ■ It is assumed that if an individual of species i interacts with an individual of species j , the expected reward for the individual of type i is a constant a ij .   a 11 . . . a 1 n . . ...   . . matrix : A = Summarised in a  . . .  relative s o re a n 1 . . . a nn Proportions ■ The number of individuals of species i is denoted by q i , or q i ( t ) . p j = Def q j / q is the rtion of species i , where q = q 1 + · · · + q n . ■ p rop o ■ So p i ∝ q i and p 1 + · · · + p n = 1. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

�tness �tness ve to r Fitness Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

�tness ve to r Fitness ■ The �tness of an individual is its expected reward when it encounters a random individual in the population. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

�tness ve to r Fitness ■ The �tness of an individual is its expected reward when it encounters a random individual in the population. ■ Example . Suppose     1 3 1 0.1  . A = p = 1 2 3 and 0.4    4 1 3 0.5 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

Fitness ■ The �tness of an individual is its expected reward when it encounters a random individual in the population. ■ Example . Suppose     1 3 1 0.1  . A = p = 1 2 3 and 0.4    4 1 3 0.5 The r , f , can now be �tness ve to computed as follows:       1 3 1 0.1 1.8  =  . f = Ap = 1 2 3 0.4 2.4     4 1 3 0.5 2.3 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

Fitness ■ The ■ Average fitness: �tness of an individual is its 3 expected reward when it encounters a ¯ ∑ f ( t ) = p i f i ( t ) random individual in the population. j = 1 = p ( Ap ) = 2.29. ■ Example . Suppose     1 3 1 0.1  . A = p = 1 2 3 and 0.4    4 1 3 0.5 The r , f , can now be �tness ve to computed as follows:       1 3 1 0.1 1.8  =  . f = Ap = 1 2 3 0.4 2.4     4 1 3 0.5 2.3 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

Fitness ■ The ■ Average fitness: �tness of an individual is its 3 expected reward when it encounters a ¯ ∑ f ( t ) = p i f i ( t ) random individual in the population. j = 1 = p ( Ap ) = 2.29. ■ Example . Suppose ■ Fitness of species 1:     1 3 1 0.1 3  . A = p = 1 2 3 and 0.4    ∑ f 1 = p j a 1 j 4 1 3 0.5 j = 1 = ( Ap ) 1 = 1.8. The r , f , can now be �tness ve to computed as follows:       1 3 1 0.1 1.8  =  . f = Ap = 1 2 3 0.4 2.4     4 1 3 0.5 2.3 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

Fitness ■ The ■ Average fitness: �tness of an individual is its 3 expected reward when it encounters a ¯ ∑ f ( t ) = p i f i ( t ) random individual in the population. j = 1 = p ( Ap ) = 2.29. ■ Example . Suppose ■ Fitness of species 1:     1 3 1 0.1 3  . A = p = 1 2 3 and 0.4    ∑ f 1 = p j a 1 j 4 1 3 0.5 j = 1 = ( Ap ) 1 = 1.8. The r , f , can now be So species 1 does worse �tness ve to computed as follows: than average.       1 3 1 0.1 1.8  =  . f = Ap = 1 2 3 0.4 2.4     4 1 3 0.5 2.3 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

Fitness ■ The ■ Average fitness: �tness of an individual is its 3 expected reward when it encounters a ¯ ∑ f ( t ) = p i f i ( t ) random individual in the population. j = 1 = p ( Ap ) = 2.29. ■ Example . Suppose ■ Fitness of species 1:     1 3 1 0.1 3  . A = p = 1 2 3 and 0.4    ∑ f 1 = p j a 1 j 4 1 3 0.5 j = 1 = ( Ap ) 1 = 1.8. The r , f , can now be So species 1 does worse �tness ve to computed as follows: than average.       1 3 1 0.1 1.8 ■ Species 2 and 3 have  =  . f = Ap = 1 2 3 0.4 2.4     fitness 2.4 and 2.3, 4 1 3 0.5 2.3 respectively. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

The continuous replicator equation Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 15

The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? Answer . ˙ p 7 ( t ) Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? p 7 ( t ) = p 7 ( t )[ f 7 ( t ) − ¯ Answer . ˙ f ( t )] Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? p 7 ( t ) = p 7 ( t )[ f 7 ( t ) − ¯ Answer . ˙ f ( t )] = 0.2 ( 6 − 9 ) Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? p 7 ( t ) = p 7 ( t )[ f 7 ( t ) − ¯ Answer . ˙ f ( t )] = 0.2 ( 6 − 9 ) = − 0.6. � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? p 7 ( t ) = p 7 ( t )[ f 7 ( t ) − ¯ Answer . ˙ f ( t )] = 0.2 ( 6 − 9 ) = − 0.6. � Example 2 . Suppose p 5 ( t ) = 0.2, f 5 ( t ) = 6, and ¯ f ( t ) = 4. Same question. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The continuous replicator equation The equation has an extremely intuitive reading: ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] , ˙ p i ( t ) is shorthand for the change of p i in time: where ˙ p i ( t ) = p ′ i ( t ) = dp i ( t ) / dt . ˙ Example 1 . Suppose the proportion of species 7 at time t is p 7 ( t ) = 0.2, the fitness of species 7 at time t is f 7 ( t ) = 6, and the average fitness at time t is ¯ f ( t ) = 9. How fast does p 7 grow on time t ? p 7 ( t ) = p 7 ( t )[ f 7 ( t ) − ¯ Answer . ˙ f ( t )] = 0.2 ( 6 − 9 ) = − 0.6. � Example 2 . Suppose p 5 ( t ) = 0.2, f 5 ( t ) = 6, and ¯ f ( t ) = 4. Same question. p 5 ( t ) = p 5 ( t )[ f 5 ( t ) − ¯ f ( t )] = 0.2 ( 6 − 4 ) = 0.4. Answer . ˙ � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The dynamics of the replicator equation     1 3 1 1/3 Relative score matrix A =  , start proportions p = 1 2 3 1/3  .   4 1 3 1/3 Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 17

Phase space of the replicator on the previous page Circled rest points indicate Nash equilibria of the score-matrix, interpreted as the payoff matrix of a symmetric game in normal form. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 18

A replicator dynamic in a higher dimension Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 19

rest p oint (Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

rest p oint (Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

rest p oint (Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p = ( ˙ p n ) ˙ p 1 , . . . , ˙ and Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

rest p oint (Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

rest p oint (Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

(Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: p = 0. ■ p is called a oint , if ˙ rest p Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

(Ly apunov) stable asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: p = 0. (“If at p , then stays at p ”.) ■ p is called a oint , if ˙ rest p Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: p = 0. (“If at p , then stays at p ”.) ■ p is called a oint , if ˙ rest p ■ A rest point p is called stable if for every neighborhood U (Ly apunov) of p there is another neighborhood U ′ of p such that states in U ′ , if iterated, remain within U . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

asymptoti ally stable Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: p = 0. (“If at p , then stays at p ”.) ■ p is called a oint , if ˙ rest p ■ A rest point p is called stable if for every neighborhood U (Ly apunov) of p there is another neighborhood U ′ of p such that states in U ′ , if iterated, remain within U . (“If close to p , then always close to p .”) Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: p = 0. (“If at p , then stays at p ”.) ■ p is called a oint , if ˙ rest p ■ A rest point p is called stable if for every neighborhood U (Ly apunov) of p there is another neighborhood U ′ of p such that states in U ′ , if iterated, remain within U . (“If close to p , then always close to p .”) ■ A rest point p is called stable if p has a neighborhood U asymptoti ally such that all proportion vectors in U , if iterated, converge to p . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

Rest point, stable point, asymptotically stable point The equation : ontinuous repli ato r p i ( t ) = p i ( t )[ f i ( t ) − ¯ f ( t ) ] ˙ is a system of differential equations. We have p = ( p 1 , . . . , p n ) ∈ ∆ n p n ) ∈ R n . p = ( ˙ ˙ p 1 , . . . , ˙ and Definitions: p = 0. (“If at p , then stays at p ”.) ■ p is called a oint , if ˙ rest p ■ A rest point p is called stable if for every neighborhood U (Ly apunov) of p there is another neighborhood U ′ of p such that states in U ′ , if iterated, remain within U . (“If close to p , then always close to p .”) ■ A rest point p is called stable if p has a neighborhood U asymptoti ally such that all proportion vectors in U , if iterated, converge to p . (“if close to p , then convergence to p .”). Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

Relation with Nash equilibria State p is a Nash equilibrium: ∀ q : q ( Ap ) ≤ p ( Ap ) ⇔ ∀ q : q f ≤ p f ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . If for all i : f i ≤ ¯ f , then it must be that for all i : f i = ¯ f (check!), which means we have a rest point. Such a rest point is called saturated . Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

Relation with Nash equilibria State p is a Nash equilibrium: ∀ q : q ( Ap ) ≤ p ( Ap ) ⇔ ∀ q : q f ≤ p f ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . If for all i : f i ≤ ¯ f , then it must be that for all i : f i = ¯ f (check!), which means we have a rest point. Such a rest point is called saturated . ■ Nash equilibrium ⇔ saturated rest point. Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

Relation with Nash equilibria State p is a Nash equilibrium: ∀ q : q ( Ap ) ≤ p ( Ap ) ⇔ ∀ q : q f ≤ p f ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . If for all i : f i ≤ ¯ f , then it must be that for all i : f i = ¯ f (check!), which means we have a rest point. Such a rest point is called saturated . ■ Nash equilibrium ⇔ saturated rest point. Proof. ⇒ : take pure q . ⇐ : if for all i : f i ≤ f , then no convex combination of those f i can supersede ¯ f . � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

Relation with Nash equilibria State p is a Nash equilibrium: ■ Nash equilibrium ⇒ rest point. (Trivial.) ∀ q : q ( Ap ) ≤ p ( Ap ) ⇔ ∀ q : q f ≤ p f ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . If for all i : f i ≤ ¯ f , then it must be that for all i : f i = ¯ f (check!), which means we have a rest point. Such a rest point is called saturated . ■ Nash equilibrium ⇔ saturated rest point. Proof. ⇒ : take pure q . ⇐ : if for all i : f i ≤ f , then no convex combination of those f i can supersede ¯ f . � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

Relation with Nash equilibria State p is a Nash equilibrium: ■ Nash equilibrium ⇒ rest point. (Trivial.) ∀ q : q ( Ap ) ≤ p ( Ap ) ■ Fully mixed rest point ⇒ ⇔ ∀ q : q f ≤ p f Nash equilibrium. (Because ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . fully mixed implies If for all i : f i ≤ ¯ f , then it must be saturated.) that for all i : f i = ¯ f (check!), which means we have a rest point. Such a rest point is called saturated . ■ Nash equilibrium ⇔ saturated rest point. Proof. ⇒ : take pure q . ⇐ : if for all i : f i ≤ f , then no convex combination of those f i can supersede ¯ f . � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

Relation with Nash equilibria State p is a Nash equilibrium: ■ Nash equilibrium ⇒ rest point. (Trivial.) ∀ q : q ( Ap ) ≤ p ( Ap ) ■ Fully mixed rest point ⇒ ⇔ ∀ q : q f ≤ p f Nash equilibrium. (Because ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . fully mixed implies If for all i : f i ≤ ¯ f , then it must be saturated.) that for all i : f i = ¯ f (check!), ■ Strict Nash equilibrium ⇒ which means we have a rest asymptotically stable. point. Such a rest point is called saturated . ■ Nash equilibrium ⇔ saturated rest point. Proof. ⇒ : take pure q . ⇐ : if for all i : f i ≤ f , then no convex combination of those f i can supersede ¯ f . � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

Relation with Nash equilibria State p is a Nash equilibrium: ■ Nash equilibrium ⇒ rest point. (Trivial.) ∀ q : q ( Ap ) ≤ p ( Ap ) ■ Fully mixed rest point ⇒ ⇔ ∀ q : q f ≤ p f Nash equilibrium. (Because ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . fully mixed implies If for all i : f i ≤ ¯ f , then it must be saturated.) that for all i : f i = ¯ f (check!), ■ Strict Nash equilibrium ⇒ which means we have a rest asymptotically stable. point. Such a rest point is called saturated . ■ Limit point in the interior of ∆ n ⇒ Nash equilibrium. ■ Nash equilibrium ⇔ saturated rest point. Proof. ⇒ : take pure q . ⇐ : if for all i : f i ≤ f , then no convex combination of those f i can supersede ¯ f . � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

Relation with Nash equilibria State p is a Nash equilibrium: ■ Nash equilibrium ⇒ rest point. (Trivial.) ∀ q : q ( Ap ) ≤ p ( Ap ) ■ Fully mixed rest point ⇒ ⇔ ∀ q : q f ≤ p f Nash equilibrium. (Because ∀ q : q 1 f 1 + · · · + q n f n ≤ ¯ ⇔ f . fully mixed implies If for all i : f i ≤ ¯ f , then it must be saturated.) that for all i : f i = ¯ f (check!), ■ Strict Nash equilibrium ⇒ which means we have a rest asymptotically stable. point. Such a rest point is called saturated . ■ Limit point in the interior of ∆ n ⇒ Nash equilibrium. ■ Nash equilibrium ⇔ saturated rest point. ■ Asymptotically stable in the Proof. ⇒ : take pure q . ⇐ : if interior of ∆ n ⇒ isolated for all i : f i ≤ f , then no trembling-hand perfect Nash convex combination of those equilibrium. f i can supersede ¯ f . � Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

Not all Nash equilibria are Lyapunov stable ( 1, 0, 0 ) is Nash but not Lyapunov stable. (The picture is merely suggestive, since it only contains a few traces of the dynamics.) Author: Gerard Vreeswijk. Slides last modified on June 10 th , 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 22

Multi-agent learning The repliato r dynami Gerard Vreeswijk , - PowerPoint PPT Presentation

Multi-agent learning The repliato r dynami Gerard Vreeswijk , Intelligent Software Systems, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands. Wednesday 10 th June, 2020 disrete repliato r disrete

Multi-agent learning Multi-agent reinforcement learning Gerard Vreeswijk , Intelligent Systems

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

Multi-agent learning Gerard Vreeswijk , Intelligent Systems Group, Computer Science Department,

REINFORCEMENT LEARNING IN MULTI-AGENT SYSTEMS MACHINE LEARNING MEETUP DR. ANA PELETEIRO

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Multi-agent learning Simplied Poker Yannick Bitane , April 14th, 2011. Yannick Bitane. Slides

Learning Agent Learning Agents An Agent that observes its performance and adapts its

ROMA: Multi-Agent Reinforcement Learning with Emerging Roles Tonghan Wang, Heng Dong, Victor

The Player Agent The Player Agent Are they the most important league official right now? right

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 6 Agent Communication 1

Agent Training Welcome Blues Agent Portal Training e-Learning on the BCBSM Agent Portal

W HAT S AN A GENT ? Weiss, p. 29 [after Wooldridge and Jennings]: An agent is a

M ULTI -A GENT S YSTEMS Overview and Research Directions Whats an agent? AI Class 12 (C H .

Be Beyon ond the Free Will Defense: Nat Natur ural al Evil, Theo heodi dicy, and and Sac

Non-exhaustive, Overlapping Clustering via Low-Rank Semidefinite Programming Yangyang Hou 1 *,

4 Steps to Turn 4 Da Data Into Do Dollars Wednesday, June 15, 2016 12:00 pm CST Get Social!

Hoos Connected E n h a n c i n g S o c i a l B e l o n g i n g & S u p p o r t A m o n g

What is Conservation Biology? Brook Milligan Department of Biology New Mexico State University

Monitoring Electrical Power Consumption with Kieker 9 th Symposium on Software Performance 2018

2 Control and Field Level Devices Industrial Automation, EPFL, Spring 2019 Content 2.1 PLCs

Dawn Song dawnsong@cs.berkeley.edu Thanks for Benny Pinkas for some of the slides 1 Project

Sambuz

Useful Links

Newsletter

Mail Us

Multi-agent learning The repliato r dynami Gerard Vreeswijk , - PowerPoint PPT Presentation

Multi-agent learning The repliato r dynami Gerard Vreeswijk , Intelligent Software Systems, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands. Wednesday 10 th June, 2020 disrete repliato r disrete

Multi-agent learning Multi-agent reinforcement learning Gerard Vreeswijk , Intelligent Systems

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

Multi-agent learning Gerard Vreeswijk , Intelligent Systems Group, Computer Science Department,

REINFORCEMENT LEARNING IN MULTI-AGENT SYSTEMS MACHINE LEARNING MEETUP DR. ANA PELETEIRO

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Multi-agent learning Simplied Poker Yannick Bitane , April 14th, 2011. Yannick Bitane. Slides

Learning Agent Learning Agents An Agent that observes its performance and adapts its

ROMA: Multi-Agent Reinforcement Learning with Emerging Roles Tonghan Wang, Heng Dong, Victor

The Player Agent The Player Agent Are they the most important league official right now? right

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 6 Agent Communication 1

Agent Training Welcome Blues Agent Portal Training e-Learning on the BCBSM Agent Portal

W HAT S AN A GENT ? Weiss, p. 29 [after Wooldridge and Jennings]: An agent is a

M ULTI -A GENT S YSTEMS Overview and Research Directions Whats an agent? AI Class 12 (C H .

Be Beyon ond the Free Will Defense: Nat Natur ural al Evil, Theo heodi dicy, and and Sac

Non-exhaustive, Overlapping Clustering via Low-Rank Semidefinite Programming Yangyang Hou 1 *,

4 Steps to Turn 4 Da Data Into Do Dollars Wednesday, June 15, 2016 12:00 pm CST Get Social!

Hoos Connected E n h a n c i n g S o c i a l B e l o n g i n g &amp; S u p p o r t A m o n g

What is Conservation Biology? Brook Milligan Department of Biology New Mexico State University

Monitoring Electrical Power Consumption with Kieker 9 th Symposium on Software Performance 2018

2 Control and Field Level Devices Industrial Automation, EPFL, Spring 2019 Content 2.1 PLCs

Dawn Song dawnsong@cs.berkeley.edu Thanks for Benny Pinkas for some of the slides 1 Project

Sambuz

Useful Links

Newsletter

Mail Us

Hoos Connected E n h a n c i n g S o c i a l B e l o n g i n g & S u p p o r t A m o n g