bounded rationality in finite automata
play

Bounded Rationality in Finite Automata Christos A. Ioannou - PowerPoint PPT Presentation

Bounded Rationality in Finite Automata Christos A. Ioannou University of Vienna February 18, 2011 Overview Cooperate Defect Cooperate 3,3 0,5 Defect 5,0 1,1 Table: Prisoners Dilemma Matrix Finite Automata The strategies of the


  1. Bounded Rationality in Finite Automata Christos A. Ioannou University of Vienna February 18, 2011

  2. Overview Cooperate Defect Cooperate 3,3 0,5 Defect 5,0 1,1 Table: Prisoner’s Dilemma Matrix Finite Automata The strategies of the agents are represented by Moore machines. Genetic Algorithm The GA utilizes Darwinian mechanics. Bounded Rationality Machines commit errors in the implementation of actions. Machines commit errors in the perception of actions. Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 2 / 29

  3. Related Literature Optimization routines Genetic Algorithm - Holland (1975) Simulated Annealing - Kirkpatrick, Gelatt & Vecche (1983) Tabu Search - Glover & Laguna (1993) Finite Automata Abreu & Rubinstein (1988) Banks & Sundaram (1990) Axelrod’s seminal work Computational simulations pinpoint to Tit-For-Tat. Bendor, Kramer & Stout (1991) Is then the evolution of cooperation inevitable? Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 3 / 29

  4. Objectives How does bounded rationality impact the evolution of cooperation? How do different levels of errors impact the automata? What characteristics do the automata exhibit under these different error-levels? What are the prevailing (surviving) automata under these different error-levels? What automaton would you pick to play this game? Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 4 / 29

  5. Results 1 The evolution of cooperation becomes less likely as implementation and perception errors become more likely. 2 The study identifies a threshold error-level. At and above the threshold error-level, the prevailing structures converge to the open-loop automaton Always-Defect. Below the threshold, the prevailing automata are closed-loop and diverse. 3 Prevailing automata tend to be less complex, exhibit low reciprocal cooperation and low tolerance to defections as the likelihood of errors increases. Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 5 / 29

  6. Thought Experiment 30 agents are to play the PD game. Agents initially, randomly choose a strategy and play the game against each other in a round-robin structure. With the completion of all round-matches, the strategies and scores become common knowledge. Based on this information, each agent is allowed to adjust her strategy for the next generation. Agents choose their new strategies, and a new generation is initiated. Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 6 / 29

  7. Moore Machines A Moore machine for player i in an infinitely repeated game G , is a four-tuple ( Q i , q i 0 , f i , τ i ) where Q i is a finite set of internal states, q i 0 is specified to be the initial internal state, f i : Q i → A i is an output function that assigns an action to every state, τ i : Q i × A − i → Q i is the transition function that assigns a state to every pair of a state and the opponent’s action ( A − i ) . Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 7 / 29

  8. Example C C,D D C D start Figure: Grim-Trigger Machine Q i = { q C , q D } q i 0 = q C f i ( q C ) = C and f i ( q D ) = D ( q , a − i )=( q C , C ) τ i ( q , a − i ) = { q C q D otherwise Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 8 / 29

  9. Errors Implementation Errors The machine of agent i in the PD game commits an implementation error with probability ǫ , when for any given state q , the machine ′ s output function returns the action f i ( q ) with probability 1 − ǫ and draws another action “ f i ( q ) ” where f i ( q ) � = “ f i ( q ) ” otherwise. Perception Errors The machine of agent i in the PD game commits a perception error with probability δ , when for any given opponent ′ s action a − i , the machine inputs the opponent ′ s action a − i into the transition function with probability 1 − δ and inputs the opponent ′ s action “ a − i ” into the transition function where a − i � = “ a − i ” otherwise. Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 9 / 29

  10. The Genetic Algorithm The GAs operate on 3 fundamental principles. The algorithms require: A coding of the parameter set. 1 The assignment of a measure of performance on each machine. 2 The imposition of genetic operators onto the machines. 3 The selection-dynamics reflect the limited ability of the agents to receive, decode and act upon the information they get in the course of the evolution. Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 10 / 29

  11. Coding of Strings A Moore machine is defined by a string of 25 elements. C D D C D start C Figure: Tit-For-Tat Machine 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ���� � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � start state 0 state 1 state 2 state 3 state 4 state 5 state 6 state 7 Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 11 / 29

  12. The (Pseudo)code Specify error-level Fix max-periods = 200 Create initial population: 30 agents (seed randomly) Initiate round-robin tournament For t = 1 to 500 do For all agent-pairs do For p = 1 to max-periods do Award utils to each agent based on the PD matrix End loop Output performance score End loop Apply subroutine for the offspring-population-creation Store agent results End loop Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 12 / 29

  13. Subroutine for the Creation of the Offspring Population Sort agents based on performance score Copy top 20 agents to offspring-population Select 10 agent-pairs via probabilities biased by performance scores For each of 10 pairs do Create new agent as a copy of the winner of the pair’s match Mutate new agent by switching one element at random End loop Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 13 / 29

  14. Time-Homogeneous Markov Chains A population is inhabited by clones playing the PD game. Consider a system that in time-step n can be in one of four possible states in state space S = ( s 1 , s 2 , s 3 , s 4 ) . Let a strategy have transition rule p = ( p 1 , p 2 , p 3 , p 4 ) where 0 ≤ p i ≤ 1 denotes the probability of cooperating after the corresponding outcome of the previous period. s i is 1 if the strategy plays Cooperate , and 0 if the strategy plays Defect after outcome i ( i = 1, 2, 3, 4 ) . Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 14 / 29

  15. Transition Rule for S TFT & S ALLD with Errors Let ǫ be the probability of committing an implementation error. Let δ be the probability of committing a perception error. S TFT : ( 1, 0, 1, 0 ) → ( 1 − δ − ǫ ( 1 − 2 δ ) , δ + ǫ ( 1 − 2 δ ) , 1 − δ − ǫ ( 1 − 2 δ ) , δ + ǫ ( 1 − 2 δ )) S ALLD : ( 0, 0, 0, 0 ) → ( ǫ , ǫ , ǫ , ǫ ) Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 15 / 29

  16. Transition Matrix Rule p = ( p 1 , p 2 , p 3 , p 4 ) matched against rule q = ( q 1 , q 2 , q 3 , q 4 ) yields a Markov process where the transition probabilities between the four possible states are given by the following matrix:   p 1 ( 1 − q 1 ) ( 1 − p 1 ) q 1 ( 1 − p 1 )( 1 − q 1 ) p 1 q 1 p 2 q 3 p 2 ( 1 − q 3 ) ( 1 − p 2 ) q 3 ( 1 − p 2 )( 1 − q 3 )     p 3 ( 1 − q 2 ) ( 1 − p 3 ) q 2 ( 1 − p 3 )( 1 − q 2 ) p 3 q 2   p 4 q 4 p 4 ( 1 − q 4 ) ( 1 − p 4 ) q 4 ( 1 − p 4 )( 1 − q 4 ) The system has time-homogeneous transition probabilities and the Markov property. Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 16 / 29

  17. Stationary Distribution Since p and q are in the interior of the cube, all the entries of the matrix are strictly positive; hence there exists a unique stationary distribution π = ( π 1 , π 2 , π 3 , π 4 ) for n → ∞ . The payoff for a player i using p against a player − i using q is given by: A ( p , q ) = 3 π 1 + 5 π 3 + π 4 (1) Assuming that implementation errors and perception errors are each kept constant at 4%, the invariant distributions of Tit-For-Tat and Always-Defect yield: A ( S TFT , S TFT ) = 2.25 A ( S ALLD , S ALLD ) = 1.12 Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 17 / 29

  18. Computational Treatments Treatment @ 4% The likelihood of committing an implementation and a perception error is kept constant throughout the evolution at 4%. Treatment @ 2% The likelihood of committing an implementation and a perception error is kept constant throughout the evolution at 2%. Treatment @ 1% The likelihood of committing an implementation and a perception error is kept constant throughout the evolution at 1%. Treatment @ 0% The likelihood of committing an implementation and a perception error is kept constant throughout the evolution at 0%. Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 18 / 29

Recommend


More recommend