g ame t heory 1
play

G AME T HEORY 1 I NSTRUCTOR : G IANNI A. D I C ARO I CE - CREAM W ARS - PowerPoint PPT Presentation

15-382 C OLLECTIVE I NTELLIGENCE S18 L ECTURE 26: G AME T HEORY 1 I NSTRUCTOR : G IANNI A. D I C ARO I CE - CREAM W ARS http://youtu.be/jILgxeNBK_8 2 G AME T HEORY Game theory is the formal study of conflict and cooperation in (rational)


  1. 15-382 C OLLECTIVE I NTELLIGENCE – S18 L ECTURE 26: G AME T HEORY 1 I NSTRUCTOR : G IANNI A. D I C ARO

  2. I CE - CREAM W ARS http://youtu.be/jILgxeNBK_8 2

  3. G AME T HEORY  Game theory is the formal study of conflict and cooperation in (rational) multi-agent systems  Decision-making where several players must make choices that potentially affect the interests of other players: the effect of the actions of several agents are interdependent (and agents are aware of it)  Example: Auctioning! Psychology: Theory of social situations 15781 Fall 2016: Lecture 22 3

  4. E LEMENTS OF A G AME  The players: how many players are there? Does nature/chance play a role? Players are assumed to be rational  A complete description of what the players can do: the set of all possible actions. 15781 Fall 2016: Lecture 22 4

  5. E LEMENTS OF A G AME  A description of the payoff / consequences for each player for every possible combination of actions chosen by all players playing the game.  A description of all players’ preferences over payoffs Utility function for each player 15781 Fall 2016: Lecture 22 5

  6. A GENT VS . M ECHANISM D ESIGN  Agent strategy design: Game theory can be used to compute the expected utility for each decision, and use this to determine the best strategy (and its expected return) against a rational player Strategy ≡ Policy  System-level mechanism design: Define the rules of the game, such that the collective utility of the agents is maximized when each agent strategy is designed to maximize its own 15781 Fall 2016: Lecture 22 utility according to ASD 6

  7. M AKING DECISIONS : B ASIC DEFINITIONS  Decision-making can involve choosing:  one single action or  a sequence of actions  Action outcomes can be certain or subject to uncertainty  A set 𝐵 of alternative actions to choose from is given, it can be either discrete or continuous  Payoff (for a single agent): function 𝜌: 𝐵 → ℝ that associates a numerical values with every action in 𝐵 Optimal action 𝑏 ∗ (for a single agent scenario): 𝜌(𝑏 ∗ ) ≥ 𝜌 𝑏  ∀𝑏 ∈ 𝐵  Payoff (for a multi-agent scenario): The payoff of the action 𝑏 for agent 𝑗 depends on the actions of the other players! 𝜌: 𝐵 𝑜 → ℝ  Strategy: rule for choosing an action at every point a decision might have to be made (depending or not on the other agents)  The strategy defines the behavior of an agent  The observed behavior of an agent following a given strategy is the outcome of the strategy 7

  8. P URE VS . R ANDOMIZED STRATEGIES  Pure strategy: a strategy in which there is no randomization , one specific action is selected with certainty at each decision node  All possible pure strategies define the pure strategy set 𝑇  A decision tree can be used to represent a sequence of decisions 1 1 𝑏 1 𝑏 2 𝑏 1 𝑏 2 3 2 2 𝑐 1 𝑐 2 𝑐 1 𝑐 2 𝑑 1 𝑑 2 3 𝑑 1 𝑑 2 𝐵 1 = 𝑏 1 , 𝑏 2 , 𝐵 2 = 𝑐 1 , 𝑐 2 , 𝐵 3 = 𝑑 1 , 𝑑 2  Three action sets (actions may the be same), that result in the pure strategy set: 𝑇 = {𝑏 1 𝑐 1 𝑑 1 , 𝑏 1 𝑐 1 𝑑 2 , 𝑏 1 𝑐 2 𝑑 1 , 𝑏 1 𝑐 2 𝑑 2 , 𝑏 2 𝑐 1 𝑑 1 , 𝑏 2 𝑐 1 𝑑 2 , 𝑏 2 𝑐 2 𝑑 1 , 𝑏 2 𝑐 2 𝑑 2 } 8

  9. P URE VS . R ANDOMIZED STRATEGIES  In a game, we may observe only a subset of the possible outcomes of a strategy, depending on starting conditions and strategies from other agents 1  Strategies that give the same outcome lead to the same payoff 𝑏 1 𝑏 2  Reduced strategy set: the set formed by all pure 2 strategies that lead to indistinguishable outcomes 𝑐 1 𝑐 2  Let the pure strategy set be {𝑏 1 , 𝑏 2 }, the behavior specifies using 𝑏 1 with probability 𝑞 , and 𝑏 2 with 3 probability 𝑞 − 1 𝑑 1 𝑑 2  A mixed strategy 𝛾 specifies the probability 𝑞(𝑡) with which each of the pure strategies 𝑡 ∈ 𝑇 are used  Payoff for using 𝛾 (for a single agent): 𝜌 𝛾 = σ 𝑏∈𝐵 𝑞(𝑏)𝜌 𝑏  Payoff in an uncertain world : 𝜌 𝛾|𝑦 = σ 𝑏∈𝐵 𝑞(𝑏)𝜌 𝑏|𝑦 , 𝑦 is the state 9

  10. S TRATEGIES (P OLICIES )  Strategy: tells a player what to do for every possible situation throughout the game (complete algorithm for playing the game). It can be deterministic or stochastic  Strategy set: what strategies are available for the players to play. The set can be finite or infinite (e.g., beach war game)  Strategy profile: a set of strategies for all players which fully specifies all actions in a game. A strategy profile must include one and only one strategy for every player  Pure strategy: one specific element from the strategy set, a single strategy which is played 100% of the time ( deterministic )  Mixed strategy: assignment of a probability to each pure strategy. Pure strategy ≡ degenerate case of a mixed strategy ( stochastic ) 15781 Fall 2016: Lecture 22 10

  11. I NFORMATION  Complete information game: Utility functions, payoffs, strategies and “types” of players are common knowledge  Incomplete information game: Players may not possess full information about their opponents (e.g., in auctions, each player knows its utility but not that of the other players). “ Parameters ” of the game are not fully known  Perfect information game: Each player, when making any decision, is perfectly informed of all the events that have previously occurred (e.g., chess) [Full observability]  Imperfect information game: Not all information is accessible to the player (e.g., poker, prisoner’s dilemma) [Partial observability] 15781 Fall 2016: Lecture 22 11

  12. T URN - TAKING VS . S IMULTANEOUS MOVES  Static games  All players take actions “simultaneously” Morra  → Imperfect information games  Complete information  Single-move games  Dynamic games max  Turn-taking games min  Fully observable ↔ Perfect Information Games  Complete Information 10 10 9 100 15781 Fall 2016: Lecture 22  Repeated moves 12

  13. (S TRATEGIC -) N ORMAL -F ORM G AME  Let’s focus on static games Payoff matrix  There is a strategic interaction among players  A game in normal form consists of: o Set of players 𝑂 = {1, … , 𝑜} o Strategy set 𝑇 o For each 𝑗 ∈ 𝑂 , a utility function 𝑣 𝑗 defined over the set of all possible strategy profiles , 𝑣 𝑗 : 𝑇 𝑜 → ℝ o If each player 𝑘 ∈ 𝑂 plays the strategy 𝑡 𝑘 ∈ 𝑇 , the utility of player 𝑗 is 𝑣 𝑗 𝑡 1 , … , 𝑡 𝑜 that is the same as player 𝑗 ’ s payoff when strategy profile (𝑡 1 , … , 𝑡 𝑜 ) is chosen 15781 Fall 2016: Lecture 22 13

  14. T HE I CE C REAM W ARS  𝑂 = 1,2  𝑇 = [0,1]  𝑡 i is the fraction of beach  ….. 𝑡 𝑗 +𝑡 𝑘 , 𝑡 𝑗 < 𝑡 𝑘 2 𝑡 𝑗 +𝑡 𝑘 • 𝑣 𝑗 𝑡 𝑗 , 𝑡 𝑘 = 1 − , 𝑡 𝑗 > 𝑡 𝑘 2 1 2 , 𝑡 𝑗 = 𝑡 𝑘 15781 Fall 2016: Lecture 22 14

  15. T HE PRISONER ’ S DILEMMA (1962)  Two men are charged with a crime  They can’t communicate with each other  They are told that: o If one rats out and the other does not, the rat will be freed, 6 6 9 other jailed for 9 years o If both rat out, both will be jailed for 6 years 9  They also know that if neither rats out, both will be jailed for 1 year 15781 Fall 2016: Lecture 22 15

  16. T HE PRISONER ’ S DILEMMA (1962) 15781 Fall 2016: Lecture 22 16

  17. P RISONER ’ S DILEMMA : P AYOFF MATRIX Don’t confess = Don’t rat out Don’t B Cooperate with each other Confess Confess Confess = Defect Don’t cooperate to each other, act selfishly! Don’t -1,-1 -9,0 Confess A Confess 0,-9 -6,-6 What would you do? 15781 Fall 2016: Lecture 22 17

  18. P RISONER ’ S DILEMMA : P AYOFF MATRIX B Don’t confess: B • If A don’t confess, B gets -1 • If A confess, B gets -9 Don’t Confess Confess B Confess: Don’t • If A don’t confess, B gets 0 • If A confess, B gets -6 -1,-1 -9,0 Confess A Confess 0,-9 -6,-6 Rational agent B opts to confess 15781 Fall 2016: Lecture 22 18

  19. P RISONER ’ S DILEMMA  Confess (Defection, Acting selfishly) is a dominant strategy for B : no matters what A plays, the best reply strategy is always to confess  (Strictly) dominant strategy : yields a player strictly higher payoff, . no matter which decision(s) the other player(s) choose  Weakly: ties in some cases  Confess is a dominant strategy also for A  A will reason as follows: B ’s dominant strategy is to Confess, therefore, given that we are both rational agents, B will also Confess and we will both get 6 years. 15781 Fall 2016: Lecture 22 19

  20. P RISONER ’ S DILEMMA  But, is the dominant strategy (C,C) the best strategy? Don’t B Confess Confess Don’t -1,-1 -9,0 Confess A Confess 0,-9 -6,-6 15781 Fall 2016: Lecture 22 20

Recommend


More recommend