Zero-Sum Games Are Special CMPUT 366: Intelligent Systems S&LB - PowerPoint PPT Presentation

  Zero-Sum Games Are Special CMPUT 366: Intelligent Systems   S&LB §3.4.1

Lecture Outline 1. Recap 2. Maxmin Strategies and Equilibrium 3. Alpha-Beta Search

Recap: Game Theory Ballet Soccer • Game theory studies the interactions of rational agents Ballet 2, 1 0, 0 • Canonical representation is the normal form game Soccer 0, 0 1, 2 • Game theory uses solution concepts rather than optimal behaviour • "Optimal behaviour" is not clear-cut in multiagent settings Heads Tails • Pareto optimal : no agent can be made better off without making some other agent worse off Heads 1,-1 -1,1 • Nash equilibrium : no agent regrets their strategy given the choice of the other agents' strategies Tails -1,1 1,-1 • Zero-sum games are games where the agents are in pure competition

Recap: Perfect Information Extensive Form Game Definition :   A finite perfect-information game in extensive form is a tuple G = ( N , A , H , Z , χ , ρ , σ , u ), where • N is a set of n players , 1 • All None • A is a single set of actions , 2–0 0–2 Half 1–1 2 2 2 • • • • H is a set of nonterminal choice nodes , yes yes yes no no no • Z is a set of terminal nodes (disjoint from H ), • • • • • • • is the action function , χ : H → 2 A (0 , 0) (2 , 0) (0 , 0) (1 , 1) (0 , 0) (0 , 2) Figure 5.1: The Sharing game. • is the player function , ρ : H → N • is the successor function , σ : H × A → H ∪ Z • u = ( u 1 , u 2 , ..., u n ) is a utility function for each player, u i : Z → ℝ

  Maxmin Strategies Question: What is the maximum amount that an agent can guarantee themselves in expectation? 1. Does a maxmin strategy always Definition:   exist ? A maxmin strategy for i is a strategy that maximizes i 's   s i worst-case payoff:   s i ∈ S i [ min u i ( s i , s − i ) ] 2. Is a an agent's s i = arg max s − i ∈ S i maxmin strategy always unique ? Definition:   The maxmin value of a game for i is the value guaranteed 3. Why would an agent v i by a maxmin strategy:   want to play a s i ∈ S i [ min u i ( s i , s − i ) ] maxmin strategy? v i = max s − i ∈ S i

Minimax Theorem Theorem: [von Neumann, 1928]   In any finite, two-player, zero-sum game, in any Nash equilibrium, each player receives an expected utility v i equal to both their maxmin and their minmax value. Proof sketch: 1. Suppose that . But then i could guarantee a higher payoff by v i < v i playing their maxmin strategy. So v i ≥ v i . 2. -i's equilibrium payoff is v − i = max u − i ( s * i , s − i ) s − i 3. Equivalently, since the game is zero sum. v i = min u i ( s * i , s − i ), s − i 4. So v i = min u i ( s * i , s − i ) ≤ max min u i ( s i , s − i ) = v i . ∎ s − i s i s − i

Minimax Theorem Implications In any zero-sum game: 1. Each player's maxmin value is equal to their minmax value.   We call this the value of the game . 2. For both players, the maxmin strategies and the Nash equilibrium strategies are the same sets . 3. Any maxmin strategy profile (a profile in which both agents are playing maxmin strategies) is a Nash equilibrium. Therefore, each player gets the same payoff in every Nash equilibrium (namely, their value for the game).

Nash Equilibrium Safety 1 2 1 2 1 A A A A A • • • • • • (3 , 5) D D D D D • • • • • (1 , 0) (0 , 2) (3 , 1) (2 , 4) (4 , 3) • Perfect-information extensive form games: Straightforward to compute Nash equilibrium using backward induction • In the Centipede game, the equilibrium outcome is Pareto dominated • Question: Can player 2 ever regret playing a Nash equilibrium strategy against a suboptimal player 1 in Centipede?

Nash Equilibrium Safety:   General Sum Games • In a general-sum game, a Nash equilibrium strategy is not always a maxmin strategy 1 A B • Question: What is a Nash equilibrium of this 2 2 game? X Y X Y [( A , D , D ), ( Y , X )] 1 1 -1,7 4,2 • Question: What is player 1's maxmin strategy ? C D C D ( B , D , D ) • Question: Can player 1 ever regret playing a Nash 1,1 9,9 4,5 5,4 equilibrium against a suboptimal player? Yes, because if player 2 does not follow the same Nash equilibrium, player 1 could get -1 (the worst payo ff in the game).

Nash Equilibrium Safety: Zero-sum Games • In a zero-sum game, every Nash equilibrium 1 strategy is also a maxmin strategy A B 2 2 • Question: What is player 1's maxmin value for X Y X Y this game? 4 (same as previous game) 1 1 -1,1 4,-4 C D C D • Question: Can player 1 ever regret playing a Nash equilibrium strategy against a suboptimal 1,-1 9,-9 4,-4 5,-5 player? No, because player 1's equilibrium strategy is also their maxmin strategy.

Efficient Equilibrium Computation • Backward induction requires us to examine every leaf node • However, in a zero-sum game, we can do better by pruning some sub-trees • Special case of branch and bound • Intuition: If a player can guarantee at least x starting from a given subtree h , but their opponent can guarantee them getting less than x in an earlier subtree, then the opponent will never allow the player to reach h

Algorithm: Alpha-Beta Search A LPHA B ETA S EARCH (a choice node h ):   v ← M AX V ALUE ( h, - ∞ , ∞ )   M IN V ALUE ( h , 𝛽 , 𝛾 ):   return a ∈ 𝜓 ( h ) such that M AX V ALUE ( 𝜏 ( h , a )) = v if h ∈ Z : return u ( h )   v ← + ∞   M AX V ALUE (choice node h , max value 𝛽 , min value 𝛾 ):   for h ʹ ∈ { h ʹ | a ∈ 𝜓 ( h ) and 𝜏 ( h,a ) = h ʹ }:   if h ∈ Z : return u ( h )   v ← - ∞   v ← min ( v , M AX V ALUE (h ʹ , 𝛽 , 𝛾 ))   for h ʹ ∈ { h ʹ | a ∈ 𝜓 ( h ) and 𝜏 ( h,a ) = h ʹ }:   if v ≤ 𝛽 : return v   v ← max ( v , M IN V ALUE (h ʹ , 𝛽 , 𝛾 ))   𝛾 ← min ( 𝛾 , v )   if v ≥ 𝛾 : return v   return v 𝛽 ← max ( 𝛽 , v )   return v

Randomness • Sometimes a game will include elements of randomness in the environment • E.g., dice • Can handle this by including chance nodes owned by nature • Alpha-beta search can work in this setting, but it needs some tweaks • Take expectation at chance nodes instead of min/max • Pruning based on bounds on the expectation • Question: What about randomness in the strategies of the players ?

Alpha-Beta Search:   Additional Considerations • Question: Can this algorithm work with arbitrarily deep game trees? No, because it needs to get to the "bottom" of the tree before it can start pruning • Question: Can this algorithm work for non-zero-sum games? No, it relies on the fact that player 1 and player 2 are maximizing and minimizing the same quantity .

Summary • Maxmin strategies maximize an agent's worst-case payoff • Nash equilibrium strategies are different from maxmin strategies in general games • In zero-sum games , they are the same thing • It is always safe to play an equilibrium strategy in a zero- sum game • Alpha-beta search computes equilibrium of zero-sum games more efficiently than backward induction

Zero-Sum Games Are Special CMPUT 366: Intelligent Systems S&LB - PowerPoint PPT Presentation

Zero-Sum Games Are Special CMPUT 366: Intelligent Systems S&LB 3.4.1 Lecture Outline 1. Recap 2. Maxmin Strategies and Equilibrium 3. Alpha-Beta Search Recap: Game Theory Ballet Soccer Game theory studies the interactions

CSC2556 Lecture 11 Noncooperative Games 2: Zero-Sum Games, Stackelberg Games CSC2556 - Nisarg

Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg Shah 1 Zero-Sum Games

Chapter 2.5 Intermission Zero-Sum Games Zero-Sum Games A game consists of Players: Can

CS 170 Section 9 Zero-Sum Games, Reductions Owen Jow | owenjow@berkeley.edu Zero-Sum Games

Guest Lecture: Prof. Allan Borodin Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 -

Game Theory Preliminaries: Playing and Solving Games Zero-sum games with perfect information

ex Addition: 1-bit half adder A + Sum B Carry out Carry A B Sum out 0 0 A 0 1 Sum

Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games Colin McMillen and Manuela Veloso

Today Experts/Zero-Sum Games Equilibrium. Boosting and Experts. Routing and Experts. Two person

Non-Zero-Sum Stochastic Differential Games of Controls and Stoppings Qinghua Li October 1, 2009

Games Miheer Dewaskar Chennai Mathematical Institute April 27, 2016 1 / 19 Outline Finite

S S S S erious Games erious Games erious Games erious Games + Computer S + Computer S +

Potential Games Matoula Petrolia April 14, 2011 Examples Potential Games Potential vs

Pre-Grundy Games Games And Graphs Workshop 2017 In collaboration with : Eric Duch ene,

Zero Waste at The Nat Zero Waste Zero Waste Zero Waste is a philosophy that encourages the

Getting to Zero San Francisco Consortium Zero new HIV infections Zero HIV deaths Zero stigma

Homework for lecture slides 4a, 4b, and 4c. 1,0 1 L R Homework 4.1. 0,2 1,0 2 L R

Convergence Problems of General-Sum Multiagent Reinforcement Learning Michael Bowling Carnegie

Game Theory: Spring 2020 Ulle Endriss Institute for Logic, Language and Computation University

Game Theory to the Rescue When Hard Decisions Are to Be Made Alexander C. S. Hendorf @hendorf

CS 4700: Foundations of Artificial Intelligence Bart Selman selman@cs.cornell.edu Module:

Announcements Minbiaos office hour will be changed to Thursday 1-2 pm, starting from next

Comparison of Information Structures for Zero-Sum Games in Standard Borel Spaces Ian

ECO 199 B GAMES OF STRATEGY Spring Term 2004 B March 2 MIXED STRATEGIES B NON-ZERO-SUM GAMES