adversarial search
play

Adversarial Search CE417: Introduction to Artificial Intelligence - PowerPoint PPT Presentation

Adversarial Search CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018 Soleymani Artificial Intelligence: A Modern Approach , 3 rd Edition, Chapter 5 Most slides have been adopted from Klein and


  1. Adversarial Search CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018 Soleymani “ Artificial Intelligence: A Modern Approach ” , 3 rd Edition, Chapter 5 Most slides have been adopted from Klein and Abdeel, CS188, UC Berkeley.

  2. Outline  Game as a search problem  Minimax algorithm  𝛽 - 𝛾 Pruning: ignoring a portion of the search tree  Time limit problem  Cut off & Evaluation function 2

  3. Games as search problems  Games  Adversarial search problems (goals are in conflict)  Competitive multi-agent environments  Games in AI are a specialized kind of games (in the game theory) 3

  4. Adversarial Games 4

  5. Types of Games  Many different kinds of games!  Axes:  Deterministic or stochastic?  One, two, or more players?  Zero sum?  Perfect information (can you see the state)?  Want algorithms for calculating a strategy (policy) which recommends a move from each state 5

  6. Zero-Sum Games  Zero-Sum Games  General Games Agents have opposite utilities Agents have independent utilities   (values on outcomes) (values on outcomes) Lets us think of a single value that one Cooperation, indifference, competition,   maximizes and the other minimizes and more are all possible Adversarial, pure competition More later on non-zero-sum games   6

  7. Primary assumptions  We start with these games:  T wo-player  Turn taking  agents act alternately  Zero-sum  agents ’ goals are in conflict: sum of utility values at the end of the game is zero or constant  Deterministic  Perfect information  fully observable Examples: Tic-tac-toe, chess, checkers 7

  8. Deterministic Games  Many possible formalizations, one is:  States: S (start at s 0 )  Players: P={1...N} (usually take turns)  Actions:A (may depend on player / state)  Transition Function: SxA  S  TerminalTest: S  {t,f}  Terminal Utilities: SxP  R  Solution for a player is a policy: S  A 8

  9. Single-Agent Trees 8 2 0 … 2 6 … 4 6 9

  10. Value of a State Non-Terminal Value of a state: The States: best achievable outcome (utility) from that state 8 2 0 … 2 6 … 4 6 Terminal States: 10

  11. Adversarial Search 11

  12. Adversarial Game Trees -20 -8 … -18 -5 … -10 +4 -20 +8 12

  13. Minimax Values States Under Agent ’ s Control: States Under Opponent ’ s Control: -8 -5 -10 +8 Terminal States: 13

  14. Tic-Tac-Toe Game Tree 14

  15. Game tree (tic-tac-toe)  Two players: 𝑄 1 and 𝑄 2 ( 𝑄 1 is now searching to find a good move)  Zero-sum games: 𝑄 1 gets 𝑉(𝑢) , 𝑄 2 gets 𝐷 − 𝑉(𝑢) for terminal node 𝑢 𝑄 𝑄 1 : 𝑌 1 𝑄 2 : 𝑃 1-ply = half move 𝑄 2 𝑄 1 𝑄 2 Utilities from the point of view of 𝑄 1 15

  16. Optimal play  Opponent is assumed optimal  Minimax function is used to find the utility of each state.  MAX/MIN wants to maximize/minimize the terminal payoff MAX gets 𝑉(𝑢) for terminal node 𝑢 16

  17. Adversarial Search (Minimax)  Minimax search:  A state-space search tree  Players alternate turns max 5  Compute each node ’ s minimax value: the best achievable utility against a min 5 2 rational (optimal) adversary 8 2 5 6 Terminal values: part of the game 17

  18. Minimax 𝑉𝑈𝐽𝑀𝐽𝑈𝑍(𝑡, 𝑁𝐵𝑌) 𝑗𝑔 𝑈𝐹𝑆𝑁𝐽𝑂𝐵𝑀_𝑈𝐹𝑇𝑈(𝑡) 𝑛𝑏𝑦 𝑏∈𝐵𝐷𝑈𝐽𝑃𝑂𝑇 𝑡 𝑁𝐽𝑂𝐽𝑁𝐵𝑌(𝑆𝐹𝑇𝑉𝑀𝑈(𝑡, 𝑏)) 𝑄𝑀𝐵𝑍𝐹𝑆 𝑡 = 𝑁𝐵𝑌 𝑁𝐽𝑂𝐽𝑁𝐵𝑌 𝑡 = 𝑛𝑗𝑜 𝑏∈𝐵𝐷𝑈𝐽𝑃𝑂𝑇 𝑡 𝑁𝐽𝑂𝐽𝑁𝐵𝑌(𝑆𝐹𝑇𝑉𝑀𝑈 𝑡, 𝑏 ) 𝑄𝑀𝐵𝑍𝐹𝑆 𝑡 = 𝑁𝐽𝑂 Utility of being in state s  𝑁𝐽𝑂𝐽𝑁𝐵𝑌(𝑡) shows the best achievable outcome of being in state 𝑡 (assumption: optimal opponent) 3 3 2 2 18

  19. Minimax (Cont.)  Optimal strategy: move to the state with highest minimax value  Best achievable payoff against best play  Maximizes the worst-case outcome for MAX  It works for zero-sum games 19

  20. Minimax Properties max min 10 10 9 100 Optimal against a perfect player. Otherwise? 20

  21. Minimax Implementation def max-value(state): def min-value(state): initialize v = - ∞ initialize v = + ∞ for each successor of state: for each successor of state: v = max(v, min-value(successor)) v = min(v, max-value(successor)) return v return v 21

  22. Minimax Implementation (Dispatch) def value(state): if the state is a terminal state: return the state ’ s utility if the next agent is MAX: return max-value(state) if the next agent is MIN: return min-value(state) def max-value(state): def min-value(state): initialize v = - ∞ initialize v = + ∞ for each successor of state: for each successor of state: v = max(v, value(successor)) v = min(v, value(successor)) return v return v 22

  23. Minimax algorithm Depth first search function 𝑁𝐽𝑂𝐽𝑁𝐵𝑌_𝐸𝐹𝐷𝐽𝑇𝐽𝑃𝑂(𝑡𝑢𝑏𝑢𝑓) returns 𝑏𝑜 𝑏𝑑𝑢𝑗𝑝𝑜 𝑏∈𝐵𝐷𝑈𝐽𝑃𝑂𝑇(𝑡𝑢𝑏𝑢𝑓) 𝑁𝐽𝑂_𝑊𝐵𝑀𝑉𝐹(𝑆𝐹𝑇𝑉𝑀𝑈(𝑡𝑢𝑏𝑢𝑓, 𝑏)) max return function 𝑁𝐵𝑌_𝑊𝐵𝑀𝑉𝐹(𝑡𝑢𝑏𝑢𝑓) returns 𝑏 𝑣𝑢𝑗𝑚𝑗𝑢𝑧 𝑤𝑏𝑚𝑣𝑓 if 𝑈𝐹𝑆𝑁𝐽𝑂𝐵𝑀_𝑈𝐹𝑇𝑈(𝑡𝑢𝑏𝑢𝑓) then return 𝑉𝑈𝐽𝑀𝐽𝑈𝑍(𝑡𝑢𝑏𝑢𝑓) 𝑤 ← −∞ for each 𝑏 in 𝐵𝐷𝑈𝐽𝑃𝑂𝑇(𝑡𝑢𝑏𝑢𝑓) do 𝑤 ← 𝑁𝐵𝑌(𝑤, 𝑁𝐽𝑂_𝑊𝐵𝑀𝑉𝐹(𝑆𝐹𝑇𝑉𝑀𝑈𝑇(𝑡𝑢𝑏𝑢𝑓, 𝑏))) return 𝑤 function 𝑁𝐽𝑂_𝑊𝐵𝑀𝑉𝐹(𝑡𝑢𝑏𝑢𝑓) returns 𝑏 𝑣𝑢𝑗𝑚𝑗𝑢𝑧 𝑤𝑏𝑚𝑣𝑓 if 𝑈𝐹𝑆𝑁𝐽𝑂𝐵𝑀_𝑈𝐹𝑇𝑈(𝑡𝑢𝑏𝑢𝑓) then return 𝑉𝑈𝐽𝑀𝐽𝑈𝑍(𝑡𝑢𝑏𝑢𝑓) 𝑤 ← ∞ for each 𝑏 in 𝐵𝐷𝑈𝐽𝑃𝑂𝑇(𝑡𝑢𝑏𝑢𝑓) do 𝑤 ← 𝑁𝐽𝑂(𝑤, 𝑁𝐵𝑌_𝑊𝐵𝑀𝑉𝐹(𝑆𝐹𝑇𝑉𝑀𝑈𝑇(𝑡𝑢𝑏𝑢𝑓, 𝑏))) return 𝑤 23

  24. Properties of minimax  Complete?Yes (when tree is finite)  Optimal?Yes (against an optimal opponent)  Time complexity: 𝑃(𝑐 𝑛 )  Space complexity: 𝑃(𝑐𝑛) (depth-first exploration)  For chess, 𝑐 ≈ 35 , 𝑛 > 50 for reasonable games  Finding exact solution is completely infeasible 24

  25. Game Tree Pruning 25

  26. Pruning  Correct minimax decision without looking at every node in the game tree  α - β pruning  Branch & bound algorithm  Prunes away branches that cannot influence the final decision 26

  27. α - β pruning example 27

  28. α - β pruning example 28

  29. α - β pruning example 29

  30. α - β pruning example 30

  31. α - β pruning example 31

  32. α - β pruning  Assuming depth-first generation of tree  We prune node 𝑜 when player has a better choice 𝑛 at (parent or) any ancestor of 𝑜  Two types of pruning (cuts):  pruning of max nodes ( α -cuts)  pruning of min nodes ( β -cuts) 32

  33. Alpha-Beta Pruning  General configuration (MIN version) We ’ re computing the MIN-VALUE at some node n MAX  We ’ re looping over n ’ s children  MIN Who cares about n ’ s value? MAX a  Let a be the best value that MAX can get at any  choice point along the current path from the root If n becomes worse than a , MAX will avoid it, so we  MAX can stop considering n ’ s other children (it ’ s already bad enough that it won ’ t be played) MIN n  MAX version is symmetric 33

  34. α - β pruning (an other example) 3 ≤ 2 3 2 ≥ 5 5 1 34

  35. Why is it called α - β?  α : Value of the best (highest) choice found so far at any choice point along the path for MAX  𝛾 : Value of the best (lowest) choice found so far at any choice point along the path for MIN  Updating α and 𝛾 during the search process  For a MAX node once the value of this node is known to be more than the current 𝛾 ( v ≥ 𝛾 ), its remaining branches are pruned.  For a MIN node once the value of this node is known to be less than the current 𝛽 ( v ≤ 𝛽 ), its remaining branches are pruned. 35

  36. Alpha-Beta Implementation α : MAX ’ s best option on path to root β : MIN ’ s best option on path to root def max-value(state, α , β ): def min-value(state , α , β ): initialize v = - ∞ initialize v = + ∞ for each successor of state: for each successor of state: v = max(v, value(successor, α , β )) v = min(v, value(successor, α , β )) if v ≥ β return v if v ≤ α return v α = max( α , v) β = min( β , v) return v return v 36

Recommend


More recommend