last time simulated annealing algorithm
play

Last time: Simulated annealing algorithm Idea: Escape local extrema - PowerPoint PPT Presentation

Last time: Simulated annealing algorithm Idea: Escape local extrema by allowing bad moves, but gradually decrease bad moves, but gradually decrease their size and frequency. Note: goal here is to - maximize E. 1 Last time:


  1. Last time: Simulated annealing algorithm � Idea: Escape local extrema by allowing “bad moves,” but gradually decrease bad moves, but gradually decrease their size and frequency. Note: goal here is to - maximize E. 1

  2. Last time: Simulated annealing algorithm � Idea: Escape local extrema by allowing “bad moves,” but gradually decrease bad moves, but gradually decrease their size and frequency. Algorithm when goal - is to minimize E. < - 2

  3. This time: Outline � Game playing � The minimax algorithm l ith � Resource limitations � alpha-beta pruning � alpha beta pruning � Elements of chance 3

  4. What kind of games? � Abstraction : To describe a game we must capture every relevant aspect of the game. Such as: � Chess � Tic-tac-toe � … � Accessible environments: Such games are characterized by perfect games are characterized by perfect information 4

  5. What kind of games? � Search: game-playing then consists of a search through possible game positions � Unpredictable opponent: introduces uncertainty thus game-playing must uncertainty thus game playing must deal with contingency problems 5

  6. Searching for the next move � Complexity: many games have a huge search space search space b = 35, m= 100 ⇒ nodes = 35 100 � Chess: if each node takes about 1 ns to explore p then each move will take about 10 50 millennia to calculate. 6

  7. Searching for the next move � Resource (e.g., time, memory) limit: optimal solution not feasible/possible, optimal solution not feasible/possible, thus must approximate 1 Pruning: makes the search more efficient 1. Pruning: makes the search more efficient by discarding portions of the search tree that cannot improve quality result. 2. Evaluation functions: heuristics to evaluate utility of a state without exhaustive search. 7

  8. Two-player games � A game formulated as a search problem: � Initial state: ? � Operators: ? Operators: ? � Terminal state: ? � Utility function: ? Utilit f ti ? 8

  9. 9 Game vs. search problem

  10. Example: Tic-Tac-Toe Question: 1. b (branching factor) = ? 2 2. m (max depth) = ? m (max depth) = ? 10

  11. 11 Type of games

  12. The minimax algorithm Perfect play for deterministic � environments with perfect information Basic idea: choose move with highest � minimax value = best achievable payoff against best play p y 12

  13. The minimax algorithm Algorithm: � Generate game tree completely 1. Determine utility of each terminal state 2. Propagate the utility values upward in the three by 3. applying MIN and MAX operators on the nodes in applying MIN and MAX operators on the nodes in the current level At the root node use minimax decision to select the 4. move with the max (of the mins) utility value Steps 2 and 3 in the algorithm assume that Steps 2 and 3 in the algorithm assume that � � the opponent will play perfectly. 13

  14. 14 Generate Game Tree

  15. 15 1 move 1 ply Generate Game Tree x o o x x o x x o

  16. A subtree x o x x x o o x win o lose lose x x x x x x draw o o o x o x o x o x o o x x o o x o x o x x x x o o x x x x x x x x o o o o o o x o x x o x x o x o o x o x o x o x o o o o o x o x o o x x x x x x x x o o o o x x o o o x x o o x x x x o o x x o o o o x x o o x x x o x o x x o x o x o o x x o x x o o x o 16

  17. What is a good move? x o x x x o o x win o lose lose x x x x x x draw o o o x o x o x o x o o x x o o x x o o x x x x o o x x x x x x x x o o o o o o x o x x o x x o x o o x o x o x o x o o o o o x o x o o x x x x x x o o o x x o o o x x o o x x o o o o x x o o x x x x o o x x o x o x x o x x o o x o 17

  18. Minimax 3 12 8 2 4 6 14 5 2 •Minimize opponent’s chance Mi i i t’ h •Maximize your chance 18

  19. 19 2 nd ply ply 1 st ply minimax = maximum of the 2 minimum

  20. 20 -Minimx java applet JavaApplet dd dd

  21. Minimax: Recursive implementation p Complete: ? Time complexity: ? 21 Optimal: ? Space complexity: ?

  22. 1. Move evaluation without complete search � Complete search is too complex and impractical � Evaluation function: evaluates value of state using heuristics and cuts off search � New MI NI MAX: � CUTOFF-TEST: cutoff test to replace the termination condition (e.g., deadline, depth-limit, etc.) � EVAL: evaluation function to replace utility function l f l l f (e.g., number of chess pieces taken) 22

  23. 23 Do We Have To Do All That 8 12 Work? 3 MAX MIN

  24. Evaluation functions � Weighted linear evaluation function: � to combine n heuristics: f = w 1 f 1 + w 2 f 2 + to combine n heuristics: f w 1 f 1 + w 2 f 2 + … + w n f n + w n f n E.g, � w ’s could be the values of pieces (1 for prawn, 3 for bishop) p ( p , p) � f ’s could be the number of type of pieces on the board 24

  25. 25 Note: exact values do not matter Ordering is preserved

  26. Minimax with cutoff: viable algorithm? Assume we have 100 seconds, evaluate 10 4 nodes/s; can nodes/s; can evaluate 10 6 nodes/move 26

Recommend


More recommend