Last time: Simulated annealing algorithm � Idea: Escape local extrema by allowing “bad moves,” but gradually decrease bad moves, but gradually decrease their size and frequency. Note: goal here is to - maximize E. 1
Last time: Simulated annealing algorithm � Idea: Escape local extrema by allowing “bad moves,” but gradually decrease bad moves, but gradually decrease their size and frequency. Algorithm when goal - is to minimize E. < - 2
This time: Outline � Game playing � The minimax algorithm l ith � Resource limitations � alpha-beta pruning � alpha beta pruning � Elements of chance 3
What kind of games? � Abstraction : To describe a game we must capture every relevant aspect of the game. Such as: � Chess � Tic-tac-toe � … � Accessible environments: Such games are characterized by perfect games are characterized by perfect information 4
What kind of games? � Search: game-playing then consists of a search through possible game positions � Unpredictable opponent: introduces uncertainty thus game-playing must uncertainty thus game playing must deal with contingency problems 5
Searching for the next move � Complexity: many games have a huge search space search space b = 35, m= 100 ⇒ nodes = 35 100 � Chess: if each node takes about 1 ns to explore p then each move will take about 10 50 millennia to calculate. 6
Searching for the next move � Resource (e.g., time, memory) limit: optimal solution not feasible/possible, optimal solution not feasible/possible, thus must approximate 1 Pruning: makes the search more efficient 1. Pruning: makes the search more efficient by discarding portions of the search tree that cannot improve quality result. 2. Evaluation functions: heuristics to evaluate utility of a state without exhaustive search. 7
Two-player games � A game formulated as a search problem: � Initial state: ? � Operators: ? Operators: ? � Terminal state: ? � Utility function: ? Utilit f ti ? 8
9 Game vs. search problem
Example: Tic-Tac-Toe Question: 1. b (branching factor) = ? 2 2. m (max depth) = ? m (max depth) = ? 10
11 Type of games
The minimax algorithm Perfect play for deterministic � environments with perfect information Basic idea: choose move with highest � minimax value = best achievable payoff against best play p y 12
The minimax algorithm Algorithm: � Generate game tree completely 1. Determine utility of each terminal state 2. Propagate the utility values upward in the three by 3. applying MIN and MAX operators on the nodes in applying MIN and MAX operators on the nodes in the current level At the root node use minimax decision to select the 4. move with the max (of the mins) utility value Steps 2 and 3 in the algorithm assume that Steps 2 and 3 in the algorithm assume that � � the opponent will play perfectly. 13
14 Generate Game Tree
15 1 move 1 ply Generate Game Tree x o o x x o x x o
A subtree x o x x x o o x win o lose lose x x x x x x draw o o o x o x o x o x o o x x o o x o x o x x x x o o x x x x x x x x o o o o o o x o x x o x x o x o o x o x o x o x o o o o o x o x o o x x x x x x x x o o o o x x o o o x x o o x x x x o o x x o o o o x x o o x x x o x o x x o x o x o o x x o x x o o x o 16
What is a good move? x o x x x o o x win o lose lose x x x x x x draw o o o x o x o x o x o o x x o o x x o o x x x x o o x x x x x x x x o o o o o o x o x x o x x o x o o x o x o x o x o o o o o x o x o o x x x x x x o o o x x o o o x x o o x x o o o o x x o o x x x x o o x x o x o x x o x x o o x o 17
Minimax 3 12 8 2 4 6 14 5 2 •Minimize opponent’s chance Mi i i t’ h •Maximize your chance 18
19 2 nd ply ply 1 st ply minimax = maximum of the 2 minimum
20 -Minimx java applet JavaApplet dd dd
Minimax: Recursive implementation p Complete: ? Time complexity: ? 21 Optimal: ? Space complexity: ?
1. Move evaluation without complete search � Complete search is too complex and impractical � Evaluation function: evaluates value of state using heuristics and cuts off search � New MI NI MAX: � CUTOFF-TEST: cutoff test to replace the termination condition (e.g., deadline, depth-limit, etc.) � EVAL: evaluation function to replace utility function l f l l f (e.g., number of chess pieces taken) 22
23 Do We Have To Do All That 8 12 Work? 3 MAX MIN
Evaluation functions � Weighted linear evaluation function: � to combine n heuristics: f = w 1 f 1 + w 2 f 2 + to combine n heuristics: f w 1 f 1 + w 2 f 2 + … + w n f n + w n f n E.g, � w ’s could be the values of pieces (1 for prawn, 3 for bishop) p ( p , p) � f ’s could be the number of type of pieces on the board 24
25 Note: exact values do not matter Ordering is preserved
Minimax with cutoff: viable algorithm? Assume we have 100 seconds, evaluate 10 4 nodes/s; can nodes/s; can evaluate 10 6 nodes/move 26
Recommend
More recommend