Game Theory • Studied by mathematicians, economists, finance Chapter6 • In AI we limit games to: - deterministic - turn-taking - two-player - zero-sum ( 零和遊戲或 Win-lose Game; 你死我活 ) Adversarial Search - perfect information This means deterministic, fully observable environments in which there are two agents whose actions must alternate and in which the utility values at the end of the game are always equal and opposite. 20070419 Chap6 1 20070419 Chap6 2 Types of Games Games as Search Problems Deterministic Chance • Games offer pure, abstract competition. Perfect Chess, Go, Othello Backgammon ( 西洋雙陸棋 ) • A chess-playing computer would be information Checkers ( 西洋跳棋 ) Monopoly ( 地產大亨 , 大富翁 ) an existence proof of a machine doing something generally thought to require intelligence. Imperfect Blind Tictactoe Bridge ( 橋牌 ) , Poker ( 梭哈 ) • Games are idealization of worlds in which information - the world state is fully accessible ; - the (small number of) actions are well-defined; • Game playing was one of the first tasks undertaken in AI. - uncertainty • Machines have surpassed humans on checkers and Othello, due to moves by the opponent have defeated human champions in chess and due to the complexity of games backgammon. • In Go, computers perform at the amateur level. 20070419 Chap6 3 20070419 Chap6 4 Games as Search Problems (cont.-2) Games as Search Problems (cont.-1) • Games are usually much too hard to solve. • Initial State For example, in a typical chess game, - Average branching factor: 35 - How does the game start? - Average moves by each player: 50 • Successor Function - Total number of nodes in search tree : - A list of legal (move, state) pairs for each state 35 100 or 10 154 • Terminal Test (although total number of different legal positions: 10 40 ) - Determines when game is over • Utility Function • Time limits for making good decisions - Provides numeric value for all terminal states 20070419 Chap6 5 20070419 Chap6 6 1
Partial Game Tree Optimal strategies • Find the contingent strategy for MAX assuming an infallible MIN opponent. • Assumption: Both players play optimally !! • Given a game tree, the optimal strategy can be determined by using the minimax value of each node: MinimaxValue( n ) = Utility ( n ) if n is a terminal state max s ∈ Successors(n) MinimaxValue( s ) if n is a MAX node min s ∈ Successors(n) MinimaxValue( s ) if n is a MIN node 20070419 Chap6 7 20070419 Chap6 8 Minimax Two-Ply Game Tree • Perfect play for deterministic, perfect information games • Idea: choose move to a position with highest minimax value = best achievable payoff against best play 20070419 Chap6 9 20070419 Chap6 10 Two-Ply Game Tree (cont.-1) Two-Ply Game Tree (cont.-2) 20070419 Chap6 11 20070419 Chap6 12 2
Two-Ply Game Tree (cont.-3) Minimax Algorithm The minimax decision Minimax maximizes the worst-case outcome for max. 20070419 Chap6 13 20070419 Chap6 14 The Minimax Algorithm (cont.) Analysis of Minimax • Generate the whole game tree. Complete ?? Yes , only if tree is finite • Apply the utility function to each terminal state. • Determine the utility of the nodes one level higher up Optimal ?? Yes , against an optimal opponent. from the terminal nodes. Otherwise?? = K Utility( n ) max / min ( n . 1 , n . 2 , , n . b ) Time ?? O(b m ), is a complete depth-first search • Continue backing up the values. m: max depth, b : # of legal moves • At the root, MAX chooses the move leading to the highest utility value. Space ?? O(bm), generate all successors at once or O(m), generate successors one at a time For chess, b ≈ 35, m ≈ 100 for “reasonable” games ⇒ Exact solution completely infeasible 20070419 Chap6 15 20070419 Chap6 16 α - β Pruning Optimal Decisions in Multiplayer Games • The problem of minimax search • Extend the minimax idea to multiplayer games # of state to examine: exponential in number of moves • Replace the single value for each node with a vector of values α - β Pruning: • returns the same moves as minimax would, but prunes away branches that cannot possibly influence the final decision α : the value of the best (highest) choice so far in search of MAX • β : the value of the best (lowest) choice so far in search of MIN • • Order of considering successors matters (look at step f of Fig 6.5 pp.168) - If possible, consider best successors first 20070419 Chap6 17 20070419 Chap6 18 3
α - β Pruning (cont.) α - β Pruning Example Do DF-search until first leaf Range of possible values [- ∞ ,+ ∞ ] [- ∞ , + ∞ ] If m is better than n for Player, we will never get to n in play and just prune it. 20070419 Chap6 19 20070419 Chap6 20 α - β Pruning Example (cont.-1) α - β Pruning Example (cont.-2) [- ∞ ,+ ∞ ] [- ∞ ,+ ∞ ] [- ∞ ,3] [- ∞ ,3] 20070419 Chap6 21 20070419 Chap6 22 α - β Pruning Example (cont.-3) α - β Pruning Example (cont.-4) [3,+ ∞ ] [3,+ ∞ ] This node is worse for MAX [3,3] [- ∞ ,2] [3,3] 20070419 Chap6 23 20070419 Chap6 24 4
α - β Pruning Example (cont.-5) α - β Pruning Example (cont.-6) , , [3,5] [3,14] [- ∞ ,2] [- ∞ ,14] [ − ∞ ,2] [- ∞ ,5] [3,3] 20070419 Chap6 25 20070419 Chap6 26 α - β Pruning Example (cont.-7) α - β Pruning Example (cont.-8) [3,3] [3,3] [ − ∞ ,2] [2,2] [3,3] [- ∞ ,2] [2,2] [3,3] 20070419 Chap6 27 20070419 Chap6 28 The α - β Algorithm The α - β Algorithm (cont.) 20070419 Chap6 29 20070419 Chap6 30 5
Analysis of α - β Algorithm Analysis of α - β Algorithm (cont.) • If best-move-first, • Pruning does not affect final result. - the total number of nodes examined is: O(b d/2 ) • Entire subtrees can be pruned. - the effective branching factor becomes: b 1/2 • Good move ordering improves its effectiveness for chess, 6 instead 35 highly dependent on the order i.e. α - β can look ahead roughly twice as far as in which the successor are examined minimax in the same amount of time. ⇒ It is worthwhile to try to examine first the successors that are likely to be best . • If random ordering, - the total number of nodes examined is: O(b 3d/4 ) e.g. Figure 6.5 (e, f) for moderate b If successors of D is 2, 5, 14 (instead of 14, 5, 2) • Repeated states are again possible. then 5, 14 can be pruned. - Store them in memory = transposition table 20070419 Chap6 31 20070419 Chap6 32 Imperfect, Real-Time Decisions Heuristic Evaluation Functions • Produce an estimate of the expected utility of the • Minimax and alpha-beta pruning require game from a given position. too much leaf-node evaluations. • Performance depends on the quality of EVAL • May be impractical within a reasonable • Requirements amount of time. - EVAL should order terminal-nodes in the same as UTILITY - Computation cannot take too long • Shannon (1950): - For non-terminal states, the EVAL should be strongly correlated with the actual chance of winning. - Apply heuristic evaluation function EVAL (replacing utility function of alpha-beta) • Most evaluation functions work by calculating various features of the state. - Cut off search earlier What are features of chess? (replacing terminal-test by Cutoff test) e.g. # of pawns possessed, etc. addition assumes • Weighted linear function independence of Eval(s) = w 1 f 1 (s) + w 2 f 2 (s) + … + w n f n (s) each feature 20070419 Chap6 33 20070419 Chap6 34 Heuristic Evaluation Functions (cont.-2) Heuristic Evaluation Functions (cont.-1) • Give material value for each piece (by chess book) pawn 1 knight or bishop 3 rook 5 queen 9 Heuristic difficulties (Heuristic counts pieces won ) e.g. Two slightly different chess positions: (a) Black has an advantage of a knight and two pawns and will win the game. (b) Black will lose after white captures the queen. 20070419 Chap6 35 20070419 Chap6 36 6
Recommend
More recommend