game playing
play

Game Playing Game playing AI Class 8 Ch. 5.1-5.3, 5.4.1, 5.5 - PDF document

Todays Class Game Playing Game playing AI Class 8 Ch. 5.1-5.3, 5.4.1, 5.5 State of the art and resources Framework Weve seen multi-agent Game trees systems, and search problems Minimax where another agents moved


  1. Today’s Class Game Playing • Game playing AI Class 8 — Ch. 5.1-5.3, 5.4.1, 5.5 • State of the art and resources • Framework We’ve seen multi-agent • Game trees systems, and search problems • Minimax where another agent’s moved • Alpha-beta pruning need to be taken into account • Adding randomness – but what if they are actively moving against us? Adversarial search = Games! Cynthia Matuszek – CMSC 671 1 Based on slides by Marie desJardin, Francisco Iacobelli 3 Why Games? State-of-the-art • How good are computer game players? • Clear criteria for success • Chess : • Deep Blue beat Gary Kasparov in 1997 • Offer an opportunity to study problems involving • Garry Kasparav vs. Deep Junior (Feb 2003): tie! {hostile / adversarial / competing} agents. • Kasparov vs. X3D Fritz (November 2003): tie! http://www.thechessdrum.net/tournaments/Kasparov-X3DFritz/index.html • Deep Fritz beat world champion Vladimir Kramnik (2006) • Interesting, hard problems which require minimal setup • Checkers : Chinook (an AI program with a very large endgame database) is the world champion and can provably never be beaten. Retired in 1995 • Often define very large search spaces • Go : Computer players have finally reached tournament-level play • AlphaGo beat Ke Jie (No.1 world player) in 2017 • chess 35 100 nodes in search tree, 10 40 legal states • Bridge : “Expert-level” computer players exist (but no world champions yet!) • Historical reasons • Good places to learn more: • http://www.cs.ualberta.ca/~games/ • Fun! (Mostly.) • http://www.cs.unimass.nl/icga 4 5 Chinook • World Man-Machine Checkers Champion, developed by researchers at the University of Alberta. • Earned this title by competing in human tournaments, winning the right to play for the world championship, eventually defeating the best players in the world. • Play it! http://www.cs.ualberta.ca/~chinook • Developers have fully analyzed the game of checkers, and can provably never be beaten • ( http://www.sciencemag.org/cgi/content/abstract/ 1144079v1 ) 6 1

  2. www.wired.com/2017/05/googles-alphago-levels-board-games-power-grids Typical Games How to Play (How to Search) • 2-person game • Obvious approach: • From current game state: • Players alternate moves • Consider all the legal moves you can make • Compute new position resulting from each move • Zero-sum : one player’s loss is the other’s gain • Evaluate each resulting position • Perfect information : both players have access to complete • Decide which is best information about the state of the game. No information is • Make that move hidden from either player. • Wait for your opponent to move and repeat • Deterministic : No chance (e.g., dice) involved • Key problems are: • Examples: Tic-Tac-Toe, Checkers, Chess, Go, Nim, Othello • Representing the “board” • Generating all legal next boards • Not: Bridge, Solitaire, Backgammon, ... x 1 x 2 x 3 x 4 • Evaluating a position 11 12 Evaluation function Evaluation function examples • Evaluation function or static evaluator is used to • Example of an evaluation function for Tic-Tac-Toe: evaluate the “goodness” of a game position • f ( n ) = [#3-lengths open for × ] - [#3-lengths open for O] • Unlike heuristic search, where evaluation function is a positive • A 3-length is a complete row, column, or diagonal estimate of cost from start node to a goal, passing through n • Alan Turing’s function for chess • Zero-sum assumption allows one evaluation function to describe goodness of a board for both players (how?) • f ( n ) = w ( n )/ b ( n ) • f ( n ) >> 0 : position n good for me and bad for you • w ( n ) = sum of the point value of white’s pieces • f ( n ) << 0 : position n bad for me and good for you • b ( n ) = sum of black’s • f ( n ) = 0 ± ε : position n is a neutral position • f ( n ) = + ∞ : win for me • f ( n ) = - ∞ : win for you 13 14 2

  3. Evaluation function examples Game trees • Most evaluation functions are specified as a • Problem spaces for typical games are weighted sum of position features: represented as trees • f ( n ) = w 1 * feat 1 ( n ) + w 2 * feat 2 ( n ) + ... + w n * feat k ( n ) • Player must decide best • Example features for chess: piece count, piece single move to make next placement, squares controlled, … • Root node = the current • Deep Blue had over 8000 board configuration square control, rook-in-file, x- rays, king safety, pawn structure, features in its nonlinear passed pawns, ray control, • Arcs = possible legal outposts, pawn majority, rook on evaluation function! moves for a player the 7 th blockade, restraint, trapped pieces, color complex, ... 15 16 Game trees Minimax Procedure • Static evaluator function • Create start node: MAX node, current board state • Rates a board position • Expand nodes down to a depth of lookahead • f(board) = R with f>0 “white” (me), f<0 for black (you) • Apply evaluation function at each leaf node • If it is my turn to move: • “Back up” values for each non-leaf node until a • Root is labeled “ MAX ” node value is computed for the root node • Otherwise it is a “ MIN ” node • MIN : backed-up value is lowest of children’s values ( opponent’s turn ) • MAX : backed-up value is highest of children’s values • Each level’s nodes are all MAX or all MIN • Pick operator associated with the child node whose backed-up value set the value at the root • Nodes at level i are opposite those at level i +1 17 18 Minimax Algorithm 2 1 2 2 1 2 7 1 8 2 7 1 8 2 7 1 8 2 Static evaluator value 2 1 MAX MIN 2 7 1 8 https://www.youtube.com/watch?v=6ELUvkSkCts 3

  4. Example: Nim Partial Game Tree for Tic-Tac-Toe • In Nim, there are a certain number of objects (coins, sticks, etc.) on the table – we’ll play 7-coin Nim • Each player in turn has to pick up either one or two objects • Whoever picks up the last object loses • f(n) = +1 if position is a win for X. • f(n) = -1 if position is a win for O. • f(n) = 0 if position is a draw. 22 Minimax Tree Nim Game Tree MAX node • In-class exercise: • Draw minimax search tree for 4-coin Nim MIN node • Things to consider: • What’s your start state? • What’s the maximum depth of the tree? Minimum? • Pick up either one or two objects • Whoever picks up the last object loses value computed f value by minimax 24 Improving Minimax Alpha-Beta Pruning • Basic problem: must examine a number of states • We can improve on the performance of the that is exponential in d ! minimax algorithm through alpha-beta pruning • Solution: judicious pruning • Basic idea: “If you have an idea that is surely bad, don't of the search tree take the time to see how truly awful it is.” – Pat Winston MAX ≤ 2 • “Cut off” whole sections that • We don’t need to compute the value at this node. can’t be part of the best solution MIN = 2 ≤ 1 • Or, sometimes, probably won’t • No matter what it is, it can’t affect the value of the root • Can be a completeness vs. efficiency tradeoff, esp. in node. stochastic problem spaces MAX 2 7 1 ? 26 4

  5. Alpha-Beta Pruning Alpha-beta Example ( b =3) 3 • Traverse search tree in depth-first order MAX • At each MAX node n, α ( n ) = maximum value found so far • At each MIN node n, β ( n ) = minimum value found so far 3 2 - prune 14 1 - prune MIN • α starts at - ∞ and increases, β starts at + ∞ and decreases • β - cutoff : Given a MAX node n , • Cut off search below n (i.e., don’t look at any more of n ’s children) if: • α ( n ) ≥ β (i) for some MIN node ancestor i of n 2 • α - cutoff : 3 12 8 14 1 • Stop searching below MIN node n if: • β (n) ≤ α (i) for some MAX node ancestor i of n 27 28 Alpha-Beta Pruning Effectiveness of Alpha-Beta • Alpha-beta is guaranteed to: • Compute the same value for the root node as minimax MAX • With ≤ computation • Worst case: nothing pruned, examine b d leaf nodes, MIN where each node has b children and a d -ply search is performed • Best case: examine only (2 b ) d /2 leaf nodes. MAX • Result is you can search twice as deep as minimax! • When each player’s best move is the first alternative generated • In Deep Blue, empirically, alpha-beta pruning took average branching factor from ~35 to ~6! 29 31 Games of Chance Game Trees with Chance • Backgammon: a two-player • Chance nodes (circles) game with uncertainty represent random events • Players roll dice to • For a random event Min determine what moves to Rolls with N outcomes: make • Chance node has N • White has just rolled 5 and distinct children 6 and has four legal moves: • Each has a probability • 5-10, 5-11 • 5-11, 19-24 Max • Example: Rolls • 5-10, 10-16 • Rolling 2 dice à 21 • 5-11, 11-16 distinct outcomes Good for decision making • • Not all equally likely! in adversarial problems with skill and luck 32 33 5

Recommend


More recommend