Minimax strategies, alpha beta pruning Lirong Xia How to find good - PowerPoint PPT Presentation

Minimax strategies, alpha beta pruning Lirong Xia

How to find good heuristics? Ø No really mechanical way § art more than science Ø General guideline: relaxing constraints § e.g. Pacman can pass through the walls Ø Mimic what you would do 1

Arc Consistency of a CSP Ø A simple form of propagation makes sure all arcs are consistent: X X X Delete Ø If V loses a value, neighbors of V need to be rechecked! from tail! Ø Arc consistency detects failure earlier than forward checking Ø Can be run as a preprocessor or after each assignment Ø Might be time-consuming 2

Limitations of Arc Consistency Ø After running arc consistency: § Can have one solution left § Can have multiple solutions left § Can have no solutions left (and not know it) 3

“Sum to 2” game Ø Player 1 moves, then player 2, finally player 1 again Ø Move = 0 or 1 Ø Player 1 wins if and only if all moves together sum to 2 Player 1 0 1 Player 2 Player 2 0 1 1 0 Player 1 Player 1 Player 1 Player 1 0 1 0 1 1 0 0 1 -1 -1 -1 1 -1 1 1 -1 Player 1’s utility is in the leaves; player 2’s utility is the negative of this

Today’s schedule Ø Adversarial game Ø Minimax search Ø Alpha-beta pruning algorithm 5

Adversarial Games Ø Deterministic, zero-sum games: § Tic-tac-toe, chess, checkers § The MAX player maximizes result § The MIN player minimizes result Ø Minimax search: § A search tree § Players alternate turns § Each node has a minimax value: best achievable utility against a rational adversary 6

Computing Minimax Values Ø This is DFS Ø Two recursive functions: § max-value maxes the values of successors § min-value mins the values of successors Ø Def value (state): If the state is a terminal state: return the state’s utility If the agent at the state is MAX: return max-value(state) If the agent at the state is MIN: return min-value(state) Ø Def max-value(state): Initialize max = -∞ For each successor of state: Compute value(successor) Update max accordingly return max Ø Def min-value(state): similar to max-value 7

Minimax Example 3 3 2 2 8

Tic-tac-toe Game Tree 9

Renju • 15*15 • 5 horizontal, vertical, or diagonal in a row win • no double-3 or double-4 moves for black • otherwise black’s winning strategy was computed – L. Victor Allis 1994 (PhD thesis) 10

Minimax Properties Ø Time complexity? ( ) m O b § Ø Space complexity? ( ) O bm § Ø For chess, § Exact solution is completely » » b 35, m 100 infeasible § But, do we need to explore the whole tree? 11

Resource Limits Ø Cannot search to leaves Ø Depth-limited search § Instead, search a limited depth of tree § Replace terminal utilities with an evaluation function for non-terminal positions Ø Guarantee of optimal play is gone 12

Evaluation Functions Ø Functions which scores non-terminals Ø Ideal function: returns the minimax utility of the position Ø In practice: typically weighted linear sum of features: ( ) = w 1 f 1 s ( ) + w 2 f 2 s ( ) +  + w n f n s ( ) Evals s ( ) ( ) f s = # white queens - # black queens Ø e.g. , etc. 13 1

Minimax with limited depth Ø Suppose you are the MAX player Ø Given a depth d and current state Ø Compute value(state, d ) that reaches depth d § at depth d , use a evaluation function to estimate the value if it is non-terminal 14

Improving minimax: pruning 15

Pruning in Minimax Search Ø An ancestor is a MAX node § already has an option than my current solution § my future solution can only be smaller 16

Alpha-beta pruning Ø Pruning = cutting off parts of the search tree (because you realize you don’t need to look at them) § When we considered A* we also pruned large parts of the search tree Ø Maintain § α = value of the best option for the MAX player along the path so far § β = value of the best option for the MIN player along the path so far § Initialized to be α = -∞ and β = +∞ Ø Maintain and update α and β for each node § α is updated at MAX player’s nodes § β is updated at MIN player’s nodes

Alpha-Beta Pruning Ø General configuration § We’re computing the MIN-VALUE at n § We’re looping over n ’s children § n ’s value estimate is dropping § α is the best value that MAX can get at any choice point along the current path § If n becomes worse than α , MAX will avoid it, so can stop considering n ’s other children § Define β similarly for MIN § α is usually smaller than β • Once α >= β , return to the upper layer 18

Alpha-Beta Pruning Example a is MAX’s best alternative here or above b is MIN’s best alternative here or above 19

Alpha-Beta Pruning Example a = ¥ - a b starting / b = ¥ + a = a = a = ¥ a = ¥ 3 - - 3 raising a b = ¥ b = ¥ b = ¥ b = ¥ + + + + lowering b a = 3 a = 3 a = ¥ a = ¥ a = ¥ a = ¥ - - - - a = a = a = a = 3 3 3 3 b = ¥ b = + 2 b = b = b = b = ¥ 3 3 3 + b = b = b = b = ¥ 14 5 1 + a raising a a = ¥ a = 8 - is MAX’s best alternative here or above b = b = 3 b 3 is MIN’s best alternative here or above 20

Alpha-Beta Pseudocode 21

Alpha-Beta Pruning Properties Ø This pruning has no effect on final result at the root Ø Values of intermediate nodes might be wrong! § Important: children of the root may have the wrong value Ø Good children ordering improves effectiveness of pruning Ø With “perfect ordering”: § Time complexity drops to O ( b m /2 ) § Doubles solvable depth! § Your action looks smarter: more forward-looking with good evaluation function § Full search of, e.g. chess, is still hopeless… 22

Project 2 Ø Q1: write an evaluation function for (state,action) pairs § the evaluation function is for this question only Ø Q2: minimax search with arbitrary depth and multiple MIN players (ghosts) § evaluation function on states has been implemented for you Ø Q3: alpha-beta pruning with arbitrary depth and multiple MIN players (ghosts) 23

Minimax strategies, alpha beta pruning Lirong Xia How to find good - PowerPoint PPT Presentation

Minimax strategies, alpha beta pruning Lirong Xia How to find good heuristics? No really mechanical way art more than science General guideline: relaxing constraints e.g. Pacman can pass through the walls Mimic what you would do 1

Alpha- -beta pruning beta pruning Example Alpha Example reduce the branching factor of

Natural Target Pruning Making Proper Pruning Cuts Natural Target Pruning In this lesson we

More on games (Ch. 5.4-5.6) Alpha-beta pruning Previously on CSci 4511... We talked about how to

Minimax strategies, alpha beta pruning Lirong Xia Reminder Project 1 due tonight Makes

More on games (Ch. 5.4-5.6) Announcements Writing 2 posted Minimax Pruning in real life:

Beta star measurement G. Wang and M.Bai Yellow beta star and chromatic beta beat measurement

Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw

Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw

Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw

Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw

BASICS Natural Target Pruning Terminology and Tools Reasons for Pruning Fruit Trees

Adversarial Search Volker Sorge Intro to AI: Problem of Games Lecture 4 Volker Sorge MiniMax

ECE 4524 Artificial Intelligence and Engineering Applications Meeting 6: Alpha-Beta Pruning,

Pruning for Cropload Management and Productivity 2013 Winter Pruning Workshop Dr. Mercy

Extensive Form Games 2/10/17 Alpha-Beta Pruning Exercise + + + + + + +

The results of alpha-beta depend on the order in which moves are considered among the

Counterfactual Regret Minimization Gabriele Farina 1 Christian Kroer 2 Noam Brown 1 Tuomas Sandholm

C/C++ LANGUAGE SERVERS THE NEXT GENERATION IS NOW EclipseCon 2018 1 AGENDA Why C++

Michael Mrozek (EvilDragon) 1. The GP32 started it all (2001) Originally closed source,

s t ss

Spectral Gap of Stable Commutator Length Lvzhou Chen Department of Mathematics University of

Spectral gaps and oscillations Alexei Poltoratski Texas A&M Abel Symposium, 8/2012 Alexei

Gap Property Zahra Shojaee Yazd University December 7, 2015 1 / 67 Introduction Why do we

Definable Hausdorff Gaps Yurii Khomskii Kurt G odel Research Center Trends in Set Theory,

Sambuz

Useful Links

Newsletter

Mail Us

Minimax strategies, alpha beta pruning Lirong Xia How to find good - PowerPoint PPT Presentation

Minimax strategies, alpha beta pruning Lirong Xia How to find good heuristics? No really mechanical way art more than science General guideline: relaxing constraints e.g. Pacman can pass through the walls Mimic what you would do 1

Alpha- -beta pruning beta pruning Example Alpha Example reduce the branching factor of

Natural Target Pruning Making Proper Pruning Cuts Natural Target Pruning In this lesson we

More on games (Ch. 5.4-5.6) Alpha-beta pruning Previously on CSci 4511... We talked about how to

Minimax strategies, alpha beta pruning Lirong Xia Reminder Project 1 due tonight Makes

More on games (Ch. 5.4-5.6) Announcements Writing 2 posted Minimax Pruning in real life:

Beta star measurement G. Wang and M.Bai Yellow beta star and chromatic beta beat measurement

Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw

Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw

Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw

Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw

BASICS Natural Target Pruning Terminology and Tools Reasons for Pruning Fruit Trees

Adversarial Search Volker Sorge Intro to AI: Problem of Games Lecture 4 Volker Sorge MiniMax

ECE 4524 Artificial Intelligence and Engineering Applications Meeting 6: Alpha-Beta Pruning,

Pruning for Cropload Management and Productivity 2013 Winter Pruning Workshop Dr. Mercy

Extensive Form Games 2/10/17 Alpha-Beta Pruning Exercise + + + + + + +

The results of alpha-beta depend on the order in which moves are considered among the

Counterfactual Regret Minimization Gabriele Farina 1 Christian Kroer 2 Noam Brown 1 Tuomas Sandholm

C/C++ LANGUAGE SERVERS THE NEXT GENERATION IS NOW EclipseCon 2018 1 AGENDA Why C++

Michael Mrozek (EvilDragon) 1. The GP32 started it all (2001) Originally closed source,

s t ss

Spectral Gap of Stable Commutator Length Lvzhou Chen Department of Mathematics University of

Spectral gaps and oscillations Alexei Poltoratski Texas A&amp;M Abel Symposium, 8/2012 Alexei

Gap Property Zahra Shojaee Yazd University December 7, 2015 1 / 67 Introduction Why do we

Definable Hausdorff Gaps Yurii Khomskii Kurt G odel Research Center Trends in Set Theory,

Sambuz

Useful Links

Newsletter

Mail Us

Spectral gaps and oscillations Alexei Poltoratski Texas A&M Abel Symposium, 8/2012 Alexei