CS 730/830: Intro AI Adversarial Search 1 handout: slides You - PowerPoint PPT Presentation

CS 730/830: Intro AI Adversarial Search 1 handout: slides You think you know when you can learn, are more sure when you can write, even more when you can teach, but certain when you can program. Wheeler Ruml (UNH) Lecture 7, CS 730 – 1 / 19

EOLQs Adversarial Search Wheeler Ruml (UNH) Lecture 7, CS 730 – 2 / 19

Adversarial Search ■ Problems ■ Different! ■ Minimax ■ Tic-tac-toe ■ Improvements ■ Break ■ α - β Pruning ■ α - β Pseudo-code Adversarial Search ■ Why α - β ? ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 3 / 19

Planning Problems Observability: complete, partial, hidden Adversarial Search State: discrete, continuous ■ Problems ■ Different! Actions: deterministic, stochastic, discrete, continuous ■ Minimax ■ Tic-tac-toe Nature: static, deterministic, stochastic ■ Improvements Interaction: one decision, sequential ■ Break ■ α - β Pruning Time: static/off-line, on-line, discrete, continuous ■ α - β Pseudo-code Percepts: discrete, continuous, uncertain ■ Why α - β ? ■ Progress Others: solo, cooperative, competitive ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 4 / 19

Multi-agent is Different Shortest-path (M&C, vacuum, tile puzzle) ■ Adversarial Search ■ Problems want least-cost path to goal at unkown depth ◆ ■ Different! ■ Minimax Decisions with an adversary (chess, tic-tac-toe) ■ ■ Tic-tac-toe ■ Improvements adversary might prevent path to best goal ◆ ■ Break ■ α - β Pruning want best assured outcome assuming rational opponent ◆ ■ α - β Pseudo-code ■ Why α - β ? irrational opponent can only be worse ◆ ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 5 / 19

Adversarial Search: Minimax Each ply corresponds to half a move . Adversarial Search Terminal states are labeled with value. ■ Problems ■ Different! ■ Minimax incorrect version by Zermelo (1912) ■ Tic-tac-toe full treatment by von Neumann and Morgenstern (1944) ■ Improvements ■ Break ■ α - β Pruning ■ α - β Pseudo-code Can also bound depth and use a static evaluation function on ■ Why α - β ? non-terminal states. ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 6 / 19

Evaluation for Tic-tac-toe A 3-length is a complete row, column, or diagonal. Adversarial Search value of position = ∞ if win for me, ■ Problems ■ Different! or = −∞ if a win for you, ■ Minimax otherwise = # 3-lengths open for me − ■ Tic-tac-toe ■ Improvements # 3-lengths open for you ■ Break ■ α - β Pruning ■ α - β Pseudo-code ■ Why α - β ? ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 7 / 19

Tic-tac-toe: two-ply search Adversarial Search ■ Problems ■ Different! ■ Minimax ■ Tic-tac-toe ■ Improvements ■ Break ■ α - β Pruning ■ α - β Pseudo-code ■ Why α - β ? ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 8 / 19

Tic-tac-toe: second move Adversarial Search ■ Problems ■ Different! ■ Minimax ■ Tic-tac-toe ■ Improvements ■ Break ■ α - β Pruning ■ α - β Pseudo-code ■ Why α - β ? ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 9 / 19

Tic-tac-toe: third move Adversarial Search ■ Problems ■ Different! ■ Minimax ■ Tic-tac-toe ■ Improvements ■ Break ■ α - β Pruning ■ α - β Pseudo-code ■ Why α - β ? ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 10 / 19

Improving the Search partial expansion, SEF ■ Adversarial Search symmetry (‘transposition tables’) ■ Problems ■ ■ Different! search more ply as we have time (De Groot figure) ■ ■ Minimax ■ Tic-tac-toe avoid unnecessary evaluations ■ ■ Improvements ■ Break ■ α - β Pruning ■ α - β Pseudo-code ■ Why α - β ? ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 11 / 19

Break asst 3 ■ Adversarial Search asst 4 ■ Problems ■ ■ Different! projects! talk with me well before break ■ ■ Minimax ■ Tic-tac-toe ■ Improvements ■ Break ■ α - β Pruning ■ α - β Pseudo-code ■ Why α - β ? ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 12 / 19

Which Values are Necessary? Adversarial Search ■ Problems ■ Different! ■ Minimax ■ Tic-tac-toe ■ Improvements ■ Break ■ α - β Pruning ■ α - β Pseudo-code ■ Why α - β ? ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 13 / 19

α - β Pruning best outcome Max can force at previous decision on this α Adversarial Search path (init to −∞ ) ■ Problems ■ Different! β best outcome Min can force at previous decision on this path ■ Minimax ■ Tic-tac-toe (init to ∞ ) ■ Improvements ■ Break α and β values are copied down the tree (but not up). ■ α - β Pruning ■ α - β Pseudo-code Minmax values are passed up the tree, as usual. ■ Why α - β ? ■ Progress ■ EOLQs John McCarthy (1956 but never published) simple version used by Newell, Shaw, and Simon (1958) published by Hart and Edwards (1961) proved correct and analyzed by Knuth and Moore (1975) proved optimal by Pearl (1982) Wheeler Ruml (UNH) Lecture 7, CS 730 – 14 / 19

α - β Pseudo-code Max-value (state, α , β ): Adversarial Search ■ Problems when depth-cutoff (state), return SEF(state) ■ Different! ■ Minimax for each child of state ■ Tic-tac-toe α ← max( α , Min-value (child, α , β )) ■ Improvements ■ Break when α ≥ β , return α ■ α - β Pruning return α ■ α - β Pseudo-code ■ Why α - β ? ■ Progress ■ EOLQs Min-value (state, α , β ): when depth-cutoff (state), return SEF(state) for each child of state β ← min( β , Max-value (child, α , β )) when β ≤ α , return β return β Wheeler Ruml (UNH) Lecture 7, CS 730 – 15 / 19

α - β in action Adversarial Search ■ Problems ■ Different! ■ Minimax ■ Tic-tac-toe ■ Improvements ■ Break ■ α - β Pruning ■ α - β Pseudo-code ■ Why α - β ? ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 16 / 19

Why α - β ? Time complexity of α - β is about O ( b d/ 2 ) Adversarial Search ■ Problems ■ Different! ■ Minimax ■ Tic-tac-toe ■ Improvements ■ Break ■ α - β Pruning ■ α - β Pseudo-code ■ Why α - β ? ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 17 / 19

Progress on Games Computers best: chess, checkers, backgammon, Scrabble, Adversarial Search Jeopardy, Go ■ Problems ■ Different! Computers competitive: bridge, crosswords, poker ■ Minimax ■ Tic-tac-toe Computers amateur: soccer? ■ Improvements ■ Break ■ α - β Pruning ■ α - β Pseudo-code ■ Why α - β ? ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 18 / 19

EOLQs Please write down the most pressing question you have about Adversarial Search the course material covered so far and put it in the box on your ■ Problems ■ Different! way out. ■ Minimax ■ Tic-tac-toe Thanks! ■ Improvements ■ Break ■ α - β Pruning ■ α - β Pseudo-code ■ Why α - β ? ■ Progress ■ EOLQs Wheeler Ruml (UNH) Lecture 7, CS 730 – 19 / 19

CS 730/830: Intro AI Adversarial Search 1 handout: slides You - PowerPoint PPT Presentation

CS 730/830: Intro AI Adversarial Search 1 handout: slides You think you know when you can learn, are more sure when you can write, even more when you can teach, but certain when you can program. Wheeler Ruml (UNH) Lecture 7, CS 730 1 / 19

CS 730/830: Intro AI CSPs 1 handout: slides asst 4 posted Wheeler Ruml (UNH) Lecture 8, CS 730

CS 730/730W/830: Intro AI Beyond STRIPS Hierarchy Wheeler Ruml (UNH) Lecture 18, CS 730 1 /

CS 730/830: Intro AI 1 handout: slides Control Wheeler Ruml (UNH) Lecture 6, CS 730 1 / 12

STAT 830 Blank Slides for Notes Richard Lockhart SFU STAT 830 Fall 2020 Richard Lockhart

CS 730/830: Intro AI 1 handout: slides Search Basic Algorithms A Clever Algorithm EOLQs

CS 730/830: Intro AI Solving MDPs MDP Extras Wheeler Ruml (UNH) Lecture 20, CS 730 1 / 23

CS 730/830: Intro AI Class Outro AI at UNH Wheeler Ruml (UNH) Lecture 27, CS 730 1 / 12

CS 730/830: Intro AI Unsuperv. Learning asst 11 posted Wheeler Ruml (UNH) Lecture 23, CS 730

CS 730/830: Intro AI 1 handout: slides Are We Done? Beyond A* Suboptimal Search Anytime

CS 730/730W/830: Intro AI Naive Bayes Boosting 1 handout: slides asst 5 milestone was due

CS 730/730W/830: Intro AI MDP Wrap-Up ADP Q -Learning 1 handout: slides project proposals are

CS 730/730W/830: Intro AI Propositional Logic First-Order Logic 1 handout: slides Wheeler Ruml

CS 730/730W/830: Intro AI What is KR? Prop. Logic Reasoning 2 handouts: slides, assignment 2

CS 730/730W/830: Intro AI First-order Logic Inference in FOL 1 handout: slides 730W journal

CS 730/730W/830: Intro AI Bayesian Networks Approx. Inference Exact Inference 1 handout: slides

CS 730/830: Intro AI Reasoning Inference in FOL assignments 6 and 7 are posted Wheeler Ruml

The results of alpha-beta depend on the order in which moves are considered among the

Minimax strategies, alpha beta pruning Lirong Xia Reminder Project 1 due tonight Makes

Faking a Failover Over the Top With Samba Clusters Christopher R. Hertel Samba Team May 2017

On the input energy for state reachability of linear systems with packet losses A. Sanand Dilip,

Game Playing Philipp Koehn 29 September 2015 Philipp Koehn Artificial Intelligence: Game

Game Playing Tail end of Constraint Satisfaction Ch. 5.1-5.3, 5.4.1, 5.5 Questions Game

CS885 Reinforcement Learning Lecture 13c: June 13, 2018 Adversarial Search [RusNor] Sec. 5.1-5.4

Foundations of Artificial Intelligence 42. Board Games: Minimax Search and Evaluation Functions