Adversarial Search CE417: Introduction to Artificial Intelligence - PowerPoint PPT Presentation

Adversarial Search CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani “ Artificial Intelligence: A Modern Approach ” , 3 rd Edition, Chapter 5

Outline  Game as a search problem  Minimax algorithm  𝛽 - 𝛾 Pruning: ignoring a portion of the search tree  Time limit problem  Cut off & Evaluation function 2

Games as search problems  Games  Adversarial search problems (goals are in conflict)  Competitive multi-agent environments  Games in AI are a specialized kind of games (in the game theory) 3

Primary assumptions  Common games in AI:  T wo-player  Turn taking  agents act alternately  Zero-sum  agents ’ goals are in conflict: sum of utility values at the end of the game is zero or constant  Deterministic  Perfect information  fully observable

Game as a kind of search problem  Initial state 𝑇 0 , set of states (each state contains also the turn), 𝐵𝐷𝑈𝐽𝑃𝑂𝑇(𝑡) , 𝑆𝐹𝑇𝑉𝑀𝑈𝑇 𝑡, 𝑏 like standard search  𝑄𝑀𝐵𝑍𝐹𝑆𝑇(𝑡) : Defines which player takes turn in a state  𝑈𝐹𝑆𝑁𝐽𝑂𝐵𝑀_𝑈𝐹𝑇𝑈(𝑡) : Shows where game has ended  𝑉𝑈𝐽𝑀𝐽𝑈𝑍(𝑡, 𝑞) : utility or payoff function 𝑉: 𝑇 × 𝑄 → ℝ (how good is the terminal state 𝑡 for player 𝑞 )  Zero-sum (constant-sum) game: the total payoff to all players is zero (or constant) for every terminal state  We have utilities at end of game instead of sum of action costs 5

Game tree (tic-tac-toe)  Two players: 𝑄 1 and 𝑄 2 ( 𝑄 1 is now searching to find a good move)  Zero-sum games: 𝑄 1 gets 𝑉(𝑢) , 𝑄 2 gets 𝐷 − 𝑉(𝑢) for terminal node 𝑢 𝑄 𝑄 1 : 𝑌 1 𝑄 2 : 𝑃 1-ply = half move 𝑄 2 𝑄 1 𝑄 2 Utilities from the point of view of 𝑄 1 6

Game tree (tic-tac-toe)  Two players: 𝑄 1 and 𝑄 2 ( 𝑄 1 is now searching to find a good move)  Zero-sum games: 𝑄 1 gets 𝑉(𝑢) , 𝑄 2 gets 𝐷 − 𝑉(𝑢) for terminal node 𝑢 1-ply = half move Utilities from the point of view of 𝑄 MAX 1 7

Optimal play  Opponent is assumed optimal  Minimax function is used to find the utility of each state.  MAX/MIN wants to maximize/minimize the terminal payoff MAX gets 𝑉(𝑢) for terminal node 𝑢 8

Minimax 𝑉𝑈𝐽𝑀𝐽𝑈𝑍(𝑡, 𝑁𝐵𝑌) 𝑗𝑔 𝑈𝐹𝑆𝑁𝐽𝑂𝐵𝑀_𝑈𝐹𝑇𝑈(𝑡) 𝑛𝑏𝑦 𝑏∈𝐵𝐷𝑈𝐽𝑃𝑂𝑇 𝑡 𝑁𝐽𝑂𝐽𝑁𝐵𝑌(𝑆𝐹𝑇𝑉𝑀𝑈(𝑡, 𝑏)) 𝑄𝑀𝐵𝑍𝐹𝑆 𝑡 = 𝑁𝐵𝑌 𝑁𝐽𝑂𝐽𝑁𝐵𝑌 𝑡 = 𝑛𝑗𝑜 𝑏∈𝐵𝐷𝑈𝐽𝑃𝑂𝑇 𝑡 𝑁𝐽𝑂𝐽𝑁𝐵𝑌(𝑆𝐹𝑇𝑉𝑀𝑈 𝑡, 𝑏 ) 𝑄𝑀𝐵𝑍𝐹𝑆 𝑡 = 𝑁𝐽𝑂 Utility of being in state s  𝑁𝐽𝑂𝐽𝑁𝐵𝑌(𝑡) shows the best achievable outcome of being in state 𝑡 (assumption: optimal opponent) 3 3 2 2 9

Minimax (Cont.)  Optimal strategy: move to the state with highest minimax value  Best achievable payoff against best play  Maximizes the worst-case outcome for MAX  It works for zero-sum games 10

Minimax algorithm Depth first search function 𝑁𝐽𝑂𝐽𝑁𝐵𝑌_𝐸𝐹𝐷𝐽𝑇𝐽𝑃𝑂(𝑡𝑢𝑏𝑢𝑓) returns 𝑏𝑜 𝑏𝑑𝑢𝑗𝑝𝑜 𝑏∈𝐵𝐷𝑈𝐽𝑃𝑂𝑇(𝑡𝑢𝑏𝑢𝑓) 𝑁𝐽𝑂_𝑊𝐵𝑀𝑉𝐹(𝑆𝐹𝑇𝑉𝑀𝑈(𝑡𝑢𝑏𝑢𝑓, 𝑏)) max return function 𝑁𝐵𝑌_𝑊𝐵𝑀𝑉𝐹(𝑡𝑢𝑏𝑢𝑓) returns 𝑏 𝑣𝑢𝑗𝑚𝑗𝑢𝑧 𝑤𝑏𝑚𝑣𝑓 if 𝑈𝐹𝑆𝑁𝐽𝑂𝐵𝑀_𝑈𝐹𝑇𝑈(𝑡𝑢𝑏𝑢𝑓) then return 𝑉𝑈𝐽𝑀𝐽𝑈𝑍(𝑡𝑢𝑏𝑢𝑓) 𝑤 ← −∞ for each 𝑏 in 𝐵𝐷𝑈𝐽𝑃𝑂𝑇(𝑡𝑢𝑏𝑢𝑓) do 𝑤 ← 𝑁𝐵𝑌(𝑤, 𝑁𝐽𝑂_𝑊𝐵𝑀𝑉𝐹(𝑆𝐹𝑇𝑉𝑀𝑈𝑇(𝑡𝑢𝑏𝑢𝑓, 𝑏))) return 𝑤 function 𝑁𝐽𝑂_𝑊𝐵𝑀𝑉𝐹(𝑡𝑢𝑏𝑢𝑓) returns 𝑏 𝑣𝑢𝑗𝑚𝑗𝑢𝑧 𝑤𝑏𝑚𝑣𝑓 if 𝑈𝐹𝑆𝑁𝐽𝑂𝐵𝑀_𝑈𝐹𝑇𝑈(𝑡𝑢𝑏𝑢𝑓) then return 𝑉𝑈𝐽𝑀𝐽𝑈𝑍(𝑡𝑢𝑏𝑢𝑓) 𝑤 ← ∞ for each 𝑏 in 𝐵𝐷𝑈𝐽𝑃𝑂𝑇(𝑡𝑢𝑏𝑢𝑓) do 𝑤 ← 𝑁𝐽𝑂(𝑤, 𝑁𝐵𝑌_𝑊𝐵𝑀𝑉𝐹(𝑆𝐹𝑇𝑉𝑀𝑈𝑇(𝑡𝑢𝑏𝑢𝑓, 𝑏))) return 𝑤 11

Properties of minimax  Complete?Yes (when tree is finite)  Optimal?Yes (against an optimal opponent)  Time complexity: 𝑃(𝑐 𝑛 )  Space complexity: 𝑃(𝑐𝑛) (depth-first exploration)  For chess, 𝑐 ≈ 35 , 𝑛 > 50 for reasonable games  Finding exact solution is completely infeasible 12

Pruning  Correct minimax decision without looking at every node in the game tree  α - β pruning  Branch & bound algorithm  Prunes away branches that cannot influence the final decision 13

α - β pruning example 14

α - β progress 19

α - β pruning  Assuming depth-first generation of tree  We prune node 𝑜 when player has a better choice 𝑛 at (parent or) any ancestor of 𝑜  Two types of pruning (cuts):  pruning of max nodes ( α -cuts)  pruning of min nodes ( β -cuts) 20

Why is it called α - β?  α : Value of the best (highest) choice found so far at any choice point along the path for MAX  𝛾 : Value of the best (lowest) choice found so far at any choice point along the path for MIN  Updating α and 𝛾 during the search process  For a MAX node once the value of this node is known to be more than the current 𝛾 ( v ≥ 𝛾 ), its remaining branches are pruned.  For a MIN node once the value of this node is known to be less than the current 𝛽 ( v ≤ 𝛽 ), its remaining branches are pruned. 21

α - β pruning (an other example) 3 ≤ 2 3 2 ≥ 5 5 1 22

function 𝐵𝑀𝑄𝐼𝐵_𝐶𝐹𝑈𝐵_𝑇𝐹𝐵𝑆𝐷𝐼(𝑡𝑢𝑏𝑢𝑓) returns 𝑏𝑜 𝑏𝑑𝑢𝑗𝑝𝑜 𝑤 ← 𝑁𝐵𝑌_𝑊𝐵𝑀𝑉𝐹(𝑡𝑢𝑏𝑢𝑓, −∞, +∞) return the 𝑏𝑑𝑢𝑗𝑝𝑜 in 𝐵𝐷𝑈𝐽𝑃𝑂𝑇(𝑡𝑢𝑏𝑢𝑓) with value 𝑤 function 𝑁𝐵𝑌_𝑊𝐵𝑀𝑉𝐹(𝑡𝑢𝑏𝑢𝑓, 𝛽, 𝛾) returns 𝑏 𝑣𝑢𝑗𝑚𝑗𝑢𝑧 𝑤𝑏𝑚𝑣𝑓 if 𝑈𝐹𝑆𝑁𝐽𝑂𝐵𝑀_𝑈𝐹𝑇𝑈(𝑡𝑢𝑏𝑢𝑓) then return 𝑉𝑈𝐽𝑀𝐽𝑈𝑍(𝑡𝑢𝑏𝑢𝑓) 𝑤 ← −∞ for each 𝑏 in 𝐵𝐷𝑈𝐽𝑃𝑂𝑇(𝑡𝑢𝑏𝑢𝑓) do 𝑤 ← 𝑁𝐵𝑌(𝑤, 𝑁𝐽𝑂_𝑊𝐵𝑀𝑉𝐹(𝑆𝐹𝑇𝑉𝑀𝑈𝑇(𝑡𝑢𝑏𝑢𝑓, 𝑏), 𝛽, 𝛾)) if 𝑤 ≥ 𝛾 then return 𝑤 𝛽 ← 𝑁𝐵𝑌(𝛽, 𝑤) return 𝑤 function 𝑁𝐽𝑂_𝑊𝐵𝑀𝑉𝐹(𝑡𝑢𝑏𝑢𝑓, 𝛽, 𝛾) returns 𝑏 𝑣𝑢𝑗𝑚𝑗𝑢𝑧 𝑤𝑏𝑚𝑣𝑓 if 𝑈𝐹𝑆𝑁𝐽𝑂𝐵𝑀_𝑈𝐹𝑇𝑈(𝑡𝑢𝑏𝑢𝑓) then return 𝑉𝑈𝐽𝑀𝐽𝑈𝑍(𝑡𝑢𝑏𝑢𝑓) 𝑤 ← +∞ for each 𝑏 in 𝐵𝐷𝑈𝐽𝑃𝑂𝑇(𝑡𝑢𝑏𝑢𝑓) do 𝑤 ← 𝑁𝐽𝑂(𝑤, 𝑁𝐵𝑌_𝑊𝐵𝑀𝑉𝐹(𝑆𝐹𝑇𝑉𝑀𝑈𝑇(𝑡𝑢𝑏𝑢𝑓, 𝑏), 𝛽, 𝛾)) if 𝑤 ≤ 𝛽 then return 𝑤 𝛾 ← 𝑁𝐽𝑂(𝛾, 𝑤) return 𝑤 23

Order of moves  Good move ordering improves effectiveness of pruning ? 𝑛 2 )  Best order: time complexity is 𝑃(𝑐 3𝑛 4 )  Random order: time complexity is about 𝑃(𝑐 for moderate 𝑐  α - β pruning just improves the search time only partly 24

Computational time limit (example)  100 secs is allowed for each move (game rule)  10 4 nodes/sec (processor speed)  We can explore just 10 6 nodes for each move  b m = 10 6 , b=35 ⟹ m=4 (4-ply look-ahead is a hopeless chess player!) 25

Computational time limit: Solution  We must make a decision even when finding the optimal move is infeasible.  Cut off the search and apply a heuristic evaluation function  cutoff test: t urns non-terminal nodes into terminal leaves  Cut off test instead of terminal test ( e.g., depth limit)  evaluation function: estimated desirability of a state  Heuristic function evaluation instead of utility function  This approach does not guarantee optimality. 26

Heuristic minimax 𝐼 𝑁𝐽𝑂𝐽𝑁𝐵𝑌 𝑡,𝑒 = 𝐹𝑊𝐵𝑀(𝑡, 𝑁𝐵𝑌) 𝑗𝑔 𝐷𝑉𝑈𝑃𝐺𝐺_𝑈𝐹𝑇𝑈(𝑡, 𝑒) 𝑛𝑏𝑦 𝑏∈𝐵𝐷𝑈𝐽𝑃𝑂𝑇 𝑡 𝐼_𝑁𝐽𝑂𝐽𝑁𝐵𝑌(𝑆𝐹𝑇𝑉𝑀𝑈 𝑡, 𝑏 , 𝑒 + 1) 𝑄𝑀𝐵𝑍𝐹𝑆 𝑡 = MAX 𝑛𝑗𝑜 𝑏∈𝐵𝐷𝑈𝐽𝑃𝑂𝑇 𝑡 𝐼_𝑁𝐽𝑂𝐽𝑁𝐵𝑌(𝑆𝐹𝑇𝑉𝑀𝑈 𝑡, 𝑏 , 𝑒 + 1) 𝑄𝑀𝐵𝑍𝐹𝑆 𝑡 = MIN 27

Evaluation functions  For terminal states, it should order them in the same way as the true utility function.  For non-terminal states, it should be strongly correlated with the actual chances of winning.  It must not need high computational cost. 28

Adversarial Search CE417: Introduction to Artificial Intelligence - PowerPoint PPT Presentation

Adversarial Search CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach , 3 rd Edition, Chapter 5 Outline Game as a search problem

Adversarial Search Robert Platt Northeastern University Some images and slides are used from:

Adversarial Search Rob Platt Northeastern University Some images and slides are used from: AIMA

CHAPTERS 45: NON-CLASSICAL AND CHAPTERS 45: NON-CLASSICAL AND ADVERSARIAL SEARCH

Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search

CSE 473: Artificial Intelligence Today Spring 2012 Adversarial Search Minimax search

Adversarial Search Lecture 7 How can we use search to plan ahead when other agents are planning

Adversarial Search Lecture 6 How can we use search to plan ahead when other agents are planning

Adversarial Search Toolbox so far Uninformed search BFS, DFS, uniform cost search

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

CS70: Today Euclids GCD algorithm. Multiplicative Inverse. (define (euclid x y) (if (= y 0)

On the Shuffling Algorithm for the Aztec Nordenstam eno@kth.se Diamond Background Shuffling

COMP 3403 Algorithm Analysis Part 3 Chapters 6 7 Jim Diamond CAR 409 Jodrey School

Relativistic Effects Relativistic Bit . . . Can Keep Data Secret: Relativistic Bit . . . Why

Outline ` Mining Sequential Patterns PrefixSpan: Mining Sequential Patterns Problem

CS 2550 / Spring 2006 Principles of Database Systems Undo/No-Redo No-Undo/Redo

COMP 110-003 Introduction to Programming Branching Statements and Boolean Expressions January

; and if Please always put {} after if-statements The compiler will let you get away with not

Adversarial Search CE417: Introduction to Artificial Intelligence - PowerPoint PPT Presentation

Adversarial Search CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach , 3 rd Edition, Chapter 5 Outline Game as a search problem

Adversarial Search Robert Platt Northeastern University Some images and slides are used from:

Adversarial Search Rob Platt Northeastern University Some images and slides are used from: AIMA

CHAPTERS 45: NON-CLASSICAL AND CHAPTERS 45: NON-CLASSICAL AND ADVERSARIAL SEARCH

Game-Playing &amp; Adversarial Search This lecture topic: Game-Playing &amp; Adversarial Search

CSE 473: Artificial Intelligence Today Spring 2012 Adversarial Search Minimax search

Adversarial Search Lecture 7 How can we use search to plan ahead when other agents are planning

Adversarial Search Lecture 6 How can we use search to plan ahead when other agents are planning

Adversarial Search Toolbox so far Uninformed search BFS, DFS, uniform cost search

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

CS70: Today Euclids GCD algorithm. Multiplicative Inverse. (define (euclid x y) (if (= y 0)

On the Shuffling Algorithm for the Aztec Nordenstam eno@kth.se Diamond Background Shuffling

COMP 3403 Algorithm Analysis Part 3 Chapters 6 7 Jim Diamond CAR 409 Jodrey School

Relativistic Effects Relativistic Bit . . . Can Keep Data Secret: Relativistic Bit . . . Why

Outline ` Mining Sequential Patterns PrefixSpan: Mining Sequential Patterns Problem

CS 2550 / Spring 2006 Principles of Database Systems Undo/No-Redo No-Undo/Redo

COMP 110-003 Introduction to Programming Branching Statements and Boolean Expressions January

; and if Please always put {} after if-statements The compiler will let you get away with not

Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin