1 - Pruning Pseudocode - Pruning Properties Pruning has no - PDF document

Recap: Resource Limits Introduction to Artificial Intelligence • Cannot search to leaves max 4 V22.0472-001 Fall 2009 -2 4 • Depth-limited search min min • Instead, search a limited depth of -1 -2 4 9 tree • Replace terminal utilities with an Lecture 6: Expectimax Search Lecture 6: Expectimax Search eval function for non-terminal eval function for non terminal positions • Guarantee of optimal play is gone • Replanning agents: Rob Fergus – Dept of Computer Science, Courant Institute, NYU • Search to choose next action • Replan each new turn in response to new state Many slides from Dan Klein, Stuart Russell or Andrew Moore ? ? ? ? 2 Evaluation for Pacman Iterative Deepening Iterative deepening uses DFS as a subroutine: b 1. Do a DFS which only searches for paths of length 1 … or less. (DFS gives up on any path of length 2) 2. If “1” failed, do a DFS which only searches paths of length 2 or less. 3. If “2” failed, do a DFS which only searches paths of length 3 or less. ….and so on. This works for single-agent search as well! Why do we want to do this for multiplayer games? 3 4 α - β Pruning Example α - β Pruning • General configuration • α is the best value that MAX Player can get at any choice point along the current path α Opponent • If n becomes worse than α , MAX will avoid it, so can stop considering n ’s other Player children • Define β similarly for MIN n Opponent 5 6 1

α - β Pruning Pseudocode α - β Pruning Properties • Pruning has no effect on final result • Good move ordering improves effectiveness of pruning • With “perfect ordering”: With perfect ordering : • Time complexity drops to O(b m/2 ) • Doubles solvable depth • Full search of, e.g. chess, is still hopeless! • A simple example of metareasoning, here reasoning about β which computations are relevant v 7 8 Expectimax Search Trees Maximum Expected Utility • What if we don’t know what the result • Why should we average utilities? Why not minimax? of an action will be? E.g., • In solitaire, next card is unknown • In minesweeper, mine locations max • Principle of maximum expected utility: an agent should chose • In pacman, the ghosts act randomly the action which maximizes its expected utility, given its • Can do expectimax search knowledge • Chance nodes, like min nodes, except Ch d lik i d t chance chance the outcome is uncertain • Calculate expected utilities • General principle for decision making • Max nodes as in minimax search • Often taken as the definition of rationality • Chance nodes take average (expectation) of value of children 10 4 5 7 • We’ll see this idea over and over in this course! Later, we’ll learn how to formalize the • underlying problem as a Markov • Let’s decompress this definition… Decision Process 9 10 Reminder: Probabilities What are Probabilities? • Objectivist / frequentist answer: • A random variable represents an event whose outcome is unknown A probability distribution is an assignment of weights to outcomes • • Averages over repeated experiments • E.g. empirically estimating P(rain) from historical observation • Example: traffic on freeway? • Assertion about how future experiments will go (in the limit) • Random variable: T = whether there’s traffic • New evidence changes the reference class • Outcomes: T in {none, light, heavy} • Makes one think of inherently random events like rolling dice Makes one think of inherently random events, like rolling dice • • Distribution: P(T=none) = 0.25, P(T=light) = 0.55, P(T=heavy) = 0.20 Di t ib ti P(T ) 0 25 P(T li ht) 0 55 P(T h ) 0 20 • Some laws of probability (more later): • Subjectivist / Bayesian answer: Probabilities are always non-negative • • Degrees of belief about unobserved variables • Probabilities over all possible outcomes sum to one • E.g. an agent’s belief that it’s raining, given the temperature • As we get more evidence, probabilities may change: • E.g. pacman’s belief that the ghost will turn left, given the state • P(T=heavy) = 0.20, P(T=heavy | Hour=8am) = 0.60 • Often learn probabilities from past experiences (more later) • We’ll talk about methods for reasoning and updating probabilities later • New evidence updates beliefs (more later) 11 12 2

Uncertainty Everywhere Reminder: Expectations • Often a quantity of interest depends on a random variable Not just for games of chance! • • I’m snuffling: am I sick? • The expected value of a function is its average output, weighted • Email contains “FREE!”: is it spam? • Tooth hurts: have cavity? by a given distribution over inputs • 60 min enough to get to the airport? • Robot rotated wheel three times, how far did it advance? • Example: How late if I leave 60 min before my flight? p y g • Safe to cross street? (Look both ways!) Safe to cross street? (Look both ways!) • Lateness is a function of traffic: • Why can a random variable have uncertainty? L(none) = -10, L(light) = -5, L(heavy) = 15 • Inherently random process (dice, etc) • What is my expected lateness? • Insufficient or weak evidence • Need to specify some belief over T to weight the outcomes • Ignorance of underlying processes • Say P(T) = {none: 2/5, light: 2/5, heavy: 1/5} • Unmodeled variables • The expected lateness: • The world’s just noisy! • Compare to fuzzy logic , which has degrees of truth , or rather than just degrees of belief 13 14 Expectations Utilities • Real valued functions of random variables: • Utilities are functions from outcomes (states of the world) to real numbers that describe an agent’s preferences • Expectation of a function of a random variable • Where do utilities come from? • In a game, may be simple (+1/-1) • Utilities summarize the agent’s goals Utilities summarize the agent s goals • Theorem: any set of preferences between outcomes can be summarized as a utility function (provided the preferences meet certain conditions) • Example: Expected value of a fair die roll • In general, we hard-wire utilities and let actions emerge (why P X f don’t we let agents decide their own utilities?) 1 1/6 1 2 1/6 2 • More on utilities soon… 3 1/6 3 4 1/6 4 5 1/6 5 15 16 6 1/6 6 Expectimax Search Expectimax Pseudocode • In expectimax search, we have a def value(s) probabilistic model of how the if s is a max node return maxValue(s) opponent (or environment) will if s is an exp node return expValue(s) behave in any state • Model could be a simple uniform if s is a terminal node return evaluation(s) distribution (roll a die) • Model could be sophisticated and def maxValue(s) def maxValue(s) require a great deal of computation require a great deal of computation • We have a node for every outcome values = [value(s’) for s’ in successors(s)] out of our control: opponent or return max(values) environment 8 4 5 6 • The model might say that adversarial actions are likely! def expValue(s) For now, assume for any state we • values = [value(s’) for s’ in successors(s)] magically have a distribution to assign probabilities to opponent weights = [probability(s, s’) for s’ in successors(s)] actions / environment outcomes return expectation(values, weights) Having a probabilistic belief about an agent’s action does not mean 17 18 that agent is flipping any coins! 3

Expectimax for Pacman Expectimax Pruning? • Notice that we’ve gotten away from thinking that the ghosts are trying to minimize pacman’s score • Instead, they are now a part of the environment • Pacman has a belief (distribution) over how they will act • Quiz: Can we see minimax as a special case of expectimax? • Quiz: what would pacman s computation look like if we • Q i h t ld ’ t ti l k lik if assumed that the ghosts were doing 1-ply minimax and taking the result 80% of the time, otherwise moving randomly? • If you take this further, you end up calculating belief distributions over your opponents’ belief distributions over your belief distributions, etc… • Can get unmanageable very quickly! 19 20 Expectimax Evaluation Mixed Layer Types • E.g. Backgammon • For minimax search, evaluation function insensitive • Expectiminimax to monotonic transformations • Environment is an extra • We just want better states to have higher evaluations (get player that moves after the ordering right) each agent • Chance nodes take • For expectimax, we need the magnitudes to be expectations, otherwise like minimax meaningful as well • E.g. must know whether a 50% / 50% lottery between A and B is better than 100% chance of C • 100 or -10 vs 0 is different than 10 or -100 vs 0 21 22 Stochastic Two-Player Non-Zero-Sum Games • Similar to • Dice rolls increase b : 21 possible rolls with minimax: 2 dice • Utilities are now • Backgammon ≈ 20 legal moves tuples • Depth 4 = 20 x (21 x 20) 3 1.2 x 10 9 • Each player • As depth increases, probability of reaching maximizes their a given node shrinks a given node shrinks own entry at own entry at each node • So value of lookahead is diminished • Propagate (or • So limiting depth is less damaging back up) nodes • But pruning is less possible… from children • TDGammon uses depth-2 search + very • Can give rise to good eval function + reinforcement cooperation and 1,2,6 4,3,2 6,1,2 7,4,1 5,1,1 1,5,2 7,7,1 5,4,5 learning: world-champion level play competition dynamically… 23 24 4

1 - Pruning Pseudocode - Pruning Properties Pruning has no - PDF document

Recap: Resource Limits Introduction to Artificial Intelligence Cannot search to leaves max 4 V22.0472-001 Fall 2009 -2 4 Depth-limited search min min Instead, search a limited depth of -1 -2 4 9 tree Replace

Introduction to Machine Learning CART: Stopping Criteria & Pruning

Efficient Search-Space Pruning for Integrated Fusion and Tiling Transformations Xiaoyang Gao,

Random Sampling Revisited: Lattice Enumeration with Discrete Pruning Yoshinori Aono

High-Performance Hardware for Machine Learning U.C. Berkeley October 19, 2016 William Dally

TVM at Facebook Lots of contributors at FB and elsewhere TVM at Facebook Why TVM? Examples from

Near-Optimal Offline Cleaning for Flash-Based SSDs MANSOUR SHAFAEI & PETER DESNOYERS

XDN: Towards Efficient Inference of Residual Neural Networks on Cambricon Chips 2019

A stable skeletonization for tabletop gesture recognition Andoni Beristain, Manuel Graa

More on games (Ch. 5.4-5.6) Announcements Midterm next Tuesday: covers weeks 1-4 (Chapters 1-4)

Model Compression Seminar: Advanced Machine Learning, SS 2016 Markus Beuckelmann

- Pruning the SASLOG Digging into the Roots of NOTEs, WARNINGs, and ERRORs A

Outline Games Perfect play: principles of adversarial search minimax decisions

Geosphere: C Consistently T y Turning M MIMO C O Capaci city y into T Throughput

Model Compression Presented by : Ashutosh Adhikari Neural Networks Can be Too Huge !! - NNs

Introduction to Machine Learning Session 2a: Introduction to Classification and Regression Trees

Accelerating Convolutional Neural networks on FPGA SoC Francesco Restuccia, Ph.D. Fellow

The Dark Side of DNN Pruning Reza Yazdani Marc Riera Jose-Maria Arnau Antonio Gonzlez

Second Wednesdays | 1:00 2:15 pm ET www.fs.fed.us/research/urban-webinars USDA is an equal

Thank you for having me today! I teach Life Science, an introductory biology seminar course that

Disciple Training on the Mountain REXBURG, Idaho In what botanists around the world are

Joint Optimization of Wireless Power Transfer and Collaborative Beamforming for Relay

BPF # readahead.bt Attaching 5 probes... ^C Readahead unused pages: 128 Observability

Lecture 04: Performance Name: ID: EX-1 If computer A runs a program in 10 seconds and computer

Paper Overview Study the Internet using game theory On a Network Creation Game Define a

1 - Pruning Pseudocode - Pruning Properties Pruning has no - PDF document

Recap: Resource Limits Introduction to Artificial Intelligence Cannot search to leaves max 4 V22.0472-001 Fall 2009 -2 4 Depth-limited search min min Instead, search a limited depth of -1 -2 4 9 tree Replace

Introduction to Machine Learning CART: Stopping Criteria &amp; Pruning

Efficient Search-Space Pruning for Integrated Fusion and Tiling Transformations Xiaoyang Gao,

Random Sampling Revisited: Lattice Enumeration with Discrete Pruning Yoshinori Aono

High-Performance Hardware for Machine Learning U.C. Berkeley October 19, 2016 William Dally

TVM at Facebook Lots of contributors at FB and elsewhere TVM at Facebook Why TVM? Examples from

Near-Optimal Offline Cleaning for Flash-Based SSDs MANSOUR SHAFAEI &amp; PETER DESNOYERS

XDN: Towards Efficient Inference of Residual Neural Networks on Cambricon Chips 2019

A stable skeletonization for tabletop gesture recognition Andoni Beristain, Manuel Graa

More on games (Ch. 5.4-5.6) Announcements Midterm next Tuesday: covers weeks 1-4 (Chapters 1-4)

Model Compression Seminar: Advanced Machine Learning, SS 2016 Markus Beuckelmann

- Pruning the SASLOG Digging into the Roots of NOTEs, WARNINGs, and ERRORs A

Outline Games Perfect play: principles of adversarial search minimax decisions

Geosphere: C Consistently T y Turning M MIMO C O Capaci city y into T Throughput

Model Compression Presented by : Ashutosh Adhikari Neural Networks Can be Too Huge !! - NNs

Introduction to Machine Learning Session 2a: Introduction to Classification and Regression Trees

Accelerating Convolutional Neural networks on FPGA SoC Francesco Restuccia, Ph.D. Fellow

The Dark Side of DNN Pruning Reza Yazdani Marc Riera Jose-Maria Arnau Antonio Gonzlez

Second Wednesdays | 1:00 2:15 pm ET www.fs.fed.us/research/urban-webinars USDA is an equal

Thank you for having me today! I teach Life Science, an introductory biology seminar course that

Disciple Training on the Mountain REXBURG, Idaho In what botanists around the world are

Joint Optimization of Wireless Power Transfer and Collaborative Beamforming for Relay

BPF # readahead.bt Attaching 5 probes... ^C Readahead unused pages: 128 Observability

Lecture 04: Performance Name: ID: EX-1 If computer A runs a program in 10 seconds and computer

Paper Overview Study the Internet using game theory On a Network Creation Game Define a

Introduction to Machine Learning CART: Stopping Criteria & Pruning

Near-Optimal Offline Cleaning for Flash-Based SSDs MANSOUR SHAFAEI & PETER DESNOYERS