Monte Carlo Tree Search 2-15-16 Reading Quiz What is the - PowerPoint PPT Presentation

Monte Carlo Tree Search 2-15-16

Reading Quiz What is the relationship between Monte Carlo tree search and upper confidence bound applied to trees? a) MCTS is a type of UCB b) UCB is a type of MCTS c) both (they are the same algorithm) d) neither (they are different algorithms)

Consider hex on an NxN board. branching factor ≤ N 2 2N ≤ depth ≤ N 2 depth of 10 10 nodes board size max branching factor min depth tree size >10 17 6x6 36 12 7 >10 28 8x8 64 16 6 >10 44 11x11 121 22 5 >10 96 19x19 361 38 4

Heuristics are hard. Think about your board evaluation heuristics for Hex. ● Lots of human effort goes into designing a good heuristic. ● That effort isn’t transferrable to other domains.

Monte Carlo simulations Idea: evaluate states by playing out random games. function MC_BoardEval(state): wins = 0 losses = 0 for i=1:NUM_SAMPLES next_state = state while non_terminal(next_state): next_state = random_legal_move(next_state) if next_state.winner == state.turn: wins++ else: losses++ #needs slight modification if draws possible return (wins - losses) / (wins + losses)

Monte Carlo board evaluation Advantages Disadvantages ● simple ● slow ● domain independent ● nondeterministic ● anytime ● not great for alpha-beta pruning

Improving MC_BoardEval Consider one level up. Suppose we’re doing minimax search with a depth limit of 4 and using MC_BoardEval as our heuristic. What’s happening at depth 3? At this node, minimax picks the best move for the opponent. MC_BoardEval plays out NUM_SAMPLES random games from each of these nodes. Objective: allocate samples more effectively.

Multi-armed bandit problem Given a row of slot machines (bandits), with different, unknown, probabilities of winning a jackpot, use a fixed number of quarters to win as many jackpots as possible.

Upper confidence bound (UCB) Pick each node with probability proportional to: parent node visits value estimate number of visits tunable parameter ● probability is decreasing in the number of visits (explore) ● probability is increasing in a node’s value (exploit) ● always tries every option once

Why do this at only one level? Extend to deeper levels? + more value out of every random playout - more information to keep track of (how can we alleviate this?) Extend to shallower levels? + guide the search to explore better paths first - lose optimality of minimax (is this a big deal?) - never completely prune branches (is this a big deal?)

The Monte Carlo tree search algorithm

Selection ● Used for nodes we’ve seen before. ● Pick according to UCB. Expansion ● Used when we reach the frontier. ● Add one node per playout.

Simulation ● Used beyond the search frontier. ● Don’t bother with UCB, just play randomly. Backpropagation ● After reaching a terminal node. ● Update value and visits for states expanded in selection and expansion.

Basic MCTS pseudocode function MCTS_sample(state) state.visits++ if all children of state expanded: next_state = UCB_sample(state) winner = MCTS_sample(next_state) else: if some children of state expanded: next_state = expand(random unexpanded child) else: next_state = state winner = random_playout(next_state) update_value(state, winner)

MCTS helper functions function UCB_sample(state): weights = [] for child of state: w = child.value + C * sqrt(ln(state.visits) / child.visits) weights.append(w) distribution = [w / sum(weights) for w in weights] return child sampled according to distribution function random_playout(state): if is_terminal(state): return winner else: return random_playout(random_move(state))

MCTS helper functions function expand(state): state.visits = 1 state.value = 0 function update_value(state, winner): # Depends on the application. The following would work for hex. if winner == state.turn: state.value += 1 else: state.value -= 1

Note: reading assignments ● Wednesday has been updated to include sections 3.2-3.3. ● Friday has been updated to include miscellaneous short sections. ● Next week’s reading may change. I’ll send out an email if it does.

Monte Carlo Tree Search 2-15-16 Reading Quiz What is the - PowerPoint PPT Presentation

Monte Carlo Tree Search 2-15-16 Reading Quiz What is the relationship between Monte Carlo tree search and upper confidence bound applied to trees? a) MCTS is a type of UCB b) UCB is a type of MCTS c) both (they are the same algorithm) d)

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Monte-Carlo tree search for Monte-Carlo tree search for multi-player, no-limit multi-player,

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Modern Monte Carlo Tree Search Andrew Li, John Chen, Keiran Paster 1 Outline Motivation

Balanced Search Trees Binary Search Trees Binary Search Tree Binary Search Tree A binary tree is

Chapter 5: Monte Carlo Methods Monte Carlo methods are learning methods Experience

Draft Introduction to (randomized) quasi-Monte Carlo Pierre LEcuyer MCQMC Conference,

Monte Carlo Estimation 7 January 2019 OSU CSE 1 Monte Carlo Methods Class of computational

Monte Carlo Localization Ximing Yu March 24, 2009 Ximing Yu Monte Carlo Localization 1

Monte Carlo Control CMPUT 366: Intelligent Systems S&B 5.3-5.5, 5.7 Lecture Outline 1.

4. THE MONTE CARLO METHOD 4.1 I ntroduction This chapter is aimed at describing the Monte Carlo

CS171: Artificial Intelligence Monte Carlo Tree Search and Alpha Go Jia Chen Dec 5, 2017 1

Monte Carlo Tree Search for Algorithm Configuration: MOSAIC Herilalaina Rakotoarison and Mich`

Planning and Optimization December 16, 2019 G8. Monte-Carlo Tree Search Algorithms (Part II)

Forgetting to learn logic programs Andrew Cropper University of Oxford Program

Language Models: Evaluation & Neural Models CMSC 470 Marine Carpuat Slides credit: Jurasky

Branching Algorithms Dieter Kratsch Laboratoire dInformatique Th eorique et Appliqu ee

Solving CSP Overview Marco Chiarandini Department of Mathematics & Computer Science

CS 327E Class 4 Sept 18, 2020 Announcements Rubric clarification Test 1 details Exam

Game-Tree Search over High-Level Game States in RTS Games, by A. Uriarte and S. Onta n on

Heuristic Search CPSC 470/570 Artificial Intelligence Brian Scassellati Goal Formulation

Flow of Control -- Conditional branch instructions You can compare directly Equality or

Monte Carlo Tree Search 2-15-16 Reading Quiz What is the - PowerPoint PPT Presentation

Monte Carlo Tree Search 2-15-16 Reading Quiz What is the relationship between Monte Carlo tree search and upper confidence bound applied to trees? a) MCTS is a type of UCB b) UCB is a type of MCTS c) both (they are the same algorithm) d)

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Monte-Carlo tree search for Monte-Carlo tree search for multi-player, no-limit multi-player,

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Modern Monte Carlo Tree Search Andrew Li, John Chen, Keiran Paster 1 Outline Motivation

Balanced Search Trees Binary Search Trees Binary Search Tree Binary Search Tree A binary tree is

Chapter 5: Monte Carlo Methods Monte Carlo methods are learning methods Experience

Draft Introduction to (randomized) quasi-Monte Carlo Pierre LEcuyer MCQMC Conference,

Monte Carlo Estimation 7 January 2019 OSU CSE 1 Monte Carlo Methods Class of computational

Monte Carlo Localization Ximing Yu March 24, 2009 Ximing Yu Monte Carlo Localization 1

Monte Carlo Control CMPUT 366: Intelligent Systems S&amp;B 5.3-5.5, 5.7 Lecture Outline 1.

4. THE MONTE CARLO METHOD 4.1 I ntroduction This chapter is aimed at describing the Monte Carlo

CS171: Artificial Intelligence Monte Carlo Tree Search and Alpha Go Jia Chen Dec 5, 2017 1

Monte Carlo Tree Search for Algorithm Configuration: MOSAIC Herilalaina Rakotoarison and Mich`

Planning and Optimization December 16, 2019 G8. Monte-Carlo Tree Search Algorithms (Part II)

Forgetting to learn logic programs Andrew Cropper University of Oxford Program

Language Models: Evaluation &amp; Neural Models CMSC 470 Marine Carpuat Slides credit: Jurasky

Branching Algorithms Dieter Kratsch Laboratoire dInformatique Th eorique et Appliqu ee

Solving CSP Overview Marco Chiarandini Department of Mathematics &amp; Computer Science

CS 327E Class 4 Sept 18, 2020 Announcements Rubric clarification Test 1 details Exam

Game-Tree Search over High-Level Game States in RTS Games, by A. Uriarte and S. Onta n on

Heuristic Search CPSC 470/570 Artificial Intelligence Brian Scassellati Goal Formulation

Flow of Control -- Conditional branch instructions You can compare directly Equality or

Monte Carlo Control CMPUT 366: Intelligent Systems S&B 5.3-5.5, 5.7 Lecture Outline 1.

Language Models: Evaluation & Neural Models CMSC 470 Marine Carpuat Slides credit: Jurasky

Solving CSP Overview Marco Chiarandini Department of Mathematics & Computer Science