Mastering the game of Go with deep neural networks and tree search - PowerPoint PPT Presentation

Nov 06, 2023 •46 likes •556 views

Mastering the game of Go with deep neural networks and tree search Nature, Jan, 2016 Roadmap What this paper is about? Deep Learning Search problem How to explore a huge tree (graph) AlphaGo Video

Mastering the game of Go with deep neural networks and tree search Nature, Jan, 2016
Roadmap What this paper is about? • Deep Learning • Search problem • How to explore a huge tree (graph)
AlphaGo Video https://www.youtube.com/watch?v=53YLZBSS0cc https://www.youtube.com/watch?v=g-dKXOlsf98
rank Al AlphaGo vs vs*European*Champion*(Fan*Hui 27Da Dan) * October$5$– 9,$2015 <Official$match> I Time+limit:+1+hour I AlphaGo Wins (5:0)
Al AlphaGo vs vs*Wo World*Champion*(Lee*Se Sedol 97Da Dan) March$9$– 15,$2016 <Official$match> I Time+limit:+2+hours Venue :+ Seoul ,+Four+Seasons+Hotel
Lee*Sedol wiki Photo+source: Maeil+Economics 2013/04
Lee Sedol
=$multiple$machines European$champion
The Game
Go Elo Ranking http://www.goratings.org/history/
Lee Sedol VS Ke Jie
How about Other Games?
Tic Tac Toe
Chess
Chess (1996)
Deep Blue (1996)
AlphaGo is the Skynet?
Go Game
Simple Rules
High Complexity
High Complexity
Different Games
Search Problem (the search space)
Tic Tac Toe
Tic Tac Toe
The “Tree” in Tic Tac Toe
The “Tree” of Chess
The “Tree” of Go Game
Search Problem (how to search)
MiniMax in Tic Tac Toe
Adversarial"Search"–"MiniMax"" 1" 1" 1" 0" 0" J1" J1" 0" J1" 0" J1" J1" 5"
Adversarial"Search"–"MiniMax"" J1" 1" J1" 0" 1" 1" 0" J1" J1" J1" 1" 1" 0" 1" 0" J1" J1" 0" J1" 0" J1" J1" 6"
What is the problem? 1. Generate the Search Tree 2. use MinMax Search
The Size of the Tree Tic Tac Toe: b = 9, d =9 Chess: b = 35, d =80 Go: b = 250, d =150 b : number of legal move per position d : its depth (game length)
One Grain of Rice https://www.youtube.com/watch?v=byk3pA1GPgU
The “Space” of GO Game
How about other Games? • Flappy bird? • Angry Bird? Tic Tac Toe: • Starcraft? b = 9, d =9 • learning a language Chess: b = 35, d =80 • Write a paper Go: • Get a MS/PhD degree b = 250, d =150 • Finding a job • Life
How to solve?
Chess (1996)
Monte Carlo
Las Vegas
Monte"Carlo"Tree"Search" Tree"search" ……." ……." ……." ……." Monte"Carlo"search" ……." ……." ……." ……." ……." 7"
Monte"Carlo"Tree"Search" • Tree"Search"+"Monte"Carlo"Method"" – SelecIon" white"wins"/"total" 3/5" – Expansion" – SimulaIon" 2/3" 1/2" – BackJPropagaIon" 1/1" 1/2" 1/1" 0/1" 1/1" 0/1" 8"

Recommend

THE MOD METHOD with VESPERS MASTERING In this Module What mastering can do & what it

THE MOD METHOD with VESPERS MASTERING In this Module What mastering can do & what it cant Self-mastering vs. third-party mastering Picking a mastering engineer Mastering work fl ow Audio artifacts & fi delity:

520 views • 16 slides

e-Bug Junior Game Junior Game Game Style Game Process Demo Game Mechanics and

e-Bug Junior Game Junior Game Game Style Game Process Demo Game Mechanics and Learning Outcomes Feedback to date and evaluation plans Game Style Game style encompasses two genres of game o Platform game fast action

575 views • 22 slides

e-Bug Senior Game Senior Game Game Style Game Process Demo Game Puzzles and

e-Bug Senior Game Senior Game Game Style Game Process Demo Game Puzzles and Learning Outcomes Feedback to date and evaluation plans Game Style Game is a detective game where the player investigates microbial problems.

572 views • 19 slides

Game interoperability with functors functor AgsFun (structure Game : GAME) :> sig structure

Game interoperability with functors functor AgsFun (structure Game : GAME) :> sig structure Game : GAME val bestmove : Game.config -> Game.Move.move option val forecast : Game.config -> Player.outcome end where type Game.Move.move =

247 views • 22 slides

Deep Reinforcement Learning [Mastering the Game of Go with Deep Reinforcement Learning and Tree

Deep Reinforcement Learning [Mastering the Game of Go with Deep Reinforcement Learning and Tree Search, Nature 2016] CS 486/686 University of Waterloo Lecture 21: July 12, 2017 Outline AlphaGo Supervised Learning of Policy Networks

541 views • 15 slides

Mastering Complex Complex Analogue Analogue Mixed Signal Mixed Signal Mastering Systems with

Mastering Complex Complex Analogue Analogue Mixed Signal Mixed Signal Mastering Systems with with SystemC SystemC- -AMS AMS Systems Karsten Einwich Fraunhofer IIS EAS Dresden karsten.einwich@eas.iis.fraunhofer.de Thomas Arndt, Uwe

410 views • 22 slides

Download How to Wash a Chicken Mastering the Business Presentation pdf ebook by Tim Calkins

Download How to Wash a Chicken Mastering the Business Presentation pdf ebook by Tim Calkins You're readind a review How to Wash a Chicken Mastering the Business Presentation ebook. To get able to download How to Wash a Chicken Mastering the

158 views • 3 slides

Mastering the Gospel P resentation Welcome to the CMF Training page on Mastering a Gospel

Mastering the Gospel P resentation Welcome to the CMF Training page on Mastering a Gospel Presentation. This may be the most important thing that you do and you need to be able to do this as you grow in your Christian walk. You may be the only

290 views • 6 slides

Mastering Your Mindset Mastering Your Money Focus: How to Focus on Earning More Income, and

Mastering Your Mindset Mastering Your Money Focus: How to Focus on Earning More Income, and Managing & Investing Your Money with John Assaraf What Is Money? Money Is A VERY Powerful Idea Money Is A Means Of Exchange Money Is

955 views • 40 slides

Game Loops CIS 580 - Fundamentals of Game Programming Hangman Game Phases Game Loop

Game Loops CIS 580 - Fundamentals of Game Programming Hangman Game Phases Game Loop Turn-based vs. Real-Time Turn-Based Real-Time Fixed-Timestep vs. Variable Timestep Example - Ballistic Motion x (delta t/timestep) Hybrid Game Loop

332 views • 20 slides

VIDEOGAMES ARE A MESS Ian Bogost WHAT IS A GAME? Is a game a system of rules, or is a game a

VIDEOGAMES ARE A MESS Ian Bogost WHAT IS A GAME? Is a game a system of rules, or is a game a kind of narrative? LUDOLOGY vs. NARRATOLOGY Is a game a system of rules, or is a game a kind of narrative? Is a game a system of rules, like a

1.3k views • 88 slides

Nash demand game Julio D avila 2009 Julio D avila Nash demand game Nash demand game

Nash demand game Julio D avila 2009 Julio D avila Nash demand game Nash demand game bargaining problem ( U , u ) B Julio D avila Nash demand game Nash demand game bargaining problem ( U , u ) B associated demand game:

790 views • 62 slides

Connect your device to application GAME ENGINE ON ANDROID Julian Chu Agenda We Love Game Why

Connect your device to application GAME ENGINE ON ANDROID Julian Chu Agenda We Love Game Why need Game Engine What is Game Engine How many Game Engine Get one for You Implementation We Love Game Do You Love Playing Game? I DO

777 views • 50 slides

Inductive general game playing Andrew Cropper, Richard Evans, and Mark Law General game playing

Inductive general game playing Andrew Cropper, Richard Evans, and Mark Law General game playing competition Game description language initial game state legal moves how moves update the game state how the game terminates Game

610 views • 33 slides

Mastering the game of Go with deep neural networks and tree search Article overview by

David Silver et al. from Google DeepMind Mastering the game of Go with deep neural networks and tree search Article overview by Reinforcement Learning Seminar Ilya Kuzovkin University of Tartu, 2016 T HE G AME OF G O B OARD B OARD S TONES B

907 views • 66 slides

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys: surveys: the case of the Chandra Deep Field South the case of the Chandra Deep Field South the case of the Chandra Deep Field South Fabrizio Fiore

423 views • 21 slides

The natural emergence of the SFR-H2 surface density relation in galaxy simulations Alessandro

The natural emergence of the SFR-H2 surface density relation in galaxy simulations Alessandro Lupi (Institut dAstrophysique de Paris) THE ROLE OF GAS IN GALAXY DYNAMICS with: S. Bovino, P. R. Capelo, M. Volonteri, J. Silk October 2nd, 2017

543 views • 14 slides

Energy-Efficient Management of Virtual Machines in Data Centers for Cloud Computing Anton

H EURISTICS M ARKOV H OST O VERLOAD D ETECTION I MPLEMENTATION C ONCLUSIONS Energy-Efficient Management of Virtual Machines in Data Centers for Cloud Computing Anton Beloglazov Supervisor: Prof. Rajkumar Buyya The Cloud Computing and

995 views • 87 slides

Inference and Optimalities in Estimation of Gaussian Graphical Model Harrison H. Zhou Department

Inference and Optimalities in Estimation of Gaussian Graphical Model Harrison H. Zhou Department of Statistics Yale University Jointly with Zhao Ren, Tingni Sun and Cun-Hui Zhang 1 Outline Introduction Main Results Asymptotic

509 views • 27 slides

Zero forcing, propagation time, and throttling on a graph Leslie Hogben Iowa State University

Zero forcing, propagation time, and throttling on a graph Leslie Hogben Iowa State University and American Institute of Mathematics New York Combinatorics Seminar August 28, 2020 Leslie Hogben (Iowa State University and American Institute of

752 views • 53 slides

Ada, or How to Enforce Safety Rules at Compile Time Jean-Pierre Rosen Adalog www.adalog.fr

Ada, or How to Enforce Safety Rules at Compile Time Jean-Pierre Rosen Adalog www.adalog.fr Safety Integrity Levels and Segregation Railway systems: EN-50128 defines 5 integrity levels From SIL0 (not critical) to SIL4 (highest

321 views • 8 slides

CPU Scheduling Schedulers in the OS Structure of a CPU Scheduler Scheduling =

CPSC 410 / 611 : Operating Systems CPU Scheduling Schedulers in the OS Structure of a CPU Scheduler Scheduling = Selection + Dispatching Criteria for scheduling Scheduling Algorithms FIFO/FCFS SPF / SRTF

342 views • 16 slides

In-medium QQ potential from lattice QCD & the generalized Gauss-law Alexander Rothkopf

In-medium QQ potential from lattice QCD & the generalized Gauss-law Alexander Rothkopf Faculty of Science and Technology Department of Mathematics and Physics University of Stavanger References : P. Petreczky, A.R., J. Weber, NPA982

955 views • 66 slides

production environment Yahoo! Chiebukuro (a CQA service of Yahoo! Japan) Task Given a

Overview of the NTCIR-13 O penLive Q Task Makoto P. Kato, Takehiro Yamamoto (Kyoto University) , Sumio Fujita, Akiomi Nishida, Tomohiro Manabe (Yahoo Japan Corporation) Agenda Task Design (3 slides) Data (5 slides) Evaluation

532 views • 30 slides