Recent Advances in Reinforcement Learning (with a focus on - PowerPoint PPT Presentation

01/29/2020 Recent Advances in Reinforcement Learning (with a focus on ) Patrick Scholz Division of Computer Assisted Medical Interventions

Author Division Taxonomic position of RL 01/28/2020 | Page2 01/29/2020 |

Author Division Basics of RL 01/28/2020 | Page3 Markov Decision Process S – States A – Possible Actions P – Transition Probability R – Immediate Reward Policy Cumulative reward 01/29/2020 |

Author Division Deep RL within the last years wrt 01/28/2020 | Page4 AlphaGo AlphaZero MuZero Zero AlphaGo 2015 2016 2017 2018 2019 01/29/2020 |

Author Division “Deep” Learning and Reinforcement learning 01/28/2020 | Page5 Mnih, V., Kavukcuoglu, K., Silver, D. et al. ‘Human-level control through deep reinforcement learning’. Nature 518, 529–533 (2015). https://doi.org/10.1038/nature14236 01/29/2020 |

Author Division „Go“ as the next holy grail 01/28/2020 | Page6 Using expert moves for Playing against earlier versions supervised learning to generate data Defeated Lee Sedol (world champion) in a regular match 4:1 (using 48 TPUs) Silver, D., Huang, A., Maddison, C. et al. ‘Mastering the game of Go with deep neural networks and tree search’. Nature 529, 484–489 (2016). https://doi.org/10.1038/nature16961 01/29/2020 |

Author Division „Go“ as the next holy grail 01/28/2020 | Page7 Monte Carlo Tree Search Silver, D., Huang, A., Maddison, C. et al. ‘Mastering the game of Go with deep neural networks and tree search’. Nature 529, 484–489 (2016). https://doi.org/10.1038/nature16961 01/29/2020 |

Author Division Dropping initial human input 01/28/2020 | Page8 Major design changes: ● using MCTS action distribution to train ● combining policy and value network ● switching to ResNet architecture ● no hand-crafted input features any more Defeated AlphaGo after 72h under same conditions 100:0 (using 4 TPUs) Silver, D., Schrittwieser, J., Simonyan, K. et al. ‘Mastering the game of Go without human knowledge’. Nature 550, 354–359 (2017). https://doi.org/10.1038/nature24270 01/29/2020 |

Author Division Generalizing input/output representation 01/28/2020 | Page9 Major design changes: ● including draws ● no augmentation exploitation any more ● continuously updating instead of choosing a winner after iteration ● always same hyper- parameters Silver, David, et al. ‘A General Reinforcement Learning Algorithm That Masters Chess, Shogi, and Go through Self-Play’. Science, vol. 362, no. 6419, Dec. 2018, pp. 1140–44. 01/29/2020 |

Author Division Leaving perfect information environments 01/28/2020 | Page10 representation function h prediction function f dynamics function g A: planning B: acting C: training Schrittwieser, Julian, et al. ‘Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model’. ArXiv:1911.08265 [Cs, Stat], Nov. 2019. arXiv.org, http://arxiv.org/abs/1911.08265. 01/29/2020 |

Author Division Leaving perfect information environments 01/28/2020 | Page11 learns all game rules on its own Compared against: Stockfish (chess), Elmo (Shogi), AlphaZero (Go), R2D2 (Atari) Schrittwieser, Julian, et al. ‘Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model’. ArXiv:1911.08265 [Cs, Stat], Nov. 2019. arXiv.org, http://arxiv.org/abs/1911.08265. 01/29/2020 |

Author Division Some other advances 01/28/2020 | Page12 Hide and Seek AlphaStar approx. Starcraft Chess Go values II 10 26 breadth 35 250 Multiple agents in an open environment depth 80 150 1000s 01/29/2020 |

Author Division Thank you for your attention! 01/28/2020 | Page13 Any questions? 01/29/2020 |

Recent Advances in Reinforcement Learning (with a focus on - PowerPoint PPT Presentation

01/29/2020 Recent Advances in Reinforcement Learning (with a focus on ) Patrick Scholz Division of Computer Assisted Medical Interventions Author Division Taxonomic position of RL 01/28/2020 | Page2 01/29/2020 |

Demystifying the efficiency of reinforcement learning: A few recent stories Yuxin Chen EE,

Recent Advances in Photonic Recent Advances in Photonic effect employing IP- based distributed

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning by the People and for the People: With a Focus on Lifelong / Meta /

Recent Advances in Machine Learning for Mathematical Reasoning Steven Van Vaerenbergh

Recent Advances in Adversarial Machine Learning Nicholas Carlini Google Research Recent

Recent Advances in Biomolecular NMR Lucia Banci CERM University of Florence Recent Advances

Recent Advances in Biomolecular NMR Lucia Banci CERM University of Florence Recent Advances

Some Recent Algorithmic Questions in Deep Reinforcement Learning CS 285 Instructor: Aviral

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

, , Weakly Supervised Classification Robust Learning and More: Robust Learning and More:

Doing More with More: Recent Achievements in Large-Scale Deep Reinforcement Learning Compiled by:

Em ploying Recent Advances in Machine Learning for Opinion Sum m arization Claire Cardie

Recent Advances in Machine Learning And Their Application to Networking David Meyer

Recent Advances and Key Challenges Russ Salakhutdinov Machine Learning Department Carnegie Mellon

Reinforcement Learning You can think of supervised learning as the teacher providing answers

A Robotic Auto-Focus System based on Deep Reinforcement Learning Xiaofan Yu, Runze Yu, Jingsong

Tensor Network Representation for Machine Learning - Recent Advances and Perspectives Qibin ZHAO

Examples of Reinforcement Learning Robocup Soccer Teams Stone & Veloso, Reidmiller et al.

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Reinforcement Learning Reinforcement Learning Now that you know a little about Optimal Control

Reinforcement Learning CS 4100: Artificial Intelligence Reinforcement Learning II Still

R i f R i f Reinforcement Learning III Reinforcement Learning III t L t L i i III III Dec

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Recent Advances in Reinforcement Learning (with a focus on - PowerPoint PPT Presentation

01/29/2020 Recent Advances in Reinforcement Learning (with a focus on ) Patrick Scholz Division of Computer Assisted Medical Interventions Author Division Taxonomic position of RL 01/28/2020 | Page2 01/29/2020 |

Demystifying the efficiency of reinforcement learning: A few recent stories Yuxin Chen EE,

Recent Advances in Photonic Recent Advances in Photonic effect employing IP- based distributed

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning by the People and for the People: With a Focus on Lifelong / Meta /

Recent Advances in Machine Learning for Mathematical Reasoning Steven Van Vaerenbergh

Recent Advances in Adversarial Machine Learning Nicholas Carlini Google Research Recent

Recent Advances in Biomolecular NMR Lucia Banci CERM University of Florence Recent Advances

Recent Advances in Biomolecular NMR Lucia Banci CERM University of Florence Recent Advances

Some Recent Algorithmic Questions in Deep Reinforcement Learning CS 285 Instructor: Aviral

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

, , Weakly Supervised Classification Robust Learning and More: Robust Learning and More:

Doing More with More: Recent Achievements in Large-Scale Deep Reinforcement Learning Compiled by:

Em ploying Recent Advances in Machine Learning for Opinion Sum m arization Claire Cardie

Recent Advances in Machine Learning And Their Application to Networking David Meyer

Recent Advances and Key Challenges Russ Salakhutdinov Machine Learning Department Carnegie Mellon

Reinforcement Learning You can think of supervised learning as the teacher providing answers

A Robotic Auto-Focus System based on Deep Reinforcement Learning Xiaofan Yu, Runze Yu, Jingsong

Tensor Network Representation for Machine Learning - Recent Advances and Perspectives Qibin ZHAO

Examples of Reinforcement Learning Robocup Soccer Teams Stone &amp; Veloso, Reidmiller et al.

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Reinforcement Learning Reinforcement Learning Now that you know a little about Optimal Control

Reinforcement Learning CS 4100: Artificial Intelligence Reinforcement Learning II Still

R i f R i f Reinforcement Learning III Reinforcement Learning III t L t L i i III III Dec

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Examples of Reinforcement Learning Robocup Soccer Teams Stone & Veloso, Reidmiller et al.