Reinforcement Learning Environments Fully-observable vs - PowerPoint PPT Presentation

Sep 13, 2023 •305 likes •393 views

Reinforcement Learning Environments Fully-observable vs par9ally-observable Single agent vs mul9ple agents Determinis9c vs stochas9c Episodic vs

Reinforcement ¡Learning ¡
Environments ¡ • Fully-‑observable ¡vs ¡par9ally-‑observable ¡ • Single ¡agent ¡vs ¡mul9ple ¡agents ¡ • Determinis9c ¡vs ¡stochas9c ¡ • Episodic ¡vs ¡sequen9al ¡ • Sta9c ¡or ¡dynamic ¡ • Discrete ¡or ¡con9nuous ¡
What ¡is ¡reinforcement ¡learning? ¡ • Three ¡machine ¡learning ¡paradigms: ¡ – Supervised ¡learning ¡ – Unsupervised ¡learning ¡(overlaps ¡w/ ¡data ¡mining) ¡ – Reinforcement ¡learning ¡ • In ¡reinforcement ¡learning, ¡the ¡agent ¡receives ¡ incremental ¡pieces ¡of ¡feedback, ¡called ¡ rewards, ¡that ¡it ¡uses ¡to ¡judge ¡whether ¡it ¡is ¡ ac9ng ¡correctly ¡or ¡not. ¡
Examples ¡of ¡real-‑life ¡RL ¡ • Learning ¡to ¡play ¡chess. ¡ • Animals ¡learning ¡to ¡walk. ¡ • Driving ¡to ¡school ¡or ¡work ¡in ¡the ¡morning. ¡ • Key ¡idea : ¡Most ¡RL ¡tasks ¡are ¡ episodic , ¡meaning ¡ they ¡repeat ¡many ¡9mes. ¡ – So ¡unlike ¡in ¡other ¡AI ¡problems ¡where ¡you ¡have ¡ one ¡shot ¡to ¡get ¡it ¡right, ¡in ¡RL, ¡it's ¡OK ¡to ¡take ¡9me ¡ to ¡try ¡different ¡things ¡to ¡see ¡what's ¡best. ¡
Episodes, ¡explora9on, ¡and ¡exploita9on ¡
RL ¡problems ¡ • Every ¡RL ¡problem ¡is ¡structured ¡similarly. ¡ • We ¡have ¡an ¡ environment , ¡which ¡consists ¡of ¡a ¡ set ¡of ¡ states , ¡and ¡ ac,ons ¡that ¡can ¡be ¡taken ¡in ¡ various ¡states. ¡ ¡ ¡ – Environment ¡is ¡oTen ¡stochas9c ¡(there ¡is ¡an ¡ element ¡of ¡chance). ¡ • Our ¡RL ¡agent ¡wishes ¡to ¡learn ¡a ¡ policy , ¡π, ¡a ¡ func9on ¡that ¡maps ¡states ¡to ¡ac9ons. ¡
What ¡is ¡the ¡goal ¡in ¡RL? ¡ • In ¡other ¡AI ¡problems, ¡the ¡"goal" ¡is ¡to ¡get ¡to ¡a ¡ certain ¡state. ¡ ¡Not ¡in ¡RL! ¡ • A ¡RL ¡environment ¡gives ¡feedback ¡every ¡9me ¡the ¡ agent ¡takes ¡an ¡ac9on. ¡ ¡This ¡is ¡called ¡a ¡reward. ¡ – Rewards ¡are ¡usually ¡numbers. ¡ – Goal: ¡Agent ¡wants ¡to ¡maximize ¡the ¡amount ¡of ¡reward ¡ it ¡gets ¡over ¡9me. ¡ – Cri9cal ¡point: ¡Rewards ¡are ¡given ¡by ¡the ¡environment, ¡ not ¡the ¡agent. ¡
Mathema9cs ¡of ¡rewards ¡ • Assume ¡our ¡rewards ¡are ¡r 0 , ¡r 1 , ¡r 2 , ¡… ¡ • What ¡expression ¡represents ¡our ¡total ¡ rewards? ¡ • How ¡do ¡we ¡maximize ¡this? ¡Is ¡this ¡a ¡good ¡idea? ¡

Recommend

Reinforcement Learning in Configurable Continuous Environments Alberto Maria Metelli, Emanuele

Reinforcement Learning in Configurable Continuous Environments Alberto Maria Metelli, Emanuele Ghelfi and Marcello Restelli 36th International Conference on Machine Learning 13th June 2019 1 Non-Configurable Environments 1 Configurable

596 views • 12 slides

Reinforcement Learning Environments Fully-observable vs partially-observable Single agent

Reinforcement Learning Environments Fully-observable vs partially-observable Single agent vs multiple agents Deterministic vs stochastic Episodic vs sequential Static or dynamic Discrete or continuous What is reinforcement

684 views • 39 slides

Understanding Human Teaching Modalities in Reinforcement Learning Environments A Preliminary

Understanding Human Teaching Modalities in Reinforcement Learning Environments A Preliminary Report Slides available on the Program page of the ALIHT website. W. Bradley Knox Matthew E. Taylor and Peter Stone Knowledge! Desires! Current

437 views • 33 slides

Deep Reinforcement Learning and Complex Environments Raia Hadsell End-to-end Deep Learning

Deep Reinforcement Learning and Complex Environments Raia Hadsell End-to-end Deep Learning for robots? slide from V. Vanhoucke End-to-end Deep Learning for robots? 2010 : Speech Recognition Audio Acoustic Model Phonetic Model

1.05k views • 65 slides

Reinforcement Learning in Continuous Environments 64.425 Integrated Seminar: Intelligent Robotics

MIN Faculty Department of Informatics University of Hamburg Continuous Reinforcement Learning Reinforcement Learning in Continuous Environments 64.425 Integrated Seminar: Intelligent Robotics Oke Martensen University of Hamburg Faculty of

325 views • 16 slides

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Introduction to Reinforcement Learning RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem Inside an RL agent Temporal difference learning Many faces of Reinforcement Learning What is

552 views • 35 slides

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine playing a new game whose rules you dont know; after a hundred or so moves your don t know; after a hundred or so moves, your opponent announces, You

511 views • 30 slides

Reinforcement Learning You can think of supervised learning as the teacher providing answers

Reinforcement Learning You can think of supervised learning as the teacher providing answers (the class labels) Reinforcement learning: In reinforcement learning, the agent learns Markov Decision Processes based on a punishment/reward

89 views • 6 slides

Examples of Reinforcement Learning Robocup Soccer Teams Stone & Veloso, Reidmiller et al.

Reinforcement Learning Slides by Rich Sutton Mods by Dan Lizotte Refer to Reinforcement Learning: An Introduction by Sutton and Barto Alpaydin Chapter 16 Up until now we have been Supervised Learning Classifying, mostly

891 views • 31 slides

Deep Reinforcement Learning Lecture 1 Sergey Levine How do we build intelligent machines?

Deep Reinforcement Learning Lecture 1 Sergey Levine How do we build intelligent machines? Intelligent machines must be able to adapt Deep learning helps us handle unstructured environments Reinforcement learning provides a formalism for

1.89k views • 172 slides

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning Haarnoja, Tang et al. (2017) Reinforcement Learning with Deep Energy Based Policies, ICML . Haarnoja, Zhou et al. (2018) Soft Actor-Critic: Off-Policy

684 views • 24 slides

Reinforcement Learning Reinforcement Learning Now that you know a little about Optimal Control

Reinforcement Learning Reinforcement Learning Now that you know a little about Optimal Control Theory, you actually have some knowledge in RL. RL shares the overall goal with OCT: solving for a control policy such that the cumulative cost

610 views • 22 slides

R i f R i f Reinforcement Learning III Reinforcement Learning III t L t L i i III III Dec

R i f R i f Reinforcement Learning III Reinforcement Learning III t L t L i i III III Dec 03 2008 1 Large State Spaces h When a problem has a large state space we can not longer represent the U or Q functions as explicit tables explicit

436 views • 18 slides

Reinforcement Learning CS 4100: Artificial Intelligence Reinforcement Learning II Still

Reinforcement Learning CS 4100: Artificial Intelligence Reinforcement Learning II Still assume a Marko kov decision process (MDP): A se set o of st states s s S A se set o of a actions ( s (per st state) A A mo model

96 views • 5 slides

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning<br/><br/> 4/25/19, 8*06 PM Reinforcement Learning<br/><br/> 4/25/19, 8*06 PM Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning? Spring 2019 Created:

371 views • 15 slides

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B. Temporal Difference Reinforcement Learning C. PVLV Model D. Cerebellum and Error-driven Learning 2/23/18 COSC 494/594 CCN 2 Sensory-Motor Loop

790 views • 56 slides

Deep Reinforcement Learning [Mastering the Game of Go with Deep Reinforcement Learning and Tree

Deep Reinforcement Learning [Mastering the Game of Go with Deep Reinforcement Learning and Tree Search, Nature 2016] CS 486/686 University of Waterloo Lecture 21: July 12, 2017 Outline AlphaGo Supervised Learning of Policy Networks

541 views • 15 slides

Reinforcement Learning CS 4100: Artificial Intelligence Reinforcement Learning Ja Jan-Wi

Reinforcement Learning CS 4100: Artificial Intelligence Reinforcement Learning Ja Jan-Wi Willem van de Meent Northeastern University [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188

145 views • 10 slides

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning: an Introduction, 2nd Edition: Chapters 6 (6.1 6.5) Outline Reinforcement Learning Reinforcement Learning: the

587 views • 27 slides

Module 11 Introduction to Reinforcement Learning CS 886 Sequential Decision Making and

Module 11 Introduction to Reinforcement Learning CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo Machine Learning Supervised Learning Teacher tells learner what to remember Reinforcement Learning

401 views • 29 slides

1 Pure Reinforcement Learning vs. Reinforcement Learning Monte-Carlo Planning No knowledge

CSE 573: Artificial Intelligence Todo Reinforcement Learning Add simulations from 473 Dan Weld Add UCB bound (cut bolzman & constant epsilon Add snazzy videos (pendulum, zico kolter See http://www See http://www-

462 views • 11 slides

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest Lecture May 24, 2017 Lecture overview What makes a reinforcement learning algorithm safe ? Notation Creating a safe reinforcement learning

1.42k views • 88 slides

Reinforcement Learning Framework Reinforcement Learning Rewards, Returns Lectures 4 and 5

1 Reinforcement Learning Framework Reinforcement Learning Rewards, Returns Lectures 4 and 5 Environment Dynamics Gillian Hayes Components of a Problem 18th January 2007 Values and Action Values, V and Q Optimal Policies

109 views • 7 slides

Introduction to Reinforcement Learning Bayesian Methods in Reinforcement Learning ICML 2007

Introduction to Reinforcement Learning Bayesian Methods in Reinforcement Learning ICML 2007 sequential decision making under uncertainty ? How Can I ... ? Move around in the physical world (e.g. driving, navigation) Play and win a game

519 views • 21 slides