N-step bootstrapping Robert Platt Northeastern University Id love - PowerPoint PPT Presentation

Jul 04, 2023 •110 likes •285 views

N-step bootstrapping Robert Platt Northeastern University Id love to use my experiences more efficiently... Motivation Left: path taken by agent in grid world. Gets zero reward everywhere except for goal state where it gets positive

N-step bootstrapping Robert Platt Northeastern University I’d love to use my experiences more efficiently...
Motivation Left: path taken by agent in grid world. Gets zero reward everywhere except for goal state where it gets positive reward. Middle: 1-step SARSA updates only penultimate state/action pair Problem: standard Q-Learning/SARSA “propagates rewards” only one state back per time step – n-step bootstrapping is one way to address this problem – we will see other ways in subsequent slide decks.
TD and MC are two extremes of a continuum
TD and MC are two extremes of a continuum What are these?
TD and MC are two extremes of a continuum
TD and MC are two extremes of a continuum Update equation:
TD and MC are two extremes of a continuum Is called the target of the update Update equation:
TD and MC are two extremes of a continuum What’s the target for this one? Update equation:
TD and MC are two extremes of a continuum What’s the target for this one? Complete update equation:
TD and MC are two extremes of a continuum What’s the target for this one? Complete update equation:
TD and MC are two extremes of a continuum What’s the target for this one? Complete update equation:
TD and MC are two extremes of a continuum Notice that you can’t do this update until time step t+3 – TD update happens on next time step – MC update happens at end of episode – n-step TD update happens on time step n What’s the target for this one? Complete update equation:
How well does this work? This comparison is for: – a 19 state random walk policy – n-step TD policy evaluation
n-step TD algorithm
n-step SARSA Same idea as in n-step TD – how is this backup diagram different from that of n-step TD? – why is it different?
n-step SARSA Why does the backup start with a dot rather than a circle? Same idea as in n-step TD – how is this backup diagram different from that of n-step TD? – why is it different?
n-step SARSA Left: path taken by agent in grid world. Gets zero reward everywhere except for goal state where it gets positive reward. Middle: 1-step SARSA updates only penultimate state/action pair Right: 10-step SARSA updates last 10 state/action pairs

Recommend

Demo (Step 1, Selection) Demo (Step 1, Optimization) Demo (Step 2, Selection) Demo (Step 2,

Demo (Step 1, Selection) Demo (Step 1, Optimization) Demo (Step 2, Selection) Demo (Step 2, Optimization) Demo (Step 3, Selection) Demo (Step 3, Optimization) Demo (Step 4, Selection) Demo (Step 4, Optimization) Demo (Step 5, Selection)

529 views • 19 slides

Quick guide Step 1: Purchasing an RSEvents! membership Step 2: Downloading RSEvents! Step 3:

Quick guide Step 1: Purchasing an RSEvents! membership Step 2: Downloading RSEvents! Step 3: Installing RSEvents! Step 4: Configure RSEvents! Step 5: Add user permissions Step 6: Create event categories Step 7: Create event locations Step 8:

627 views • 35 slides

Step by step guide Step 1: Purchasing an RSBlog! membership Step 2: Downloading RSBlog! Step 3:

Step by step guide Step 1: Purchasing an RSBlog! membership Step 2: Downloading RSBlog! Step 3: Installing RSBlog! 3.1 Installing the component 3.2 Installing a new language file Step 4: Import plugins (optional) 4.1 Import Joomla! articles

889 views • 67 slides

Step by step guide Step 1: Purchasing an RSEvents! membership Step 2: Downloading RSEvents! Step

Step by step guide Step 1: Purchasing an RSEvents! membership Step 2: Downloading RSEvents! Step 3: Installing RSEvents! Step 4: Configure RSEvents! 4.1 General settings 4.2 Dashboard settings 4.3 Events 4.3.1 Event general settings 4.3.2

1.29k views • 80 slides

Step by step guide Step 1: Accessing the account Step 2: Download RSFiles! 2.1 Download the

Step by step guide Step 1: Accessing the account Step 2: Download RSFiles! 2.1 Download the component 2.2 Download Language files Step 3: Installing RSFiles! 3.1 Installing the component 3.2 Installing the language files Step 4: Update

508 views • 47 slides

Step 1 Step 2 Step 3 Step 4 Step 5 Preparation of a sketch Submission of birth map of all

Step 1 Step 2 Step 3 Step 4 Step 5 Preparation of a sketch Submission of birth map of all customary information forms to Preparation and adoption of Convening and submission Submission of Application Ste land owned by ILG obtain

1.23k views • 4 slides

Quick guide Step 1: Purchasing RSMail! Step 2: Download RSMail! Step 3: Installing RSMail! Step

Quick guide Step 1: Purchasing RSMail! Step 2: Download RSMail! Step 3: Installing RSMail! Step 4: RSMail! settings Step 5: Add Subscribers 5.1. Create subscriber lists 5.2. Add subscribers 5.2.1 Manual add 5.2.2 Import from CSV Step 6

625 views • 14 slides

Credential Assessment Mapping Privilege Escalation at Scale Matt Weeks @scriptjunkie1 Adversary

Credential Assessment Mapping Privilege Escalation at Scale Matt Weeks @scriptjunkie1 Adversary access (# boxes owned) 10000 1000 100 10 1 Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Step 8 Step 9 Step 10 Adversary access (#

721 views • 45 slides

Bootstrapping without the Boot We like minimally supervised learning (bootstrapping).

Executive Summary (if youre not an executive, you may stay for the rest of the talk) What: Bootstrapping without the Boot We like minimally supervised learning (bootstrapping). Lets convert it to unsupervised learning

344 views • 7 slides

Parametric Bootstrapping 18.05 Spring 2017 Parametric bootstrapping Use the estimated parameter

Parametric Bootstrapping 18.05 Spring 2017 Parametric bootstrapping Use the estimated parameter to estimate the variation of estimates of the parameter! Data: x 1 , . . . , x n drawn from a parametric distribution F ( ). Estimate by a

63 views • 5 slides

Step by step guide Step 1: Purchasing a RSMembership! membership Step 2: Download RSMembership!

Step by step guide Step 1: Purchasing a RSMembership! membership Step 2: Download RSMembership! 2.1. Download the component 2.2. Download RSMembership! language files Step 3: Installing RSMembership! 3.1: Installing the component 3.2:

798 views • 50 slides

Selection of Design Team Step 3 Design Step 4 June 2013 Project Management Concept

Selection of Design Team Step 3 Design Step 4 June 2013 Project Management Concept Project Management Concepts Step 1: Needs Development Step 2: Scope Development Step 3: Procurement of Design Team Step 4: Design Step 5:

835 views • 21 slides

Step by step guide Step 1: Purchasing an RSMail! membership Step 2: Download RSMail! 2.1.

Step by step guide Step 1: Purchasing an RSMail! membership Step 2: Download RSMail! 2.1. Download the component 2.2. Download RSMail! language files Step 3: Installing RSMail! 3.1: Installing the component 3.2: Installing the language files

453 views • 44 slides

Step by step guide Step 1: Purchasing a RSFirewall! membership Step 2: Download RSFirewall! 2.1.

Step by step guide Step 1: Purchasing a RSFirewall! membership Step 2: Download RSFirewall! 2.1. Download the component 2.2. Download RSFirewall! language files Step 3: Installing RSFirewall! 3.1. Installing the component 3.2. Installing the

560 views • 39 slides

Step by step guide Step 1: Purchasing a RSTickets!Pro membership Step 2: Downloading

Step by step guide Step 1: Purchasing a RSTickets!Pro membership Step 2: Downloading RSTickets!Pro 2.1 Downloading the component 2.2 Downloading the plugins/modules 2.3 Downloading additional language packs Step 3: Installing RSTickets!Pro

548 views • 44 slides

Quick guide Step 1: Purchasing a RSComments! membership Step 2: Download RSComments! Step 3:

Quick guide Step 1: Purchasing a RSComments! membership Step 2: Download RSComments! Step 3: Installing RSComments! Step 4: Configure RSComments 4.1 General Configuration 4.2 Configure comments Step 5: Add user permissions Step 6: Moderate

337 views • 8 slides

Generating bootstrap replicates Statistical Thinking in Python II Michelson's speed of light

STATISTICAL THINKING IN PYTHON II Generating bootstrap replicates Statistical Thinking in Python II Michelson's speed of light measurements Data: Michelson, 1880 Statistical Thinking in Python II Resampling an array Data: [23.3, 27.1,

438 views • 30 slides

Bootstrapping 18.05 Spring 2018 Agenda Leftover from 5/2 : binomial confidence intervals

Bootstrapping 18.05 Spring 2018 Agenda Leftover from 5/2 : binomial confidence intervals Bootstrap terminology Bootstrap principle Empirical bootstrap Parametric bootstrap May 7, 2018 2 / 16 Board question: exact binomial confidence

487 views • 16 slides

MiniBooNE Steve Brice Fermilab Overview MiniBooNE Beam MiniBooNE Detector Neutrino Analyses

MiniBooNE Steve Brice Fermilab Overview MiniBooNE Beam MiniBooNE Detector Neutrino Analyses Summary Neutrino 2004 June 15 Steve Brice FNAL Page 1 Current Oscillation Signals Unconfirmed m 2 LSND ~ 0.1-10

289 views • 26 slides

Fermilab: Now and Future Young-Kee Kim Fermilab and the University of Chicago Phenomenology 2010

Fermilab: Now and Future Young-Kee Kim Fermilab and the University of Chicago Phenomenology 2010 Symposium May 10 12, 2010 University of Wisconsin, Madison 21 st Century Questions in Particle Physics The Three Frontiers P5 (2008) Energy

688 views • 53 slides

Stochastic Simulation Non-parametric technique The Bootstrap method Bo Friis Nielsen

The Bootstrap method The Bootstrap method A technique for estimating the variance (etc) of an estimator. Based on sampling from the empirical distribution. Stochastic Simulation Non-parametric technique The Bootstrap method Bo Friis

348 views • 3 slides

New Technologies for Dark Matter Searches XXX NATIONAL SEMINAR of NUCLEAR AND SUBNUCLEAR PHYSICS

New Technologies for Dark Matter Searches XXX NATIONAL SEMINAR of NUCLEAR AND SUBNUCLEAR PHYSICS OTRANTO, 11 June 2018 Giuliana Fiorillo, Universit di Napoli Federico II Contents: lecture 3 How to improve? Key technologies for LAr

729 views • 52 slides

Lecture 7.2: Different boundary conditions Matthew Macauley Department of Mathematical Sciences

Lecture 7.2: Different boundary conditions Matthew Macauley Department of Mathematical Sciences Clemson University http://www.math.clemson.edu/~macaule/ Math 2080, Differential Equations M. Macauley (Clemson) Lecture 7.2: Different boundary

246 views • 6 slides

Examples of particle creation at point sources via boundary conditions Jonas Lampart CNRS &

Examples of particle creation at point sources via boundary conditions Jonas Lampart CNRS & ICB, Universit de Bourgogne Franche-Comt August 22, 2017 joint work with J. Schmidt, S. Teufel and R. Tumulka (Tbingen) Jonas Lampart

389 views • 17 slides