Department of Computer Science CSCI 5622: Machine Learning Chenhao - PowerPoint PPT Presentation

Jan 16, 2024 •485 likes •940 views

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 21: Reinforcement learning I Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Administrivia Poster printing Email your poster to

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 21: Reinforcement learning I Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1
Administrivia • Poster printing • Email your poster to inkspot.umc@colorado.edu with subject “Tan Poster Project” by Thursday noon • Poster size A1 • Check Piazza for details • Light refreshments will be provided, invite your friends • Poster session: DLC 1B70 on Dec 13 2
Learning objectives • Understand the formulation of reinforcement learning • Understand the definition of a policy and the optimal policy • Learn about value iteration • Most of the two lectures are based on Richard S. Sutton and Andrew G. Barto’s book 3
Supervised learning Unsupervised learning Data: X Labels: Y Data: X Latent structure: Z 4
5
An agent learns to behave in an environment 6
Reinforcement learning examples • Minh et al. 2013 • https://www.youtube.com/watch?v=V1eYniJ0Rnk 7
Reinforcement learning examples 8
Reinforcement learning 9
Reinforcement learning 10
Markov decision processes 11
Markov decision processes 12
Markov decision processes 13
A few examples • Grid world 14
A few examples • Atari game (Bonus: try Google image search “atari breakout”) 15
A few examples • Go 16
Goal • Episodes: ending at a terminal state, e.g., a play of a game • Continuing tasks: keep trying and having infinite steps 17
Policy • The agent’s action selection 18
Value function 19
Action-value function (Q-function) 20
Optimal policy and optimal value function 21
Optimal policy and optimal value function 22
Optimal policy and optimal value function 23
A concrete grid example • Grid world 24
A concrete grid example Rewards can be positive or negative Delayed reward : might not get reward until you reach goal Might have negative reward until you reach goal 25
A concrete grid example 26
A concrete grid example 27
A concrete grid example 28
A concrete grid example 29
A concrete grid example 30
A concrete grid example 31
A concrete grid example Take-Away : Optimal policy highly dependent on details of reward 32
Value Iteration Punchline : Discounted reward renders an infinite horizon value function finite . Great b/c we can actually compare value of different sequences 33
Value Iteration 34
Value Iteration 35
Value Iteration 36
Value Iteration 37
Value Iteration 38
Value Iteration 39
Value Iteration 40
Value Iteration 41
Value Iteration 42
Value Iteration 43
Value Iteration 44

Recommend

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 23: Machine

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 23: Machine learning and society Slides adapted from Chris Ketelsen 1 Learning objectives Learn about the connection between our society and machine learning

777 views • 35 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 18: Clustering

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 18: Clustering Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Learning objectives Learn about general clustering Learn about the K-Means

1.01k views • 60 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 13: Boosting

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 13: Boosting Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Learning objectives Understand the general idea behind ensembling Learn about

537 views • 39 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 12:

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 12: Regularization, regression, and multi-class classification Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 HW 2 2 Learning objective Review

835 views • 62 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 17: Midterm

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 17: Midterm review 1 Theory PAC learning Bias-variance tradeoff Model selection Methods K-nearest neighbor Nave Bayes Linear

586 views • 55 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 16:

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 16: Dimensionality Reduction Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Midterm A. Review session B. Flipped classroom C. Go over the example

1.08k views • 72 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 19: EM algorithm,

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 19: EM algorithm, Topic modeling Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Administrivia HW4 due, HW5 out Remember that we only count the

1.32k views • 65 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 20: Topic

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 20: Topic modeling and variational inferrence Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Administrivia Poster printing (stay tuned!) HW 5

902 views • 53 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 14: PAC

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 14: PAC learnability Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Announcements Proposal due tomorrow night HW2 regrade requests Peer

847 views • 53 slides

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Rob Schapire Princeton University www.cs.princeton.edu/ schapire Machine

1.26k views • 38 slides

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum Computing Machine Learning Quantum Computing Machine Learning so hot so so hot Quantum Computing Machine Learning Quantum Computing Machine Learning

835 views • 51 slides

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is Machine Learning? Azure Machine Learning: How it works Azure Machine Learning in action Get started Contents What is Machine Learning?

456 views • 21 slides

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING Exam Format The exam lasts a total of 3 hours: - Upon entering the room, you must

373 views • 21 slides

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

MACHINE LEARNING 2012 MACHINE LEARNING MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How to separate the red class from the grey class? x 2 360 r x 1 Polar coordinates Data

1.04k views • 44 slides

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach to Preventing to Preventing to Preventing to Preventing Avoidable ED Utilization Avoidable ED Utilization Avoidable ED

727 views • 13 slides

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

DataCamp Machine Learning for Finance in Python MACHINE LEARNING FOR FINANCE IN PYTHON Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning for Finance in Python Machine Learning in Finance source:

389 views • 36 slides

ACCT 420: ML and AI for visual data Session 11 Dr. Richard M. Crowley 1 Front matter 2 . 1

ACCT 420: ML and AI for visual data Session 11 Dr. Richard M. Crowley 1 Front matter 2 . 1 Learning objectives Theory: Neural Networks for Images Audio Video Application: Handwriting recognition Identifying

615 views • 58 slides

Applied Machine Learning Spring 2019, CS 519 Prof. Liang Huang School of EECS Oregon State

Applied Machine Learning Spring 2019, CS 519 Prof. Liang Huang School of EECS Oregon State University liang.huang@oregonstate.edu Machine Learning is Everywhere A breakthrough in machine learning would be worth ten Microsofts

525 views • 36 slides

Outline CS 188: Artificial Intelligence Markov Decision Processes (MDPs) Formalism

Outline CS 188: Artificial Intelligence Markov Decision Processes (MDPs) Formalism Value iteration Markov Decision Processes (MDPs) In essence a graph search version of expectimax, but there are rewards in every step

393 views • 8 slides

JAVASCRIPT DEVELOPMENT Sasha Vodnik, Instructor JAVASCRIPT DEVELOPMENT THE COMMAND LINE THE

JAVASCRIPT DEVELOPMENT Sasha Vodnik, Instructor JAVASCRIPT DEVELOPMENT THE COMMAND LINE THE COMMAND LINE WEEKLY OVERVIEW WEEK 2 The Command Line / Data Types WEEK 3 Loops & Conditionals / Functions & Scope WEEK 4 Slackbot Lab /

1.13k views • 87 slides

Survivor: CSCI 135 Interfacing with your computer GUI (graphical user interfaces) Today:

Survivor: CSCI 135 Interfacing with your computer GUI (graphical user interfaces) Today: predominant interaction method Windows, buttons, mouse Advantages Easier for novices No commands to remember Rich input and output

586 views • 29 slides

Introduction to MATLAB CS534 Fall 2016 Contact Qisi Wang Office: 1308CS E-mail:

Introduction to MATLAB CS534 Fall 2016 Contact Qisi Wang Office: 1308CS E-mail: qisi.wang@wisc.edu Office hours: Tuesdays and Thursdays 11:45 a.m. - 12:45 p.m. and by appointment What you'll be learning MATLAB basics (IDE,

882 views • 60 slides

Price- -cap cap regulation regulation in port in port pricing pricing: : Price a DEA

Price- -cap cap regulation regulation in port in port pricing pricing: : Price a DEA approach approach a DEA M. Basta C. Ferrari C. Ferrari M. Basta University of Genoa Genoa University of DiEM Transport Division

392 views • 17 slides

TDDE18 & 726G77 Course Introductjon Christofger Holm Department of Computer and informatjon

TDDE18 & 726G77 Course Introductjon Christofger Holm Department of Computer and informatjon science 1 Course Informatjon 2 C++ basics 3 IO 4 Variables 5 More IO 6 Streams 7 Basic constructs 8 Files 2 / 76 Course Informatjon

1.78k views • 153 slides