Dynamic Programming Prof. Kuan-Ting Lai 2020/4/10 Dynamic - PowerPoint PPT Presentation

Jul 04, 2023 •363 likes •570 views

Dynamic Programming Prof. Kuan-Ting Lai 2020/4/10 Dynamic Programming Dynamic Programming is for problems with two properties: 1. Optimal substructure Optimal solution can be decomposed into subproblems 2. Overlapping subproblems

Dynamic Programming Prof. Kuan-Ting Lai 2020/4/10
Dynamic Programming • Dynamic Programming is for problems with two properties: 1. Optimal substructure • Optimal solution can be decomposed into subproblems 2. Overlapping subproblems • Subproblems recur many times • Solutions can be cached and reused • Examples: − Shortest Path, Hanoi Tower ,……. − Markov Decision Process
Sutton, Richard S.; Barto, Andrew G.. Reinforcement Learning (Adaptive Computation and Machine Learning series) (p. 189)
Dynamic Programming for MDP • Bellman equation gives recursive decomposition • Value function stores and reuses solutions • Dynamic programming assumes full knowledge of the MDP • Used for Model-based Planning
Policy Evaluation (Prediction) • Calculate the state-action function 𝑊 𝜌 for an arbitrary policy 𝜌 • Can be solved iteratively 𝑤 𝑙+1 𝑇 ← 𝐹 𝜌 𝑆 𝑢+1 + 𝛿𝑤 𝑙 𝑇 𝑢+1
Policy Evaluation in Small Grid World • One terminal state (shown twice as shaded squares) • Actions leading out of the grid leave state unchanged • Reward is -1 until the terminal state is reached
How to Improve a Policy 1. Evaluate the policy − 𝑤 𝜌 𝑡 = 𝐹[𝑆 𝑢+1 + 𝑆 𝑢+2 + ⋯ |𝑇 𝑢 = 𝑡] 2. Improve the policy by acting greedily with respect to v − 𝜌′ = 𝑕𝑠𝑓𝑓𝑒𝑧(𝑤 𝜌 ) • This process of policy iteration always converges to 𝜌′
Policy Iteration • Policy evaluation Estimate 𝑤 𝜌 • Policy improvement Generate 𝜌′ ≥ 𝜌
Jack’s Car Rental
Policy Improvement (1)
Policy Improvement (2)
Modified Policy Iteration • Do we need to iteratively evaluate until convergence of 𝑤 𝜌 ? • Can we simply stop after k iteration? − Example: Small grid world achieves optimal policy after k=3 iterations • Update policy every iteration? => Value Iteration
Value Iteration • Updating value function 𝑤 only, don’t calculate policy function 𝜌 • Policy is implicit built using 𝑤
Shortest Path Example
Policy Iteration vs. Value Iteration • Policy iteration • Value iteration
Reference • David Silver, Lecture 3: Planning by Dynamic Programming (https://www.youtube.com/watch?v=Nd1-UUMVfz4&list=PLqYmG7hTraZDM- OYHWgPebj2MfCFzFObQ&index=3) • Chapter 4, Richard S. Sutton and Andrew G. Barto , “Reinforcement Learning: An Introduction,” 2 nd edition, Nov. 2018

Recommend

Dynamic Programming Outline and Reading Matrix Chain-Product (5.3.1) Dynamic Programming:

Dynamic Programming Outline and Reading Matrix Chain-Product (5.3.1) Dynamic Programming: The General Technique (5.3.2) 0-1 Knapsack Problem (5.3.3) Dynamic Programming 2 Matrix Chain Product Dynamic Programming is a general

534 views • 34 slides

CS 170 Section 6 Dynamic Programming Owen Jow | owenjow@berkeley.edu Agenda Dynamic

CS 170 Section 6 Dynamic Programming Owen Jow | owenjow@berkeley.edu Agenda Dynamic programming Shortest paths String shuffling Dynamic Programming Dynamic Programming Solve subproblems, then use the subproblems to solve

464 views • 8 slides

Dynamic Programming Kevin Zatloukal July 18, 2011 Motivation Dynamic programming deserves

Dynamic Programming Kevin Zatloukal July 18, 2011 Motivation Dynamic programming deserves special attention: Motivation Dynamic programming deserves special attention: technique you are most likely to use in practice Motivation Dynamic

924 views • 80 slides

Dynamic programming 1 Dynamic programming also solve a problem by combining the solutions to

Chapter 3 Dynamic programming 1 Dynamic programming also solve a problem by combining the solutions to subproblems. But dynamic programming considers the situation that some subproblems will be called repeatedly an thus need to avoid

1.54k views • 77 slides

Dynamic Programming December 15, 2016 CMPE 250 Dynamic Programming December 15, 2016 1 / 60

Dynamic Programming December 15, 2016 CMPE 250 Dynamic Programming December 15, 2016 1 / 60 Why Dynamic Programming Often recursive algorithms solve fairly difficult problems efficiently BUT in other cases they are inefficient because they

986 views • 60 slides

Dynamic Programming Dynamic Programming Steps. 9 View the problem solution as the result of a

Dynamic Programming Dynamic Programming Steps. 9 View the problem solution as the result of a sequence When solving the dynamic programming of decisions. recurrence recursively, be sure to avoid the 9 Obtain a formulation for the

195 views • 7 slides

COMMUNICATING [with empathy] @ DY DYNAMIC JILL JILL @ DY DYNAMIC JILL TENSION IS INEVITABLE @

COMMUNICATING [with empathy] @ DY DYNAMIC JILL JILL @ DY DYNAMIC JILL TENSION IS INEVITABLE @ DY DYNAMIC JILL CONFLICT IS NEUTRAL @ DY DYNAMIC JILL ONE CANNOT NOT COMMUNICATE @ DY DYNAMIC JILL CONSISTENCY @ DY DYNAMIC JILL [how to

276 views • 15 slides

Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Minema Minema

Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Minema Minema Winter School 2009 Minema Minema Winter School 2009 Winter School 2009 Winter School 2009 Minema Minema Minema Minema Winter School 2009 Winter

661 views • 54 slides

Dynamic Programming (Chapter 6) Algorithm Design Techniques Greedy Divide and Conquer Dynamic

Dynamic Programming (Chapter 6) Algorithm Design Techniques Greedy Divide and Conquer Dynamic Programming Network Flows Algorithm Design Divide and Dynamic Greedy Conquer Programming Formulate problem ? ? ? Design algorithm less

527 views • 36 slides

Lecture 18: Elements of Dynamic Programming COMS10007 - Algorithms Dr. Christian Konrad

Lecture 18: Elements of Dynamic Programming COMS10007 - Algorithms Dr. Christian Konrad 02.04.2019 Dr. Christian Konrad Lecture 18: Elements of Dynamic Programming 1 / 8 Elements of Dynamic Programming Solving a Problem with Dynamic

725 views • 23 slides

Open, extensible dynamic programming systems or just how deep is the dynamic rabbit hole?

DLS, Portland, 2006 10 23 Open, extensible dynamic programming systems or just how deep is the dynamic rabbit hole? Ian Piumarta Viewpoints Research Institute ian@squeakland.org dynamic dynamic? data? extending objects

720 views • 42 slides

17: Dynamic Programming CS1101S: Programming Methodology Martin Henz October 19, 2012 CS1101S:

Fibonacci Numbers Dropping Eggs Puzzle Optimal Binary Search Tree 17: Dynamic Programming CS1101S: Programming Methodology Martin Henz October 19, 2012 CS1101S: Programming Methodology 17: Dynamic Programming Fibonacci Numbers Dropping

660 views • 25 slides

Dynamic Programming Has nothing to do with programming in the way we normally use that term

Dynamic Programming Has nothing to do with programming in the way we normally use that term Dynamic Programming More like TV programming (scheduling) Filling in a table Chapter 16 Algorithm design technique CPTR 318 1 2

462 views • 6 slides

Merge Sort 5/6/2003 1:27 PM Outline and Reading The General Technique (5.3.2) Dynamic

Merge Sort 5/6/2003 1:27 PM Outline and Reading The General Technique (5.3.2) Dynamic Programming 0-1 Knapsack Problem (5.3.3) Matrix Chain-Product (5.3.1) Dynamic Programming version 1.4 1 Dynamic Programming version 1.4 2 Dynamic

538 views • 6 slides

Dynamic Games & Cartels Johan.Stennek@Economics.gu.se 1 Dynamic Games 2 Dynamic Games

Dynamic Games & Cartels Johan.Stennek@Economics.gu.se 1 Dynamic Games 2 Dynamic Games & Cartels Imperfect informa5on Incomplete informa/on = To study cartels, we need to study a dynamic When players dont know each

990 views • 65 slides

Type Systems: Big Idea Static vs. Dynamic Typing Expressiveness (+ Dynamic) Dont have

Type Systems: Big Idea Static vs. Dynamic Typing Expressiveness (+ Dynamic) Dont have to worry about types (+ Dynamic) Dependent on input (- Dynamic) Runtime overhead (- Dynamic) Serve as documentation (+ Static)

572 views • 24 slides

Simulating Energy Aware Networks in Large Scale Distributed Systems Betsegaw Lemma Amersho

Simulating Energy Aware Networks in Large Scale Distributed Systems Betsegaw Lemma Amersho Supervised by: Prof. Martin Quinson, Dr. Anne-C ecile Orgerie Master Thesis Defence, June 30, 2017 Outline Recent Trends in Large-Scale Networks 1

380 views • 23 slides

CSC 1800 Organization of Programming Languages Object Oriented Languages 1 Introduction

CSC 1800 Organization of Programming Languages Object Oriented Languages 1 Introduction Many object-oriented programming (OOP) languages Some support procedural and data-oriented programming (e.g., Ada 95, C++, Python) Some

365 views • 24 slides

Concepts of Programming Languages: Static vs. Dynamic Typing Toni Schumacher Institute for

Concepts of Programming Languages: Static vs. Dynamic Typing Toni Schumacher Institute for Software Engineering and Programming Languages 23. November 2015 T. Schumacher 23. November 2015 1/31 Table of Contents Motivation Typing Static

537 views • 37 slides

Dynamic Programming: Interval Scheduling and Knapsack 6.1 Weighted Interval Scheduling Weighted

Dynamic Programming: Interval Scheduling and Knapsack 6.1 Weighted Interval Scheduling Weighted Interval Scheduling Weighted interval scheduling problem. Job j starts at s j , finishes at f j , and has weight or value v j . Two jobs

643 views • 20 slides

Dynamic Programming CSE 417: Algorithms and Outline: Computational Complexity General

Dynamic Programming CSE 417: Algorithms and Outline: Computational Complexity General Principles Easy Examples Fibonacci, Licking Stamps Meatier examples Winter 2009 RNA Structure prediction W. L. Ruzzo

429 views • 7 slides

Prr

Prr Prs Prr Prrrs r tr

662 views • 50 slides

= [ [ ] ] = ( ( ) ) A A A A m i j , + + + + < min m m p p p if

Dynamic Programming: Matrix Chain Multiplication Designing a Dynamic Designing a Dynamic Matrix Chain Multiplication Problem Programming Algorithm for an Programming Algorithm for an Optimization Problem Optimization Problem n Given a

414 views • 7 slides

INF4130: Dynamic Programming Slides to the lecture Sept. 13, 2018. In the textbook: Ch. 9,

INF4130: Dynamic Programming Slides to the lecture Sept. 13, 2018. In the textbook: Ch. 9, and Section 20.5 The discussion of this example in Sec. 20.5 relies on Ch. 9, and is therefore rather short in then textbook. The slides

547 views • 33 slides