Optimal Planning and Shortcut Learning: An Unfulfilled Promise Erez - PowerPoint PPT Presentation

Background Learning Shortcut Rules Empirical Evaluation Optimal Planning and Shortcut Learning: An Unfulfilled Promise Erez Karpas Carmel Domshlak Faculty of Industrial Engineering and Management, Technion — Israel Institute of Technology May 28, 2013

Background Learning Shortcut Rules Empirical Evaluation Outline Background 1 Learning Shortcut Rules 2 Empirical Evaluation 3

Background Learning Shortcut Rules Empirical Evaluation STRIPS A STRIPS planning problem with action costs is a 5-tuple Π = � P , s 0 , G , A , C � P is a set of boolean propositions s 0 ⊆ P is the initial state G ⊆ P is the goal A is a set of actions. Each action is a triple a = � pre ( a ) , add ( a ) , del ( a ) � C : A → R 0 + assigns a cost to each action Applying action sequence ρ = � a 0 , a 1 ,..., a n � at state s leads to s [[ ρ ]] The cost of action sequence ρ is ∑ n i = 0 C ( a i )

Background Learning Shortcut Rules Empirical Evaluation Intended Effects Chicken logic Why did the chicken cross the road?

Background Learning Shortcut Rules Empirical Evaluation Intended Effects Chicken logic Why did the chicken cross the road? To get to the other side

Background Learning Shortcut Rules Empirical Evaluation Intended Effects Chicken logic Why did the chicken cross the road? To get to the other side Observation Every action along an optimal plan is there for a reason Achieve a precondition for another action Achieve a goal

Background Learning Shortcut Rules Empirical Evaluation Intended Effects — Example t 2 t 1 A B o If � load- o - t 1 � is the beginning of an optimal plan, then: There must be a reason for applying load- o - t 1 load- o - t 1 achieves o -in- t 1 Any continuation of this path to an optimal plan must use some action which requires o -in- t 1

Background Learning Shortcut Rules Empirical Evaluation Intended Effects — Example t 2 t 2 o load- o - t 1 t 1 t 1 A B A B o If � load- o - t 1 � is the beginning of an optimal plan, then: There must be a reason for applying load- o - t 1 load- o - t 1 achieves o -in- t 1 Any continuation of this path to an optimal plan must use some action which requires o -in- t 1

Background Learning Shortcut Rules Empirical Evaluation Intended Effects — Formal Definition Intended Effects Given a path π = � a 0 , a 1 ,... a n � a set of propositions X ⊆ s 0 [[ π ]] is an intended effect of π iff there exists a path π ′ such that π · π ′ is an optimal plan and π ′ consumes exactly X , i.e., ( p ∈ X iff there is a causal link � a i , p , a j � in π · π ′ , with a i ∈ π and a j ∈ π ′ ). IE ( π ) — the set of all intended effect of π

Background Learning Shortcut Rules Empirical Evaluation Intended Effects: Complexity Hard to Find Exactly It is P-SPACE Hard to find the intended effects of path π . Sound Approximation We can use supersets of IE ( π ) to derive constraints about any continuation of π .

Background Learning Shortcut Rules Empirical Evaluation Shortcuts and Approximate Intended Effects Intuition X can not be an intended effect of π if there is a cheaper way to achieve X s 0

Background Learning Shortcut Rules Empirical Evaluation Shortcuts and Approximate Intended Effects Intuition X can not be an intended effect of π if there is a cheaper way to achieve X s π s 0

Background Learning Shortcut Rules Empirical Evaluation Shortcuts and Approximate Intended Effects Intuition X can not be an intended effect of π if there is a cheaper way to achieve X s s ′ π π ′ s 0 C ( π ′ ) < C ( π )

Background Learning Shortcut Rules Empirical Evaluation Shortcuts and Approximate Intended Effects Intuition X can not be an intended effect of π if there is a cheaper way to achieve X s s ′ Any continuation of π into an optimal plan must use some fact in s \ s ′ π π ′ s 0 C ( π ′ ) < C ( π )

Background Learning Shortcut Rules Empirical Evaluation Shortcuts and Approximate Intended Effects: Example t 2 t 1 A B π = � � t 2 -at- B can not be an intended effect of π — we must use t 1 -at- B t 1 -at- B can not be an intended effect of π — we must use t 2 -at- B

Background Learning Shortcut Rules Empirical Evaluation Shortcuts and Approximate Intended Effects: Example t 2 t 1 A B π = � drive- t 1 - A - B � t 2 -at- B can not be an intended effect of π — we must use t 1 -at- B t 1 -at- B can not be an intended effect of π — we must use t 2 -at- B

Background Learning Shortcut Rules Empirical Evaluation Shortcuts and Approximate Intended Effects: Example t 2 t 1 A B π = � drive- t 1 - A - B ,drive- t 2 - A - B � t 2 -at- B can not be an intended effect of π — we must use t 1 -at- B t 1 -at- B can not be an intended effect of π — we must use t 2 -at- B

Background Learning Shortcut Rules Empirical Evaluation Shortcuts and Approximate Intended Effects: Example t 2 t 1 A B π = � drive- t 1 - A - B ,drive- t 2 - A - B � π ′ = � drive- t 2 - A - B � t 2 -at- B can not be an intended effect of π — we must use t 1 -at- B t 1 -at- B can not be an intended effect of π — we must use t 2 -at- B

Background Learning Shortcut Rules Empirical Evaluation Shortcuts and Approximate Intended Effects: Example t 2 t 1 A B π = � drive- t 1 - A - B ,drive- t 2 - A - B � π ′ = � drive- t 2 - A - B � t 2 -at- B can not be an intended effect of π — we must use t 1 -at- B π ′′ = � drive- t 1 - A - B � t 1 -at- B can not be an intended effect of π — we must use t 2 -at- B

Background Learning Shortcut Rules Empirical Evaluation Shortcuts and Approximate Intended Effects: Example t 2 t 1 A B π = � drive- t 1 - A - B ,drive- t 2 - A - B � π ′ = � drive- t 2 - A - B � t 2 -at- B can not be an intended effect of π — we must use t 1 -at- B π ′′ = � drive- t 1 - A - B � t 1 -at- B can not be an intended effect of π — we must use t 2 -at- B We must use both t 1 -at- B and t 2 -at- B

Background Learning Shortcut Rules Empirical Evaluation Finding Shortcuts Where do the shortcuts come from? They can be dynamically generated for each path Our previous paper used the causal structure of the current path — a graph whose nodes are action occurrences, with an edge from a i to a j if there is a causal link where a i provides some proposition for a j Previous shortcut rules attempted to remove some actions, according to the the causal structure, to obtain a shortcut

Background Learning Shortcut Rules Empirical Evaluation Shortcuts Example Causal Structure t 2 t 1 A B C π = � �

Background Learning Shortcut Rules Empirical Evaluation Shortcuts Example Causal Structure t 2 t 1 drive- t 1 - A - B A B C π = � drive- t 1 - A - B �

Background Learning Shortcut Rules Empirical Evaluation Shortcuts Example Causal Structure t 2 t 1 drive- t 1 - A - B drive- t 2 - A - B A B C π = � drive- t 1 - A - B ,drive- t 2 - A - B �

Optimal Planning and Shortcut Learning: An Unfulfilled Promise Erez - PowerPoint PPT Presentation

Background Learning Shortcut Rules Empirical Evaluation Optimal Planning and Shortcut Learning: An Unfulfilled Promise Erez Karpas Carmel Domshlak Faculty of Industrial Engineering and Management, Technion Israel Institute of Technology

Shortcuts in Dj Vu X3 vs. Dj Vu X2 Function Dj Vu X3 Shortcut Dj Vu X2 Shortcut

YOUR SHORTCUT TO MASSIVE CREDIBILITY CONTAINS ALL VIDEO SLIDEDECKS FOR THIS SESSION 1 VIRTUAL

Nevada Union: Welcome Incoming 9 th graders! Class of 2024 Shortcut to library-clouds-da.jpg.lnk

A Shortcut Fusion Rule for Circular Program Calculation Joo Fernandes 1 Alberto Pardo 2 Joo

Circular vs. Higher-Order Shortcut Fusion Janis Voigtl ander Technische Universit at

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

Performance of shortcut-to-adiabaticity quantum engines Obinna Abah Centre for Theoretical

The Suez Canal - A vital shortcut for global commerce Basic Information Located in Egypt

NIEPOOMICE, 16.03.2017 What is meant WPC? Shortcut WPC (Wood Plastic Composite) is the wood-

Question Box An Open Mind Project What is Question Box? Question Box is an elegant shortcut

Shortcut-Stacked Sentence Encoders for Multi-Domain Inference Yixin Nie & Mohit Bansal 1

The Normalization Shortcut stands for the probability distribution of B ( | , ) P B j m given

2 Generation Inviting the next generation to trust God He took a shortcut He lied He rescued

Using Cache Algorithms to Choose Shortcut Links Justin Brickell Inderjit S. Dhillon Dharmendra

Inverse problems and control optimal in non-linear mechanics C. Stolz 1 2 Introduction

UTSA Community-Based Secure Information and Resource Sharing in AWS Public Cloud Cyber Incident

DS595/CS525 Reinforcement Learning Prof. Yanhua Li Time: 6:00pm 8:50pm R Zoom Lecture Fall

3 RCD as Topological Sort in this paperis an attested set all of whose el- ements are

1 An Approach for Secure Edge Computing in the Internet of Things Markus Endler,

Deliberation for Social Choice Brandon Fain*[1], Ashish Goel[2], Kamesh Munagala[1] [1] Duke

raSAT: SMT for Polynomial Inequality To Van Khanh (UET/VNU-HN) Vu Xuan Tung, Mizuhito Ogawa

Draft EE 8235: Lecture 16 1 Lecture 16: Controllability and observability Controllability

Fonts with complex OpenType tables Karel Pka Institute of Physics, Academy of Sciences

Optimal Planning and Shortcut Learning: An Unfulfilled Promise Erez - PowerPoint PPT Presentation

Background Learning Shortcut Rules Empirical Evaluation Optimal Planning and Shortcut Learning: An Unfulfilled Promise Erez Karpas Carmel Domshlak Faculty of Industrial Engineering and Management, Technion Israel Institute of Technology

Shortcuts in Dj Vu X3 vs. Dj Vu X2 Function Dj Vu X3 Shortcut Dj Vu X2 Shortcut

YOUR SHORTCUT TO MASSIVE CREDIBILITY CONTAINS ALL VIDEO SLIDEDECKS FOR THIS SESSION 1 VIRTUAL

Nevada Union: Welcome Incoming 9 th graders! Class of 2024 Shortcut to library-clouds-da.jpg.lnk

A Shortcut Fusion Rule for Circular Program Calculation Joo Fernandes 1 Alberto Pardo 2 Joo

Circular vs. Higher-Order Shortcut Fusion Janis Voigtl ander Technische Universit at

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

Performance of shortcut-to-adiabaticity quantum engines Obinna Abah Centre for Theoretical

The Suez Canal - A vital shortcut for global commerce Basic Information Located in Egypt

NIEPOOMICE, 16.03.2017 What is meant WPC? Shortcut WPC (Wood Plastic Composite) is the wood-

Question Box An Open Mind Project What is Question Box? Question Box is an elegant shortcut

Shortcut-Stacked Sentence Encoders for Multi-Domain Inference Yixin Nie &amp; Mohit Bansal 1

The Normalization Shortcut stands for the probability distribution of B ( | , ) P B j m given

2 Generation Inviting the next generation to trust God He took a shortcut He lied He rescued

Using Cache Algorithms to Choose Shortcut Links Justin Brickell Inderjit S. Dhillon Dharmendra

Inverse problems and control optimal in non-linear mechanics C. Stolz 1 2 Introduction

UTSA Community-Based Secure Information and Resource Sharing in AWS Public Cloud Cyber Incident

DS595/CS525 Reinforcement Learning Prof. Yanhua Li Time: 6:00pm 8:50pm R Zoom Lecture Fall

3 RCD as Topological Sort in this paperis an attested set all of whose el- ements are

1 An Approach for Secure Edge Computing in the Internet of Things Markus Endler,

Deliberation for Social Choice Brandon Fain*[1], Ashish Goel[2], Kamesh Munagala[1] [1] Duke

raSAT: SMT for Polynomial Inequality To Van Khanh (UET/VNU-HN) Vu Xuan Tung, Mizuhito Ogawa

Draft EE 8235: Lecture 16 1 Lecture 16: Controllability and observability Controllability

Fonts with complex OpenType tables Karel Pka Institute of Physics, Academy of Sciences

Shortcut-Stacked Sentence Encoders for Multi-Domain Inference Yixin Nie & Mohit Bansal 1