Learning and Representation for Generalized Planning Hector Geffner ICREA & Universitat Pompeu Fabra Barcelona, Spain Research thread is joint work with Blai Bonet, Guillem Franc` es, Giuseppe de Giacomo, . . . Latest in thread: Learning Features and Abstract Actions for Computing Generalized Plans . B. Bonet, G. Franc` es, H. Geffner. AAAI 2019.
Planning and Generalized Planning • Planning is about solving single planning instances ⊲ E.g., find plan to achieve on ( A, B ) for particular configuration of blocks • Generalized planning is about solving multiple planning instances at once. E.g., find general strategy for 1. go to target location ( x ∗ , y ∗ ) in empty square grid of any size 2. pick objects spread in 2D grid, any number, size, locations 3. achieve goal on ( x, y ) in Blocks, any number of blocks and configuration 4. achieve any goal in Blocks, any number of blocks, any configuration, . . . Srivastava et al, 2008; Bonet et al, 2009; Hu and De Giacomo 2011, . . . H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 2
Two big questions • How to represent general plans? • How to compute them? Methodological point: seek general methods that build on existing models and solvers , avoid creation of new, ad-hoc algorithms as much as possible. H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 3
An Empirical Observation TC/Right –B/Up TB/Up –B/Down –C/Down q 0 q 1 TB/Right • Task: move ‘eye’ (mark) one cell at a time til green block found • Observables: Whether marked cell contains a green block (G), non-green block (B), or neither (C); and whether on table (T) or not (–) • Controller derived using classical planner over transformed problem where • Generality: Derived controller solves not just given instance but any instance; i.e., any number of blocks and any configuration • True one-shot generalization! Why? How to understand and extend result? H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 4
Generalized Planning: Motivation • Broaden scope of planners: General strategies for playing Atari games? • Insight into representations: What representations adequate and why? • Connections with (deep) learning: How to learn general features and plans? H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 5
General Plans in Deep Learning From BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop. Y. Bengio et al. 10/2018 Task: Pick up grey box behind you, then go to grey key and open door. Green door near the bottom left needs unlocked with green key, but this is not explicit in instruction. Red triangle represents agent, light-grey, its field of view. Actually open-ended tasks in natural language (!); See also, Mazebase papers and follow ups. H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 6
Outline of Rest of the Talk • Generalized planning: basic formulation (Hu and de Giacomo, IJCAI 2011) • Extended formulation. abstract actions (BG., IJCAI 2018) • Learning features and abstract actions (Bonet, Franc` es, G., AAAI 2019) • Wrap Up, Future Talk is a bit technical. If something not clear, please stop me and ask. Formulations are key. H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 7
Generalized Planning: Basic Formulation • Generalized Q is set of planning instances P sharing actions and features • Features f represent state functions φ f ( s ) over finite domains (observations) • Policy for Q is mapping π from feature valuations into actions • Solutions: π solves general Q iff π solves each P ∈ Q Example • Task Q hall : Clean cells in 1 × n hall, starting from left, any n and dirt • Features d , e : if current cell is dirty, if current cell is last • Actions move , clean : move right, clean current cell • Solution: Policy “If d , clean ”, “If ¬ d and ¬ e , move ” H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 8
Questions How to compute such generalized policies? • Inductively: 1. Draw some instances P 1 , . . . , P n from Q 2. Look for mapping π of features into actions that solves all P i (finite) 3. Hope that policy π will generalize to other instances P in Q • Deductively: 1. Define suitable, sound abstraction Q ′ of Q 2. Solution of abstraction Q ′ guaranteed to solve Q More critically: What about problems Q with no pool of common actions ? H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 9
Generalized Planning: Extended Formulation, BG 2018 • Generalized Q is set of planning instances P sharing set of features F • Features can be boolean p or numerical n with functions φ p ( s ) and φ n ( s ) • Boolean feature valuation assigns truth values to atoms X p = true and X n = 0 • Sound abstract actions on feature variables that track value of features • Policy π for Q maps boolean feature valuations into abstract actions • Solutions: π solves general Q iff π solves each P ∈ Q H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 10
Example Q clear • Generalized problem Q clear : clear block x , STRIPS instances P • Features F : { H, n ( x ) } ; holding block, number of blocks above x • Abstract actions A F : { pick-above- x , put-aside } ; pick-above- x : ¬ H, n ( x ) > 0 �→ H, n ( x ) ↓ , put-aside : H �→ ¬ H • Abstract actions ¯ a ∈ A F are sound ⊲ If ¯ a applicable in s over instance P of Q clear , then ∃ action b in P applicable in s with same effects over features F • Solution for Q clear is policy given by rules If ¬ H, n ( x ) > 0 do pick-above- x If H, n ( x ) > 0 do put-aside , H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 11
Next • Language of abstraction • Properties of abstraction: soundness and completeness • Computation: Compilation of abstraction into FOND problem • Learning: Features and abstract actions provided by hand, then learned H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 12
Abstract Actions: Language • Features F will refer to state functions φ f ( s ) in instances P but to state (feature) variables in abstraction • Abstract action ¯ a = Pre �→ Eff defined over feature variables : ⊲ Boolean preconditions and effects: p , ¬ p ⊲ Numerical preconditions: n = 0 , n > 0 ⊲ Numerical effects: n ↑ , n ↓ (inc’s and dec’s by unspecified amounts ) • Language of qualitative numerical problems (QNPs) , Srivastava et al, 2011 ⊲ Sufficiently expressive for abstraction ⊲ Compiles into fully observable non-deterministic (FOND) planning: ⊲ ( n = 0 become boolean var, n > 0 its negation, effect n ↑ becomes n > 0 , and n ↓ , n = 0 | n > 0 ; non-det effects n = 0 | n > 0 , however, conditionally fair ) H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 13
Abstract Actions: Soundness and Completeness They enable us to reason about all instances P in parallel, in terms of abstract actions that operate at the level of features Def: Action b and abstract ¯ a = Pre �→ Eff have same effects over F in state s , if both applicable in s with same effects on the features p and n in F : s ′ = f ( b, s ) 1. p ∈ Eff iff φ p ( s ′ ) true and φ p ( s ) false; s ′ = f ( b, s ) 2. ¬ p ∈ Eff iff φ p ( s ′ ) false and φ p ( s ) true; 3. n ↑ ∈ Eff iff φ n ( s ′ ) > φ p ( s ) ; s ′ = f ( b, s ) 4. n ↑ ∈ Eff iff φ n ( s ′ ) > φ p ( s ) ; s ′ = f ( b, s ) Def: Abstract actions A F sound in Q iff for any s over instance P of Q , if ¯ a in A F is applicable in s , there is action b in P with the same effects as ¯ a in s . Def: Abstract actions A F complete in Q iff for any s over instance P of Q , if ¯ a in A F is applicable in s , there is action b in P with the same effects as ¯ a in s . H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 14
Example • Let s be state of some instance P where ⊲ on ( A, B ) and clear ( A ) are true, ⊲ A is above x • ¯ a = ¬ H, n ( x ) > 0 �→ H, n ( x ) ↓ , F = { H, n ( x ) } , • b = unstack ( A, B ) Abstract action ¯ a and action b have the same effects over features in s : • Both ¯ a and b applicable in s • Both make H true in s ′ = f ( a, s ) , and both decrease n ( x ) ; i.e., ⊲ φ H ( s ′ ) = true , φ n ( x ) ( s ′ ) < φ n ( x ) ( s ) , and ⊲ Eff (¯ a ) = { H, n ( x ) ↓ } Abstract action ¯ a is indeed sound in Q clear H. Geffner, Learning and Representation for Generalized Planning, Hybris Workshop, Freiburg 11/2018 15
Recommend
More recommend