chapter 5 deliberation with nondeterministic domain
play

Chapter 5 Deliberation with Nondeterministic Domain Automated - PowerPoint PPT Presentation

Last update: May 5, 2020 Chapter 5 Deliberation with Nondeterministic Domain Automated Planning Models and Acting Malik Ghallab, Dana Nau and Paolo Traverso Dana S. Nau http://www.laas.fr/planning University of Maryland Nau Lecture


  1. Last update: May 5, 2020 Chapter 5 Deliberation with Nondeterministic Domain Automated Planning Models and Acting Malik Ghallab, Dana Nau and Paolo Traverso Dana S. Nau http://www.laas.fr/planning University of Maryland Nau – Lecture slides for Automated Planning and Acting Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License 1

  2. Motivation c ● We’ve assumed action a in state s a b has just one possible outcome grasp(c) c a b ▸ γ ( s,a ) c a b ● Often more than one possible outcome ▸ Unintended outcomes ▸ Exogenous events ▸ Inherent uncertainty Nau – Lecture slides for Automated Planning and Acting 2

  3. Nondeterministic Planning Domains ● 3-tuple ( S , A , γ ) ▸ S and A – finite sets of states and actions ▸ γ : S × A → 2 S ● γ ( s,a ) = {all possible “next states” after applying action a in state s } ▸ a is applicable in state s iff γ ( s,a ) ≠ ∅ ● Applicable( s ) = {all actions applicable in s } = { a ∈ A | γ ( s,a ) ≠ ∅ } ● One action representation: n mutually exclusive “effects” lists ▸ Problem: n may be combinatorially large a ( z 1 , …, z k ) • Suppose a can cause any possible pre: p 1 , …, p m combination of effects e 1 , e 2 , …, e k eff 1 : e 11 , e 12 , … eff 2 : e 21 , e 22 , … • Need eff 1 , eff 2 , …, eff 2 k … ▸ One for for each combination eff n : e n 1 , e n 2 , … • Section 5.4: a way to alleviate this ▸ For now, ignore most of that • states, actions ⇔ nodes, edges in a graph Nau – Lecture slides for Automated Planning and Acting 3

  4. Nondeterministic Planning Domains ● For deterministic planning problems, search space was a graph ● Now it’s an AND/OR graph ▸ OR branch : transit3 • several applicable actions, back which one to choose? move ▸ AND branch : parking2 back deliver • multiple possible gate1 outcomes unload • must park handle deliver on_ship at_harbor all of them parking1 gate2 ● Analogy to PSP move ▸ OR branch ⇔ action selection move ▸ AND branch ⇔ flaw selection transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 4

  5. Example ● Very simple harbor management domain ▸ Unload a single item from a ship ▸ Move it around a harbor transit3 back move parking2 back deliver gate1 unload park deliver on_ship at_harbor parking1 gate2 move move transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 5

  6. Example ● One state variable: pos(item) ● Five actions transit3 ▸ Two deterministic: back • unload, back move ▸ Three nondeterministic: parking2 back deliver • park, move, deliver gate1 unload park deliver on_ship at_harbor parking1 gate2 ● Simplified names for states ▸ For { pos(item)=on_ship } move move write on_ship transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 6

  7. Actions ▸ park pos(item) = at_harbor pre: eff 1 : pos(item) ← parking1 transit3 eff 2 : pos(item) ← parking2 eff 3 : pos(item) ← transit1 back move ● Three possible outcomes parking2 back deliver ▸ put item gate1 in parking1 or parking2 unload if one of them park has space deliver on_ship at_harbor ▸ or in transit1 parking1 gate2 if there’s no move parking space move transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 7

  8. Plans Policies ● Need something more general than a sequence of actions ▸ After park , what do we do next? ● Policy : a partial function π : S ⇸ A transit3 ▸ i.e., Dom( π ) ⊆ S back ▸ For every s ∈ Dom( π ), move require π ( s ) ∈ Applicable( s ) parking2 back deliver ● Meaning: ▸ perform π ( s ) whenever we’re gate1 in state s unload park deliver on_ship at_harbor parking1 gate2 ● π 1 = {( on_ship , unload), move (at_harbor, park), move (parking1, deliver) } transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 8

  9. Definitions Over Policies ● Transitive closure : ● leaves ( s , π ) = ̂ γ ( s , π ) ∖ Dom( π ) {all states reachable from s using π } ▸ may be empty ▸ ̂ γ ( s,π ) = S 0 ⋃ S 1 ⋃ S 2 ⋃ … • S 0 = { s } • S i +1 = ∪ { γ ( s,π ( s )) | s ∈ S i }, i ≥ 0 parking2 ● Reachability graph : Graph( s , π ) = ( V,E ) • V = ̂ γ ( s,π ) gate1 • E = {( s ′, s ′′) | s ′ ∈ V, s ′′ ∈ γ ( s ′, π ( s ′))} on_ship at_harbor parking1 gate2 ● π 1 = {( on_ship , unload), (at_harbor, park), (parking1, deliver) } transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 9

  10. Definitions Over Policies ● leaves ( s , π ) = ̂ γ ( s , π ) ∖ Dom( π ) ▸ may be empty ● π 1 = {( on_ship , unload), (at_harbor, park), transit3 (parking1, deliver) } back ● leaves (on_ship, π 1 ) are yellow move parking2 back deliver gate1 unload park deliver on_ship at_harbor parking1 gate2 move move transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 10

  11. Performing a Policy ● PerformPolicy ( π ) s ← observe current state while s ∈ Dom( π ) do transit3 perform action π ( s ) back s ← observe current state move parking2 back deliver ● π 1 = {( on_ship , unload), (at_harbor, park), gate1 (parking1, deliver) } unload park deliver on_ship at_harbor parking1 gate2 move move transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 11

  12. Planning Problems and Solutions ● Planning problem P = (Σ ,s 0 ,S g ) π 1 = { (on_ship, unload), ▸ planning domain Σ = ( S,A,γ ), initial state s 0 ∈ S , (at_harbor, park), (parking1, deliver) } set of goal states S g ⊆ S (shown in green) ● π is a solution if at least one execution ends at a goal transit3 ▸ leaves ( s , π ) ∩ S g ≠ ∅ back ● A solution π is safe if move parking2 back ∀ s ∈ ̂ γ ( s 0 , π ), leaves ( s , π ) ∩ S g ≠ ∅ deliver ▸ all executions end at goals gate1 ▸ at every node s 0 of Graph( s 0 ,π ), unload park the goal is deliver S g on_ship at_harbor reachable parking1 gate2 ● Otherwise, unsafe • Is π 1 safe or unsafe? move move transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 12

  13. Safe Solutions ● Acyclic safe solution ▸ Graph( s 0 , π ) is acyclic, and leaves ( s , π ) ⊆ S g ▸ Guaranteed to reach a goal transit3 back move parking2 back deliver gate1 unload park deliver on_ship at_harbor parking1 gate2 ● π 2 = {( on_ship , unload), (at_harbor, park), (parking1, deliver), (parking2, deliver), move (transit1, move), (transit2, move), move (transit3, move) } transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 13

  14. Poll : Are there situations Safe Solutions where we can be sure a cyclic safe solution will reach a goal? Are there ● Cyclic safe solution situations where we can’t? ▸ Graph( s 0 , π) is cyclic, leaves ( s , π ) ⊆ S g , ∀ s ∈ ̂ γ ( s 0 , π ), leaves ( s , π )∩ S g ≠ ∅ (1) Yes to both questions • At every state, there is transit3 (2) Yes to 1 st , no to 2 nd an execution path that back back (3) Yes to 2 nd , no to 1 st ends at a goal (4) No to both ▸ Will never get caught in move parking2 back deliver a dead end gate1 unload park deliver on_ship at_harbor parking1 gate2 = π 3 = {( on_ship , unload), (at_harbor, park), move (parking1, deliver), (parking2, back), move (transit1, move), (transit2, move), (gate1, back) } transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 14

  15. Kinds of Solutions acyclic a solutions safe solutions cyclic c Goal solutions solutions unsafe b solutions 15 Nau – Lecture slides for Automated Planning and Acting 15

  16. Finding (Unsafe) Solutions For comparison: Forward-search (Σ, s 0 , g ) s ← s 0 ; π ← ⟨⟩ loop if s satisfies g then return π A ′ ←{ a ∈ A | a is applicable in s } if A ′ = ∅ then return failure nondeterministically choose a ∈ A ′ s ← γ( s,a ); π ← π.a Decide which state to plan for (*) Cycle-checking Poll : which should (*) be? 1. nondeterministically choose 2. arbitrarily choose Nau – Lecture slides for Automated Planning and Acting 16

  17. Example transit3 back move parking2 back deliver s = on_ship gate1 unload park deliver S g on_ship at_harbor s parking1 gate2 π = {} move move Visited = { on_ship } transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 17

  18. Example transit3 back move parking2 back deliver s = on_ship , a = unload γ( s,a ) = { at_harbor } gate1 s ′ = at_harbor a unload park deliver S g on_ship at_harbor s s ′ parking1 gate2 π = { (on_ship, unload) } move move Visited = { on_ship, at_harbor } transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 18

  19. Example transit3 back move parking2 back deliver s = at_harbor , a = park γ( s,a ) = { parking1, parking2, transit1 } gate1 s ′ = parking1 a unload park deliver S g on_ship at_harbor s parking1 gate2 π = { (on_ship, unload), s ′ (at_harbor, park) } move move Visited = { on_ship, at_harbor, parking1 } transit1 transit2 Nau – Lecture slides for Automated Planning and Acting 19

Recommend


More recommend