classical planning
play

Classical Planning CE417: Introduction to Artificial Intelligence - PowerPoint PPT Presentation

Classical Planning CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani AIMA, 3 rd Edition, Chapter 10 & more about planning What is planning? Planning problem: finding a sequence of


  1. Backward state-space search  An action 𝑏 is relevant for 𝑕 , if 𝑏 can be the last step in a plan leading to 𝑕 : A B 𝑁𝑝𝑀𝑓(𝐡, 𝑦, 𝐢) 𝑕 ∩ ADD 𝑏 β‰   C ? 𝑕 ∩ DEL(𝑏) =   Regression: To achieve goal 𝑕 , we regress it through a relevant action 𝑏 ( 𝑏 as final step of plan to reach 𝑕 ): 𝑕′ = (𝑕 – ADD(𝑏)) βˆͺ PRE(𝑏) 23

  2. Regression Example move(B,A,C) B ??? A C  π‘•π‘π‘π‘š = {π‘ƒπ‘œ(𝐢, 𝐷), π‘ƒπ‘œ(π‘ˆπ‘π‘π‘šπ‘“, 𝐡)} ADD(𝑏): π‘ƒπ‘œ(𝐢, 𝐷) DEL(𝑏): π‘ƒπ‘œ(𝐢, 𝐡)  Relevant action: 𝑏 = 𝑁𝑝𝑀𝑓(𝐢, 𝐡, 𝐷) PREC(𝑏): π‘ƒπ‘œ(𝐢, 𝐡), π·π‘šπ‘“π‘π‘ (𝐢), π·π‘šπ‘“π‘π‘ (𝐷) g ∩ ADD(a) = {On(B,C)} β‰   g ∩ DEL(a) =   Regression (add preconds. of 𝑏 , remove predicates in add list 𝑏 )  π‘•π‘π‘π‘š = {π‘π‘œ(𝐢, 𝐷), π‘π‘œ(π‘ˆπ‘π‘π‘šπ‘“, 𝐡), π‘π‘œ(𝐢, 𝐡), π‘‘π‘šπ‘“π‘π‘ (𝐢), π‘‘π‘šπ‘“π‘π‘ (𝐷)} 24

  3. Backward Search Backward-Search( s, g ) if s satisfies g then return [] relevant = { a | a is relevant to g } if relevant =  then return failure for each a οƒŽ relevant do g ’ = (𝑕 – ADD(𝑏)) βˆͺ PREC(𝑏)  ’ = Backward-Search( s, g ’ ) if  ’≠ failure then return [  ’ |a] return failure 25

  4. Backward Search  Instantiating Schema  Goal as a conjunction of literals that may contain variables  T o be more efficient, instantiate schema variables by unification , rather than generating and testing different actions  For most domains, it has lower branching factor than forward search  Heuristics are more difficult to use  It is based on set of states rather than individual states. 26

  5. Regression Example move(B,?,C) B ??? A C  π‘•π‘π‘π‘š = {π‘ƒπ‘œ(𝐢, 𝐷), π‘ƒπ‘œ(π‘ˆπ‘π‘π‘šπ‘“, 𝐡)} ADD(𝑏): π‘ƒπ‘œ(𝐢, 𝐷) DEL(𝑏): π‘ƒπ‘œ(𝐢, ? )  Relevant action: 𝑏 = 𝑁𝑝𝑀𝑓(𝐢, ? , 𝐷) PREC(𝑏): π‘ƒπ‘œ(𝐢, ? ), π·π‘šπ‘“π‘π‘ (𝐢), π·π‘šπ‘“π‘π‘ (𝐷) g ∩ ADD(a) = {On(B,C)} β‰   g ∩ DEL(a) =   Regression (add preconds. of 𝑏 , remove predicates in add list 𝑏 )  π‘•π‘π‘π‘š = {π‘π‘œ(𝐢, 𝐷), π‘π‘œ(π‘ˆπ‘π‘π‘šπ‘“, 𝐡), π‘π‘œ(𝐢, ? ), π‘‘π‘šπ‘“π‘π‘ (𝐢), π‘‘π‘šπ‘“π‘π‘ (𝐷)} 27

  6. State-space search problems  Both of forward and backward algorithms may have repeated states problem  visited states must be recorded  What ’ s wrong with search?  Branching factor is usually too high.  Combinatorial explosion if state given by set of possible worlds/logical interpretations/variable assignments 28

  7. Heuristic for planning  Solving problems by searching atomic states (Chapter 3)  Human intelligence is usually used to define domain-specific heuristics  Assumption: β€œ path cost = number of plan steps ”  We want to estimate # of steps needed to reach 𝑕 from 𝑑  In planning, problems we use factored representation of states  Allows us to find domain-independent heuristics 29

  8. Heuristic for planning  Heuristics:  Relaxed problems:  Ignore delete lists  Ignore preconditions  Problem decomposition  Sub-goal independence assumption 30

  9. Heuristics: relaxed problems Ignore delete lists: 1) Delete negative effects from actions, solve relaxed problem and use the length of plan as heuristic  Admissible ?  Can we solve this problem in polynomial time? Ignore preconditions: 2) Delete all preconditions from actions, solve relaxed problem and use the length of plan as heuristic  Admissible ?  Can we solve this problem in polynomial time? 31

  10. Heuristics: problem decomposition 𝑔(π‘ž, 𝑑) : minimum # of steps needed to reach proposition π‘ž from 𝑑  Sum of the cost of reaching each sub-goal from 𝑑 β„Ž 𝑑𝑣𝑛 (𝑑) = 𝑔(𝑕, 𝑑) 𝑕 ∈𝐻  Not necessarily admissible  independence assumption can be pessimistic  Max of the cost of reaching each sub-goal from 𝑑 β„Ž 𝑛𝑏𝑦 (𝑑) = m𝑏𝑦 𝑕 ∈𝐻 𝑔(𝑕, 𝑑) 32

  11. Heuristics: problem decomposition (sum or max)  Max or sum?  Admissibility vs. accuracy  Sum works well in practice for problems that are largely decomposable.  How to compute 𝑔(π‘ž, 𝑑) ? 33

  12. Ignore delete lists & problem decomposition  When both ignoring delete lists & decomposing the problem  we can compute 𝑔(π‘ž, 𝑑) in polynomial time using the Planning Graph (we will see it in the next slides).  Examples of such heuristics used in these planners:  HSP  Fast-Forward (FF)  Competed in fully automated track of AIPS ’ 2000  Granted ``Group A distinguished performance Planning System' β€˜  Estimate the heuristic with the help of a planning graph J. Hoffman, B. Nebel, β€œ The FF planning system: Fast plan generation through heuristic search ” , Journal of Artificial Intelligence Research 14 (2001), 253-302 34

  13. Planning graph  A way to find accurate heuristics  (Under)estimating no. steps required to reach 𝑕  Admissible  A layered graph that keeps track of literal pairs and action pairs that cannot be reached simultaneously (mutexes) 35

  14. Planning graph: structure  Directed, leveled graph  Two types of levels:  𝑄 : proposition levels  𝐡 : action levels  Proposition and action levels alternate  Edges (between levels)  Precondition: each action at 𝐡 𝑗 is connected to its preconditions at 𝑄 𝑗  Effect: each action at 𝐡 𝑗 is connected to its effects at 𝑄 𝑗+1 36

  15. Planning graph: layers  𝑄 𝑗 contains all the literals that could hold at time 𝑗  𝐡 𝑗 contains all actions whose preconditions are satisfied in 𝑄 𝑗 plus no-op actions (to solve frame problem). … … … (Initial state) 𝑄 0 𝑄 𝑄 2 𝐡 0 𝐡 1 1 37

  16. Planning graph: layers 𝑄 0 = {π‘ž ∈ π½π‘œπ‘—π‘’} 𝐡𝑗 = {𝑏 is an action| PRECONDS(𝑏) βŠ† 𝑄 𝑗 } 𝑄 𝑗+1 = {π‘ž ∈ EFFECT(𝑏)| 𝑏 ∈ 𝐡 𝑗 } … … … (Initial state) 𝑄 0 𝑄 𝐡 0 𝐡 1 1 38

  17. Planning graph: Cake example π½π‘œπ‘—π‘’(𝐼𝑏𝑀𝑓(𝐷𝑏𝑙𝑓))  π»π‘π‘π‘š 𝐼𝑏𝑀𝑓(𝐷𝑏𝑙𝑓) ∧ πΉπ‘π‘’π‘“π‘œ(𝐷𝑏𝑙𝑓)  π΅π‘‘π‘’π‘—π‘π‘œ(𝐹𝑏𝑒(𝐷𝑏𝑙𝑓)  PRECOND: 𝐼𝑏𝑀𝑓(𝐷𝑏𝑙𝑓)  EFFECT: ¬𝐼𝑏𝑀𝑓(𝐷𝑏𝑙𝑓) ∧ πΉπ‘π‘’π‘“π‘œ 𝐷𝑏𝑙𝑓 )  π΅π‘‘π‘’π‘—π‘π‘œ(𝐢𝑏𝑙𝑓(𝐷𝑏𝑙𝑓)  PRECOND: ¬𝐼𝑏𝑀𝑓(𝐷𝑏𝑙𝑓)  EFFECT: 𝐼𝑏𝑀𝑓(𝐷𝑏𝑙𝑓))  no-op action 39

  18. Planning graph: Spare tire example π½π‘œπ‘—π‘’ π‘ˆπ‘—π‘ π‘“ πΊπ‘šπ‘π‘’ ∧ π‘ˆπ‘—π‘ π‘“ π‘‡π‘žπ‘π‘ π‘“ ∧ 𝐡𝑒 πΊπ‘šπ‘π‘’, π΅π‘¦π‘šπ‘“ ∧ 𝐡𝑒 π‘‡π‘žπ‘π‘ π‘“, π‘ˆπ‘ π‘£π‘œπ‘™ π»π‘π‘π‘š(𝐡𝑒(π‘‡π‘žπ‘π‘ π‘“, π΅π‘¦π‘šπ‘“)) π΅π‘‘π‘’π‘—π‘π‘œ(𝑆𝑓𝑛𝑝𝑀𝑓(π‘π‘π‘˜, π‘šπ‘π‘‘), 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝐡𝑒(π‘π‘π‘˜, π‘šπ‘π‘‘), πΉπΊπΊπΉπ·π‘ˆ: ¬𝐡𝑒(π‘π‘π‘˜, π‘šπ‘π‘‘) ∧ 𝐡𝑒(π‘π‘π‘˜, π»π‘ π‘π‘£π‘œπ‘’)) π΅π‘‘π‘’π‘—π‘π‘œ(π‘„π‘£π‘’π‘ƒπ‘œ(𝑒, π΅π‘¦π‘šπ‘“), 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: π‘ˆπ‘—π‘ π‘“(𝑒) ∧ 𝐡𝑒(𝑒, π»π‘ π‘π‘£π‘œπ‘’) ∧ οƒ˜ 𝐡𝑒(πΊπ‘šπ‘π‘’, π΅π‘¦π‘šπ‘“) πΉπΊπΊπΉπ·π‘ˆ: οƒ˜ 𝐡𝑒(𝑒, π»π‘ π‘π‘£π‘œπ‘’) ∧ 𝐡𝑒(πΊπ‘šπ‘π‘’, π΅π‘¦π‘šπ‘“)) π΅π‘‘π‘’π‘—π‘π‘œ(π‘€π‘“π‘π‘€π‘“π‘ƒπ‘€π‘“π‘ π‘œπ‘—π‘•β„Žπ‘’, 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: πΉπΊπΊπΉπ·π‘ˆ: οƒ˜ 𝐡𝑒(π‘‡π‘žπ‘π‘ π‘“, π»π‘ π‘π‘£π‘œπ‘’) ∧ οƒ˜ 𝐡𝑒(π‘‡π‘žπ‘π‘ π‘“, π‘ˆπ‘ π‘£π‘œπ‘™) ∧ οƒ˜ 𝐡𝑒(π‘‡π‘žπ‘π‘ π‘“, π΅π‘¦π‘šπ‘“) ∧ οƒ˜ 𝐡𝑒(πΊπ‘šπ‘π‘’, π»π‘ π‘π‘£π‘œπ‘’) ∧ οƒ˜ 𝐡𝑒(πΊπ‘šπ‘π‘’, π‘ˆπ‘ π‘£π‘œπ‘™) ∧ οƒ˜ 𝐡𝑒(πΊπ‘šπ‘π‘’, π΅π‘¦π‘šπ‘“)) 40

  19. Planning graph: Spare tire example 41

  20. Planning graphs: properties  In level 𝑄 𝑗 , both 𝑄 and ¬𝑄 may exist.  A literal may appear at level 𝑄 𝑗 while actually it could not be true until a later level (if any)  A literal will never appear late in the planning graph. 42

  21. Planning graphs: cost of each goal literal  How difficult is it to achieve a goal literal 𝑕 𝑗 from 𝑑 ?  Level-cost of 𝑕 𝑗 ( π‘šπ‘‘(𝑕 𝑗 , 𝑑) ) : It shows the first level of PG at which 𝑕 𝑗 appears.  Relation to previously introduced heuristics?  Is it accurate? 43

  22. Planning graphs: heuristics β„Ž max _π‘šπ‘“π‘€π‘“π‘š (𝑑) = 𝑕 𝑗 ∈ π‘•π‘π‘π‘š π‘šπ‘‘(𝑕 𝑗 , 𝑑) m𝑏𝑦 β„Ž π‘šπ‘“π‘€π‘“π‘š_𝑑𝑣𝑛 (𝑑) = π‘šπ‘‘(𝑕 𝑗 , 𝑑) 𝑕 𝑗 ∈ π‘•π‘π‘π‘š 44

  23. Planning graphs: constraints  Mutual exclusion (mutex) links  Two actions at a given action level are mutually exclusive if no valid plan could possibly contain both.  Two propositions at a given proposition level are mutually exclusive if no valid plan could possibly make both true.  This structure helps in reducing the search for a sub-graph of a Planning Graph that might correspond to a valid plan. 45

  24. Planning graphs: constraints  Mutexes between actions  Inconsistent effects: one action negates an effect of the other  Interference: one of the effects of one action is the negation of a precondition of the other  Competing needs: mutually exclusive preconditions  Mutexes between literals  One of the literals is the negation of the other  Inconsistent support: Each possible pair of actions that could achieve them (in this level) is mutually exclusive. 46

  25. Planning graphs: constraints Types of mutexes Interference Inconsistent (Prec-Effect) Effects Inconsistent Competing Support Needs 47

  26. Planning graph: Spare tire example 48

  27. Planning graph: more accurate heuristic  We want to define a more accurate heuristic using the mutexes: β„Ž 2 (set-level heuristic): the level at which all the goal literals appear without any pair of them being mutually exclusive.  β„Ž 1 (max-level heuristic) is extended to β„Ž 2 considering mutexes between all pairs of propositions.  β„Ž 2 is more useful than β„Ž 1 ( 0 ≀ β„Ž 1 ≀ β„Ž 2 ≀ β„Ž βˆ— ) 49

  28. Planning graph: more accurate heuristics  β„Ž 2 can be extended to β„Ž 3 by defining and considering inconsistencies of triplets of propositions  In general  β„Ž 𝑙 are admissible  β„Ž 𝑙+1 β‰₯ β„Ž 𝑙  Computing β„Ž 𝑙 is 𝑃(π‘œ 𝑙 ) with π‘œ propositions  𝑙 = 2 is commonly used 50

  29. GraphPlan: basic idea  Construct a graph that encodes constraints on plans  Use this graph to constrain search for a valid plan:  If a valid plan exists it is a sub-graph of the Planning Graph.  Actions at the same level don ’ t interfere  Each action ’ s preconditions are made true by the plan  Goals are satisfied  Planning graph can be built for each problem in polynomial time. 51

  30. GraphPlan: level off  Definition: Planning Graph levels off if two consecutive proposition levels are identical (both literals and mutexes).  We will show that the set of literals never decreases in the proposition levels and mutexes don ’ t reappear. 52

  31. GraphPlan: level off (Observation 1) p p p p A A A Β¬q q q q Β¬r Β¬q Β¬q Β¬q B B Β¬r r r Β¬r Β¬r Literals monotonically increase Propositions are always carried forward by no-ops. 53

  32. GraphPlan: level off (Observation 2) p p p p A A A Β¬q q q q Β¬r Β¬q Β¬q Β¬q B B Β¬r r r Β¬r Β¬r Actions monotonically increase (Once an action appears at a level, it will appear at all subsequent levels) If preconds. of an action appear at one level, they will appear at subsequent levels and thus the action will appear so. 54

  33. GraphPlan: level off (Observation 3) p p p q q q A r r r … … … Proposition mutex relationships monotonically decrease Available actions are monotonically increasing. Thus mutex relations between literals are decreasing. (When mutexes between literals are due to mutex relations between actions, they may be removed in the next levels) 55

  34. GraphPlan: level off (Observation 4) A A A p p p p q q q q B B B … r r r C C C s s s … … … Action mutex relationships monotonically decrease Mutex relations between actions due to competing needs (when preconditions are not negations of each other) must be decreasing. 56

  35. GraphPlan Algorithm necessary, but usually insufficient condition for plan existence Graph levels are constructed until all goals are reached 1) and not mutex.  If PG levels off before reaching this level, GraphPlan returns failure. ExtractSolution phase: search the PG for a valid plan 2) If non found, add a level to the PG and go to step 2. 3) GraphPlan builds graph forward and extracts plan backwards 57

  36. GraphPlan: β€œ Extract Solution ” phase  Some ways  As a backward search  looks for actions that produce goals while pruning as many of them as possible via incompatibility information.  As a heuristic search computes an admissible heuristic for each state and then uses it during search.  As a CSP (related to SATPlan algorithm)  Variables: a variable for an action at each level  Domain={0,1}  Constraints: mutexes 58

  37. Extract Solution: backward search Start from the last level & agenda=goals  Termination: 𝑙 = 0  Action Selection: At each level 𝑙, select any conflict-free subset of actions in 𝐡 π‘™βˆ’1 whose effects cover current goals.  If no such subset is found return failure  Preconditions of selected actions become new goals for recursive call at level 𝑙 βˆ’ 1 . 59

  38. GraphPlan: Example 𝐡𝑒(𝑄, 𝐢 ) 𝐡𝑒(𝑄, 𝐢 ) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) 𝐡𝑒 𝑄, 𝐡 𝐡𝑒(𝑄, 𝐡) 𝐡𝑒(𝑄, 𝐡) πΊπ‘šπ‘§(𝑄, 𝐢, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) π½π‘œ(𝐷, 𝑄) π½π‘œ(𝐷, 𝑄) ¬𝐡𝑒(𝑄, 𝐢) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) ¬𝐡𝑒(𝐷, 𝐡) ¬𝐡𝑒(𝐷, 𝐡) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐢) Β¬π½π‘œ(𝐷, 𝑄) 𝐡𝑒(𝐷, 𝐢) A: Airport Goal P: Plane 𝐡𝑒(𝐷, 𝐢) C: Cargo 60

  39. GraphPlan: Example 𝐡𝑒(𝑄, 𝐢 ) 𝐡𝑒(𝑄, 𝐢 ) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) 𝐡𝑒 𝑄, 𝐡 𝐡𝑒(𝑄, 𝐡) 𝐡𝑒(𝑄, 𝐡) πΊπ‘šπ‘§(𝑄, 𝐢, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) π½π‘œ(𝐷, 𝑄) π½π‘œ(𝐷, 𝑄) ¬𝐡𝑒(𝑄, 𝐢) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) ¬𝐡𝑒(𝐷, 𝐡) ¬𝐡𝑒(𝐷, 𝐡) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐢) Β¬π½π‘œ(𝐷, 𝑄) 𝐡𝑒(𝐷, 𝐢) A: Airport Goal P: Plane 𝐡𝑒(𝐷, 𝐢) C: Cargo 61

  40. GraphPlan: Example 𝐡𝑒(𝑄, 𝐢 ) 𝐡𝑒(𝑄, 𝐢 ) 𝐡𝑒(𝑄, 𝐢 ) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) 𝐡𝑒 𝑄, 𝐡 𝐡𝑒(𝑄, 𝐡) 𝐡𝑒(𝑄, 𝐡) 𝐡𝑒(𝑄, 𝐡) πΊπ‘šπ‘§(𝑄, 𝐢, 𝐡) πΊπ‘šπ‘§(𝑄, 𝐢, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) π½π‘œ(𝐷, 𝑄) π½π‘œ(𝐷, 𝑄) π½π‘œ(𝐷, 𝑄) ¬𝐡𝑒(𝑄, 𝐢) ¬𝐡𝑒(𝑄, 𝐢) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐡) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐢) ¬𝐡𝑒(𝐷, 𝐡) ¬𝐡𝑒(𝐷, 𝐡) ¬𝐡𝑒(𝐷, 𝐡) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐢) ¬𝐡𝑒(𝐷, 𝐢) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐢) Β¬π½π‘œ(𝐷, 𝑄) Β¬π½π‘œ(𝐷, 𝑄) 𝐡𝑒(𝐷, 𝐢) 𝐡𝑒(𝐷, 𝐢) A: Airport Goal P: Plane 𝐡𝑒(𝐷, 𝐢) C: Cargo 62

  41. GraphPlan: Example 𝐡𝑒(𝑄, 𝐢 ) 𝐡𝑒(𝑄, 𝐢 ) 𝐡𝑒(𝑄, 𝐢 ) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) 𝐡𝑒 𝑄, 𝐡 𝐡𝑒(𝑄, 𝐡) 𝐡𝑒(𝑄, 𝐡) 𝐡𝑒(𝑄, 𝐡) πΊπ‘šπ‘§(𝑄, 𝐢, 𝐡) πΊπ‘šπ‘§(𝑄, 𝐢, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) π½π‘œ(𝐷, 𝑄) π½π‘œ(𝐷, 𝑄) π½π‘œ(𝐷, 𝑄) ¬𝐡𝑒(𝑄, 𝐢) ¬𝐡𝑒(𝑄, 𝐢) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐡) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐢) ¬𝐡𝑒(𝐷, 𝐡) ¬𝐡𝑒(𝐷, 𝐡) ¬𝐡𝑒(𝐷, 𝐡) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐢) ¬𝐡𝑒(𝐷, 𝐢) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐢) Β¬π½π‘œ(𝐷, 𝑄) Β¬π½π‘œ(𝐷, 𝑄) 𝐡𝑒(𝐷, 𝐢) 𝐡𝑒(𝐷, 𝐢) A: Airport Goal P: Plane 𝐡𝑒(𝐷, 𝐢) C: Cargo 63

  42. GraphPlan: Example 𝐡𝑒(𝑄, 𝐢 ) 𝐡𝑒(𝑄, 𝐢 ) 𝐡𝑒(𝑄, 𝐢 ) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) 𝐡𝑒 𝑄, 𝐡 𝐡𝑒(𝑄, 𝐡) 𝐡𝑒(𝑄, 𝐡) 𝐡𝑒(𝑄, 𝐡) πΊπ‘šπ‘§(𝑄, 𝐢, 𝐡) πΊπ‘šπ‘§(𝑄, 𝐢, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) π½π‘œ(𝐷, 𝑄) π½π‘œ(𝐷, 𝑄) π½π‘œ(𝐷, 𝑄) ¬𝐡𝑒(𝑄, 𝐢) ¬𝐡𝑒(𝑄, 𝐢) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐡) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐢) ¬𝐡𝑒(𝐷, 𝐡) ¬𝐡𝑒(𝐷, 𝐡) ¬𝐡𝑒(𝐷, 𝐡) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐢) ¬𝐡𝑒(𝐷, 𝐢) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐢) Β¬π½π‘œ(𝐷, 𝑄) Β¬π½π‘œ(𝐷, 𝑄) 𝐡𝑒(𝐷, 𝐢) 𝐡𝑒(𝐷, 𝐢) A: Airport Goal P: Plane 𝐡𝑒(𝐷, 𝐢) C: Cargo 64

  43. GraphPlan: Example 𝐡𝑒(𝑄, 𝐢 ) 𝐡𝑒(𝑄, 𝐢 ) 𝐡𝑒(𝑄, 𝐢 ) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) πΊπ‘šπ‘§(𝑄, 𝐡, 𝐢) 𝐡𝑒 𝑄, 𝐡 𝐡𝑒(𝑄, 𝐡) 𝐡𝑒(𝑄, 𝐡) 𝐡𝑒(𝑄, 𝐡) πΊπ‘šπ‘§(𝑄, 𝐢, 𝐡) πΊπ‘šπ‘§(𝑄, 𝐢, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝐡𝑒(𝐷, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐡) π½π‘œ(𝐷, 𝑄) π½π‘œ(𝐷, 𝑄) π½π‘œ(𝐷, 𝑄) ¬𝐡𝑒(𝑄, 𝐢) ¬𝐡𝑒(𝑄, 𝐢) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐡) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) ¬𝐡𝑒(𝑄, 𝐡) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐢) ¬𝐡𝑒(𝐷, 𝐡) ¬𝐡𝑒(𝐷, 𝐡) ¬𝐡𝑒(𝐷, 𝐡) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐢) ¬𝐡𝑒(𝐷, 𝐢) π‘‰π‘œπ‘šπ‘π‘π‘’(𝐷, 𝑄, 𝐢) Β¬π½π‘œ(𝐷, 𝑄) Β¬π½π‘œ(𝐷, 𝑄) 𝐡𝑒(𝐷, 𝐢) 𝐡𝑒(𝐷, 𝐢) A: Airport Goal P: Plane 𝐡𝑒(𝐷, 𝐢) C: Cargo 65

  44. GraphPlan: heuristics for backward search  Pick first the goal literal with the highest level cost  To achieve a literal prefer actions with easier preconds.  Sum (or max) of the level costs of its preconds. is smallest. 66

  45. Planning as a satisfiability problem  Bounded planning problem (𝑄, 𝑙) :  𝑄 is a planning problem  Find a solution for 𝑄 of length 𝑙 Translate (𝑄, 𝑙) into a SAT problem. 1) Solve SAT problem. 2) Convert the solution to a plan 3) Solution plan Planning problem Satisfiability problem Logical model Satisfiability Decode Translate solution Solver to PL 67

  46. Pictorial view of fluent for (P,k) Initial state state fluents action fluents fluents (t=0) at t=k at t=k-1 s 0 a 0 a k-1 s k … … … … … …  Truth assignment selects a subset of these nodes to be true  Propositional formulas correspond to valid plans 68

  47. Translating PDDL to propositional logic  Initial state: Conjunction of all true literals at time 0 (and negation of not mentioned literals)  Goal state: Conjunction of all goal literals at time 𝑙  Instantiate literals containing variable (replace with ∨ over constants).  Actions  successor-state axioms at each time up to 𝑒  𝐺 𝑒+1 β‡’ π΅π‘‘π‘’π‘—π‘π‘œπ·π‘π‘£π‘‘π‘“π‘‘πΊ 𝑒 ∨ (𝐺 𝑒 ∧ Β¬π΅π‘‘π‘’π‘—π‘π‘œπ·π‘π‘£π‘‘π‘“π‘‘π‘‚π‘π‘’πΊ 𝑒 )  precondition axioms:  𝐡 𝑒 β‡’ PRECOND 𝐡 𝑒  action exclusion axioms: 𝑒 ∨ ¬𝐡 π‘˜ 𝑒  ¬𝐡 𝑗 69

  48. Translating PDDL to propositional logic: Example  Initial state: Conjunction of all true literals at time 0 (and negation of not mentioned literals) A  π½π‘œπ‘—π‘’(π‘ƒπ‘œ(𝐡, 𝐢) ∧ π‘ƒπ‘œ(𝐢, π‘ˆπ‘π‘π‘šπ‘“))  π‘ƒπ‘œ 𝐡, 𝐢 0 ∧ π‘ƒπ‘œ 𝐢, π‘ˆπ‘π‘π‘šπ‘“ 0 ∧ Β¬π‘ƒπ‘œ 𝐢, 𝐡 0 ∧ Β¬π‘ƒπ‘œ 𝐡, π‘ˆπ‘π‘π‘šπ‘“ 0 B  Goal state: Conjunction of all goal literals at time 𝑙 B  Instantiate literals containing variable (replace with ∨ over constants).  π»π‘π‘π‘š(π‘ƒπ‘œ(𝐢, 𝐡)) A  π‘ƒπ‘œ 𝐢, 𝐡 1 (for 𝑙 = 1 ) 70

  49. Translating PDDL to propositional logic: Example  Add successor-state axioms at each time up to 𝑒  𝐺 𝑒+1 β‡’ π΅π‘‘π‘’π‘—π‘π‘œπ·π‘π‘£π‘‘π‘“π‘‘πΊ 𝑒 ∨ (𝐺 𝑒 ∧ Β¬π΅π‘‘π‘’π‘—π‘π‘œπ·π‘π‘£π‘‘π‘“π‘‘π‘‚π‘π‘’πΊ 𝑒 )  Example: π‘ƒπ‘œ 𝐢, 𝐡 𝑒+1 β‡’ 𝑁𝑝𝑀𝑓 𝐢, π‘ˆπ‘π‘π‘šπ‘“, 𝐡 𝑒 ∨ [On B, A t ∧ ¬𝑁𝑝𝑀𝑓 𝐢, 𝐡, π‘ˆπ‘π‘π‘šπ‘“ 𝑒 ]  Add precondition axioms:  𝐡 𝑒 β‡’ PRECOND 𝐡 𝑒  Example: 𝑁𝑝𝑀𝑓 𝐢, π‘ˆπ‘π‘π‘šπ‘“, 𝐡 𝑒 β‡’ π‘ƒπ‘œ 𝐢, π‘ˆπ‘π‘π‘šπ‘“ 𝑒 ∧ π·π‘šπ‘“π‘π‘  𝐢 𝑒 ∧ π·π‘šπ‘“π‘π‘  𝐡 𝑒  Is it necessary to include effects: 𝐡 𝑒 β‡’ EFFECT 𝐡 𝑒+1 ?  Add action exclusion axioms: 𝑒 ∨ ¬𝐡 π‘˜ 𝑒  ¬𝐡 𝑗  Example: ¬𝑁𝑝𝑀𝑓 𝐢, π‘ˆπ‘π‘π‘šπ‘“, 𝐡 0 ∨ Β¬π‘π‘π‘€π‘“π‘ˆπ‘π‘ˆπ‘π‘π‘šπ‘“ 𝐡, 𝐢 0 71

  50. Propositional logic solver and decoding  Apply a SAT solver to the whole sentence    : conjunction of encoding initial state, goals, successor-state axioms, precondition axioms, action exclusion axioms  If an assignment of truth values that satisfies  is found, extract action sequence.  This means 𝑄 has a solution of length 𝑙  Extract solution: For 𝑗 = 0, … , 𝑙 βˆ’ 1 , there is exactly one action that has been assigned β€œ True ”  This is the 𝑗 ’ th action of the plan. 72

  51. SATPlan function SATPLAN(π‘—π‘œπ‘—π‘’, π‘’π‘ π‘π‘œπ‘‘π‘—π‘’π‘—π‘π‘œ, π‘•π‘π‘π‘š, π‘ˆ_𝑛𝑏𝑦) returns solution or failure inputs : π‘—π‘œπ‘—π‘’, π‘’π‘ π‘π‘œπ‘‘π‘—π‘’π‘—π‘π‘œ, π‘•π‘π‘π‘š , constitute a description of the problem π‘ˆ_𝑛𝑏𝑦 , an upper limit for plan length for 𝑒 = 0 to π‘ˆ_max do π‘‘π‘œπ‘” ← TRANSLATE_TO_SAT(π‘—π‘œπ‘—π‘’, π‘’π‘ π‘π‘œπ‘‘π‘—π‘’π‘—π‘π‘œ, π‘•π‘π‘π‘š, 𝑒 ) π‘›π‘π‘’π‘“π‘š ← SAT_SOLVER(π‘‘π‘œπ‘”) if π‘›π‘π‘’π‘“π‘š β‰  {} then return EXTRACT_SOLUTION(π‘›π‘π‘’π‘“π‘š) return π‘”π‘π‘—π‘šπ‘£π‘ π‘“ It is guaranteed to find the shortest plan if one exist. 73

  52. SATPlan example  Domain:  Robot 𝑆  T wo locations 𝑀 1 , 𝑀 2  One operator β€œ move ” the robot  Initial state: 𝐡𝑒(𝑆, 𝑀 1 ) 𝑀 1 𝑀 2  Goal: 𝐡𝑒(𝑆, 𝑀 2 )  Action schema:  𝑁𝑝𝑀𝑓 𝑠, π‘š, π‘šβ€™  𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝐡𝑒(𝑠, π‘š)  πΉπΊπΊπΉπ·π‘ˆ: 𝐡𝑒(𝑠, π‘šβ€™) ∧ ¬𝐡𝑒(𝑠, π‘š) 74

  53. SATPlan example (translation to SAT)  Encode (𝑄, 1)  Initial state:  𝐡𝑒(𝑆, 𝑀 1 , 0) ∧ Β¬ 𝐡𝑒(𝑆, 𝑀 2 , 0)  Goal:  𝐡𝑒(𝑆, 𝑀 2 , 1)  Actions preconditions:  𝑁𝑝𝑀𝑓(𝑆, 𝑀 1 , 𝑀 2 , 0) β‡’ 𝐡𝑒(𝑆, 𝑀 1 , 0)  𝑁𝑝𝑀𝑓(𝑆, 𝑀 2 , 𝑀 1 , 0) β‡’ 𝐡𝑒(𝑆, 𝑀 2 , 0)  Action exclusion axiom:  ¬𝑁𝑝𝑀𝑓(𝑆, 𝑀 2 , 𝑀 1 , 0) ∨ ¬𝑁𝑝𝑀𝑓(𝑆, 𝑀 1 , 𝑀 2 , 0) 75

  54. SATPlan example (translation to SAT)  Fluents (Success-state axioms): οƒ˜ 𝐡𝑒 𝑆, 𝑀 1 , 0  𝐡𝑒 𝑆, 𝑀 1 , 1 οƒž 𝑁𝑝𝑀𝑓 𝑆, 𝑀 2 , 𝑀 1 , 0  οƒ˜ 𝐡𝑒 𝑆, 𝑀 2 , 0  𝐡𝑒 𝑆, 𝑀 2 , 1 οƒž 𝑁𝑝𝑀𝑓 𝑆, 𝑀 1 , 𝑀 2 , 0  𝐡𝑒 𝑆, 𝑀 1 , 0 οƒ™οƒ˜ 𝐡𝑒 𝑆, 𝑀 1 , 1 οƒž 𝑁𝑝𝑀𝑓 𝑆, 𝑀 1 , 𝑀 2 , 0   𝐡𝑒 𝑆, 𝑀 2 , 0 οƒ™οƒ˜ 𝐡𝑒 𝑆, 𝑀 2 , 1 οƒž 𝑁𝑝𝑀𝑓 𝑆, 𝑀 2 , 𝑀 1 , 0 76

  55. SATPlan example (translation to SAT) 𝐡𝑒(𝑆, 𝑀 1 , 0) ∧ ¬𝐡𝑒(𝑆, 𝑀 2 , 0) ∧ 𝐡𝑒(𝑆, 𝑀 2 , 1) ∧ [𝑁𝑝𝑀𝑓 𝑆, 𝑀 1 , 𝑀 2 , 0 οƒž 𝐡𝑒 𝑆, 𝑀 1 , 0 ] ∧ [𝑁𝑝𝑀𝑓 𝑆, 𝑀 1 , 𝑀 2 , 0 οƒž 𝐡𝑒 𝑆, 𝑀 2 , 1 ] ∧ SAT formula [¬𝑁𝑝𝑀𝑓(𝑆, 𝑀 2 , 𝑀 1 , 0) ∨ ¬𝑁𝑝𝑀𝑓(𝑆, 𝑀 1 , 𝑀 2 , 0) ] ∧ for ( P ,1) [ οƒ˜ 𝐡𝑒 𝑆, 𝑀 1 , 0  𝐡𝑒 𝑆, 𝑀 1 , 1 οƒž 𝑁𝑝𝑀𝑓 𝑆, 𝑀 2 , 𝑀 1 , 0 ] ∧ [ οƒ˜ 𝐡𝑒 𝑆, 𝑀 2 , 0  𝐡𝑒 𝑆, 𝑀 2 , 1 οƒž 𝑁𝑝𝑀𝑓 𝑆, 𝑀 1 , 𝑀 2 , 0 ] ∧ [𝐡𝑒 𝑆, 𝑀 1 , 0 οƒ™οƒ˜ 𝐡𝑒 𝑆, 𝑀 1 , 1 οƒž 𝑁𝑝𝑀𝑓 𝑆, 𝑀 1 , 𝑀 2 , 0 ] ∧ 𝐡𝑒 𝑆, 𝑀 2 , 0 οƒ™οƒ˜ 𝐡𝑒 𝑆, 𝑀 2 , 1 οƒž 𝑁𝑝𝑀𝑓 𝑆, 𝑀 2 , 𝑀 1 , 0 Above formula is converted to CNF and solved by a SAT solver. 77

  56. SATPlan example (Extracting a plan)   can be satisfied with 𝑛𝑝𝑀𝑓(𝑆, 𝑀 1 , 𝑀 2 , 0) = 𝑒𝑠𝑣𝑓  β‡’ 𝑛𝑝𝑀𝑓(𝑆, 𝑀 1 , 𝑀 2 , 0) is a solution (and the only one) for panning problem with 1 step plan 78

  57. Layered Plans in SATPlan  Complete exclusion axiom (only one action at a time):  For all pairs of actions at each time step i: οƒ˜ 𝑏 𝑗 οƒš οƒ˜ 𝑐𝑗  Partial exclusion axiom (more than one action could be taken at a time step):  For any pair of incompatible actions (recall from Graphplan): οƒ˜ 𝑏 𝑗 οƒš οƒ˜ 𝑐𝑗  Fewer time steps may be required (i.e. shorter formulas) 79

  58. Solving SAT problem  Systematic search  DPLL (Davis Putnam Logemann Loveland)  Local search  WalkSAT 80

  59. Partial order planning Sock-shoe example: PDDL  π½π‘œπ‘—π‘’()  π»π‘π‘π‘š(π‘†π‘—π‘•β„Žπ‘’π‘‡β„Žπ‘π‘“π‘ƒπ‘œ  π‘€π‘“π‘”π‘’π‘‡β„Žπ‘π‘“π‘ƒπ‘œ)  π΅π‘‘π‘’π‘—π‘π‘œ(π‘†π‘—π‘•β„Žπ‘’π‘‡β„Žπ‘π‘“,  PRECOND: π‘†π‘—π‘•β„Žπ‘’π‘‡π‘π‘‘π‘™π‘ƒπ‘œ,  EFFECT: π‘†π‘—π‘•β„Žπ‘’π‘‡β„Žπ‘π‘“π‘ƒπ‘œ))  π΅π‘‘π‘’π‘—π‘π‘œ(π‘†π‘—π‘•β„Žπ‘’π‘‡π‘π‘‘π‘™,  EFFECT: π‘†π‘—π‘•β„Žπ‘’π‘‡π‘π‘‘π‘™π‘ƒπ‘œ))  π΅π‘‘π‘’π‘—π‘π‘œ(π‘€π‘“π‘”π‘’π‘‡β„Žπ‘π‘“,  PRECOND: π‘€π‘“π‘”π‘’π‘‡π‘π‘‘π‘™π‘ƒπ‘œ,  EFFECT: π‘€π‘“π‘”π‘’π‘‡β„Žπ‘π‘“π‘ƒπ‘œ)  π΅π‘‘π‘’π‘—π‘π‘œ(𝑀𝑓𝑔𝑒𝑇𝑝𝑑𝑙,  EFFECT: π‘€π‘“π‘”π‘’π‘‡π‘π‘‘π‘™π‘ƒπ‘œ) 81

  60. Total Order Plans: Partial Order Plans: Start Start Start Start Start Start Start Right Right Left Left Right Left Left Right Sock Sock Sock Sock Sock Sock Sock Sock Left Left Right Right Right Left Left Sock on Right Sock on Sock Sock Sock Sock Shoe Shoe Left Right Shoe Shoe Right Left Right Left Left Right Shoe Sock Shoe Shoe Sock Sock Left Shoe on Right Shoe on Right Left Right Left Right Left Finish Shoe Shoe Shoe Shoe Shoe Shoe Finish Finish Finish Finish Finish Finish 82

  61. Partial Order Planning  Two initial actions  Start  No precondition  All β€˜ Initial State ’ as its effects  Finish  All β€˜ Goal State ’ as its precondition  No Effect 83

  62. Partial plan definition  Partial plan is a < 𝐡, 𝑃, 𝑀 > where:  𝐡 : set of actions in the plan (plan steps)  Initially {Start, Finish}  𝑃 : set of orderings between actions  Initially {Start<Finish}  𝑀 : set of causal links  Initially {} 84

  63. Causal links and threats  Causal Link: serve to record the purpose of steps in the plan  Purpose of 𝐡 𝑗 is to achieve the precondition 𝑑 of 𝐡 π‘˜ 𝑑 𝐡 𝑗 𝐡 π‘˜  Threat: causal links are used to detect when a newly introduced action interferes with past decisions. 𝑑 𝐡 π‘˜ when:  𝐡 𝑙 threatens 𝐡 𝑗 β€’ 𝐡 𝑙 can become between 𝐡 𝑗 and 𝐡 π‘˜ ( 𝑃 βˆͺ {𝐡 𝑗 < 𝐡 𝑙 < 𝐡 π‘˜ } is consistent) β€’ 𝐡 𝑙 has ¬𝑑 as an effect. 85

  64. Resolving Threats  Resolve Threat: ensuring that threats are ordered to come before or after the protected link  Demotion (placed before): add 𝑇 3 < 𝑇 1 to 𝑃  Promotion (placed after): add 𝑇 2 < 𝑇 3 to 𝑃 86

  65. Spare tire example π½π‘œπ‘—π‘’ π‘ˆπ‘—π‘ π‘“ πΊπ‘šπ‘π‘’ ∧ π‘ˆπ‘—π‘ π‘“ π‘‡π‘žπ‘π‘ π‘“ ∧ 𝐡𝑒 πΊπ‘šπ‘π‘’, π΅π‘¦π‘šπ‘“ ∧ 𝐡𝑒 π‘‡π‘žπ‘π‘ π‘“, π‘ˆπ‘ π‘£π‘œπ‘™ π»π‘π‘π‘š(𝐡𝑒(π‘‡π‘žπ‘π‘ π‘“, π΅π‘¦π‘šπ‘“)) π΅π‘‘π‘’π‘—π‘π‘œ(𝑆𝑓𝑛𝑝𝑀𝑓(π‘π‘π‘˜, π‘šπ‘π‘‘), 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝐡𝑒(π‘π‘π‘˜, π‘šπ‘π‘‘), πΉπΊπΊπΉπ·π‘ˆ: ¬𝐡𝑒(π‘π‘π‘˜, π‘šπ‘π‘‘) ∧ 𝐡𝑒(π‘π‘π‘˜, π‘•π‘ π‘π‘£π‘œπ‘’)) π΅π‘‘π‘’π‘—π‘π‘œ(π‘„π‘£π‘’π‘ƒπ‘œ(𝑒, π‘π‘¦π‘šπ‘“), 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: π‘ˆπ‘—π‘ π‘“(𝑒) ∧ 𝐡𝑒(𝑒, π»π‘ π‘π‘£π‘œπ‘’) ∧ οƒ˜ 𝐡𝑒(πΊπ‘šπ‘π‘’, π΅π‘¦π‘šπ‘“) πΉπΊπΊπΉπ·π‘ˆ: οƒ˜ 𝐡𝑒(𝑒, π»π‘ π‘π‘£π‘œπ‘’) ∧ 𝐡𝑒(πΊπ‘šπ‘π‘’, π΅π‘¦π‘šπ‘“)) π΅π‘‘π‘’π‘—π‘π‘œ(π‘€π‘“π‘π‘€π‘“π‘ƒπ‘€π‘“π‘ π‘œπ‘—π‘•β„Žπ‘’, 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: πΉπΊπΊπΉπ·π‘ˆ: οƒ˜ 𝐡𝑒(π‘‡π‘žπ‘π‘ π‘“, π»π‘ π‘π‘£π‘œπ‘’) ∧ οƒ˜ 𝐡𝑒(π‘‡π‘žπ‘π‘ π‘“, π‘ˆπ‘ π‘£π‘œπ‘™) ∧ οƒ˜ 𝐡𝑒(π‘‡π‘žπ‘π‘ π‘“, π΅π‘¦π‘šπ‘“) ∧ οƒ˜ 𝐡𝑒(πΊπ‘šπ‘π‘’, π»π‘ π‘π‘£π‘œπ‘’) ∧ οƒ˜ 𝐡𝑒(πΊπ‘šπ‘π‘’, π‘ˆπ‘ π‘£π‘œπ‘™) ∧ οƒ˜ 𝐡𝑒(πΊπ‘šπ‘π‘’, π΅π‘¦π‘šπ‘“)) 87

  66. Spare tire example 88

  67. Spare tire example 89

  68. Spare tire example 90

  69. Spare tire example 91

  70. Spare tire example 92

  71. POP  Agenda: open preconditions (along with actions requiring them)  Initially all preconditions of End  function POP(< 𝐡, 𝑃, 𝑀 >, π‘π‘•π‘“π‘œπ‘’π‘)  𝐣𝐠 π‘π‘•π‘“π‘œπ‘’π‘ = {} then return (< 𝐡, 𝑃, 𝑀 > )  (π‘Ÿ, 𝐡 π‘œπ‘“π‘“π‘’ ) ← Select a goal from agenda  𝑏 ← Choose an action that adds π‘Ÿ  if no such action then return failure  Update < 𝐡, 𝑃, 𝑀 > and π‘π‘•π‘“π‘œπ‘’π‘  Add consistent ordering constraints for causal link protection  if no constraint is consistent then return failure  POP(< 𝐡, 𝑃, 𝑀 >, π‘π‘•π‘“π‘œπ‘’π‘) 93

  72. POP algorithm (more details) POP (< A,O,L >, agenda ) 1. Termination: If agenda is empty return < A,O,L > 2. Goal selection: Let < Q,A need > be a pair on the agenda 3. Action selection: Let A add = choose an action that adds Q if no such action exists, then return failure Q Let L ’ = L οƒˆ {A add β†’ A need } , and let O ’ = O οƒˆ {A add < A need } . If A add is newly instantiated, then A ’ = A οƒˆ {A add } and O ’ = O οƒˆ {A 0 < A add < A ο‚₯ } (otherwise, let A ’ = A) 4. Updating of goal set: Let agenda ’ = agenda -{< Q,A need >}. If A add is newly instantiated, then for each conjunction, Q i, of its precondition, add < Q i ,A add > to agenda ’ 5. Causal link protection: For every action A t that might p threaten a causal link A p β†’ A c , add a consistent ordering constraint, either (a) Demotion: Add A t < A p to O ’ (b) Promotion: Add A c < A t to O ’ If neither constraint is consistent, then return failure 6. Recursive invocation: POP( (< A ’ ,O ’ ,L ’ >, agenda ’ ) 94

  73. Shopping example  π½π‘œπ‘—π‘’(𝐡𝑒(𝐼𝑝𝑛𝑓)  π‘‡π‘“π‘šπ‘šπ‘‘(𝐼𝑋𝑇, πΈπ‘ π‘—π‘šπ‘š)  π‘‡π‘“π‘šπ‘šπ‘‘(𝑇𝑁, π‘π‘—π‘šπ‘™)  π‘‡π‘“π‘šπ‘šπ‘‘(𝑇𝑁, πΆπ‘π‘œπ‘π‘œπ‘))  π»π‘π‘π‘š(𝐼𝑏𝑀𝑓(πΈπ‘ π‘—π‘šπ‘š)  𝐼𝑏𝑀𝑓 π‘π‘—π‘šπ‘™  𝐼𝑏𝑀𝑓 πΆπ‘π‘œπ‘π‘œπ‘  𝐡𝑒(𝐼𝑝𝑛𝑓))  π΅π‘‘π‘’π‘—π‘π‘œ(𝐻𝑝(π‘’β„Žπ‘“π‘ π‘“)  PRECOND: 𝐡𝑒(β„Žπ‘“π‘ π‘“),  EFFECT: 𝐡𝑒(π‘’β„Žπ‘“π‘ π‘“) ∧ ¬𝐡𝑒(β„Žπ‘“π‘ π‘“)) π΅π‘‘π‘’π‘—π‘π‘œ(𝐢𝑣𝑧(𝑦),   PRECOND: 𝐡𝑒(𝑑𝑒𝑝𝑠𝑓)  π‘‡π‘“π‘šπ‘šπ‘‘(𝑑𝑒𝑝𝑠𝑓, 𝑦),  EFFECT: 𝐼𝑏𝑀𝑓(𝑦)) 95

  74. Shopping example  Many possible ways to elaborate the initial plan  Three 𝐢𝑣𝑧 actions for three preconditions of Finish action  π‘‡π‘“π‘šπ‘šπ‘‘ precondition of Buy  Bold arrows : causal links, protection of precondition  Light arrows : ordering constraints 96

  75. Shopping example 97

  76. Shopping example 98

  77. Shopping example 99

  78. Shopping example 100

Recommend


More recommend