Utrecht University INFOB2KI 2019-2020 The Netherlands ARTIFICIAL INTELLIGENCE Classical planning (goal-directed) Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html
Outline We consider single‐agent, goal‐directed planning and assume the environment to be static and deterministic Planning languages/architectures: STRIPS (1971) Goal Oriented Action Planning (GOAP) Hierarchical planning: NOAH/HTN 2
Applications Mobile robots – An initial motivator, and still being developed Simulated environments – Goal‐directed agents for training or games Web and grid environments – Composing queries or services – Workflows on a computational grid Managing crisis situations – E.g. oil‐spill, forest fires, urban evacuation, in factories, … And many more – Factory automation, flying autonomous spacecraft, … 3
Shakey (1966-1972) Shakey is a robot which plans moving from one location to another, turning the light switches on and off, opening and closing the doors, climbing up and down from rigid objects, and pushing movable objects around using…? a) STRIPS b) GOAP c) HTN d) SHOP 4
Goal Oriented Behavior Agent has one or more (internal) goals Goals are used as specific targets to plan actions for Goals are explicit and can be updated, reasoned about, etc. There can be a separate level of behavior to manage the goals – e.g. preferences, importance… 5
Generating plans Given (similar to before): – A way to describe the world – An initial state of the world – A goal description – A set of possible actions to change the world Find: – A prescription for actions that are guaranteed to change the initial state into one that satisfies the goal(s) Difference with before: we’re not optimizing an evaluation function How to choose actions ?? 6
Actions Contain domain knowledge, not simply mapping state state E.g. Precondition: in(house,fire) Action: extinguish_fire Postcondition: not in(house,fire) 7
Actions & change Actions change the world, but only partly Upon considering/comparing possible actions, we want to: – know how an action will alter the world – keep track of the history of world states (have we been here before?) – answer questions about potential world states (what would happen if..?) 8
Planning sequences of actions: a search problem (?) Use e.g. IDA* (Iterative Deepening A*) to create all possible sequences of actions for a single goal Heuristic should capture how far a sequence of actions is from achieving the goal Quite cumbersome: we do not use (available) information about the actions, their effects and their relations 9
Classical Planning Simplest possible planning problem Determined by: – a unique known initial state – durationless actions – deterministic actions – actions are taken one at a time – single agent World states are typically complex: what do we need to represent, and how? 10
Frame problem How can we efficiently represent everything that has not changed ? (“frame of reference”) Example: I go from home to the store, creating a new world state S’. In S’: – My friend is still at home – The store still sells chips – My age is still the same – Los Angeles is still the largest city in California… Why didn’t this problem occur with path planning? (complete state info, rather than persistent domain knowledge) 11
Ramification problem Do we want to represent every change to the world in an action definition, even indirect effects? (= ramifications) Example: I go from home to the store, creating a new situation S’. In S’: – I am now in Marina del Rey – The number of people in the store went up by 1 – The contents of my pockets are now in the store.. 12
Linear planning 13
Linear planning A linear planner is a classical planner that assumes: ‐ no distinction between importance of goals ‐ all (sub)goals are assumed to be independent (sub) goals can be achieved in arbitrary order As a result, plans that achieve subgoals are combined by placing all steps of one subplan before or after all steps of the others (=non‐interleaved) . 14
STRIPS (Fikes and Nilsson 71) A non‐hierarchical, linear planner Idea: State (or world model) represents a large number of facts and relations Use formulas in first‐order predicate logic Use theorem‐proving within states, for action preconditions and goal tests Use goal stack planning for going through state space More recent: PDDL (just the language; includes a.o. negations) Alternatives backward chaining algorithm for searching through the state space & planning graphs (not discussed) 15
(generalised) STRIPS Problem space: Initial world model: set of well‐formed formulas (wffs: conjunction of literals) Set of actions, each represented with – Preconditions (list of predicates that should hold) – Delete list (list of predicates that will become invalid) – Add list (list of predicates that will become valid) Actions thus allow variables (we consider a proposition to be a special case of a predicate without variables) A goal condition: stated as wff 16
Example problem: Initial state: at(home), ¬ have(beer) Goal: have(beer), at(home) Actions: Go (X, Y): Buy(X): Pre : at(X) Pre : at(store) Del : at(X) Add : have(X) Add : at(Y) 17
Planning with STRIPS: example have(beer) have(beer) at(store) at(home) at(home) at(store) buy(beer) go(home,store) go(store,home) S-2 S S-1 S-3 Start from goal and reason backwards 18
Goal stack planning with STRIPS Search strategy idea: Identify differences between present world model and the goal Identify actions that are relevant for reducing the differences Satisfy preconditions: turn preconditions of relevant actions into new subgoals; solve subproblems Use a stack to push (and pop) preconditions and actions 19
20
21
22
23
24
25
26
27
28
STRIPS & frame problem How can we efficiently represent everything that hasn’t changed? Example: I go from home to the store, creating a new situation S’. In S’: – The store still sells chips – My age is still the same – Los Angeles is still the largest city in California… STRIPS solution for simple actions: every satisfied formula not explicitly deleted by an operator continues to hold after the operator is executed 29
STRIPS & ramification problem Do we want to represent every change to the world in an action definition? Example: I go from home to the store, creating a new situation S’. In S’: – I am now in Marina del Rey – The number of people in the store went up by 1 – The contents of my pockets are now in the store.. STRIPS solution: some facts are inferred within a world state – e.g. the number of people in the store ‘inferred’ facts are not carried over and must be re‐inferred – Avoids making mistakes, perhaps inefficient. 30
More questions about STRIPS What if the order of goals at(home), have(beer) was reversed? Would require re‐planning a goal that already seemed fulfilled; is that guaranteed? Is STRIPS complete (always finds a plan if there is one)? No; (Sometimes fixable through conjunction of goals, but computationally inefficient) When STRIPS returns a plan, is it sound (always correct)? And is the plan returned efficient? It is sound, but ‘detours’ (unnecessary series of ops) possible 31
Example: blocks world Initial: Goal: A C B A B C State I: (on‐table A) (on C A) (on‐table B) (clear B) (clear C) Action: (put‐on X Y) Goal: (on A B) (on B C) Pre: (clear X) (clear Y) Add: (on X Y) Del: (clear Y) Sussman anomaly : problem with non‐interleaved planning 32
Example: blocks world ( Sussman anomaly ) Goal: (on A B) (on B C) First pursue sub goal (on A B) Initial: A C B A B C This accomplishes the sub goal, but the agent cannot now pursue sub goal (on B C) without undoing (on A B) 33
Example: blocks world ( Sussman anomaly ) Goal: (on A B) (on B C) First pursue second sub goal (on B C) Initial: B C C A A B Again, the planner cannot pursue sub goal (on A B) without undoing sub goal (on B C) 34
Planning in Games 35
Goal oriented action planning GOAP: simplified STRIPS‐like planning architecture specifically designed for real‐time control of autonomous character behavior in games used in FPSs since ~2005 Goals (a.k.a. motives): – Have different levels of importance (insistence) – High insistence affects behavior more – Character tries to fulfill goals, by reducing insistence 36
GOAP Goals can be seen as motives for taking action. Example goals: Eat, Sleep, Kill Priority for fulfilling goals is given by insistence I Example: goal g = Eat I( g ) = 4 ( I {0,…,5}, 5 highest) Actions specify to which goal they contribute and how they affect insistence Example: Action: get‐food: I ( Eat ) I( Eat ) − 3 Different action‐selection methods possible 37
Recommend
More recommend