Outline General Motivation and Goal Brief Introduction to AI planning Planning Programs to Specify and Control Agents Behaviour Building Planning Programs: Solutions to the Realization Problem LTL Synthesis Planning-based approach Conclusions 13 / 50
Automated Planning: Model-based Initial State PLANNING Operators/actions SYSTEM Plan (SOLVER) Goal State 14 / 50
Automated Planning: Model-based achieves goal from initial state using operators Initial State PLANNING Operators/actions SYSTEM Plan (SOLVER) Goal State 14 / 50
Automated Planning: Model-based achieves goal prec + effects from initial state using operators Initial State PLANNING Operators/actions SYSTEM Plan (SOLVER) Goal State 14 / 50
Automated Planning: Model-based achieves goal prec + effects from initial state using operators Initial State PLANNING Operators/actions SYSTEM Plan (SOLVER) Goal State • Domain-independent: no need to specify how to solve a problem. • Planning languages: specify operators (atomic actions), initial state, and goals (e.g., PDDL) • Planning algorithms: search heuristics automatically extracted from model! • Tremendous improvements (modelling and algorithms) in last 15 years. 14 / 50
Automated Planning: Model-based Initial State PLANNING Operators/actions SYSTEM Plan (SOLVER) Goal State • Domain-independent: no need to specify how to solve a problem. • Planning languages: specify operators (atomic actions), initial state, and goals (e.g., PDDL) • Planning algorithms: search heuristics automatically extracted from model! • Tremendous improvements (modelling and algorithms) in last 15 years. 14 / 50
Environment Specified via a PDDL Domain Model (define (domain academic-life) (:types location - home parking dept pub) fuel_level - low high) (:predicates (my-loc ?place - location) (car-loc ?place - location) (car-fuel ?level - fuel_level) (driven) ...) (:action go-by-car :parameters (?place - location) :precondition (and (my-loc ?source) (car-loc ?source) (not (fuel empty)) ) :effect (when (car-fuel high) (and (car-fuel low) (not (car-fuel high))) (when (car-fuel low) (and (car-fuel empty) (not (car-fuel low))) (:action refuel-car :parameters (?place - location) :precondition (and (car-loc ?place) (my-loc ?place)) :effect (car-fuel high) (not (car-fuel low)) (not (car-fuel empty)) ) ... 15 / 50
State Model for (Classical) AI Planning Definition (Planning State Model) • Finite and discrete state space S (state = set of ground predicates). • A known initial state s 0 ∈ S . • A set S G ⊆ S of goal states. • A set A of actions. • Actions A ( s ) ⊆ A applicable in each s ∈ S . • A deterministic state transition function s ′ = f ( a , s ) for a ∈ A ( s ). • Action costs c ( a , s ) > 0. 16 / 50
State Model for (Classical) AI Planning Definition (Planning State Model) • Finite and discrete state space S (state = set of ground predicates). • A known initial state s 0 ∈ S . • A set S G ⊆ S of goal states. • A set A of actions. • Actions A ( s ) ⊆ A applicable in each s ∈ S . • A deterministic state transition function s ′ = f ( a , s ) for a ∈ A ( s ). • Action costs c ( a , s ) > 0. A solution is a sequence of applicable actions that maps s 0 into S G . It is optimal if it minimizes sum of action costs (e.g., # of steps). The resulting controller is open-loop (no feedback). 16 / 50
Uncertainty but No Feedback: Conformant Planning Differences with classical planning: • a set of possible initial states S 0 ∈ S . • a non-deterministic transition function F ( a , s ) ⊆ S for a ∈ A ( s ). Uncertainty but no sensing; hence controller still open-loop . A solution is (still) an action sequence that achieves the goal in spite of the uncertainty . • i.e. for any possible initial state and any possible transition . 17 / 50
Planning with Sensing Like conformant planning plus: • a sensor model O ( a , s ) mapping actions and states into observations. Solutions can be expressed in many forms: • policies mapping belief states (sets of states) into actions; • contingent trees ; • finite-state controllers ; • programs, etc. Probabilistic version known as POMDP : Partially Obs. Markov Decision Process. 18 / 50
Models, Languages, Control, Scalability • A planner is a solver over a class of models; it takes a model description, and computes the corresponding control (plan). • Many dimensions/models: plan adaptation , soft goals , constraints, multi-agent, temporal, numerical, continuous change, uncertainty, feedback, etc. • Models described in compact form by means of planning languages (e.g. PDDL). • Different types of control: • open-loop vs. closed-loop (feedback used); • off-line vs. on-line (full policies vs. lookahead). • All models computationally intractable; key challenge is scalability • how not to be defeated by problem size ; • need to use heuristics and exploit the structure of problems. 19 / 50
Combinatorial Explosion: Example B A C INIT GOAL B C A C A A B C B A B C ........ ......... C B C B C A B A C A B A .... B ..... .... B C ..... A B C A C A GOAL • Classical problem: move blocks to transform Init into Goal . • Problem becomes path finding in directed graph . • Difficulty is that graph size is exponential in number of blocks. • Problem simple for specialized Block Solver but difficult for General Solver. 20 / 50
Dealing with the Combinatorial Explosion: Heuristics B INIT GOAL A C h=3 B C A C A h=3 h=2 h=3 A B C B A B C ........ ......... C B C B h=1 h=2 h=2 h=2 A C A B A C A B B h=2 B C h=0 A C A A B C GOAL Plans can be found/constructed with heuristic search: • Heuristic values h ( s ) estimate “cost” from s to goal; provide sense of direction. • They are derived automatically from problem representation! 21 / 50
Status of Model-based Planning • Classical planners work reasonably well: • Large problems solved very fast (non-optimally). • Exploit different techniques: heuristics, landmarks, helpful actions . • Other approaches like SAT and local search (LPG) work well too . 22 / 50
Status of Model-based Planning • Classical planners work reasonably well: • Large problems solved very fast (non-optimally). • Exploit different techniques: heuristics, landmarks, helpful actions . • Other approaches like SAT and local search (LPG) work well too . • Model simple but useful: • Operators not primitive; can be policies themselves . • Fast closed-loop replanning able to cope with uncertainty sometimes . 22 / 50
Status of Model-based Planning • Classical planners work reasonably well: • Large problems solved very fast (non-optimally). • Exploit different techniques: heuristics, landmarks, helpful actions . • Other approaches like SAT and local search (LPG) work well too . • Model simple but useful: • Operators not primitive; can be policies themselves . • Fast closed-loop replanning able to cope with uncertainty sometimes . • Beyond Classical Planning: temporal, numerical, soft goals, constraints, multi-agent, incomplete information, uncertainty, . . . • Top-down approach: several general native solvers . • Bottom-up approach: Some transformations into classical planning . 22 / 50
Status of Model-based Planning • Classical planners work reasonably well: • Large problems solved very fast (non-optimally). • Exploit different techniques: heuristics, landmarks, helpful actions . • Other approaches like SAT and local search (LPG) work well too . • Model simple but useful: • Operators not primitive; can be policies themselves . • Fast closed-loop replanning able to cope with uncertainty sometimes . • Beyond Classical Planning: temporal, numerical, soft goals, constraints, multi-agent, incomplete information, uncertainty, . . . • Top-down approach: several general native solvers . • Bottom-up approach: Some transformations into classical planning . • Other approaches: specialised algorithms/data structures • E.g., path planning (many applications in robotics and games) . 22 / 50
Outline General Motivation and Goal Brief Introduction to AI planning Planning Programs to Specify and Control Agents Behaviour Building Planning Programs: Solutions to the Realization Problem LTL Synthesis Planning-based approach Conclusions 23 / 50
What we are after II A new way of specifying & building controllers of agents task-oriented behavior such that: 1 Builds on declarative goals φ 1 , φ 2 , ... 24 / 50
What we are after II A new way of specifying & building controllers of agents task-oriented behavior such that: 1 Builds on declarative goals φ 1 , φ 2 , ... • achievement & maintenance goal types, maybe others... 24 / 50
What we are after II A new way of specifying & building controllers of agents task-oriented behavior such that: 1 Builds on declarative goals φ 1 , φ 2 , ... • achievement & maintenance goal types, maybe others... 2 Supports automatic plan synthesis for goals. 24 / 50
What we are after II A new way of specifying & building controllers of agents task-oriented behavior such that: 1 Builds on declarative goals φ 1 , φ 2 , ... • achievement & maintenance goal types, maybe others... 2 Supports automatic plan synthesis for goals. 3 Allows relating/linking goals. 24 / 50
What we are after II A new way of specifying & building controllers of agents task-oriented behavior such that: 1 Builds on declarative goals φ 1 , φ 2 , ... • achievement & maintenance goal types, maybe others... 2 Supports automatic plan synthesis for goals. 3 Allows relating/linking goals. • “high-level” know-how 24 / 50
What we are after II A new way of specifying & building controllers of agents task-oriented behavior such that: 1 Builds on declarative goals φ 1 , φ 2 , ... • achievement & maintenance goal types, maybe others... 2 Supports automatic plan synthesis for goals. 3 Allows relating/linking goals. • “high-level” know-how • e.g., after φ 1 , it “makes sense” to achieve φ 2 or φ 3 , but not φ 4 . 24 / 50
What we are after II A new way of specifying & building controllers of agents task-oriented behavior such that: 1 Builds on declarative goals φ 1 , φ 2 , ... • achievement & maintenance goal types, maybe others... 2 Supports automatic plan synthesis for goals. 3 Allows relating/linking goals. • “high-level” know-how • e.g., after φ 1 , it “makes sense” to achieve φ 2 or φ 3 , but not φ 4 . 4 Provides continuous control/behavior. 24 / 50
What we are after II A new way of specifying & building controllers of agents task-oriented behavior such that: 1 Builds on declarative goals φ 1 , φ 2 , ... • achievement & maintenance goal types, maybe others... 2 Supports automatic plan synthesis for goals. 3 Allows relating/linking goals. • “high-level” know-how • e.g., after φ 1 , it “makes sense” to achieve φ 2 or φ 3 , but not φ 4 . 4 Provides continuous control/behavior. • may never stop... 24 / 50
What we are after II A new way of specifying & building controllers of agents task-oriented behavior such that: 1 Builds on declarative goals φ 1 , φ 2 , ... • achievement & maintenance goal types, maybe others... 2 Supports automatic plan synthesis for goals. 3 Allows relating/linking goals. • “high-level” know-how • e.g., after φ 1 , it “makes sense” to achieve φ 2 or φ 3 , but not φ 4 . 4 Provides continuous control/behavior. • may never stop... 5 Allows for external decision making/input. 24 / 50
What we are after II A new way of specifying & building controllers of agents task-oriented behavior such that: 1 Builds on declarative goals φ 1 , φ 2 , ... • achievement & maintenance goal types, maybe others... 2 Supports automatic plan synthesis for goals. 3 Allows relating/linking goals. • “high-level” know-how • e.g., after φ 1 , it “makes sense” to achieve φ 2 or φ 3 , but not φ 4 . 4 Provides continuous control/behavior. • may never stop... 5 Allows for external decision making/input. • At run-time it is decided whether system should achieve φ 1 or φ 2 at a given point. 24 / 50
What we are after II A new way of specifying & building controllers of agents task-oriented behavior such that: 1 Builds on declarative goals φ 1 , φ 2 , ... • achievement & maintenance goal types, ! g maybe others... n i m n e m e w a t r g 2 Supports automatic plan synthesis for goals. e o b r d P i r t b n y e 3 Allows relating/linking goals. H g A d • “high-level” know-how n a g • e.g., after φ 1 , it “makes sense” to achieve n i n n φ 2 or φ 3 , but not φ 4 . a l P I A 4 Provides continuous control/behavior. • may never stop... 5 Allows for external decision making/input. • At run-time it is decided whether system should achieve φ 1 or φ 2 at a given point. 24 / 50
Agent Planning Programs: Academic Domain Example 25 / 50
Agent Planning Programs • Finite state program (including conditionals and loops). • Non-controllable transitions: selected by external entity . • With an initial state and possibly non-terminating. • Atomic instructions: requests to “ achieve goal φ while maintaining goal ψ ” • Meant to run in a dynamic domain where agents act: the environment. 26 / 50
Agent Planning Programs • Finite state program (including conditionals and loops). • Non-controllable transitions: selected by external entity . • With an initial state and possibly non-terminating. • Atomic instructions: requests to “ achieve goal φ while maintaining goal ψ ” • Meant to run in a dynamic domain where agents act: the environment. t 0 t 1 t 2 26 / 50
Agent Planning Programs • Finite state program (including conditionals and loops). • Non-controllable transitions: selected by external entity . • With an initial state and possibly non-terminating. • Atomic instructions: requests to “ achieve goal φ while maintaining goal ψ ” • Meant to run in a dynamic domain where agents act: the environment. G 1 : achieve MyLoc ( dept ) while maintaining ¬ Fuel ( empty ) t 0 t 1 t 2 26 / 50
Agent Planning Programs • Finite state program (including conditionals and loops). • Non-controllable transitions: selected by external entity . • With an initial state and possibly non-terminating. • Atomic instructions: requests to “ achieve goal φ while maintaining goal ψ ” • Meant to run in a dynamic domain where agents act: the environment. G 1 : achieve MyLoc ( dept ) while maintaining ¬ Fuel ( empty ) t 0 t 1 G 4 : achieve MyLoc ( pub ) t 2 26 / 50
Agent Planning Programs • Finite state program (including conditionals and loops). • Non-controllable transitions: selected by external entity . • With an initial state and possibly non-terminating. • Atomic instructions: requests to “ achieve goal φ while maintaining goal ψ ” • Meant to run in a dynamic domain where agents act: the environment. G 1 : achieve MyLoc ( dept ) while maintaining ¬ Fuel ( empty ) t 0 t 1 G 2 : achieve MyLoc ( home ) ∧ CarLoc ( home ) G 4 : achieve MyLoc ( pub ) while maintaining ¬ Fuel ( empty ) achieve MyLoc ( pub ) G 3 : while maintaining t 2 ¬ Fuel ( empty ) 26 / 50
Agent Planning Programs • Finite state program (including conditionals and loops). • Non-controllable transitions: selected by external entity . • With an initial state and possibly non-terminating. • Atomic instructions: requests to “ achieve goal φ while maintaining goal ψ ” • Meant to run in a dynamic domain where agents act: the environment. G 1 : achieve MyLoc ( dept ) while maintaining ¬ Fuel ( empty ) t 0 t 1 G 2 : achieve MyLoc ( home ) ∧ CarLoc ( home ) G 4 : achieve MyLoc ( pub ) while maintaining ¬ Fuel ( empty ) achieve MyLoc ( pub ) G 3 : while maintaining t 2 ¬ Fuel ( empty ) G 5 : achieve MyLoc ( home ) while maintaining ¬ Driven 26 / 50
Agent Planning Programs • Finite state program (including conditionals and loops). • Non-controllable transitions: selected by external entity . • With an initial state and possibly non-terminating. • Atomic instructions: requests to “ achieve goal φ while maintaining goal ψ ” • Meant to run in a dynamic domain where agents act: the environment. G 1 : achieve MyLoc ( dept ) while maintaining ¬ Fuel ( empty ) t 0 t 1 G 2 : achieve MyLoc ( home ) ∧ CarLoc ( home ) G 4 : achieve MyLoc ( pub ) while maintaining ¬ Fuel ( empty ) Agent chooses a achieve MyLoc ( pub ) goal to pursue! G 3 : while maintaining t 2 ¬ Fuel ( empty ) G 5 : achieve MyLoc ( home ) while maintaining ¬ Driven 26 / 50
Environment for Planning Programs (a simple example) Planning programs are executed in an environment State Propositions CarLoc , MyLoc : { home , parking , dept , pub } Fuel : { empty , low , high } Driven : { true, false } Initial State CarLoc = home ∧ MyLoc = home ∧ Fuel = high ∧ Driven = false Operators goByCar ( x ) with x ∈ { home , parking , dept , pub } prec : MyLoc = CarLoc ∧ Fuel � = empty post : MyLoc = CarLoc = x ; Driven = true ; (when ( Fuel high ) ( Fuel low )) (when ( Fuel low ) ( Fuel empty )) 27 / 50
Environment for Planning Programs (a simple example) Planning programs are executed in an environment State Propositions CarLoc , MyLoc : { home , parking , dept , pub } Fuel : { empty , low , high } deterministic Driven : { true, false } action Initial State CarLoc = home ∧ MyLoc = home ∧ Fuel = high ∧ Driven = false Operators goByCar ( x ) with x ∈ { home , parking , dept , pub } prec : MyLoc = CarLoc ∧ Fuel � = empty post : MyLoc = CarLoc = x ; Driven = true ; (when ( Fuel high ) ( Fuel low )) (when ( Fuel low ) ( Fuel empty )) 27 / 50
Environment for Planning Programs (a simple example) Planning programs are executed in an environment State Propositions CarLoc , MyLoc : { home , parking , dept , pub } Fuel : { empty , low , high } Driven : { true, false } Initial State CarLoc = home ∧ MyLoc = home ∧ Fuel = high ∧ Driven = false Operators goByCar ( x ) with x ∈ { home , parking , dept , pub } prec : MyLoc = CarLoc ∧ Fuel � = empty post : MyLoc = CarLoc = x ; Driven = true ; (when ( Fuel high ) (oneof ( Fuel high ) ( Fuel low ))); (when ( Fuel low ) (oneof ( Fuel low ) ( Fuel empty ))) 27 / 50
Environment for Planning Programs (a simple example) Planning programs are executed in an environment State Propositions CarLoc , MyLoc : { home , parking , dept , pub } Fuel : { empty , low , high } non-deterministic Driven : { true, false } action Initial State CarLoc = home ∧ MyLoc = home ∧ Fuel = high ∧ Driven = false Operators goByCar ( x ) with x ∈ { home , parking , dept , pub } prec : MyLoc = CarLoc ∧ Fuel � = empty post : MyLoc = CarLoc = x ; Driven = true ; (when ( Fuel high ) (oneof ( Fuel high ) ( Fuel low ))); (when ( Fuel low ) (oneof ( Fuel low ) ( Fuel empty ))) 27 / 50
Environment Specified via a PDDL Domain Model (define (domain academic-life) (:types location - home parking dept pub) fuel_level - low high) (:predicates (my-loc ?place - location) (car-loc ?place - location) (car-fuel ?level - fuel_level) (driven) ...) (:action go-by-car :parameters (?place - location) :precondition (and (my-loc ?source) (car-loc ?source) (not (fuel empty)) ) :effect (when (car-fuel high) (and (car-fuel low) (not (car-fuel high))) (when (car-fuel low) (and (car-fuel empty) (not (car-fuel low))) (:action refuel-car :parameters (?place - location) :precondition (and (car-loc ?place) (my-loc ?place)) :effect (car-fuel high) (not (car-fuel low)) (not (car-fuel empty)) ) ... 28 / 50
Agent Planning Programs: Controller Synthesis Agent Planning Program SYNTHESIS Environment SYSTEM (APP SOLVER) Controller 29 / 50
Agent Planning Programs: Controller Synthesis Agent Planning Program SYNTHESIS Environment SYSTEM (APP SOLVER) Controller dynamic domain 29 / 50
Agent Planning Programs: Controller Synthesis Agent Planning Program possible requests SYNTHESIS Environment SYSTEM (APP SOLVER) Controller dynamic domain 29 / 50
Agent Planning Programs: Controller Synthesis Agent Planning Program possible requests SYNTHESIS Environment SYSTEM (APP SOLVER) Controller dynamic domain “ realizes ” the APP in the environment 29 / 50
Agent Planning Programs: Controller Synthesis Request Agent Planning Program SYNTHESIS Environment SYSTEM (APP SOLVER) Controller 29 / 50
Agent Planning Programs: Controller Synthesis Request Agent Planning Program SYNTHESIS Environment SYSTEM (APP SOLVER) Controller Plans 29 / 50
Agent Planning Programs: Execution and Control Execution cycle 1 the environment is in a state s and the T planning program in state t ; P S L E A U N Q 2 the user requests a legal transition from E state t to state t ′ to achieve goal φ ; R Execution Cycle 3 controller deploys a plan π that Y U O achieves φ from environment state s ; P L D P E A D T E 4 environment and planning program states are updated; and 5 cycle restarts. 30 / 50
Agent Planning Programs: Execution and Control Execution cycle 1 the environment is in a state s and the T planning program in state t ; P S L E A U N Q 2 the user requests a legal transition from E state t to state t ′ to achieve goal φ ; R Execution Cycle 3 controller deploys a plan π that Y U O achieves φ from environment state s ; P L D P E A D T E 4 environment and planning program states are updated; and Find those plans! 5 cycle restarts. 30 / 50
A Realization of a Planning Program MyLoc ( home ) ∧ Fuel ( high ) , t 0 , Req G 1 − → goByCar ( parking ); walk ( dept ) MyLoc ( dept ) ∧ Fuel ( high ) , t 1 , Req G 3 − → walk ( parking ); goByCar ( home ); walk ( pub ) MyLoc ( dept ) ∧ Fuel ( high ) , t 1 , Req G 2 − → walk ( parking ); goByCar ( home ) MyLoc ( pub ) , t 2 , Req G 5 − → walk ( home ) MyLoc ( home ) , t 0 , Req G 4 − → walk ( pub ) . . . G 1 : achieve MyLoc ( dept ) while maintaining ¬ Fuel ( empty ) t 0 t 1 G 2 : achieve MyLoc ( home ) ∧ CarLoc ( home ) G 4 : achieve MyLoc ( pub ) while maintaining ¬ Fuel ( empty ) achieve MyLoc ( pub ) G 3 : while maintaining G 5 : achieve MyLoc ( home ) t 2 ¬ Fuel ( empty ) while maintaining ¬ Driven 31 / 50
A Realization of a Planning Program MyLoc ( home ) ∧ Fuel ( high ) , t 0 , Req G 1 − → goByCar ( parking ); walk ( dept ) MyLoc ( dept ) ∧ Fuel ( high ) , t 1 , Req G 3 − → walk ( parking ); goByCar ( home ); walk ( pub ) MyLoc ( dept ) ∧ Fuel ( high ) , t 1 , Req G 2 − → walk ( parking ); goByCar ( home ) MyLoc ( pub ) , t 2 , Req G 5 − → walk ( home ) MyLoc ( home ) , t 0 , Req G 4 − → walk ( pub ) . . . G 1 : achieve MyLoc ( dept ) while maintaining ¬ Fuel ( empty ) t 0 t 1 G 2 : achieve MyLoc ( home ) ∧ CarLoc ( home ) G 4 : achieve MyLoc ( pub ) while maintaining ¬ Fuel ( empty ) achieve MyLoc ( pub ) G 3 : while maintaining G 5 : achieve MyLoc ( home ) t 2 ¬ Fuel ( empty ) while maintaining ¬ Driven 31 / 50
A Realization of a Planning Program MyLoc ( home ) ∧ Fuel ( high ) , t 0 , Req G 1 − → goByCar ( parking ); walk ( dept ) MyLoc ( dept ) ∧ Fuel ( high ) , t 1 , Req G 3 − → walk ( parking ); goByCar ( home ); walk ( pub ) MyLoc ( dept ) ∧ Fuel ( high ) , t 1 , Req G 2 − → walk ( parking ); goByCar ( home ) MyLoc ( pub ) , t 2 , Req G 5 − → walk ( home ) Suppose no Suppose no bus, MyLoc ( home ) , t 0 , Req G 4 − → walk ( pub ) TakeBus actions, then shouldn’t then shouldn’t . . drive to pub! . drive to pub! G 1 : achieve MyLoc ( dept ) while maintaining ¬ Fuel ( empty ) t 0 t 1 G 2 : achieve MyLoc ( home ) ∧ CarLoc ( home ) G 4 : achieve MyLoc ( pub ) while maintaining ¬ Fuel ( empty ) achieve MyLoc ( pub ) G 3 : while maintaining G 5 : achieve MyLoc ( home ) t 2 ¬ Fuel ( empty ) while maintaining ¬ Driven • From dept , should not drive to pub (even if optimal plan!) • Otherwise, agent needs to leave the car in the pub when achieving G 5 from t 2 . • Later, agent will not be able to achieve G 1 from t 0 ! (long walks not possible) 31 / 50
A Realization of a Planning Program (cont.) MyLoc ( home ) ∧ Fuel ( high ) , t 0 , Req G 1 − → goByCar ( parking ); walk ( dept ) MyLoc ( dept ) ∧ Fuel ( high ) , t 1 , Req G 3 − → walk ( parking ); goByCar ( home ); walk ( pub ) MyLoc ( dept ) ∧ Fuel ( low ) , t 1 , Req G 3 − → walk ( parking ); refuel ; goByCar ( home ); walk ( pub ) MyLoc ( dept ) ∧ Fuel ( high ) , t 1 , Req G 2 − → walk ( parking ); goByCar ( home ) MyLoc ( pub ) , t 2 , Req G 5 − → walk ( home ) MyLoc ( home ) , t 0 , Req G 4 − → walk ( pub ) . . . G 1 : achieve MyLoc ( dept ) while maintaining ¬ Fuel ( empty ) t 0 t 1 G 2 : achieve MyLoc ( home ) ∧ CarLoc ( home ) G 4 : achieve MyLoc ( pub ) while maintaining ¬ Fuel ( empty ) achieve MyLoc ( pub ) G 3 : while maintaining t 2 G 5 : achieve MyLoc ( home ) ¬ Fuel ( empty ) while maintaining ¬ Driven 32 / 50
A Realization of a Planning Program (cont.) MyLoc ( home ) ∧ Fuel ( high ) , t 0 , Req G 1 − → goByCar ( parking ); walk ( dept ) MyLoc ( dept ) ∧ Fuel ( high ) , t 1 , Req G 3 − → walk ( parking ); goByCar ( home ); walk ( pub ) MyLoc ( dept ) ∧ Fuel ( low ) , t 1 , Req G 3 − → walk ( parking ); refuel ; goByCar ( home ); walk ( pub ) MyLoc ( dept ) ∧ Fuel ( high ) , t 1 , Req G 2 − → walk ( parking ); goByCar ( home ) MyLoc ( pub ) , t 2 , Req G 5 − → walk ( home ) avoid empty tank! MyLoc ( home ) , t 0 , Req G 4 − → walk ( pub ) . . . G 1 : achieve MyLoc ( dept ) while maintaining ¬ Fuel ( empty ) t 0 t 1 G 2 : achieve MyLoc ( home ) ∧ CarLoc ( home ) G 4 : achieve MyLoc ( pub ) while maintaining ¬ Fuel ( empty ) achieve MyLoc ( pub ) G 3 : while maintaining t 2 G 5 : achieve MyLoc ( home ) ¬ Fuel ( empty ) while maintaining ¬ Driven 32 / 50
Planning Program Realization (formally) Definition A planning program T is realizable in dynamic domain/environment D iff there is a plan-based simulation � PLAN between the initial states of T and D . 33 / 50
Planning Program Realization (formally) Definition A planning program T is realizable in dynamic domain/environment D iff there is a plan-based simulation � PLAN between the initial states of T and D . Informally. ( t , s ) ∈� PLAN means we can satisfy (with plans) all agent’s potential requests from the current program state t when the dynamic domain is in state s . 33 / 50
Planning Program Realization (formally) Definition A planning program T is realizable in dynamic domain/environment D iff there is a plan-based simulation � PLAN between the initial states of T and D . Informally. ( t , s ) ∈� PLAN means we can satisfy (with plans) all agent’s potential requests from the current program state t when the dynamic domain is in state s . Formally. A binary relation � PLAN is a plan-based simulation relation iff: ( t , s ) ∈� PLAN implies that → achieve φ while maintaining ψ t ′ for all possible requests t − there exists plan a 1 , a 2 , . . . , a n s.t. such that: 33 / 50
Planning Program Realization (formally) Definition A planning program T is realizable in dynamic domain/environment D iff there is a plan-based simulation � PLAN between the initial states of T and D . Informally. ( t , s ) ∈� PLAN means we can satisfy (with plans) all agent’s potential requests from the current program state t when the dynamic domain is in state s . Formally. A binary relation � PLAN is a plan-based simulation relation iff: ( t , s ) ∈� PLAN implies that → achieve φ while maintaining ψ t ′ for all possible requests t − there exists plan a 1 , a 2 , . . . , a n s.t. such that: a 1 a 2 a n − 1 a n • s − → s 1 − → · · · − → s n − 1 − → s n (plan is executable) 33 / 50
Planning Program Realization (formally) Definition A planning program T is realizable in dynamic domain/environment D iff there is a plan-based simulation � PLAN between the initial states of T and D . Informally. ( t , s ) ∈� PLAN means we can satisfy (with plans) all agent’s potential requests from the current program state t when the dynamic domain is in state s . Formally. A binary relation � PLAN is a plan-based simulation relation iff: ( t , s ) ∈� PLAN implies that → achieve φ while maintaining ψ t ′ for all possible requests t − there exists plan a 1 , a 2 , . . . , a n s.t. such that: a 1 a 2 a n − 1 a n • s − → s 1 − → · · · − → s n − 1 − → s n (plan is executable) • s i | = ψ , for s i = s , s 1 , . . . , s n − 1 (maintenance goal is satisfied) 33 / 50
Planning Program Realization (formally) Definition A planning program T is realizable in dynamic domain/environment D iff there is a plan-based simulation � PLAN between the initial states of T and D . Informally. ( t , s ) ∈� PLAN means we can satisfy (with plans) all agent’s potential requests from the current program state t when the dynamic domain is in state s . Formally. A binary relation � PLAN is a plan-based simulation relation iff: ( t , s ) ∈� PLAN implies that → achieve φ while maintaining ψ t ′ for all possible requests t − there exists plan a 1 , a 2 , . . . , a n s.t. such that: a 1 a 2 a n − 1 a n • s − → s 1 − → · · · − → s n − 1 − → s n (plan is executable) • s i | = ψ , for s i = s , s 1 , . . . , s n − 1 (maintenance goal is satisfied) • s n | = φ (achievement goal is satisfied) 33 / 50
Planning Program Realization (formally) Definition A planning program T is realizable in dynamic domain/environment D iff there is a plan-based simulation � PLAN between the initial states of T and D . Informally. ( t , s ) ∈� PLAN means we can satisfy (with plans) all agent’s potential requests from the current program state t when the dynamic domain is in state s . Formally. A binary relation � PLAN is a plan-based simulation relation iff: ( t , s ) ∈� PLAN implies that → achieve φ while maintaining ψ t ′ for all possible requests t − there exists plan a 1 , a 2 , . . . , a n s.t. such that: a 1 a 2 a n − 1 a n • s − → s 1 − → · · · − → s n − 1 − → s n (plan is executable) • s i | = ψ , for s i = s , s 1 , . . . , s n − 1 (maintenance goal is satisfied) • s n | = φ (achievement goal is satisfied) • ( t ′ , s n ) ∈� PLAN (simulation holds in resulting state) 33 / 50
Outline General Motivation and Goal Brief Introduction to AI planning Planning Programs to Specify and Control Agents Behaviour Building Planning Programs: Solutions to the Realization Problem LTL Synthesis Planning-based approach Conclusions 34 / 50
Planning Program Realization: How to Compute it? Theorem (Complexity) Checking whether an agent planning is realizable in a planning domain from a given initial state EXPTIME-complete. Agent Planning Programs R Conformant ELEMENTARY Planning . . Non-deterministic . . . . Planning 2EXPTIME EXPSPACE Classical Planning EXPTIME PSPACE NPC E M I T N P PTIME N P T 35 / 50 - o I c M E
Planning Program Realization: How to Compute it? Theorem (Complexity) Checking whether an agent planning is realizable in a planning domain from a given initial state EXPTIME-complete. Two proposed approaches: 1 Reduction to LTL synthesis [AAMAS’10] • Reduction to reactive synthesis for certain kinds of Linear-time Temporal Logic (LTL) specifications based on model checking game structures. • Pros: Solvers available (TLV, NuGaT); handle non-determinism easily, yield universal solutions. • Cons: Computationally challenging; scalability. 2 Planning-based approach [ICAPS’11, AIJ’16] • Dedicated algorithm using automated planning to realise each transition. • Pros: can exploit fast planning technology and the program structure to efficiently solve the problem • Cons: Mostly algorithms for deterministic domains; yield single solutions. 35 / 50
Reduction to LTL Synthesis LTL Synthesis [Pnueli & Rosner ’89] Given a model E of the environment and an LTL formula specification φ , find a controller C such that E × C | = φ . • E and C are generally automata. • φ a temporal formula: “ always p ”, “ eventually q ”, “ always eventually q ∧ q ” • E × C means the evolution of E when constrained by C . 36 / 50
Recommend
More recommend