the dynamic logic of policies and contingent planning
play

The dynamic logic of policies and contingent planning Andreas - PowerPoint PPT Presentation

Policies PDL Extending PDL Programs for policies The dynamic logic of policies and contingent planning Andreas Herzig CNRS, IRIT joint work with Thomas Bolander, Thorsten Engesser, Robert Mattm uller, Bernhard Nebel Journ es MAFTEC,


  1. Policies PDL Extending PDL Programs for policies The dynamic logic of policies and contingent planning Andreas Herzig CNRS, IRIT joint work with Thomas Bolander, Thorsten Engesser, Robert Mattm¨ uller, Bernhard Nebel Journ´ es MAFTEC, Rennes, Dec. 6, 2018 1 / 27

  2. Policies PDL Extending PDL Programs for policies Motivation understand policies standard concept in the planning literature in the game theory literature: strategies central in contingent planning only semantically defined syntactic counterpart? can we describe policies by PDL programs? long-term aim: understand policies for epistemic planning ⇒ understand implicitly coordinated policies 2 / 27

  3. Policies PDL Extending PDL Programs for policies Action models set of action names Act PDL model = triple M Act = ( W , { R a } a ∈ Act , V ) where W : non-empty set (‘states’, ‘possible worlds’) R a ⊆ W × W (‘action a ’s transition relation’) V : W −→ 2 Prp (‘valuation’) 3 / 27

  4. Policies PDL Extending PDL Programs for policies Outline Planning tasks and strong policies 1 PDL: language and semantics 2 Extending PDL to account for policies 3 Programs for policies 4 4 / 27

  5. Policies PDL Extending PDL Programs for policies Planning tasks ( S 0 , γ, M Act ) where S ⊆ W (‘set of initial states’) γ = boolean formula (‘goal’) M Act = PDL model 5 / 27

  6. Policies PDL Extending PDL Programs for policies Policies policy = relation Λ ⊆ W × ( Act ∪ { stop } ) cf. ‘state-action table’ [Cimatti&Roveri 2003] defined at set of states S ⊆ W iff for every s ∈ S : there is x ∈ Act ∪ { stop } such that ( s , x ) ∈ Λ strongly executable iff for every ( s , a ) ∈ Λ : R a ( s ) � ∅ 1 Λ is defined at R a ( s ) 2 6 / 27

  7. Policies PDL Extending PDL Programs for policies Strong solutions to planning tasks Stop (Λ) = { t ∈ W : ( t , stop ) ∈ Λ } ‘license to stop’ all ‘stop states’ must satisfy the goal (v.i.) if a policy contains ( s , a ) and ( s , stop ) then at s one may both to stop and perform a : both must lead to the goal necessary if we want nondeterministic policies Λ is a strong solution of planning task ( S 0 , γ, M Act ) iff: Λ acyclic, finite, strongly executable 1 Λ defined at S 0 2 M Act , Stop (Λ) � γ 3 example: task = ( S 0 , HaveFerrari , M Act ) sequential plan = playLottery ; buyFerrari weak solution not a strong solution 7 / 27

  8. Policies PDL Extending PDL Programs for policies Properties If Λ is a strong solution of ( S , γ, M Act ) then Λ is a strong solution of ( R a ( S ) , γ, M Act ) if Λ 1 and Λ 2 are both strong solutions to ( S 0 , γ, M Act ) and Λ 1 ∪ Λ 2 is acyclic then Λ 1 ∪ Λ 2 is a strong solution of ( S 0 , γ, M Act ) 8 / 27

  9. Policies PDL Extending PDL Programs for policies Outline Planning tasks and strong policies 1 PDL: language and semantics 2 Extending PDL to account for policies 3 Programs for policies 4 9 / 27

  10. Policies PDL Extending PDL Programs for policies PDL: language grammar of formulas ϕ and programs π : ϕ ::= p | ¬ ϕ | ϕ ∧ ϕ | � π � ϕ | [ π ] ϕ π ::= a | π ; π | π ∪ π | ϕ ? where p ∈ Prp and a ∈ Act [ π ] ϕ = “ ϕ true after every possible execution of π ” � π � ϕ = “ ϕ true after some possible execution of π ” so: � π � ϕ ↔ ¬ [ π ] ¬ ϕ N.B.: no Kleene star 10 / 27

  11. Policies PDL Extending PDL Programs for policies PDL: semantics interpretation of programs: R π 1 ; π 2 = R π 1 ◦ R π 2 R π 1 ∪ = R π 1 ∪ R π 2 π 2 R ψ ? = { ( s , s ) : M Act , s � ψ } interpretation of formulas: p ∈ V ( s ) M Act , s � p iff M Act , s � ¬ ϕ iff . . . M Act , s � ϕ ∧ ψ iff . . . M Act , s � � π � ϕ iff M Act , t � ϕ for some t ∈ R π ( s ) M Act , s � [ π ] ϕ M Act , t � ϕ for every t ∈ R π ( s ) iff iff M Act , R π ( s ) � ϕ where: R π ( S ) = � s ∈ S R π ( s ) M Act , S � ϕ iff M Act , s � ϕ for every s ∈ S 11 / 27

  12. Policies PDL Extending PDL Programs for policies PDL: weak executability inexecutability: [ π ] ⊥ = “ ⊥ true after every possible execution of π ” = “ π is inexecutable” executability: def = ¬ [ π ] ⊥ wx π def = � π �⊤ = “ π is weakly executable” weak: I can get by with a little help from nature. . . wx playLottery ; buyFerrari cannot account for strong solutions of planning tasks 12 / 27

  13. Policies PDL Extending PDL Programs for policies Outline Planning tasks and strong policies 1 PDL: language and semantics 2 Extending PDL to account for policies 3 Programs for policies 4 13 / 27

  14. Policies PDL Extending PDL Programs for policies PDL: a new modality when a is deterministic: “ a guarantees outcome ϕ ‘’ = � a � ϕ when actions can be nondeterministic: “ a guarantees outcome ϕ ‘’ = � a �⊤ ∧ [ a ] ϕ = wx a ∧ [ a ] ϕ def = ( [ a ] ) ϕ 14 / 27

  15. Policies PDL Extending PDL Programs for policies PDL: a new modality (ctd.) straightforward extension to sequential plans/programs: “ a 1 ; · · · ; a n guarantees ϕ ” = “ a 1 guarantees ( a 2 ; · · · ; a n guarantees ϕ ) ” = ( [ a 1 ] ) · · · ( [ a n ] ) γ def = ( [ a 1 ; · · · ; a n ] ) ϕ characterises strong sequential solutions to planning tasks: a 1 ; · · · ; a n strong solution of ( S 0 , γ, M Act ) iff M Act , S 0 � ( [ a 1 ; · · · ; a n ] ) γ example: M Act , S 0 � � playLottery ; buyFerrari � HaveFerrari (weak) M Act , S 0 � ( [ playLottery ; buyFerrari ] ) HaveFerrari (not strong) 15 / 27

  16. Policies PDL Extending PDL Programs for policies PDL: strong executability strong executability: def sx a 1 ; · · · ; a n = ( [ a 1 ; · · · ; a n ] ) ⊤ = ( [ a 1 ] ) · · · ( [ a n ] ) ⊤ = � a 1 �⊤ ∧ [ a 1 ](( [ a 2 ] ) · · · ( [ a n ] ) ⊤ ) = . . . ↔ wx a 1 ∧ [ a 1 ] wx a 2 ∧ · · · ∧ [ a 1 ] · · · [ a n − 1 ] wx a n = “ a 1 ; · · · ; a n is strongly executable” examples: sx playLottery = wx playLottery sx playLottery ; buyFerrari = wx playLottery ∧ [ playLottery ] sx buyFerrari 16 / 27

  17. Policies PDL Extending PDL Programs for policies PDL: strong executability (ctd.) ( [ π ] ) ϕ can be reduced to sx π : ( [ a 1 ; · · · ; a n ] ) ϕ = ( [ a 1 ] ) · · · ( [ a n ] ) ϕ = � a 1 �⊤ ∧ [ a 1 ](( [ a 2 ] ) · · · ( [ a n ] ) ϕ ) = . . . ↔ wx a 1 ∧ [ a 1 ] wx a 2 ∧ · · · ∧ [ a 1 ] · · · [ a n − 1 ] wx a n ∧ = [ a 1 ][ a 2 ] · · · [ a n ] ϕ = sx a 1 ; · · · ; a n ∧ [ a 1 ; · · · ; a n ] ϕ strong executability of a nondeterministic composition? sx π 1 ∪ π 2 = ??? 17 / 27

  18. Policies PDL Extending PDL Programs for policies Extended PDL: language formulas ϕ and programs π : ϕ ::= p | ¬ ϕ | ϕ ∧ ϕ | � π � ϕ | [ π ] ϕ | ( [ π ] ) ϕ π ::= a | π ; π | π ∪ π | ϕ ? where p ∈ Prp and a ∈ Act ( [ π ] ) ϕ = “ π is strongly executable and guarantees ϕ ” nondeterministic choice π 1 ∪ π 2 : nature chooses between (possible executions of) π 1 and π 2 can only choose π i that are weakly executable π 1 ∪ π 2 equivalent to ( wx π 1 ?; π 1 ) ∪ ( wx π 2 ?; π 2 ) consequence: ( [ π 1 ∪ π 2 ] ) ϕ ↔ ( wx π 1 ∪ π 2 ) ∧ ( wx π 1 → ( [ π 1 ] ) ϕ ) ∧ ( wx π 2 → ( [ π 2 ] ) ϕ ) cf. [ π 1 ∪ π 2 ] ϕ ↔ ( wx π 1 ∨ wx π 2 ) ∧ ( wx π 1 → [ π 1 ] ϕ ) ∧ ( wx π 2 → [ π 2 ] ϕ ) 18 / 27

  19. Policies PDL Extending PDL Programs for policies Extended PDL: interpretation of formulas M Act , s � ( [ a ] ) ϕ iff R a ( s ) � ∅ and M Act , R a ( s ) � ϕ M Act , s � ( [ π 1 ; π 2 ] ) ϕ M Act , s � ( [ π 1 ] )( [ π 2 ] ) ϕ iff M Act , s � ( [ π 1 ∪ π 2 ] ) ϕ iff M Act , s � � π 1 �⊤ or M Act , s � � π 2 �⊤ , and if M Act , s � � π 1 �⊤ then M Act , s � ( [ π 1 ] ) ϕ , and if M Act , s � � π 2 �⊤ then M Act , s � ( [ π 2 ] ) ϕ M Act , s � ( [ ψ ?] ) ϕ iff M Act , s � ψ ∧ ϕ 19 / 27

  20. Policies PDL Extending PDL Programs for policies Extended PDL: nondeterminism two kinds of nondeterminism atomic actions: nature chooses among possible executions 1 M Act , s �| = ( [ playLottery ] ) Rich π 1 ∪ π 2 : nature chooses among (weakly) executable disjuncts 2 M Act , s �| = ( [ playLottery ∪ workHard ] ) Rich M Act , s | = ( [( playLottery ; ⊥ ?) ∪ workHard ] ) Rich 20 / 27

  21. Policies PDL Extending PDL Programs for policies Axiomatics axiomatics of dynamic logic reduction axioms for program operators: ( [ ψ ?] ) ϕ ↔ ψ ∧ ϕ ( [ π 1 ; π 2 ] ) ϕ ↔ ( [ π 1 ] )( [ π 2 ] ) ϕ ( [ π 1 ∪ π 2 ] ) ϕ ↔ ( � π 1 �⊤ ∨ � π 2 �⊤ ) ∧ ( � π 1 �⊤ → ( [ π 1 ] ) ϕ ) ∧ ( � π 2 �⊤ → ( [ π 2 ] ) ϕ ) reduction axiom for atomic programs: ( [ a ] ) ϕ ↔ � a �⊤ ∧ [ a ] ϕ 21 / 27

  22. Policies PDL Extending PDL Programs for policies Alternative axiomatisation ( [ π ] ) ϕ can be reduced to sx π : ( [ π ] ) ϕ ↔ sx π ∧ [ π ] ϕ axiomatics of strong executability: sx a ↔ wx a sx ψ ? ↔ ψ sx π 1 ; π 2 ↔ sx π 1 ∧ [ π 1 ] sx π 2 sx π 1 ∪ π 2 ↔ ( sx π 1 ∧ sx π 2 ) ∨ ( sx π 1 ∧ ¬ wx π 2 ) ∨ ( ¬ wx π 1 ∧ sx π 2 ) 22 / 27

Recommend


More recommend