knowledge based policies for qualitative decentralized
play

Knowledge-Based Policies for Qualitative Decentralized POMDPs - PowerPoint PPT Presentation

Knowledge-based programs Semantics Mathematical properties Conclusion Knowledge-Based Policies for Qualitative Decentralized POMDPs Abdallah Saffidine Bruno Zanuttini Franc ois Schwarzentruber 68NQRT January 25th, 2018 1 / 50


  1. Knowledge-based programs Semantics Mathematical properties Conclusion Knowledge-Based Policies for Qualitative Decentralized POMDPs Abdallah Saffidine Bruno Zanuttini Franc ¸ois Schwarzentruber 68NQRT January 25th, 2018 1 / 50

  2. Knowledge-based programs Semantics Mathematical properties Conclusion Automation of complex tasks Building surveillance Nuclear decommissioning Intelligent farming 2 / 50

  3. Knowledge-based programs Semantics Mathematical properties Conclusion Multiple robots more robust/efficient than 3 / 50

  4. Knowledge-based programs Semantics Mathematical properties Conclusion Multiple robots more robust /efficient than 4 / 50

  5. Knowledge-based programs Semantics Mathematical properties Conclusion Multiple robots more robust/ efficient than 5 / 50

  6. Knowledge-based programs Semantics Mathematical properties Conclusion Multiple robots more robust/efficient than Settings Cooperative agents; Common goal; Imperfect information; Decentralized execution. 6 / 50

  7. Knowledge-based programs Semantics Mathematical properties Conclusion Methodology Model a ’s program Planning b ’s program Goal c ’s program 7 / 50

  8. Knowledge-based programs Semantics Mathematical properties Conclusion Need: understandable system Motivation Legal issues in case of failure Interaction with humans 8 / 50

  9. Knowledge-based programs Semantics Mathematical properties Conclusion Our contribution: use of knowledge-based programs KBP for agent a KBP for agent b listenRadio readNewsPaper if a knows strike if b knows strike toStation toStation else else toAirport toAirport Operational Semantics for Knowledge-based programs; (Un)decidability/complexity and succinctness. Extends: [ Lang, Zanuttini, ECAI2012, TARK2013 ] 9 / 50

  10. Knowledge-based programs Semantics Epistemic formulas Mathematical properties Program constructions Conclusion Outline Knowledge-based programs 1 Epistemic formulas Program constructions Semantics 2 Mathematical properties 3 4 Conclusion 10 / 50

  11. Knowledge-based programs Semantics Epistemic formulas Mathematical properties Program constructions Conclusion Outline Knowledge-based programs 1 Epistemic formulas Program constructions Semantics 2 Mathematical properties 3 4 Conclusion 11 / 50

  12. Knowledge-based programs Semantics Epistemic formulas Mathematical properties Program constructions Conclusion Properties expressed in epistemic logic Language constructions . . . room 43 is safe door 12 is locked not ... ( ... or ... ) ( ... knows ... ) ( ... and ... ) ( ... knowswhether ... ) ( ... → ... ) Example ( a knows door 12 is locked ) and not ( c knows door 12 is locked ) a knowswhether ( c knows door 12 is locked ) 12 / 50

  13. Knowledge-based programs Semantics Epistemic formulas Mathematical properties Program constructions Conclusion Outline Knowledge-based programs 1 Epistemic formulas Program constructions Semantics 2 Mathematical properties 3 4 Conclusion 13 / 50

  14. Knowledge-based programs Semantics Epistemic formulas Mathematical properties Program constructions Conclusion Program constructions Language constructions turn left stay broadcast temperature ... ; ... if ϕ then ... else ... while ϕ do ... Example (knowledge-based program for agent a ) if a knows ( door 12 is locked and justobserved ( )) then turn left broadcast temperature else stay 14 / 50

  15. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion Outline Knowledge-based programs 1 Semantics 2 Models: QdecPOMDP Interlude: semantics of epistemic formulas Operational semantics of KBPs Mathematical properties 3 Conclusion 4 15 / 50

  16. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion Outline Knowledge-based programs 1 Semantics 2 Models: QdecPOMDP Interlude: semantics of epistemic formulas Operational semantics of KBPs Mathematical properties 3 Conclusion 4 16 / 50

  17. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion QdecPOMDP Qualitative decentralized Partially Observable Markov Decision Processes = Concurrent game structures with observations. Transitions of the form: a : a : stay b : b : turn left state1 state2 A non-empty set of possible initial states; A set of goal states. 17 / 50

  18. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion States Typically, a state describes: positions of agents; battery levels; etc. 18 / 50

  19. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion Outline Knowledge-based programs 1 Semantics 2 Models: QdecPOMDP Interlude: semantics of epistemic formulas Operational semantics of KBPs Mathematical properties 3 Conclusion 4 19 / 50

  20. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion Prototype http://people.irisa.fr/Francois.Schwarzentruber/ hintikkasworld/ 20 / 50

  21. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion Semantics of epistemic formulas Epistemic structure S , w S , w | = a knows ϕ S , u | = ϕ . iff for all u , w ∼ a u implies 21 / 50

  22. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion Outline Knowledge-based programs 1 Semantics 2 Models: QdecPOMDP Interlude: semantics of epistemic formulas Operational semantics of KBPs Mathematical properties 3 Conclusion 4 22 / 50

  23. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion Operational semantics Epistemic structure Higher-order knowledge about: the current state of the QdecPOMDP; the current program counters in KBPs. 23 / 50

  24. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion Assumptions Common knowledge of: the QdecPOMDP; the KBPs; synchronicity of the system; tests last 0 unit of time; actions last 1 unit of time. KBP for agent a KBP for agent b listenRadio readNewsPaper if a knows strike if b knows strike toStation toStation else else toAirport toAirport 24 / 50

  25. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion Epistemic structures at time T : worlds Worlds = histories of the form consistent (wait few slides) pc 0 − − → − − → s 0 − → obs 1 s 1 − → obs T s T − → pc 1 pc T . . . where − − → obs t vector of observations at time t listenRadio s t state at time t if K a strike then − → toStation pc t vector of program counters at time t else toAirport 25 / 50

  26. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion Epistemic structures at time t : indistingu ishability relations Agent a confuses two histories iff she has received the same observations. pc 0 − − → − − → s 0 − → obs 1 s 1 − → obs T s T − → pc 1 pc T . . . for all t ∈ { 1 , . . . , T } , iff − − → a = − − → ∼ a obs t obs ′ t pc ′ 0 − − → pc ′ 1 . . . − − → s ′ 0 − → obs ′ 1 s ′ 1 − → obs ′ T s ′ T − → a pc ′ T 26 / 50

  27. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion Program counters Definition (Program counter) (guard, action just executed, continuation) ( ⊤ , start , ) listenRadio � � if K a strike then ⊤ , listenRadio , toStation � � else K a strike , toStation , toAirport � � ¬ K a strike , toAirport , 27 / 50

  28. Knowledge-based programs Models: QdecPOMDP Semantics Interlude: semantics of epistemic formulas Mathematical properties Operational semantics of KBPs Conclusion Control-flow graph listenRadio if K a strike then toStation else toAirport ( ⊤ , start , ) � � ⊤ , listenRadio , � � � � K a strike , toStation , ¬ K a strike , toAirport , 28 / 50

Recommend


More recommend