knowledge based policies for qualitative decentralized
play

Knowledge-Based Policies for Qualitative Decentralized POMDPs - PowerPoint PPT Presentation

Knowledge-based programs Semantics Mathematical Properties Conclusion Knowledge-Based Policies for Qualitative Decentralized POMDPs Abdallah Saffidine Bruno Zanuttini Franc ois Schwarzentruber May 14th, 2019 1 / 42 Knowledge-based


  1. Knowledge-based programs Semantics Mathematical Properties Conclusion Knowledge-Based Policies for Qualitative Decentralized POMDPs Abdallah Saffidine Bruno Zanuttini Franc ¸ois Schwarzentruber May 14th, 2019 1 / 42

  2. Knowledge-based programs Semantics Mathematical Properties Conclusion Automation of complex tasks Building surveillance Nuclear decommissioning Intelligent farming 2 / 42

  3. Knowledge-based programs Semantics Mathematical Properties Conclusion Multiple robots more robust/efficient than 3 / 42

  4. Knowledge-based programs Semantics Mathematical Properties Conclusion Multiple robots more robust /efficient than 4 / 42

  5. Knowledge-based programs Semantics Mathematical Properties Conclusion Multiple robots more robust/ efficient than 5 / 42

  6. Knowledge-based programs Semantics Mathematical Properties Conclusion Multiple robots more robust/efficient than Settings Cooperative agents; Common goal; Imperfect information; Decentralized execution. 6 / 42

  7. Knowledge-based programs Semantics Mathematical Properties Conclusion Methodology Model a ’s program Planning b ’s program Goal c ’s program 7 / 42

  8. Knowledge-based programs Semantics Mathematical Properties Conclusion Need: understandable system Motivation Legal issues in case of failure Interaction with humans 8 / 42

  9. Knowledge-based programs Semantics Mathematical Properties Conclusion Our contribution: use of knowledge-based programs KBP for agent a KBP for agent b listenRadio readNewsPaper if a knows strike if b knows strike toStation toStation else else toAirport toAirport Operational Semantics for Knowledge-based programs; Succinctness; (Un)decidability/complexity. Extends: [ Lang, Zanuttini, ECAI2012, TARK2013 ] 9 / 42

  10. Knowledge-based programs Semantics Mathematical Properties Conclusion Outline Knowledge-based programs 1 Semantics 2 Mathematical Properties 3 Conclusion 4 10 / 42

  11. Knowledge-based programs Semantics Mathematical Properties Conclusion Program constructions Language constructions turn left stay broadcast temperature ... ; ... if ϕ then ... else ... while ϕ do ... Example (knowledge-based program for agent a ) if a knows ( door 12 is locked and justobserved ( )) then turn left broadcast temperature else stay 11 / 42

  12. Knowledge-based programs Semantics Models: QdecPOMDP Mathematical Properties Operational semantics of KBPs Conclusion Outline Knowledge-based programs 1 Semantics 2 Models: QdecPOMDP Operational semantics of KBPs Mathematical Properties 3 4 Conclusion 12 / 42

  13. Knowledge-based programs Semantics Models: QdecPOMDP Mathematical Properties Operational semantics of KBPs Conclusion Outline Knowledge-based programs 1 Semantics 2 Models: QdecPOMDP Operational semantics of KBPs Mathematical Properties 3 4 Conclusion 13 / 42

  14. Knowledge-based programs Semantics Models: QdecPOMDP Mathematical Properties Operational semantics of KBPs Conclusion QdecPOMDP Qualitative decentralized Partially Observable Markov Decision Processes = Concurrent game structures with observations. Transitions of the form: a : a : stay b : b : turn left state1 state2 A non-empty set of possible initial states; A set of goal states. 14 / 42

  15. Knowledge-based programs Semantics Models: QdecPOMDP Mathematical Properties Operational semantics of KBPs Conclusion States Typically, a state describes: positions of agents; battery levels; etc. 15 / 42

  16. Knowledge-based programs Semantics Models: QdecPOMDP Mathematical Properties Operational semantics of KBPs Conclusion Outline Knowledge-based programs 1 Semantics 2 Models: QdecPOMDP Operational semantics of KBPs Mathematical Properties 3 4 Conclusion 16 / 42

  17. Knowledge-based programs Semantics Models: QdecPOMDP Mathematical Properties Operational semantics of KBPs Conclusion Operational semantics one step of computation of KBPs in the QdecPOMDP Epistemic structure Higher-order knowledge about: the current state of the QdecPOMDP; the current program counters in KBPs. 17 / 42

  18. Knowledge-based programs Semantics Models: QdecPOMDP Mathematical Properties Operational semantics of KBPs Conclusion Assumptions Common knowledge of: the QdecPOMDP; the KBPs; synchrony of the system; tests last 0 unit of time; actions last 1 unit of time. KBP for agent a KBP for agent b listenRadio readNewsPaper if a knows strike if b knows strike toStation toStation else else toAirport toAirport 18 / 42

  19. Knowledge-based programs Semantics Models: QdecPOMDP Mathematical Properties Operational semantics of KBPs Conclusion Epistemic structures at time T : worlds Worlds = histories of the form consistent (wait few slides) pc 0 − − → − − → s 0 − → obs 1 s 1 − → obs T s T − → pc 1 pc T . . . where − − → obs t vector of observations at time t listenRadio s t state at time t if K a strike then − → toStation pc t vector of program counters at time t else toAirport 19 / 42

  20. Knowledge-based programs Semantics Models: QdecPOMDP Mathematical Properties Operational semantics of KBPs Conclusion Epistemic structures at time t : indistingu ishability relations she has received the same Agent a confuses two histories iff observations. pc 0 − − → − − → s 0 − → obs 1 s 1 − → obs T s T − → pc 1 pc T . . . for all t ∈ { 1 , . . . , T } , iff − − → a = − − → ∼ a obs t obs ′ t pc ′ 0 − − → pc ′ 1 . . . − − → s ′ 0 − → obs ′ 1 s ′ 1 − → obs ′ T s ′ T − → a pc ′ T 20 / 42

  21. Knowledge-based programs Semantics Models: QdecPOMDP Mathematical Properties Operational semantics of KBPs Conclusion Program counters Definition (Program counter) (guard, action just executed, continuation) ( ⊤ , start , ) listenRadio � � if K a strike then ⊤ , listenRadio , toStation � � else K a strike , toStation , toAirport � � ¬ K a strike , toAirport , 21 / 42

  22. Knowledge-based programs Semantics Models: QdecPOMDP Mathematical Properties Operational semantics of KBPs Conclusion Control-flow graph listenRadio if K a strike then toStation else toAirport ( ⊤ , start , ) � � ⊤ , listenRadio , � � � � K a strike , toStation , ¬ K a strike , toAirport , 22 / 42

  23. Knowledge-based programs Semantics Models: QdecPOMDP Mathematical Properties Operational semantics of KBPs Conclusion Consistent histories (explained with one agent) listenRadio if K a strike then KBP control-flow graph toStation else In the QdecPOMDP: toAirport ( ⊤ , start , ) listenRadio , s 0 → s 1 − − − − − − − − − − − − − � � toStation , ⊤ , listenRadio , s 1 → s 2 − − − − − − − − − − − � � � � K a strike , toStation , ¬ K a strike , toAirport , s 1 � � s 2 � � s 0 ( ⊤ , start , ⊤ , listenRadio , K a strike , toStation , ) � ���������������������������������������������� �� ���������������������������������������������� � | = K a strike 23 / 42

  24. Knowledge-based programs Verification Semantics Execution Problem Mathematical Properties Succinctness Conclusion Outline Knowledge-based programs 1 Semantics 2 3 Mathematical Properties Verification Execution Problem Succinctness Conclusion 4 24 / 42

  25. Knowledge-based programs Verification Semantics Execution Problem Mathematical Properties Succinctness Conclusion Outline Knowledge-based programs 1 Semantics 2 3 Mathematical Properties Verification Execution Problem Succinctness Conclusion 4 25 / 42

  26. Knowledge-based programs Verification Semantics Execution Problem Mathematical Properties Succinctness Conclusion Verification problem Input: A QdecPOMDP model (given in STRIPS-like symbolic form) ; Knowledge-based programs for each agent; Output: yes if all executions of the KBPs lead to a goal state. 26 / 42

  27. Knowledge-based programs Verification Semantics Execution Problem Mathematical Properties Succinctness Conclusion Verification problem for while-free KBPs Theorem The verification problem for while-free KBPs is PSPACE-complete. P roof idea . Upper bound: on-the-fly model checking; Lower bound: reduction from TQBF . value of value of value of p 1 p 2 p 3 agent 1 agent 2 agent 3 27 / 42

  28. Knowledge-based programs Verification Semantics Execution Problem Mathematical Properties Succinctness Conclusion Verification problem for while-free KBPs Theorem The verification problem for while-free KBPs is PSPACE-complete. P roof idea . Upper bound: on-the-fly model checking; Lower bound: reduction from TQBF . value of value of value of p 1 p 2 p 3 agent 1 agent 2 agent 3 28 / 42

  29. Knowledge-based programs Verification Semantics Execution Problem Mathematical Properties Succinctness Conclusion Verification problem for while-free KBPs Theorem The verification problem for while-free KBPs is PSPACE-complete. P roof idea . Upper bound: on-the-fly model checking; Lower bound: reduction from TQBF . value of value of value of p 1 p 2 p 3 agent 1 agent 2 agent 3 29 / 42

Recommend


More recommend