reasoning about hypothetical agent behaviours and their
play

Reasoning about Hypothetical Agent Behaviours and their Parameters - PowerPoint PPT Presentation

Introduction Approach Experiments Reasoning about Hypothetical Agent Behaviours and their Parameters Stefano Albrecht and Peter Stone Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 1


  1. Introduction Approach Experiments Reasoning about Hypothetical Agent Behaviours and their Parameters Stefano Albrecht and Peter Stone Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 1

  2. Introduction Approach Experiments Introduction Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 2

  3. Introduction Approach Experiments Motivation: Ad Hoc Teamwork Design individual agent which can collaborate effectively with other agents, without pre-coordination Flexibility – ability to collaborate with different teammates Efficiency – find effective policy quickly AAAI 2010 Challenge Paper (Stone et al.) Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 3

  4. Introduction Approach Experiments Motivation: Ad Hoc Teamwork Design individual agent which can collaborate effectively with other agents, without pre-coordination Flexibility – ability to collaborate with different teammates Multiagent Interaction without Prior Coordination Efficiency – find effective policy quickly JAAMAS Special Issue on MIPC AAAI 2010 Challenge Paper (Stone et al.) AAMAS’17 Workshop on MIPC mipc.inf.ed.ac.uk Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 4

  5. Introduction Approach Experiments Type-Based Method Hypothesise possible types of other agents: Each type θ j ∈ Θ j is blackbox behaviour specification P ( a j | H t , θ j ) Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 5

  6. Introduction Approach Experiments Type-Based Method Hypothesise possible types of other agents: Each type θ j ∈ Θ j is blackbox behaviour specification P ( a j | H t , θ j ) Compute belief over types based on interaction history H t P ( θ j | H t ) ∝ P ( H t | θ j ) P ( θ j ) Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 5

  7. Introduction Approach Experiments Type-Based Method Hypothesise possible types of other agents: Each type θ j ∈ Θ j is blackbox behaviour specification P ( a j | H t , θ j ) Compute belief over types based on interaction history H t P ( θ j | H t ) ∝ P ( H t | θ j ) P ( θ j ) Plan own action with respect to belief over types Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 5

  8. Introduction Approach Experiments Type-Based Method Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 6

  9. Introduction Approach Experiments Type-Based Method HBA (Albrecht & Ramamoorthy, AIJ’16) PLASTIC (Barrett & Stone, AIJ’16) Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 6

  10. Introduction Approach Experiments Type-Based Method and Parameters Type-based method useful for ad hoc teamwork: Flexible – can hypothesise any types Efficient – can learn true type with few observations But... Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 7

  11. Introduction Approach Experiments Type-Based Method and Parameters Type-based method useful for ad hoc teamwork: Flexible – can hypothesise any types Efficient – can learn true type with few observations But... Limitation: method does not recognise parameters in types! Complex behaviours often have parameters If we want to reason about n parameter settings, have to store n copies of same type with different parameter settings ⇒ Inefficient, does not scale Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 7

  12. Introduction Approach Experiments Type-Based Method and Parameters Goal in this work Devise method which allows agent to reason about both: Relative likelihood of types and Values of bounded continuous parameters in types Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 8

  13. Introduction Approach Experiments Type-Based Method and Parameters Goal in this work Devise method which allows agent to reason about both: Relative likelihood of types and Values of bounded continuous parameters in types Keep blackbox nature of types (can be any model) Work with any continuous parameters in types Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 8

  14. Introduction Approach Experiments Approach Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 9

  15. Introduction Approach Experiments Approach For each θ j ∈ Θ j , maintain parameter estimate p ∈ [ p min , p max ] n Update estimates after new observations Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 10

  16. Introduction Approach Experiments Approach For each θ j ∈ Θ j , maintain parameter estimate p ∈ [ p min , p max ] n Update estimates after new observations Updating estimate incurs two computational costs: Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 10

  17. Introduction Approach Experiments Approach For each θ j ∈ Θ j , maintain parameter estimate p ∈ [ p min , p max ] n Update estimates after new observations Updating estimate incurs two computational costs: Computing new parameter estimate Types are blackboxes: must sample effects of parameters ⇒ Need general, efficient estimation methods Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 10

  18. Introduction Approach Experiments Approach For each θ j ∈ Θ j , maintain parameter estimate p ∈ [ p min , p max ] n Update estimates after new observations Updating estimate incurs two computational costs: Computing new parameter estimate Types are blackboxes: must sample effects of parameters ⇒ Need general, efficient estimation methods Adjusting internal state of type May depend on history of observations and parameter values ⇒ New estimate may introduce model inconsistency Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 10

  19. Introduction Approach Experiments Approach: Selective Parameter Updating Observe action a t − 1 of agent j j Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 11

  20. Introduction Approach Experiments Approach: Selective Parameter Updating Observe action a t − 1 of agent j j Select types Φ ⊂ Θ j for updating Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 12

  21. Introduction Approach Experiments Approach: Selective Parameter Updating Observe action a t − 1 of agent j j Select types Φ ⊂ Θ j for updating For each θ j ∈ Φ, update estimate p t − 1 → p t Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 13

  22. Introduction Approach Experiments Approach: Selective Parameter Updating Observe action a t − 1 of agent j j Select types Φ ⊂ Θ j for updating For each θ j ∈ Φ, update estimate p t − 1 → p t Update beliefs: P ( θ j | H t ) ∝ P ( a t − 1 | H t − 1 , θ j , p t ) P ( θ j | H t − 1 ) j Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 14

  23. Introduction Approach Experiments Approach: Selective Parameter Updating Observe action a t − 1 of agent j j Select types Φ ⊂ Θ j for updating Plan own action For each θ j ∈ Φ, update estimate p t − 1 → p t Update beliefs: P ( θ j | H t ) ∝ P ( a t − 1 | H t − 1 , θ j , p t ) P ( θ j | H t − 1 ) j Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 15

  24. Introduction Approach Experiments Approach: Selective Parameter Updating Observe action a t − 1 of agent j j Select types Φ ⊂ Θ j for updating Plan own action For each θ j ∈ Φ, update estimate p t − 1 → p t Update beliefs: P ( θ j | H t ) ∝ P ( a t − 1 | H t − 1 , θ j , p t ) P ( θ j | H t − 1 ) j Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 16

  25. Introduction Approach Experiments Updating Parameter Estimates P ( a 2 j | H 2 , θ j , p 1 , p 2 ) Given type θ j , update parameter estimate p t − 1 → p t P ( a 1 j | H 1 , θ j , p 1 , p 2 ) Type defines action likelihoods P ( a t − 1 | H t − 1 , θ j , p ) j P ( a 0 j | H 0 , θ j , p 1 , p 2 ) -5 5 0 0 5 -5 p 1 p 2 Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 17

  26. Introduction Approach Experiments Approximate Bayesian Updating (ABU) Idea: construct Bayesian update using polynomials Maintain prior P ( p | H t − 1 , θ j ), represented as polynomial Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 18

  27. Introduction Approach Experiments Approximate Bayesian Updating (ABU) Idea: construct Bayesian update using polynomials Maintain prior P ( p | H t − 1 , θ j ), represented as polynomial Approximate likelihood f ( p ) = P ( a t − 1 | H t − 1 , θ j , p ) as j polynomial by sampling over p Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 18

Recommend


More recommend