Information Sharing for Distributed Planning Prasanna Velagapudi AAMAS 2010 - Doctoral Symposium 1
Large Heterogeneous Teams • 100s to 1000s of robots, agents, people • Complex, collaborative tasks • Dynamic, uncertain environment • Joint planning intractable AAMAS 2010 - Doctoral Symposium 2
Scaling Team Planning • Independent planners: can’t account for teammates • Existing work: needs specific structure or doesn’t scale to these sizes – DPC, Prioritized Planning – JESP, Factored MDP, ND-POMDP AAMAS 2010 - Doctoral Symposium 3
Iterated Distributed Planning 1. Factor the problem, enumerate interactions 2. Compute independent plans & potential interactions 3. Exchange messages about interactions 4. Use exchanged information, improve local model AAMAS 2010 - Doctoral Symposium 4
Iterated Distributed Planning 1. Factor the problem, enumerate interactions 2. Compute independent plans & potential interactions 3. Exchange messages about interactions 4. Use exchanged information, improve local model ? AAMAS 2010 - Doctoral Symposium 5
Iterated Distributed Planning 1. Factor the problem, enumerate interactions 2. Compute independent plans & potential interactions 3. Exchange messages about interactions 4. Use exchanged information, improve local model ? AAMAS 2010 - Doctoral Symposium 6
Iterated Distributed Planning 1. Factor the problem, enumerate interactions 2. Compute independent plans & potential interactions 3. Exchange messages about interactions 4. Use exchanged information, improve local model AAMAS 2010 - Doctoral Symposium 7
A Tale of Two Distributed Planners Distributed Prioritized L-TREMOR Planning (DPP) 18 16 14 12 10 8 6 4 2 5 10 15 AAMAS 2010 - Doctoral Symposium 8
Distributed Prioritized Planning AAMAS 2010 - Doctoral Symposium 9
Multiagent Path Planning Start 18 16 14 12 10 8 6 4 2 Goal 5 10 15 AAMAS 2010 - Doctoral Symposium 10
Multiagent Path Planning 40 35 30 25 20 15 10 5 5 10 15 20 25 30 35 40 AAMAS 2010 - Doctoral Symposium 11
Prioritized Planning • Assign priorities to agents based on path length [van den Berg, et al 2005] AAMAS 2010 - Doctoral Symposium 12
Prioritized Planning • Plan from highest priority to lowest priority • Use previous agents as dynamic obstacles [van den Berg, et al 2005] AAMAS 2010 - Doctoral Symposium 13
Distributed Prioritized Planning Parallelizable & Equivalent AAMAS 2010 - Doctoral Symposium 14
Large-Scale Path Solutions AAMAS 2010 - Doctoral Symposium 15
Large-Scale Path Solutions AAMAS 2010 - Doctoral Symposium 16
DPP Results Fewer Sequential Plans Number of sequential planning iterations 15 10 5 0 50 100 150 200 Number of robots AAMAS 2010 - Doctoral Symposium 17
DPP Results Fewer Sequential Plans Longer Planning Time Number of sequential planning iterations Proportion of centralized planning time 15 5 4 10 3 5 2 0 1 50 100 150 200 50 100 150 200 Number of robots Number of robots AAMAS 2010 - Doctoral Symposium 18
Why does this happen? • Prioritized Planning Longest planning agents might replan A multiple times B C Individual agent D planning times varied by >2 orders of • DPP magnitude A Solution 1: B Prioritize by plan time? C Solution 2: D Incremental Planning AAMAS 2010 - Doctoral Symposium 19
Summary of DPP • Observable, certain world • Only one type of interaction: collision • Far fewer sequential planning iterations • Incremental planning may reduce execution time AAMAS 2010 - Doctoral Symposium 20
L-TREMOR AAMAS 2010 - Doctoral Symposium 21
A Simple Rescue Domain Unsafe Cell Rescue Clearable Agent Debris Narrow Corridor Victim Cleaner Agent AAMAS 2010 - Doctoral Symposium 22
A Simple (Large) Rescue Domain AAMAS 2010 - Doctoral Symposium 23
Distributed POMDP with Coordination Locales (DPCL) • Often, interactions between agents are sparse Only fits one agent Passable if cleaned [Varakantham, et al 2009] AAMAS 2010 - Doctoral Symposium 24
Distributed POMDP with Coordination Locales (DPCL) • Define coordination locales (CLs) where POMDP model functions are not independent: < S , A , Ω , P , R , O > (states) (actions) (obs.) (transition)(reward)(obs. fn) [Varakantham, et al 2009] AAMAS 2010 - Doctoral Symposium 25
Distributed POMDP with Coordination Locales (DPCL) • Define coordination locales (CLs) where POMDP model functions are not independent: Outside CL: S global R 1 , P 1 , O 1 R 2 , P 2 , O 2 (typical) S 1 , A 1 S 2 , A 2 [Varakantham, et al 2009] AAMAS 2010 - Doctoral Symposium 26
Distributed POMDP with Coordination Locales (DPCL) • Define coordination locales (CLs) where POMDP model functions are not independent: Inside CL: S global (interaction) R 12 , P 12 , O 12 S 1 , A 1 S 2 , A 2 [Varakantham, et al 2009] AAMAS 2010 - Doctoral Symposium 27
TREMOR Role Allocation Policy Solution Interaction Detection Coordination Reward shaping TREMOR Branch & Bound Independent Joint policy of independent MDP EVA [3] solvers evaluation models [Varakantham, et al 2009] AAMAS 2010 - Doctoral Symposium 28
L-TREMOR Role Allocation Policy Solution Interaction Detection Coordination TREMOR Branch & Bound Joint policy Distributed & Parallelizable MDP evaluation Reward shaping Independent of independent EVA [3] solvers L-TREMOR Sampling & models Decentralized message Auction passing AAMAS 2010 - Doctoral Symposium 29
Preliminary Results – Joint Utility 1050 30 250 20 1000 10 200 Empirical Joint Reward Empirical Joint Reward Empirical Joint Reward 950 0 � 10 150 900 � 20 850 � 30 100 800 � 40 L � TREMOR L � TREMOR L � TREMOR Independent Independent Independent � 50 50 750 0 2 4 6 8 10 12 0 5 10 15 20 0 5 10 15 20 Iteration Iteration Iteration N = 100 N = 6 N = 10 (structurally similar to N=10) AAMAS 2010 - Doctoral Symposium 30
Preliminary Results – Timing 50 n = 5, complex 45 n = 10, tall n = 100, tall 40 Planning Time per Agent (s) 35 30 25 20 15 10 5 0 2 4 6 8 10 12 14 16 18 20 Iteration AAMAS 2010 - Doctoral Symposium 31
Preliminary Results – Model Accuracy 100 n = 5, complex Improvement over independent policy n = 10, tall n = 100, tall 50 0 � 50 � 100 R = 0.804 � 150 � 100 0 100 200 300 400 Error between actual and expected value AAMAS 2010 - Doctoral Symposium 32
Current Issues • Oscillations in solutions • Discovery of relevant locales ? AAMAS 2010 - Doctoral Symposium 33
Summary of L-TREMOR • Partially-observable, uncertain world • Multiple types of interactions • Role-allocation of tasks • Improvement over independent planning • Handles large problems • Next steps: improving convergence AAMAS 2010 - Doctoral Symposium 34
Conclusions • Two approaches to distributed planning – DPP: approaching centralized performance – L-TREMOR: exceeding joint tractability • Analogous strategies for distributing planning – Both iterate independent planners – Both exchange messages about states, actions AAMAS 2010 - Doctoral Symposium 35
Future Work • Generalized framework for distributed planning through iterative message exchange • Reduce necessary communication • Better search over task allocations • Scaling to larger team sizes AAMAS 2010 - Doctoral Symposium 36
Recommend
More recommend