open world planning for robots via hindsight optimization
play

Open World Planning for Robots via Hindsight Optimization Scott - PowerPoint PPT Presentation

Open World Planning for Robots via Hindsight Optimization Scott Kiesel 1 , Ethan Burns 1 , Wheeler Ruml 1 , J. Benton 2 , Frank Kreimendahl 1 1 2 We are grateful for funding from the DARPA CSSG program (grant H R0011-09-1-0021) and NSF (grant


  1. Open World Planning for Robots via Hindsight Optimization Scott Kiesel 1 , Ethan Burns 1 , Wheeler Ruml 1 , J. Benton 2 , Frank Kreimendahl 1 1 2 We are grateful for funding from the DARPA CSSG program (grant H R0011-09-1-0021) and NSF (grant IIS-0812141). Scott Kiesel (UNH) Open World Planning for Robots – 1 / 19

  2. Open World Planning - Search and Rescue Introduction ■ Open World ■ Search & Rescue ■ Previous Approaches ■ Hindsight Opt OH-wOW Results Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 2 / 19

  3. Search and Rescue Domain Robot agent ■ Introduction Unknown building/map layout ■ Open World ■ ■ Search & Rescue Unknown victim locations ■ ■ Previous Approaches Unknown number of victims ■ ■ Hindsight Opt Search time limit ■ OH-wOW Results Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 3 / 19

  4. Previous Approaches Talamadupula et al. (ICAPS ’09, AAAI ’10, TIST ’10) ■ Introduction ad-hoc assumption: roomExists ( x ) → personExistsIn ( x ) ■ Open World ■ Search & Rescue ■ Previous Approaches ■ Hindsight Opt Joshi et al. (ICRA ’12) ■ OH-wOW based on FODD approximations Results hours of offline planning Conclusion Optimization in Hindsight with Open Worlds (OH-wOW) ■ general principled easy to implement (and extend) Scott Kiesel (UNH) Open World Planning for Robots – 4 / 19

  5. Hindsight Optimization Select action that maximizes expected reward. Introduction ■ Open World ■ Search & Rescue reward = cumulative reward following optimal plan ■ Previous Approaches ■ Hindsight Opt OH-wOW Results Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 5 / 19

  6. Hindsight Optimization Select action leading to states with highest expected reward. Introduction ■ Open World ■ Search & Rescue reward = reward of plan out of all possible plans with best ■ Previous Approaches average reward over all configurations ■ Hindsight Opt OH-wOW   Results | A | V ∗ ( s 1 ) = � min R ( s i , a i ) Conclusion E   A = � a 1 ,...,a | A | � � s 2 ,...,s | A | � i =1 , , , , ... , , , , ... Scott Kiesel (UNH) Open World Planning for Robots – 5 / 19

  7. Hindsight Optimization Select action leading to states with highest expected reward. Introduction ■ Open World ■ Search & Rescue reward ≈ reward of plan out of all possible plans with best ■ Previous Approaches average reward across sampled configurations ■ Hindsight Opt OH-wOW   Results | A | ˆ � V ( s 1 ) = min R ( s i , a i ) Conclusion E   A = � a 1 ,...,a | A | � � s 2 ,...,s | A | � i =1 , , , , , , ... Scott Kiesel (UNH) Open World Planning for Robots – 5 / 19

  8. Hindsight Optimization Select action leading to states with highest expected reward. Introduction ■ Open World ■ Search & Rescue reward ≈ average reward of best plan in each sampled ■ Previous Approaches configuration ■ Hindsight Opt OH-wOW   | A | ˆ � Results V ( s 1 ) = min R ( s i , a i ) E   A = � a 1 ,...,a | A | � Conclusion � s 2 ,s 3 ,... � i =1 Scott Kiesel (UNH) Open World Planning for Robots – 5 / 19

  9. Introduction OH-wOW ■ Implementation ■ Sense ■ Sample ■ Plan ■ Act Results Optimization in Hindsight with Open Worlds Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 6 / 19

  10. OH-wOW Implementation for Search and Rescue 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 7 / 19

  11. Sensing and Observations 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Conclusion SLAM (ROS gmapping) ■ laser rangefinder Topological Map ■ rough construction Person Detector ■ Scott Kiesel (UNH) Open World Planning for Robots – 8 / 19

  12. Sensing and Observations 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Sensed Occupancy Grid with Topological Graph Overlayed Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 8 / 19

  13. Sampling Possible Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Conclusion Current Knowledge ■ observed known to be true Expectation ■ prior domain knowledge bias Scott Kiesel (UNH) Open World Planning for Robots – 9 / 19

  14. Sampling Possible Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Known Partial World State Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 9 / 19

  15. Sampling Possible Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Sampled “Complete” World State Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 9 / 19

  16. Planning in Sampled Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Conclusion Fully Known ■ Deterministic ■ Classical Planners or ■ Domain Specific Planners ■ Scott Kiesel (UNH) Open World Planning for Robots – 10 / 19

  17. Planning in Sampled Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results A Single Sample Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 10 / 19

  18. Acting in Sampled Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Conclusion Execute Best Currently Available Action ■ maximize expected reward Scott Kiesel (UNH) Open World Planning for Robots – 11 / 19

  19. Acting in Sampled Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 11 / 19

  20. Introduction OH-wOW Results ■ Rescue ■ Rescue (sim) ■ Omelette (sim) Conclusion Results Scott Kiesel (UNH) Open World Planning for Robots – 12 / 19

  21. Search and Rescue UNH CS Offices, Pioneer 3-DX, SICK LMS500, ROS Fuerte Introduction OH-wOW victims found Results deadline 0 1 2 3 ■ Rescue ■ Rescue (sim) 1 minute 4 6 0 0 ■ Omelette (sim) 5 minutes 0 7 3 0 Conclusion 10 minutes 0 3 4 3 Joshi et al: 4 hours precomputation, 3 victims constant time table lookup OH-wOW: no precomputation 0.18 sec avg max step time, 3 victims (256 samples) 2.7 sec avg max step time, 10 victims (256 samples) Scott Kiesel (UNH) Open World Planning for Robots – 13 / 19

  22. Search and Rescue UNH CS Offices, Pioneer 3-DX, SICK LMS500, ROS Fuerte Introduction OH-wOW victims found Results deadline 0 1 2 3 ■ Rescue ■ Rescue (sim) 1 minute 4 6 0 0 ■ Omelette (sim) 5 minutes 0 7 3 0 Conclusion 10 minutes 0 3 4 3 OH-wOW: is online, ■ computes the next action quickly, ■ and handles the tradeoff between hard and soft goals. ■ Scott Kiesel (UNH) Open World Planning for Robots – 13 / 19

  23. Search and Rescue in Simulation 10 cost over optimal Introduction OH-wOW Results 5 ■ Rescue ■ Rescue (sim) ■ Omelette (sim) Conclusion 0 32 256 ctlr 32 256 ctlr 32 256 ctlr none south southwest OH-wOW: leverages domain specific knowledge, ■ and can beat a handcoded controller. ■ Scott Kiesel (UNH) Open World Planning for Robots – 14 / 19

  24. Omelette Domain in Simulation Levesque (AAAI ’96) Introduction OH-wOW planning time (seconds) Results 3 eggs step 4 eggs step ■ Rescue ■ Rescue (sim) Bonet et al (IJCAI ’01) 185 - - - ■ Omelette (sim) Levesque (IJCAI ’05)) 1.4 - 1,681 - Conclusion OH-wOW 12.9 0.52 76.7 1.57 Levesque plans are longer than OH-wOW OH-wOW: is online, ■ computes the next action quickly, ■ and finds cheaper cost solutions. ■ Scott Kiesel (UNH) Open World Planning for Robots – 15 / 19

  25. Introduction OH-wOW Results Conclusion ■ Limitations ■ Summary ■ Advertising Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 16 / 19

  26. Limitations Scalability of the underlying planner ■ Introduction leverage large body of literature OH-wOW Calls underlying planner repetitively ■ Results embarassingly parallel Conclusion ■ Limitations Vulnerable to black swans during sampling ■ ■ Summary importance sampling ■ Advertising Regenerates world samples at every step ■ reuse samples until world ”changes” (see Yoon et al. ICAPS ’10 for HO Optimizations) Scott Kiesel (UNH) Open World Planning for Robots – 17 / 19

  27. Summary The OH-wOW framework is a: Introduction OH-wOW Fast, ■ Results Simple, ■ Conclusion General, ■ ■ Limitations Online, ■ Summary ■ ■ Advertising Approximate, ■ Way of Handling Open Worlds. ■ Scott Kiesel (UNH) Open World Planning for Robots – 18 / 19

Recommend


More recommend