formalizing connections between motion planning and
play

Formalizing Connections Between Motion Planning and Machine - PowerPoint PPT Presentation

Formalizing Connections Between Motion Planning and Machine Learning Siddhartha Srinivasa Boeing Endowed Professor University of Washington 1 Problems I Want You to Solve So I can Retire Siddhartha Srinivasa Retired Boeing Endowed


  1. Formalizing Connections Between 
 Motion Planning and Machine Learning Siddhartha Srinivasa Boeing Endowed Professor University of Washington 1

  2. Problems I Want You 
 to Solve So I can Retire Siddhartha Srinivasa Retired Boeing Endowed Professor University of Washington 2

  3. Motion Planning

  4. 6

  5. 7

  6. 8

  7. Motion Planning 
 is a technology

  8. 10-100X Improvement

  9. The Piano Movers’ Problem On the Piano Movers problem. I-III , Schwartz and Sharir, Comm. on Pure and Applied Math., 1983

  10. Roadmaps Build Roadmap Plan on Roadmap Plan on Roadmap Probabilistic roadmaps for path planning in high-dimensional configuration spaces , Kavraki et al., IEEE TRO, 1996. 


  11. A* Search

  12. A* Search OPTIMAL!! Is it optimal over something we care about?

  13. A* Search: A Personal Journey Search for Optimal Solutions: the Heart of Heuristic Search is Still Beating Ariel Felner ISE Department Ben-Gurion University ISRAEL felner@bgu.ac.il 1

  14. 16

  15. A* Search: A Personal Journey

  16. A* Search: Amoebas! Optimal Substructure f ( a ) < f ( b ) ⟹ f ( a ∘ x ) < f ( b ∘ x ) ∀ x You will never catch up. Bellman Condition f *( a ) = min x ∈ succ { c ( a , x ) + f *( b )} Be best, locally. Bacteria Vectors by Vecteezy

  17. A* Search: Favoritism Optimism in the 
 Face of Uncertainty (OFU) x ∈ open g ( x ) + h ( x ) min Always be optimistic under uncertainty. 
 You’ll either be correct, 
 or learn something important if you’re wrong. R-MAX: A general polynomial time algorithm for near-optimal reinforcement learning, Brafman and Tennenholtz, JMLR, 2002. 


  18. A* Search is Optimal … Expands the Fewest Number of Vertices But is this what we 
 really want in Motion Planning?

  19. Edge Evaluation Dominates Planning Time Edge Evaluations Other Amoebas are Cheap Slime is Expensive Lazy collision checking in asymptotically-optimal motion planning, Hauser, ICRA 2015. 


  20. Is there a Search Algorithm 
 that Minimizes 
 the Number of Edge Evaluations? I don’t care about amoebas. What algorithm minimizes slime? LazySP ICAPS 2018 [Best Conference Paper Award Winner] First Provably Edge-Optimal A*-like Search Algorithm The Provable Virtue of Laziness in Motion Planning, Hagtalab et al., ICAPS 2018. 


  21. LazySP Greedy Best-first Search over Paths To find the shortest path, 
 eliminate all shorter paths!

  22. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P

  23. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P Send out the Ghost Amoebas

  24. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P Only Slime Known Shortest Paths

  25. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P Only Slime Known Shortest Paths

  26. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path Update the graph P Evaluate Path Collision Free P Only Slime Known Shortest Paths

  27. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P Send out the Ghost Amoebas

  28. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P Only Slime Known Shortest Paths

  29. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P Only Slime Known Shortest Paths

  30. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P Only Slime Known Shortest Paths

  31. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P Only Slime Known Shortest Paths

  32. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P Only Slime Known Shortest Paths

  33. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P Only Slime Known Shortest Paths

  34. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P Only Slime Known Shortest Paths

  35. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P Optimal Slime!

  36. LazySP OFU on Steroids! Graph, start, goal, lazy estimates Lazy search for shortest path P Update the graph Evaluate Path Collision Free P + =

  37. Edge Selectors Forward (first unevaluated edge) Reverse (last unevaluated edge) Alternate (alternate Forward and Reverse) Bisect (furthest from an unevaluated edge)

  38. The Realizability Assumption Forward Can we Learn to Hypothesis Class Imitate the Oracle? All LazySP Selectors Oracle Leveraging experience in lazy search, Bhardwaj et al., RSS 2019. 
 Alternate The Oracle is a LazySP Selector! The Provable Virtue of Laziness in Motion Planning, Hagtalab et al., ICAPS 2018. 


  39. Is there a Search Algorithm 
 that Minimizes 
 the Number of Edge Evaluations? LazySP ICAPS 2018 [Best Conference Paper Award Winner] First Provably Edge-Optimal A*-like Search Algorithm

  40. Anytime Motion Planning Feasible Path Solution Cost Shortest Path Computation Time 46

  41. Anytime Motion Planning Solution Cost Computation Time 47

  42. Will it converge to the shortest path? Solution Cost Computation Time 48

  43. Beyond Asymptotic Optimality Solution Cost Computation Time 49

  44. Beyond Asymptotic Optimality Solution Cost Time to Initial Path Computation Time 50

  45. Beyond Asymptotic Optimality Solution Cost Suboptimality Gap Time to Initial Path Time Budget Computation Time 51

  46. We formalize anytime search as Bayesian Reinforcement Learning Posterior Sampling for Anytime Motion Planning on 
 Graphs with Expensive-to-Evaluate Edges, Hou et al., ICRA 2020. 
 52

  47. Bayesian Anytime Motion Planning • Evaluating edges uncovers shorter paths • Anytime Objective : cumulative path lengths • Given prior on collision statuses • Bayesian Anytime Objective : • Bayesian planning algorithm uses 
 edge evaluation history to 
 compute collision posterior 53

  48. The Experienced Piano Movers’ Problem New Piano. New House. Same Mover.

  49. Bayesian Anytime Motion Planning as 
 Bayesian Reinforcement Learning • Equivalence to episodic Bayesian RL [Osband et al, 2013] • Infer unknown MDP through repeated episodes Minimizing Bayesian regret is equivalent to 
 minimizing the Bayesian anytime planning objective! Extending rapidly-exploring random trees for asymptotically optimal “no regret” is equivalent to asymptotic optimality anytime motion planning, Abbasi-Yadkori et al., IROS 2010. 
 55

  50. Experienced Lazy Path Search Proposer Posterior Path Validator Feasible Path Evaluated edge statuses 56

  51. The Posterior Sampling Proposer Proposer Posterior • Posterior Sampling for Motion Planning (PSMP): 
 propose paths according to probability they are optimal Validator • Idea from multi-armed bandits (as Thompson sampling), 
 Posterior Sampling for RL (More) efficient reinforcement learning via posterior sampling, Osband et al., N*IPS 2013. 
 57

  52. The Posterior Sampling Proposer Proposer Posterior • Posterior Sampling for Motion Planning (PSMP): 
 propose paths according to probability they are optimal Validator • Idea from multi-armed bandits (as Thompson sampling), 
 Posterior Sampling for RL [Osband et al, 2013] • First anytime motion planning algorithm with Bayesian regret bounds • Analysis adapts [Osband et al, 2013] for deterministic MDPs • Bound of matches known lower bounds (More) efficient reinforcement learning via posterior sampling, Osband et al., N*IPS 2013. 
 58

  53. The Posterior Sampling Proposer Proposer Posterior • Posterior Sampling for Motion Planning (PSMP): 
 propose paths according to probability they are optimal Validator • Idea from multi-armed bandits (as Thompson sampling), 
 Posterior Sampling for RL [Osband et al, 2013] • First anytime motion planning algorithm with Bayesian regret bounds • Analysis adapts [Osband et al, 2013] for deterministic MDPs • Bound of matches known lower bounds • Solves one shortest path problem per proposal (More) efficient reinforcement learning via posterior sampling, Osband et al., N*IPS 2013. 
 59

Recommend


More recommend