Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Random Walk Planning: Theory, Practice, and Application Hootan Nakhost University of Alberta, Canada Google Canada since May 2013 May 9, 2012
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Outline RW Planning Inefficient Design Application plans Why does it work? Resource-constrained Empirical study of Planning the design space Postprocessing RW Theory
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Automated Planning 1 RW Theory 2 3 RW Search Application 4 Plan Improvement 5 Systems 6 Conclusions 7
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Automated Planning Given a model of the world, generate a plan to achieve predefined goals Applications Autonomous agents General solvers
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Classical Representations (STRIPS) State Each state is a set of propositions B {On(B, A), Ontable(A), Clear(B)} A Action Each action has preconditions, positive and negative effects B {OnTable(A), Holding(B)} A Plan A sequence of actions that starts from the initial state and ends in s ⊇ G
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Planning Methods Heuristic Search Common standard systematic search algorithms such as Greedy Best First Search (GBFS) and WA* Contribution A new search paradigm for satisficing planning: random walk (RW) search
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Automated Planning 1 RW Theory 2 3 RW Search Application 4 Plan Improvement 5 Systems 6 Conclusions 7
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Why Random Walks? Random Walk A sequence of randomly selected actions High level and Intuitive Explanations Escaping faster from plateaus More exploration Not wasting time in dead-ends A theoretical model can explain ... What are the key features affecting the performance How we can improve the algorithms
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions A Motivating Example: Transportation Domain
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Random Walks vs. Systematic Search
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Theoretical Analysis of RW Planning Graph properties affecting RW performance Progress Chance(PC) Regress Chance(RC) Regress Factor(RF) PC = 1 4 , RC = 1 2 , RF = RC PC = 2
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Definitions: Fairness and Hitting Time Fairness A single state transition in the graph cannot change the goal distance by more than one unit. Every undirected graph is a fair graph. Hitting Time The expected number of steps in a random walk starting from the initial state and ending in the goal for the first time.
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Fair Strongly Homogenous Graph (FSHG) p = progress chance q = regress chance D = largest goal distance
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Theorem: Hitting time in FSHG β 0 λ D + β 1 d x � � � Θ if q � = p h x = Θ ( α 1 Dd x ) if q = p where λ = q q p − q , α 0 = 1 1 2 p , α 1 = 1 p , β 0 = ( p − q ) 2 , β 1 = p
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Bounds for more general graphs q i = maximum regress chance at the goal distance i p i = minimum progress chance at the goal distance i
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Analysis of the Transport Example 1 RC max = PC min = 2 × |trucks| h x = Dd x p
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Fair Homogenous Graph (FHG) p i = progress chance at goal distance i q i = regress chance at goal distance i D = largest goal distance
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Hitting time in FHG j − 1 d x D − 1 D − 1 � � � � h x = β D λ i + β j λ i d = 1 i = d j = d i = d where for all 1 ≤ d ≤ D , λ d = q d , β d = 1 p d p d
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Theory for Random Walks with Restart Restarting Random Walks At each step with probability r restart from the initial state Hitting Time � βλ d x − 1 � h x ∈ O where � q � r , β = q + r λ = p + p ( 1 − r ) + 1 pr
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Findings Determined the key features of the search space affecting RW Regress factor RF Largest goal distance D Initial goal distance d Provides valuable insights to design RW planners Biasing action selection Restarting frequency r
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Automated Planning 1 RW Theory 2 3 RW Search Application 4 Plan Improvement 5 Systems 6 Conclusions 7
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions RW Search The General Framework Use forward chaining Local Search In each step, run random walks to find the next state Use restarts to recover from unpromising search regions
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions RWS Framework: an Illustration 9
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions RWS Framework: an Illustration ∞ 65 9 14 14 14 15 13 9 14 10 9 10 10 10 7
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions RWS Framework: an Illustration ∞ 65 9 14 14 14 15 13 9 14 10 9 10 10 10 7
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions RWS Framework: an Illustration ∞ 65 9 14 14 14 15 13 9 14 10 9 43 7 10 10 7 10 9 7 7 7 7 7 7 2
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions RWS Framework: an Illustration ∞ 65 9 14 14 14 15 13 9 14 10 9 43 7 10 10 7 10 9 7 7 7 7 7 7 2
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions A Basic RW planner Walk Length Use a local restarting rate r l : at each step terminate the walk with probability r l Restarting Use a restarting threshold t g : restart the search when the last t g walks have not reached lower heuristic
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Experimental Study of the Design Space Local Exploration Length of Walks Evaluation Rate Action Selection Bias Global Exploration Jumping Strategies Restarting Strategies Heuristic function Type of the heuristic function The accuracy of the heuristic function
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Two Practical Outcomes Learning systems that adapt parameters to the input problem Effective Biasing techniques
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions The Effect of Restarting Threshold: Elevators 03 400 350 300 Min. Heuristic Value 250 200 150 100 Fast Restarting 50 Slow Restarting 0 10000 20000 30000 40000 50000 No. of Walks
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions The Effect of Restarting Threshold: Floortile 01 70 60 50 Min. Heuristic Value 40 30 20 Fast Restarting 10 Slow Restarting 0 10000 20000 30000 40000 50000 No. of Walks
Automated Planning RW Theory RW Search Application Plan Improvement Systems Conclusions Adaptive Global Restarting (AGR) Let V w be the average heuristic improvement per walk AGR continually estimates V w and sets t g = h 0 V w rl=0.01 ¡ 100% ¡ 90% ¡ 80% ¡ 70% ¡ 60% ¡ Coverage ¡ 50% ¡ tg=100 ¡ 40% ¡ tg=1000 ¡ tg=10000 ¡ 30% ¡ AGR ¡ 20% ¡ 10% ¡ 0% ¡ elevators ¡ floor6le ¡ nomystery ¡ parcprinter ¡ parking ¡ scanalyzer ¡ sokoban ¡ 6dybot ¡ visitall ¡ woodworking ¡ total ¡
Recommend
More recommend