online saturated cost partitioning for classical planning
play

Online Saturated Cost Partitioning for Classical Planning Jendrik - PowerPoint PPT Presentation

Online Saturated Cost Partitioning for Classical Planning Jendrik Seipp October 21, 2020 University of Basel 1/17 Setting optimal classical planning multiple abstraction heuristics cost partitioning 2/17 A search +


  1. Online Saturated Cost Partitioning for Classical Planning Jendrik Seipp October 21, 2020 University of Basel 1/17

  2. Setting • optimal classical planning • multiple abstraction heuristics • cost partitioning 2/17 • A ∗ search + admissible heuristic

  3. Setting • optimal classical planning • multiple abstraction heuristics 2/17 • A ∗ search + admissible heuristic • saturated cost partitioning

  4. Coverage over time 10 0 10 1 10 2 10 3 200 400 600 800 time in seconds solved tasks offline-1000s 3/17 1 , 200 1 , 000

  5. h 1 s 2 h 2 s 2 • only selects best heuristic • does not combine heuristics Background 1 5 • h s 2 maximize over estimates: 4 5 1 4 s 1 , s 2 4 s 5 s 2 , s 3 , s 4 s 1 1 1 4 4 s 4 , s 5 s 3 4/17

  6. • only selects best heuristic • does not combine heuristics Background s 1 , s 2 5 • h s 2 maximize over estimates: 1 1 4 4 s 5 s 2 , s 3 , s 4 s 1 1 1 4 4 s 4 , s 5 s 3 4/17 h 1 ( s 2 ) = 5 h 2 ( s 2 ) = 4

  7. • only selects best heuristic • does not combine heuristics Background s 5 maximize over estimates: 1 1 4 s 1 , s 2 4 s 2 , s 3 , s 4 s 1 1 1 4 4 s 4 , s 5 s 3 4/17 h 1 ( s 2 ) = 5 h 2 ( s 2 ) = 4 • h ( s 2 ) = 5

  8. Background s 2 , s 3 , s 4 maximize over estimates: 1 1 4 s 1 , s 2 s 5 4 s 1 1 1 4 4 s 4 , s 5 s 3 4/17 h 1 ( s 2 ) = 5 h 2 ( s 2 ) = 4 • h ( s 2 ) = 5 • only selects best heuristic • does not combine heuristics

  9. Background s 2 , s 3 , s 4 6 3 3 h s 2 1 1 4 4 s 5 s 1 Cost partitioning 1 1 4 4 s 4 , s 5 s 3 s 1 , s 2 • split action costs among heuristics 5/17 • sum of costs ≤ original cost

  10. Background s 2 , s 3 , s 4 6 3 3 h s 2 0 1 3 1 s 5 s 1 Cost partitioning 1 0 1 2 s 4 , s 5 s 3 s 1 , s 2 • split action costs among heuristics 5/17 • sum of costs ≤ original cost

  11. Background Cost partitioning 0 1 3 1 s 5 s 2 , s 3 , s 4 s 1 1 0 1 2 s 4 , s 5 s 3 s 1 , s 2 • split action costs among heuristics 5/17 • sum of costs ≤ original cost h ( s 2 ) = 3 + 3 = 6

  12. h SCP s 2 Background s 2 , s 3 , s 4 8 3 5 1 1 4 4 s 5 s 1 Saturated cost partitioning 1 1 4 4 s 4 , s 5 s 3 s 1 , s 2 • use remaining costs for subsequent heuristics • use minimum costs preserving all estimates of h • order heuristics, then for each heuristic h : 6/17

  13. h SCP s 2 Background 1 8 3 5 s 5 s 2 , s 3 , s 4 s 1 1 Saturated cost partitioning 4 4 s 4 , s 5 s 3 s 1 , s 2 • use remaining costs for subsequent heuristics • use minimum costs preserving all estimates of h • order heuristics, then for each heuristic h : 6/17

  14. h SCP s 2 Background 1 8 3 5 s 5 s 2 , s 3 , s 4 s 1 0 Saturated cost partitioning 1 4 s 4 , s 5 s 3 s 1 , s 2 • use remaining costs for subsequent heuristics • use minimum costs preserving all estimates of h • order heuristics, then for each heuristic h : 6/17

  15. h SCP s 2 Background s 2 , s 3 , s 4 8 3 5 0 1 3 0 s 5 s 1 Saturated cost partitioning 1 0 1 4 s 4 , s 5 s 3 s 1 , s 2 • use remaining costs for subsequent heuristics • use minimum costs preserving all estimates of h • order heuristics, then for each heuristic h : 6/17

  16. h SCP s 2 Background s 2 , s 3 , s 4 8 3 5 0 0 3 0 s 5 s 1 Saturated cost partitioning 1 0 1 4 s 4 , s 5 s 3 s 1 , s 2 • use remaining costs for subsequent heuristics • use minimum costs preserving all estimates of h • order heuristics, then for each heuristic h : 6/17

  17. Background 1 0 0 3 0 s 5 s 2 , s 3 , s 4 s 1 0 Saturated cost partitioning 1 4 s 4 , s 5 s 3 s 1 , s 2 • use remaining costs for subsequent heuristics • use minimum costs preserving all estimates of h • order heuristics, then for each heuristic h : 6/17 h SCP ( s 2 ) = 5 + 3 = 8

  18. h 1 h 2 s 2 h 2 h 1 s 2 Background Order matters: • h SCP • h SCP use multiple orders and maximize over estimates: h SCP h SCP 7/17 ⟨ h 1 , h 2 ⟩ ( s 2 ) = 8 ⟨ h 2 , h 1 ⟩ ( s 2 ) = 7

  19. Background Order matters: • h SCP • h SCP 7/17 ⟨ h 1 , h 2 ⟩ ( s 2 ) = 8 ⟨ h 2 , h 1 ⟩ ( s 2 ) = 7 → use multiple orders and maximize over estimates: max( h SCP ⟨ h 1 , h 2 ⟩ ( s 2 ) , h SCP ⟨ h 2 , h 1 ⟩ ( s 2 ))

  20. Background Offline diversification • sample 1000 states • start with empty set of orders • until time limit is reached: • compute order for new sample • store order if a sample profits from it 8/17

  21. Coverage over time 10 0 10 1 10 2 10 3 200 400 600 800 time in seconds solved tasks offline-1000s 9/17 1 , 200 1 , 000

  22. Coverage over time 10 0 10 1 10 2 10 3 200 400 600 800 time in seconds solved tasks offline-1000s online-nodiv 9/17 1 , 200 1 , 000

  23. Coverage over time 10 0 online-1000s online-nodiv offline-1000s solved tasks time in seconds 800 600 400 200 10 3 10 2 10 1 9/17 1 , 200 1 , 000

  24. Online diversification ComputeHeuristic( s ) • if Select( s ) and not time limit reached • compute order for s • store order if s profits from it • return maximum over all stored orders for s 10/17

  25. Offline vs. online diversification Offline • compute orders for samples for T seconds • store order if one of 1000 samples profits from it Online • store order if single evaluated state profits from it 11/17 • compute orders for subset of evaluated states for at most T seconds

  26. Selection strategies Novelty 1 1153– 1159 1157 1153 1145 Coverage Interval 1–100K Novelty 2 Bellman Select a • Bellman (Eifler and Fickert 2018): • Novelty (Lipovetzky and Geffner 2012) • Interval 12/17 h ( s ) ≥ min s → t ∈ T ( h ( t ) + cost ( a )) −

  27. Selection strategies Novelty 1 1153– 1159 1157 1153 1145 Coverage Interval 1–100K Novelty 2 Bellman Select a • Bellman (Eifler and Fickert 2018): • Novelty (Lipovetzky and Geffner 2012) • Interval 12/17 h ( s ) ≥ min s → t ∈ T ( h ( t ) + cost ( a )) −

  28. Coverage 10 0 10 1 10 2 10 3 diversification time in seconds solved tasks offline online 13/17 1 , 140 1 , 120 1 , 100 1 , 080 1 , 060 1 , 040

  29. Time score 10 0 10 1 10 2 10 3 0 200 400 600 800 diversification time in seconds time score offline online 14/17 1 , 000

  30. Stored orders 10 0 10 1 10 2 10 3 10 0 10 1 10 2 10 3 failed failed offline online 15/17

  31. Choose which abstractions to build • e.g., patterns for PDBs future work Build abstractions • e.g., Cartesian abstractions and symbolic PDBs online refinement (e.g., Eifler and Fickert 2018, Franco and Torralba 2019) Compute orders and cost partitionings • e.g., saturated cost partitioning Bellman, novelty, interval 16/17 Before the A ∗ search can start

  32. Choose which abstractions to build • e.g., patterns for PDBs future work Build abstractions • e.g., Cartesian abstractions and symbolic PDBs Compute orders and cost partitionings • e.g., saturated cost partitioning Bellman, novelty, interval 16/17 Before the A ∗ search can start → online refinement (e.g., Eifler and Fickert 2018, Franco and Torralba 2019)

  33. Choose which abstractions to build • e.g., patterns for PDBs future work Build abstractions • e.g., Cartesian abstractions and symbolic PDBs Compute orders and cost partitionings • e.g., saturated cost partitioning 16/17 Before the A ∗ search can start → online refinement (e.g., Eifler and Fickert 2018, Franco and Torralba 2019) → Bellman, novelty, interval

  34. Choose which abstractions to build • e.g., patterns for PDBs future work Build abstractions • e.g., Cartesian abstractions and symbolic PDBs Compute orders and cost partitionings • e.g., saturated cost partitioning 16/17 Before the A ∗ search can start → online refinement (e.g., Eifler and Fickert 2018, Franco and Torralba 2019) → Bellman, novelty, interval

  35. Choose which abstractions to build • e.g., patterns for PDBs Build abstractions • e.g., Cartesian abstractions and symbolic PDBs Compute orders and cost partitionings • e.g., saturated cost partitioning 16/17 Before the A ∗ search can start → future work → online refinement (e.g., Eifler and Fickert 2018, Franco and Torralba 2019) → Bellman, novelty, interval

  36. Summary states high coverage low coverage high coverage fast evaluations slow evaluations fast evaluations states Offline diversification samples no precomputation no precomputation long precomputation Online diversification Online computation 17/17

  37. Offline vs. online diversification 86.8 1146 Time Score offline 791.2 690.7 420.3 59.2 1159 25.9 online 920.6 929.7 919.1 908.7 906.3 1154 1153 1s offline 10s 100s 1000s 1200s 1500s Coverage 1056 1135 1145 1159 1156 1148 1128 online 1102 906.6

  38. Offline vs. online diversification 86.8 1146 Time Score offline 791.2 690.7 420.3 59.2 1159 25.9 online 920.6 929.7 919.1 908.7 906.3 1154 1153 1s offline 10s 100s 1000s 1200s 1500s Coverage 1056 1135 1145 1159 1156 1148 1128 online 1102 906.6

Recommend


More recommend