pascal
play

PASCAL A Parallel Algorithmic SCALable Framework A Parallel - PowerPoint PPT Presentation

PASCAL A Parallel Algorithmic SCALable Framework A Parallel Algorithmic SCALable Framework for N-body Problems for N-body Problems Laleh Aghababaie Beni, Aparna Chandramowlishwaran Laleh Aghababaie Beni, Aparna Chandramowlishwaran Euro-Par


  1. PASCAL A Parallel Algorithmic SCALable Framework A Parallel Algorithmic SCALable Framework for N-body Problems for N-body Problems Laleh Aghababaie Beni, Aparna Chandramowlishwaran Laleh Aghababaie Beni, Aparna Chandramowlishwaran Euro-Par 2017 Euro-Par 2017 HPC Factory

  2. Outline • Introduction • PASCAL Framework • Space Partitioning Trees • Tree Traversal • Prune/Approximate Generators • Optimizations & Parallelization • Experiments & Results • Conclusions & Future Work HPC Factory

  3. Outline • Introduction • PASCAL Framework • Space Partitioning Trees • Tree Traversal • Prune/Approximate Generators • Optimizations & Parallelization • Experiments & Results • Conclusions & Future Work HPC Factory

  4. N-body calculations r − q � ∀ q ∈ Q : F ( q ) = C Force computation || r − q || 3 r ∈ ( Q − { q } ) ∀ q ∈ Q : AllNN ( q ) = argmin r ∈ R d ( q, r ) Nearest neighbors 1 � ∀ q ∈ Q : KDE ( q ) = K ( q, r ) Kernel density estimation | R | r ∈ R � ∀ q ∈ Q : Range ( q ) = I ( dist ( q, r )) ≤ h ) Range count r ∈ R HPC Factory

  5. N-body calculations What do these have in common? r − q � ∀ q ∈ Q : F ( q ) = C Force computation || r − q || 3 r ∈ ( Q − { q } ) ∀ q ∈ Q : AllNN ( q ) = argmin r ∈ R d ( q, r ) Nearest neighbors 1 � ∀ q ∈ Q : KDE ( q ) = K ( q, r ) Kernel density estimation | R | r ∈ R � ∀ q ∈ Q : Range ( q ) = I ( dist ( q, r )) ≤ h ) Range count r ∈ R HPC Factory

  6. N-body calculations What do these have in common? r − q � ∀ q ∈ Q : F ( q ) = C Force computation || r − q || 3 r ∈ ( Q − { q } ) ∀ q ∈ Q : AllNN ( q ) = argmin r ∈ R d ( q, r ) Nearest neighbors 1 � ∀ q ∈ Q : KDE ( q ) = K ( q, r ) Kernel density estimation | R | r ∈ R � ∀ q ∈ Q : Range ( q ) = I ( dist ( q, r )) ≤ h ) Range count r ∈ R Consider pairs of points – naïvely O( N 2 ) HPC Factory

  7. Commonality: Optimal approximation algorithms r − q � ∀ q ∈ Q : F ( q ) = C Force computation || r − q || 3 r ∈ ( Q − { q } ) • Hierarchical tree-based approximation algorithms for force computations, e.g. , Barnes-Hut or FMM Evaluate interactions → Tree traversals Store aggregate data at nodes, e.g., bounding box, mass HPC Factory

  8. N-body problems in other domains Problem Operators Kernel Function ∀ , arg min All Nearest Neighbors || x q − x r || ∀ , ∪ arg All Range Search I ( h min < || x q − x r || < h max ) ∀ , Σ All Range Count I ( h min < || x q − x r || < h max ) ∀ , arg max 2 ( x i − µ k ) T Σ − 1 2 π | Σ k | ) e − 1 Naive Bayes Classifier p k ( x i − µ k ) P ( C k ) (1 / 2 ( x i − µ k ) T Σ − 1 Mixture Model E-step ∀ , ∀ 2 π | Σ k | ) e − 1 p k ( x i − µ k ) (1 / || x q − x r || ∀ , arg min K-means E-step X X 2 ( x i − µ k ) T Σ − 1 Mixture Model Log-likelihood , log 2 π | Σ k | ) e − 1 p ( x i − µ k ) (1 / k φ ( || x q − x r || ∀ , Σ Kernel Density Estimation ) h φ ( || x q − x r || ∀ , arg max Σ Kernel Density Bayes Classifier ) P ( C k ) h I ( || x q − x r || < h ) Σ , Σ 2-point (cross-)correlation y r φ ( || x q − x r || ∀ , Σ Nadaraya-Watson Regression ) h Thermodynamic Average Σ , Σ φ ( || x q − x r || ) Σ ( || x q − x r || ) Largest-span set max , ..., max Closest Pair arg min , arg min || x q − x r || ∀ , arg min || x q − x r || Minimum Spanning Tree α q α r Coulombic Interaction ∀ , Σ || x q − x r || Average Density I ( || x q − x r || < h ) Σ , Σ φ ( || x q − x r || ) Wave function ∀ , Π || x q − x r || Hausdor ff Distance max , min Intrinsic (fractal) Dimension I ( || x q − x r || < h ) Σ , Σ Each problem has a set of operators and a kernel function HPC Factory

  9. Why N-body methods? • One of the original seven dwarfs or motifs • FMM listed among the top 10 algorithms having the greatest influence in 20 th century • EM is one of the top 10 algorithms having the highest impact in 
 data mining • Applications • Machine learning • Computer vision • Computational geometry • Scientific computing … HPC Factory

  10. Key Ideas and Findings • An algorithmic framework for N-body problems • Automatically generates prune & approximation conditions • Results in O(N log N) and O(N) algorithms • Domain-specific optimizations and parallelization • Show 10-230x speedup on 6 different algorithms compared to state-of-art libraries/softwares • Out-of-the-box new optimal algorithms • O(N log N) EM algorithm for GMM’s • O(N) algorithm for Hausdorff distance HPC Factory

  11. Outline • Introduction • PASCAL Framework • Space Partitioning Trees • Tree Traversal • Prune/Approximate Generators • Optimizations & Parallelization • Experiments & Results • Conclusions & Future Work HPC Factory

  12. PASCAL Framework Space-partitioning Trees Tree Traversal Schemes Datasets Kd-tree Multi tree traversals BaseCase Prune/Approximate ComputeApproximate N-body spec.: Prune/Approximate Operators & condition generator Kernel function Domain-Specific Optimizations Parallelization Optimized code HPC Factory

  13. Tree Construction Recursively divide space until each box has at most q points . http://www.cs.cmu.edu/~dpelleg/kmeans.html HPC Factory

  14. Tree Construction Recursively divide space until each box has at most q points . http://www.cs.cmu.edu/~dpelleg/kmeans.html HPC Factory

  15. Tree Construction Recursively divide space until each box has at most q points . http://www.cs.cmu.edu/~dpelleg/kmeans.html HPC Factory

  16. Tree Traversal Q R HPC Factory

  17. Tree Traversal Q R Prune/Approx()? HPC Factory

  18. Tree Traversal Q R Prune/Approx()? NO HPC Factory

  19. Tree Traversal Q R HPC Factory

  20. Tree Traversal Q R Prune/Approx()? HPC Factory

  21. Tree Traversal Q R Prune/Approx()? NO HPC Factory

  22. Tree Traversal Q R HPC Factory

  23. Tree Traversal Q R BaseCase() Direct Q L ⊗ R L → O( q 2 ) HPC Factory

  24. Tree Traversal Q R HPC Factory

  25. Tree Traversal Q R Prune/Approx()? HPC Factory

  26. Tree Traversal Q R Prune/Approx()? YES HPC Factory

  27. Tree Traversal Q R If Prune/Approx() is true, discard the entire subtree for pruning problems HPC Factory

  28. Tree Traversal Q R BaseCase() HPC Factory

  29. Tree Traversal Q R HPC Factory

  30. Tree Traversal Q R Prune/Approx()? YES HPC Factory

  31. Tree Traversal Q R If Prune/Approx() is true, replace the subtree with the centroid for approximation problems HPC Factory

  32. Tree Traversal Q R ApproxCompute() HPC Factory

  33. Tree Traversal Q R BaseCase() HPC Factory

  34. Tree Traversal Q R HPC Factory

  35. Prune/Approximate Condition Generator • Prune e.g. , Hausdorff Distance HPC Factory

  36. Hausdorff Distance HPC Factory

  37. Hausdorff Distance Q HPC Factory

  38. Hausdorff Distance Q R HPC Factory

  39. Hausdorff Distance Q R HPC Factory

  40. Hausdorff Distance Q R HPC Factory

  41. Hausdorff Distance Q R HPC Factory

  42. Hausdorff Distance Q R HPC Factory

  43. Hausdorff Distance Q R HPC Factory

  44. Prune/Approximate Condition Generator • Prune e.g. , Hausdorff Distance • Approximation e.g. , Expectation Maximization (EM) E-step M-step Log-likelihood HPC Factory

  45. Approximate Condition for EM HPC Factory

  46. Approximate Condition for EM Q R HPC Factory

  47. Approximate Condition for EM Q R HPC Factory

  48. Approximate Condition for EM Q K min R HPC Factory

  49. Approximate Condition for EM Q R HPC Factory

  50. Approximate Condition for EM Q K max R HPC Factory

  51. Approximate Condition for EM Q R HPC Factory

  52. Approximate Condition for EM center Q R center HPC Factory

  53. Approximate Condition for EM center Q K cen tf r R center HPC Factory

  54. Approximate Condition for EM K max - K min < X K cen tf r < β center Q K cen tf r R center HPC Factory

  55. Approximate Condition for EM K max - K min < X K cen tf r < β center user-controlled accuracy Q K cen tf r R center HPC Factory

  56. Approximate Condition for EM K max - K min < X K cen tf r < β center user-controlled accuracy Q K cen tf r R center E-step: log liklihood: ( r i,max − r i,min ) < β r i,mean ( i = 1 , ..., K ) HPC Factory

  57. Prune Condition for Hausdorff distance: HPC Factory

  58. Prune Condition for Hausdorff distance: Q R HPC Factory

  59. Prune Condition for Hausdorff distance: Q R HPC Factory

  60. Prune Condition for Hausdorff distance: border point Q R HPC Factory

  61. Prune Condition for Hausdorff distance: border point Q R ∀ x q ∈ N border , ∀ x r ∈ N border op ⊕ 1 ( τ 1 , K ( x q , x r ) | op ⊕ 2 ( τ 2 , K ( x q , x r ))) s.t. q r HPC Factory

  62. Outline • Introduction • PASCAL Framework • Space Partitioning Trees • Tree Traversal • Prune/Approximate Generators • Optimizations & Parallelization • Experiments & Results • Conclusions & Future Work HPC Factory

Recommend


More recommend