parallel depth first on gpu
play

Parallel Depth First on GPU M. Naumov, A. Vrielink and M. Garland, - PowerPoint PPT Presentation

Parallel Depth First on GPU M. Naumov, A. Vrielink and M. Garland, GTC 2017 Introduction Directed Trees Directed Acyclic Graphs (DAGs) AGENDA Path- and SSSP-based variants Optimizations Performance Experiments 2 What is DFS? a Node:


  1. Parallel Depth First on GPU M. Naumov, A. Vrielink and M. Garland, GTC 2017

  2. Introduction Directed Trees Directed Acyclic Graphs (DAGs) AGENDA  Path- and SSSP-based variants  Optimizations Performance Experiments 2

  3. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: d c b Discovery: g e Finish: f i j 3

  4. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: /,a d c b Discovery: a,b g e Finish: f i j 4

  5. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: /,a, b, d c b Discovery: a,b,e g e Finish: e f i j 5

  6. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: /,a, b,b d c b Discovery: a,b,e,f g e Finish: e f i j 6

  7. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: /,a, b,b, ,f d c b Discovery: a,b,e,f,i g e Finish: e,i f i j 7

  8. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: /,a, b,b, ,f,f d c b Discovery: a,b,e,f,i,j g e Finish: e,i,j f i j 8

  9. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: /,a,a,a,b,b,d,f,f d c b Discovery: a,b,e,f,i,j,c,d,g g e Finish: e,i,j,f,b,c,g,d,a f i j 9

  10. Previous Work on DFS Lexicographic DFS Planar Graphs Directed Graphs with Cycles Directed Acyclic Graphs (DAGs) Time O( 𝑜 log 11 n) Time O(log 2 n) Time O(log 2 n) Processors O(n 3 ) Processors O(n ω /log n) Processors O(n) where ω < 2.373 is the matrix multiplication exponent 10

  11. Previous Work on DFS Lexicographic DFS Planar Graphs Directed Graphs with Cycles Directed Acyclic Graphs (DAGs) Time O( 𝑜 log 11 n) Time O(log 2 n) Time O(log 2 n) Processors O(n 3 ) Processors O(n ω /log n) Processors O(n) topological sort, bi-connectivity and planarity testing where ω < 2.373 is the matrix multiplication exponent 11

  12. DIRECTED TREES 12

  13. Directed Tree a c d b [0] f g e [0] [0] i j [0] [0] Phase 2: Bottom-Up Traversal 13

  14. Directed Tree a c d b [0,1] [0] f g e [0,1,1] [0] [0] i j [0] [0] Phase 2: Bottom-Up Traversal 14

  15. Directed Tree a c d b [0,1] [0] f g e [0,1,2] [0] [0] i j prefix sum [0] [0] Phase 2: Bottom-Up Traversal 15

  16. Directed Tree a c d b [0,1,3] [0,1] [0] f g e [0,1,2] [0] [0] i j [0] [0] Phase 2: Bottom-Up Traversal 16

  17. Directed Tree a c d b [0,1,4] [0,1] [0] f g e [0,1,2] [0] [0] i j prefix sum [0] [0] Phase 2: Bottom-Up Traversal 17

  18. Directed Tree [0,5,1,2] a c d b [0,1,4] [0,1] [0] f g e [0,1,2] [0] [0] i j [0] [0] Phase 2: Bottom-Up Traversal 18

  19. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] f g e [0,1,2] [0] [0] i j [0] [0] Phase 2: Bottom-Up Traversal 19

  20. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] f g e [0,1,2] [0] [0] i j [0] [0] This phase is done, next phase is about to start … 20

  21. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] f g e [0,1,2] offset 0 [0] [0] i j [0] [0] Phase 3: Top-down Traversal 21

  22. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] f g e [0,1,2] offset 0 [0] [0] i j offset 1 [0] [0] Phase 3: Top-down Traversal 22

  23. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] offset 6 f g e [0,1,2] offset 0 [0] [0] i j offset 1 [0] [0] Phase 3: Top-down Traversal 23

  24. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] discovery 6+1 f g e [0,1,2] discovery 0+2 [0] [0] i j discovery 1+3 [0] [0] discovery = offset + depth Phase 3: Top-down Traversal 24

  25. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] finish 6+1 f g e [0,1,2] finish 0+0 [0] [0] i j finish 1+0 [0] [0] finish = offset + sub-tree size Phase 3: Top-down Traversal 25

  26. DIRECTED ACYCLIC GRAPHS PATH-BASED VARIANT 26

  27. Path-Based (for DAGs) a c d b f g e i j collision left right [a,b,f] f [a,d,f] Phase 1 27

  28. Path-Based (for DAGs) a c d b f g e i j collision left right • wait until all paths to a node are traversed • align path sequences [a,b,f] f [a,d,f] left [a,b,f] resolution (lexicographically smallest) right [a,d,f] • compare left-to-right and choose smallest Phase 1 28

  29. Path-Based (for DAGs) a c d b f g e i j This phase is done 29

  30. OPTIMIZATIONS 30

  31. Path Pruning a c b e d [a,c,d,f] [a,b,e,f] f 31

  32. Path Pruning When two paths reach the same node a  There exists a parent “a” where the path split [a,b ,…] and [ a,c ,…] c b e d [a,c,d,f] [a,b,e,f] f 32

  33. Path Pruning When two paths reach the same node a  There exists a parent “a” where the path split [a,b ,…] and [ a,c ,…] c b  It is the comparison between “b” and “c” that allows us to distinguish between paths e d [a,c,d,f] [a,b,e,f] f 33

  34. Path Pruning When two paths reach the same node a  There exists a parent “a” where the path split [a,b ,…] and [ a,c ,…] c b  It is the comparison between “b” and “c” that allows us to distinguish between paths  Parent node with a single edge e d will never be a decision point [a,c,d,f] [a,b,e,f] f 34

  35. Path Pruning When two paths reach the same node a  There exists a parent “a” where the path split [a,b ,…] and [ a,c ,…] c b  It is the comparison between “b” and “c” that allows us to distinguish between paths  Parent node with a single edge e d will never be a decision point  No need to store nodes with such parents [a,c,f] [a,b,f] f 35

  36. Path Pruning 36

  37. Phase Composition 37

  38. SSSP-BASED VARIANT 38

  39. SSSP-based (for DAGs) a c d b [1] f g e [1] [1] i j [1] [1] Run the algorithm for Directed Trees, but Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0)  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 39

  40. SSSP-based (for DAGs) a c d b [1,1] [1] f g e [1,1,1] [1] [1] i j [1] [1] Run the algorithm for Directed Trees, but Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0)  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 40

  41. SSSP-based (for DAGs) a c d b [1,2] [1] f g e [1,2,3] [1] [1] i j prefix sum [1] [1] Run the algorithm for Directed Trees, but Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0)  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 41

  42. SSSP-based (for DAGs) a c d b [1,1,3] [1,2] [1] f g e [1,2,3] [1] [1] i j [1] [1] Run the algorithm for Directed Trees, but Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0)  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 42

  43. SSSP-based (for DAGs) a c d b [1,2,4] [1,2] [1] f g e [1,2,3] prefix sum [1] [1] i j [1] [1] Run the algorithm for Directed Trees, but Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0)  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 43

  44. SSSP-based (for DAGs) [1,5,1,2,1] a c d b [1,2,4] [1,2] [1] f g e [1,2,3] [1] [1] i j [1] [1] Run the algorithm for Directed Trees, but Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0)  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 44

  45. SSSP-based (for DAGs) [1,6,7,9,10] a c d b [1,2,4] [1,2] [1] f g e [1,2,3] [1] [1] i j [1] [1] Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 45

  46. SSSP-based (for DAGs) a 6 7 1 9 c d b 1 1 2 f g e 1 2 i j Assign # of nodes as the edge weight This phase is done, next phase is about to start … 46

  47. SSSP-based (for DAGs) a 6 7 1 9 c d b 1 1 2 f g e 1 2 i j 1+2+2=5 < 9 Phase 2: Top-down traversal 47

  48. SSSP-based (for DAGs) a 6 7 1 9 c d b 1 1 2 f g e 1 2 i j 1+2+2=5 < 9 Shortest Path is the DFS path Phase 2: Top-down traversal 48

  49. SSSP-based (for DAGs) a c d b f g e i j Phase 2: This phase is done 49

  50. OPTIMIZATIONS 50

  51. Discovery time  The length of shortest path a 0 defines an ordering of nodes c d b 1 6 7 f g e 8 3 2 i j 4 5 Phase 3a: Sorting 51

  52. Discovery time  The length of shortest path a 0 defines an ordering of nodes  We can sort them to obtain c d discovery time b 1 6 7 f g e 8 3 2 i j 4 5 Phase 3a: Sorting 52

Recommend


More recommend