scalable k core decomposition for static graphs using a
play

Scalable K-Core Decomposition for Static Graphs Using a Dynamic - PowerPoint PPT Presentation

Scalable K-Core Decomposition for Static Graphs Using a Dynamic Graph Data Structure Alok Tripathy What Ill Show Maximal -core algorithm Up to 4 faster than previous research Up to 58 faster than popular graph


  1. Scalable K-Core Decomposition for Static Graphs Using a Dynamic Graph Data Structure Alok Tripathy

  2. What I’ll Show • Maximal 𝑙 -core algorithm – Up to 4𝑌 faster than previous research – Up to 58𝑌 faster than popular graph libraries • 𝑙 -core edge decomposition algorithm – Up to 8𝑌 faster than previous research – Up to 129𝑌 faster than popular graph libraries Alok Tripathy, GTC 2019 2

  3. What I’ll Show • Maximal 𝑙 -core algorithm – Up to 4𝑌 faster than previous research – Up to 58𝑌 faster than popular graph libraries • 𝑙 -core edge decomposition algorithm – Up to 8𝑌 faster than previous research – Up to 129𝑌 faster than popular graph libraries – Us Uses a d dynamic g graph op operation ons Alok Tripathy, GTC 2019 3

  4. Takeaways • Algorithms on static graphs can use dynamic graph operations efficiently with the GPU. • Dynamic graph operations can be computed on a GPU efficiently. – Check out the Hornet data structure! – https://github.com/hornet-gt/hornet Alok Tripathy, GTC 2019 4

  5. Motivation • Two types of graphs – Static graphs that don’t change – Dynamic graphs that change frequently •Edge/vertex insertions/deletions •e.g. Facebook, road networks Alok Tripathy, GTC 2019 5

  6. Motivation • Two types of graphs – Static graphs that don’t change – Dynamic graphs that change frequently •Edge/vertex insertions/deletions •e.g. Facebook, road networks • Algorithms on static graphs can benefit from dynamic graph operations Alok Tripathy, GTC 2019 6

  7. Dynamic Operations on Static Graphs • 𝑙 -truss problem Alok Tripathy, GTC 2019 7

  8. Dynamic Operations on Static Graphs • 𝑙 -truss problem – Subgraph where all edges belong to at least 𝑙 ¡ − 2 triangles – Can be extended to maximal 𝑙 -truss 𝑙 = 4 Alok Tripathy, GTC 2019 8

  9. Dynamic Operations on Static Graphs • 𝑙 -truss problem – Subgraph where all edges belong to at least 𝑙 ¡ − 2 triangles – Can be extended to maximal 𝑙 -truss – Applications: community detection, anomaly detection 𝑙 = 4 Alok Tripathy, GTC 2019 9

  10. 𝑙 -truss Algorithm -­‑ 𝐹 . = ¡ all ¡edges ¡in ¡ ≥ 𝑙 ¡ − 2 triangles -­‑ while ¡ 𝐹 . > 0 -­‑ delete ¡ 𝐹 . ¡ from ¡G ¡ -­‑ update ¡triangles ¡in ¡G -­‑ 𝐹 . = ¡ all ¡edges ¡in ¡ ≥ 𝑙 ¡ − 2 triangles Alok Tripathy, GTC 2019 10

  11. Takeaways • Algorithms on static graphs can use dynamic graph operations efficiently with the GPU. • Dynamic graph operations can be computed on a GPU efficiently. – Check out the Hornet data structure! – https://github.com/hornet-gt/hornet Alok Tripathy, GTC 2019 11

  12. Widely used graph data structures Na Names Pr Pros os Con Cons Dense Adjacency • Supports updates • Poor locality Matrix Massive storage • requirements Linked lists • Flexible • Poor locality Limited parallelism • • Allocation time is costly COO (Edge list) - • Has some flexibility • Poor locality unsorted Updates are simple Stores both the source and • • • Lots of parallelism destination CSR Uses exact amount of Inflexible • • memory • Good locality Lots of parallelism • These data structures don’t cut it Oded Green, Alok Tripathy, GTC 2019 12

  13. Compressed Sparse Row (CSR) Pr Pros: • Uses precise storage requirements • Great locality – Good for GPUs • Handful of arrays – Simple to use and manage Co Cons ns: • Inflexible. Src/Row 0 1 2 3 4 5 6 7 • Network growth Offset 0 2 4 7 9 11 13 14 14 unsupported • Topology changes Dest./Col. 1 2 0 5 0 3 4 2 6 2 5 1 4 3 Value 2 5 2 7 4 1 4 1 2 4 1 7 1 2 unsupported • Property graphs not supported Oded Green, Alok Tripathy, GTC 2019 13

  14. Hornet – A High Level View U SER -­‑I NTERFACE 0 1 2 3 4 5 6 7 Vertex Id Id 2 2 3 2 2 2 1 0 Over-­‑allocated ¡space Us Used Po Pointer 3 1 2 0 5 2 6 2 5 1 4 0 3 4 Dest 2 2 5 2 7 1 2 4 1 7 1 4 1 4 Value Oded Green, Alok Tripathy, GTC 2019 14

  15. Hornet in Detail Over-­‑allocated space U SER -­‑I NTERFACE Vertex ¡Id 0 1 2 3 4 5 6 7 for ¡vertex insertions 2 2 3 2 2 2 1 0 Used ( (#Neighbors/nnz nnz) Pointer Po Over-­‑allocated space for ¡ power-­‑of-­‑two rule 3 1 2 0 5 2 6 2 5 1 4 0 3 4 2 5 2 1 2 4 1 7 1 2 1 4 5 7 0 1 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 0 𝑪𝑩 𝟏,𝟐 𝑪𝑩 𝟐,𝟐 𝑪𝑩 𝟐,𝟑 𝑪𝑩 𝟑,𝟐 bsize = 2 bsize = 2 bsize = 4 bsize= 1 Dest./Col. Vec-­‑Tree Weight Bit ¡status ¡ M EMORY MANAGER Oded Green, Alok Tripathy, GTC 2019 15

  16. Hornet Insertion Oded Green, Alok Tripathy, GTC 2019 16

  17. Hornet Insertion Pseudocode parallel ¡for ¡(u, ¡v) ¡in ¡batch ¡ -­‑ if ¡u’s ¡block ¡is ¡too ¡full -­‑ allocate ¡a ¡new ¡block -­‑ queue.add(u) parallel ¡for ¡v ¡in ¡queue -­‑ copy ¡adjacency ¡list ¡to ¡new ¡block parallel ¡for ¡(u, ¡v) ¡in ¡batch -­‑ add ¡(u, ¡v) ¡to ¡u’s ¡block Alok Tripathy, GTC 2019 17

  18. Insertion Rates • Supports over 150M updates per second • Hornet – 4𝑌 − 10𝑌 faster than cuSTINGER – Does not have 𝑞𝑓𝑠𝑔𝑝𝑠𝑛𝑏𝑜𝑑𝑓 ¡𝑒𝑗𝑞 like cuSTINGER • Scalable growth in update rate cuSTIN INGER Horne Ho net 10 9 10 9 1,000,000,000 1,000,000,000 Update ¡Rate ¡(edges ¡per ¡second) Update ¡Rate ¡(edges ¡per ¡second) 10 8 10 8 100,000,000 100,000,000 10 7 10 7 10,000,000 10,000,000 10 6 10 6 1,000,000 1,000,000 10 5 100,000 10 5 100,000 10,000 10 4 10,000 10 4 1,000 1,000 10 3 10 3 in-­‑2004 soc-­‑LiveJournal1 cage15 kron_g500-­‑logn21 in-­‑2004 soc-­‑LiveJournal1 cage15 kron_g500-­‑logn21 18 Oded Green, Alok Tripathy, GTC 2019

  19. Takeaways • Algorithms on static graphs can use dynamic graph operations efficiently with the GPU. • Dynamic graph operations can be computed on a GPU efficiently. – Check out the Hornet data structure! – https://github.com/hornet-gt/hornet Alok Tripathy, GTC 2019 19

  20. Motivation • Current idea: – Dynamic graph operations are only for dynamic graphs, not static graphs. •Very expensive •Why bother? Alok Tripathy, GTC 2019 20

  21. Motivation • Current idea: – Dynamic graph operations are only for dynamic graphs, not static graphs. •Very expensive •Why bother? • New idea: Algorithms on static graphs can benefit from dynamic graph operations – If If we can efficiently parallelize operations Alok Tripathy, GTC 2019 21

  22. What I’ll Show • 3 static graph algorithms – All 3 leverage NVIDIA P100 GPUs. •2 beat the state-of-the-art •1 does not (does not have good GPU utilization) Alok Tripathy, GTC 2019 22

  23. Algorithms • Old maximal 𝑙 -core algorithm • New maximal 𝑙 -core algorithm • 𝑙 -core edge decomposition Alok Tripathy, GTC 2019 23

  24. Algorithms • Old maximal 𝑙 -core algorithm L • New maximal 𝑙 -core algorithm • 𝑙 -core edge decomposition Alok Tripathy, GTC 2019 24

  25. Maximal 𝑙 -core Definitions • 𝑙 -core – Maximal subgraph where all vertices have degree at least 𝑙 𝑙 = 2 Alok Tripathy, GTC 2019 25

  26. Maximal 𝑙 -core Definitions • 𝑙 -core – Maximal subgraph where all vertices have degree at least 𝑙 • Maximal 𝑙 -core – Largest 𝑙 such that 𝑙 -core exists in graph 𝑙 = 3 Alok Tripathy, GTC 2019 26

  27. Maximal 𝑙 -core Definitions • 𝑙 -core – Maximal subgraph where all vertices have degree at least 𝑙 • Maximal 𝑙 -core – Largest 𝑙 such that 𝑙 -core exists in graph • Applications: visualization, community detection 𝑙 = 3 Alok Tripathy, GTC 2019 27

  28. Maximal 𝑙 -core High-Level 𝑞𝑓𝑓𝑚 = 0 while ¡vertices ¡exist ¡in ¡G ¡ 1 3 4 5 1 -­‑ delete ¡all ¡vertices ¡ ¡ ¡ with ¡degree ¡<= ¡ 𝑞𝑓𝑓𝑚 2 5 4 1 2 -­‑ if ¡there ¡aren’t ¡any 𝑞𝑓𝑓𝑚 = 1 -­‑ increment ¡ 𝑞𝑓𝑓𝑚 Alok Tripathy, GTC 2019 28

  29. Maximal 𝑙 -core High-Level 𝑞𝑓𝑓𝑚 = 0 while ¡vertices ¡exist ¡in ¡G ¡ 3 4 2 -­‑ delete ¡all ¡vertices ¡ ¡ ¡ with ¡degree ¡<= ¡ 𝑞𝑓𝑓𝑚 2 5 4 2 -­‑ if ¡there ¡aren’t ¡any 𝑞𝑓𝑓𝑚 = 2 -­‑ increment ¡ 𝑞𝑓𝑓𝑚 Alok Tripathy, GTC 2019 29

  30. Maximal 𝑙 -core High-Level 𝑞𝑓𝑓𝑚 = 0 while ¡vertices ¡exist ¡in ¡G ¡ 3 3 -­‑ delete ¡all ¡vertices ¡ ¡ ¡ with ¡degree ¡<= ¡ 𝑞𝑓𝑓𝑚 3 3 -­‑ if ¡there ¡aren’t ¡any 𝑞𝑓𝑓𝑚 = 3 -­‑ increment ¡ 𝑞𝑓𝑓𝑚 Alok Tripathy, GTC 2019 30

  31. Old Maximal 𝑙 -core Algorithm 𝑞𝑓𝑓𝑚 = 0 while ¡vertices ¡exist ¡in ¡ 𝐻 1 3 4 -­‑ reset ¡colors ¡ -­‑ color ¡all ¡vertices 5 1 with ¡degree ¡ ≤ 𝑞𝑓𝑓𝑚 2 5 4 1 -­‑ if ¡#coloredvertices > ¡0 2 -­‑ delete ¡colored ¡vertices 𝑞𝑓𝑓𝑚 = 1 -­‑ delete ¡incident ¡edges J -­‑ insert ¡vertices ¡in ¡ 𝐻 J ¡ -­‑ insert ¡edges ¡in ¡ 𝐻 -­‑ else -­‑ increment ¡ 𝑞𝑓𝑓𝑚 Alok Tripathy, GTC 2019 31

  32. Old Maximal 𝑙 -core Code Alok Tripathy, GTC 2019 32

Recommend


More recommend