Scalable K-Core Decomposition for Static Graphs Using a Dynamic Graph Data Structure Alok Tripathy
What I’ll Show • Maximal 𝑙 -core algorithm – Up to 4𝑌 faster than previous research – Up to 58𝑌 faster than popular graph libraries • 𝑙 -core edge decomposition algorithm – Up to 8𝑌 faster than previous research – Up to 129𝑌 faster than popular graph libraries Alok Tripathy, GTC 2019 2
What I’ll Show • Maximal 𝑙 -core algorithm – Up to 4𝑌 faster than previous research – Up to 58𝑌 faster than popular graph libraries • 𝑙 -core edge decomposition algorithm – Up to 8𝑌 faster than previous research – Up to 129𝑌 faster than popular graph libraries – Us Uses a d dynamic g graph op operation ons Alok Tripathy, GTC 2019 3
Takeaways • Algorithms on static graphs can use dynamic graph operations efficiently with the GPU. • Dynamic graph operations can be computed on a GPU efficiently. – Check out the Hornet data structure! – https://github.com/hornet-gt/hornet Alok Tripathy, GTC 2019 4
Motivation • Two types of graphs – Static graphs that don’t change – Dynamic graphs that change frequently •Edge/vertex insertions/deletions •e.g. Facebook, road networks Alok Tripathy, GTC 2019 5
Motivation • Two types of graphs – Static graphs that don’t change – Dynamic graphs that change frequently •Edge/vertex insertions/deletions •e.g. Facebook, road networks • Algorithms on static graphs can benefit from dynamic graph operations Alok Tripathy, GTC 2019 6
Dynamic Operations on Static Graphs • 𝑙 -truss problem Alok Tripathy, GTC 2019 7
Dynamic Operations on Static Graphs • 𝑙 -truss problem – Subgraph where all edges belong to at least 𝑙 ¡ − 2 triangles – Can be extended to maximal 𝑙 -truss 𝑙 = 4 Alok Tripathy, GTC 2019 8
Dynamic Operations on Static Graphs • 𝑙 -truss problem – Subgraph where all edges belong to at least 𝑙 ¡ − 2 triangles – Can be extended to maximal 𝑙 -truss – Applications: community detection, anomaly detection 𝑙 = 4 Alok Tripathy, GTC 2019 9
𝑙 -truss Algorithm -‑ 𝐹 . = ¡ all ¡edges ¡in ¡ ≥ 𝑙 ¡ − 2 triangles -‑ while ¡ 𝐹 . > 0 -‑ delete ¡ 𝐹 . ¡ from ¡G ¡ -‑ update ¡triangles ¡in ¡G -‑ 𝐹 . = ¡ all ¡edges ¡in ¡ ≥ 𝑙 ¡ − 2 triangles Alok Tripathy, GTC 2019 10
Takeaways • Algorithms on static graphs can use dynamic graph operations efficiently with the GPU. • Dynamic graph operations can be computed on a GPU efficiently. – Check out the Hornet data structure! – https://github.com/hornet-gt/hornet Alok Tripathy, GTC 2019 11
Widely used graph data structures Na Names Pr Pros os Con Cons Dense Adjacency • Supports updates • Poor locality Matrix Massive storage • requirements Linked lists • Flexible • Poor locality Limited parallelism • • Allocation time is costly COO (Edge list) - • Has some flexibility • Poor locality unsorted Updates are simple Stores both the source and • • • Lots of parallelism destination CSR Uses exact amount of Inflexible • • memory • Good locality Lots of parallelism • These data structures don’t cut it Oded Green, Alok Tripathy, GTC 2019 12
Compressed Sparse Row (CSR) Pr Pros: • Uses precise storage requirements • Great locality – Good for GPUs • Handful of arrays – Simple to use and manage Co Cons ns: • Inflexible. Src/Row 0 1 2 3 4 5 6 7 • Network growth Offset 0 2 4 7 9 11 13 14 14 unsupported • Topology changes Dest./Col. 1 2 0 5 0 3 4 2 6 2 5 1 4 3 Value 2 5 2 7 4 1 4 1 2 4 1 7 1 2 unsupported • Property graphs not supported Oded Green, Alok Tripathy, GTC 2019 13
Hornet – A High Level View U SER -‑I NTERFACE 0 1 2 3 4 5 6 7 Vertex Id Id 2 2 3 2 2 2 1 0 Over-‑allocated ¡space Us Used Po Pointer 3 1 2 0 5 2 6 2 5 1 4 0 3 4 Dest 2 2 5 2 7 1 2 4 1 7 1 4 1 4 Value Oded Green, Alok Tripathy, GTC 2019 14
Hornet in Detail Over-‑allocated space U SER -‑I NTERFACE Vertex ¡Id 0 1 2 3 4 5 6 7 for ¡vertex insertions 2 2 3 2 2 2 1 0 Used ( (#Neighbors/nnz nnz) Pointer Po Over-‑allocated space for ¡ power-‑of-‑two rule 3 1 2 0 5 2 6 2 5 1 4 0 3 4 2 5 2 1 2 4 1 7 1 2 1 4 5 7 0 1 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 0 𝑪𝑩 𝟏,𝟐 𝑪𝑩 𝟐,𝟐 𝑪𝑩 𝟐,𝟑 𝑪𝑩 𝟑,𝟐 bsize = 2 bsize = 2 bsize = 4 bsize= 1 Dest./Col. Vec-‑Tree Weight Bit ¡status ¡ M EMORY MANAGER Oded Green, Alok Tripathy, GTC 2019 15
Hornet Insertion Oded Green, Alok Tripathy, GTC 2019 16
Hornet Insertion Pseudocode parallel ¡for ¡(u, ¡v) ¡in ¡batch ¡ -‑ if ¡u’s ¡block ¡is ¡too ¡full -‑ allocate ¡a ¡new ¡block -‑ queue.add(u) parallel ¡for ¡v ¡in ¡queue -‑ copy ¡adjacency ¡list ¡to ¡new ¡block parallel ¡for ¡(u, ¡v) ¡in ¡batch -‑ add ¡(u, ¡v) ¡to ¡u’s ¡block Alok Tripathy, GTC 2019 17
Insertion Rates • Supports over 150M updates per second • Hornet – 4𝑌 − 10𝑌 faster than cuSTINGER – Does not have 𝑞𝑓𝑠𝑔𝑝𝑠𝑛𝑏𝑜𝑑𝑓 ¡𝑒𝑗𝑞 like cuSTINGER • Scalable growth in update rate cuSTIN INGER Horne Ho net 10 9 10 9 1,000,000,000 1,000,000,000 Update ¡Rate ¡(edges ¡per ¡second) Update ¡Rate ¡(edges ¡per ¡second) 10 8 10 8 100,000,000 100,000,000 10 7 10 7 10,000,000 10,000,000 10 6 10 6 1,000,000 1,000,000 10 5 100,000 10 5 100,000 10,000 10 4 10,000 10 4 1,000 1,000 10 3 10 3 in-‑2004 soc-‑LiveJournal1 cage15 kron_g500-‑logn21 in-‑2004 soc-‑LiveJournal1 cage15 kron_g500-‑logn21 18 Oded Green, Alok Tripathy, GTC 2019
Takeaways • Algorithms on static graphs can use dynamic graph operations efficiently with the GPU. • Dynamic graph operations can be computed on a GPU efficiently. – Check out the Hornet data structure! – https://github.com/hornet-gt/hornet Alok Tripathy, GTC 2019 19
Motivation • Current idea: – Dynamic graph operations are only for dynamic graphs, not static graphs. •Very expensive •Why bother? Alok Tripathy, GTC 2019 20
Motivation • Current idea: – Dynamic graph operations are only for dynamic graphs, not static graphs. •Very expensive •Why bother? • New idea: Algorithms on static graphs can benefit from dynamic graph operations – If If we can efficiently parallelize operations Alok Tripathy, GTC 2019 21
What I’ll Show • 3 static graph algorithms – All 3 leverage NVIDIA P100 GPUs. •2 beat the state-of-the-art •1 does not (does not have good GPU utilization) Alok Tripathy, GTC 2019 22
Algorithms • Old maximal 𝑙 -core algorithm • New maximal 𝑙 -core algorithm • 𝑙 -core edge decomposition Alok Tripathy, GTC 2019 23
Algorithms • Old maximal 𝑙 -core algorithm L • New maximal 𝑙 -core algorithm • 𝑙 -core edge decomposition Alok Tripathy, GTC 2019 24
Maximal 𝑙 -core Definitions • 𝑙 -core – Maximal subgraph where all vertices have degree at least 𝑙 𝑙 = 2 Alok Tripathy, GTC 2019 25
Maximal 𝑙 -core Definitions • 𝑙 -core – Maximal subgraph where all vertices have degree at least 𝑙 • Maximal 𝑙 -core – Largest 𝑙 such that 𝑙 -core exists in graph 𝑙 = 3 Alok Tripathy, GTC 2019 26
Maximal 𝑙 -core Definitions • 𝑙 -core – Maximal subgraph where all vertices have degree at least 𝑙 • Maximal 𝑙 -core – Largest 𝑙 such that 𝑙 -core exists in graph • Applications: visualization, community detection 𝑙 = 3 Alok Tripathy, GTC 2019 27
Maximal 𝑙 -core High-Level 𝑞𝑓𝑓𝑚 = 0 while ¡vertices ¡exist ¡in ¡G ¡ 1 3 4 5 1 -‑ delete ¡all ¡vertices ¡ ¡ ¡ with ¡degree ¡<= ¡ 𝑞𝑓𝑓𝑚 2 5 4 1 2 -‑ if ¡there ¡aren’t ¡any 𝑞𝑓𝑓𝑚 = 1 -‑ increment ¡ 𝑞𝑓𝑓𝑚 Alok Tripathy, GTC 2019 28
Maximal 𝑙 -core High-Level 𝑞𝑓𝑓𝑚 = 0 while ¡vertices ¡exist ¡in ¡G ¡ 3 4 2 -‑ delete ¡all ¡vertices ¡ ¡ ¡ with ¡degree ¡<= ¡ 𝑞𝑓𝑓𝑚 2 5 4 2 -‑ if ¡there ¡aren’t ¡any 𝑞𝑓𝑓𝑚 = 2 -‑ increment ¡ 𝑞𝑓𝑓𝑚 Alok Tripathy, GTC 2019 29
Maximal 𝑙 -core High-Level 𝑞𝑓𝑓𝑚 = 0 while ¡vertices ¡exist ¡in ¡G ¡ 3 3 -‑ delete ¡all ¡vertices ¡ ¡ ¡ with ¡degree ¡<= ¡ 𝑞𝑓𝑓𝑚 3 3 -‑ if ¡there ¡aren’t ¡any 𝑞𝑓𝑓𝑚 = 3 -‑ increment ¡ 𝑞𝑓𝑓𝑚 Alok Tripathy, GTC 2019 30
Old Maximal 𝑙 -core Algorithm 𝑞𝑓𝑓𝑚 = 0 while ¡vertices ¡exist ¡in ¡ 𝐻 1 3 4 -‑ reset ¡colors ¡ -‑ color ¡all ¡vertices 5 1 with ¡degree ¡ ≤ 𝑞𝑓𝑓𝑚 2 5 4 1 -‑ if ¡#coloredvertices > ¡0 2 -‑ delete ¡colored ¡vertices 𝑞𝑓𝑓𝑚 = 1 -‑ delete ¡incident ¡edges J -‑ insert ¡vertices ¡in ¡ 𝐻 J ¡ -‑ insert ¡edges ¡in ¡ 𝐻 -‑ else -‑ increment ¡ 𝑞𝑓𝑓𝑚 Alok Tripathy, GTC 2019 31
Old Maximal 𝑙 -core Code Alok Tripathy, GTC 2019 32
Recommend
More recommend