Fast and Scalable Subgraph Isomorphism using Dynamic Graph Techniques James Fox
Collaborators • Oded Green, Research Scientist (GT) • Euna Kim, PhD student (GT) • Federico Busato, PhD student (Universita di Verona) • Dr. Nicola Bombieri (Universita di Verona) • Kartik Lakhotia, PhD student (USC) • Shijie Zhou, PhD student (USC) • Shreyas Singapura, PhD student (USC) • Hanqing Zeng, PhD student (USC) • Dr. Rajgopal Kannan, (USC) • Prof. Viktor Prasanna (USC) • Prof. David Bader (GT) Quickly Finding a Truss in Haystack 2
Outline • K-Truss – Introduction – Sequential Approaches • Our new algorithm – Dynamic Triangle Counting – Hornet: data structure for dynamic graphs • Performance Analysis Quickly Finding a Truss in Haystack 3
K-Truss : for given 𝑙 , the 𝑙 − 𝑢𝑠𝑣𝑡𝑡 is a • Definition: subgraph such that each edge closes at least 𝑙 − 2 triangles, i.e. “ support ” of 𝑙 − 2 • A well-connected subgraph – “Relaxation of k-clique, stricter than k-core” [Cohen; 2008] – Computationally efficient to find • Maximal k-truss: focus of our work Quickly Finding a Truss in Haystack 4
Example 2 2 2 1 2 3 3 3 7 7 1 1 1 2 0 3 6 4 6 1 4 1 5 5 2 2 2 2 1 2 2 3 3 2 3 7 7 1 1 2 1 2 2 3 6 4 6 1 4 K=4 K=3 1 5 Truss 5 Truss Quickly Finding a Truss in Haystack 5
Over 1000x time faster Graph Challenge Innovation Award (HPEC’17) Three main factors • Algorithmic Optimization 1. Uses dynamic graph data structure 2. Novel algorithm for dynamically updating triangle counts • Parallelization • Programming model – vertex centric more efficient than linear algebra Quickly Finding a Truss in Haystack 6
Simple Vertex Centric 𝑙 ← 3 𝑥ℎ𝑗𝑚𝑓 𝐹 ≠ ∅ 𝑠𝑓𝑞𝑓𝑏𝑢 𝑣𝑜𝑢𝑗𝑚 𝑜𝑝 𝑛𝑝𝑠𝑓 𝑑ℎ𝑏𝑜𝑓𝑡 𝑔𝑝𝑠 e = 𝑣, 𝑤 ∈ 𝐹 𝑗𝑔 𝑏𝑒𝑘 𝑣 ∩ 𝑏𝑒𝑘 𝑤 < 𝑙 − 2 𝑒𝑓𝑚𝑓𝑢𝑓 𝑓 𝑔𝑠𝑝𝑛 𝐹 𝑙 ← 𝑙 + 1 𝑙 ← 𝑙 − 1 Quickly Finding a Truss in Haystack 7
Linear Algebra Formulation • Given k • Bold letters refer to vectors and matrices 𝑺 = 𝑭𝑩 𝒚 = 𝑔𝑗𝑜𝑒 𝑆 == 2 ⋅ 𝟐 < 𝑙 − 2 𝑥ℎ𝑗𝑚𝑓 𝒚 𝑭 𝒚 = 𝑭 𝒚, : 𝑭 = 𝑭 𝒚 𝒅 , : 𝑺 = 𝑭 𝒚 𝒅 , : 𝑩 𝑼 − 𝑒𝑗𝑏 𝑭 𝒚 𝑭 𝒚 𝑼 𝑺 = 𝑺 − 𝑭 𝑭 𝒚 𝑭 𝒚 𝒚 = 𝑔𝑗𝑜𝑒 𝑆 == 2 ⋅ 𝟐 < 𝑙 − 2 Quickly Finding a Truss in Haystack 8
New Algorithm for finding Maximal Truss 𝑔𝑝𝑠 e = 𝑣, 𝑤 ∈ 𝐹 ü - par paral allel w e ← 𝑏𝑒𝑘 𝑣 ∩ 𝑏𝑒𝑘 𝑤 𝑙 ← 3 𝑥ℎ𝑗𝑚𝑓 𝐹 ≠ ∅ 𝑠𝑓𝑞𝑓𝑏𝑢 𝑣𝑜𝑢𝑗𝑚 𝑜𝑝 𝑛𝑝𝑠𝑓 𝑑ℎ𝑏𝑜𝑓𝑡 𝑚𝑗𝑡𝑢 ← ∅ 𝑔𝑝𝑠 e = 𝑣, 𝑤 ∈ 𝐹 𝑗𝑔 𝑏𝑒𝑘 𝑣 ∩ 𝑏𝑒𝑘 𝑤 < 𝑙 − 2 ü - par paral allel 𝑏𝑞𝑞𝑓𝑜𝑒 𝑚𝑗𝑡𝑢, 𝑓 𝐻 RST ← CreateGraph(𝑚𝑗𝑡𝑢) ü - par paral allel 𝑠𝑓𝑛𝑝𝑤𝑓𝐹𝑒𝑓𝑡 𝐻, 𝐻 RST ü - par paral allel 𝑉𝑞𝑒𝑏𝑢𝑓𝑈𝑠𝑗𝑏𝑜𝑚𝑓𝐷𝑝𝑣𝑜𝑢 𝐻, 𝐻 RST ü - par paral allel 𝑙 ← 𝑙 + 1 𝑙 ← 𝑙 − 1 Quickly Finding a Truss in Haystack 9
𝐻 RST ← CreateGraph(𝑚𝑗𝑡𝑢) • We will create a graph from all the deleted edges • Adjacencies will be sorted 2 2 2 1 1 2 3 3 3 7 7 1 1 1 1 3 1 1 6 4 6 4 1 1 5 5 𝐻 RST 𝐻 Quickly Finding a Truss in Haystack 10
𝑉𝑞𝑒𝑏𝑢𝑓𝑈𝑠𝑗𝑏𝑜𝑚𝑓𝐷𝑝𝑣𝑜𝑢 𝐻, 𝐻 RST • Must update counts of non-removed edges • Don’t want to re-compute globally After deletion (incorrect triangle counts) Updated triangle counts 2 2 2 2 2 2 3 3 3 2 7 7 1 2 1 2 3 2 6 4 6 4 5 5 Quickly Finding a Truss in Haystack 11
Three “types” of triangles affected v 1. One edge removed u w v 2. Two edges removed u w v 3. All three edges removed u w [Makkar; HiPC’17] Quickly Finding a Truss in Haystack 12
One edge removed v • 𝑣, 𝑤 deleted u w • By intersecting the list of 𝑣 with the list of 𝑤 we can find all common neighbors – Decrement support by 1 • For all 𝑓 = 𝑣, 𝑤 ∈ 𝐻 RST – 𝐽𝑜𝑢𝑓𝑠𝑡𝑓𝑑𝑢(𝑣, 𝐻, 𝑤, 𝐻) Quickly Finding a Truss in Haystack 13
Two edges removed v • 𝑣, 𝑤 and 𝑣, 𝑥 deleted u • Intersecting the adjacencies like w before won’t work. • Instead we will intersect adjacencies from the two graphs: 𝐻 and 𝐻 RST • For all 𝑓 = 𝑣, 𝑤 ∈ 𝐻 RST – 𝐽𝑜𝑢𝑓𝑠𝑡𝑓𝑑𝑢(𝑣, 𝐻, 𝑤, 𝐻 RST ) • Can handle double-counting Quickly Finding a Truss in Haystack 14
Three edges removed v • 𝑣, 𝑤 , 𝑣, 𝑥 , 𝑥, 𝑤 deleted u w • No need to update supports! Quickly Finding a Truss in Haystack 15
So what else do we need? • We need a dynamic graph data structure • These data structures don’t cut it Na Names De Dense Li Linked COO ( OO (Edge CS CSR/CS /CSC Adjacency Ad li lists li list) Matrix Ma ❌ ❌ ❌ Good ü Locality ❌ ❌ Flexible ü ü Updates Quickly Finding a Truss in Haystack 16
Hornet… U SER -I NTERFACE 0 1 2 3 4 5 6 7 Vertex Id Id 2 2 3 2 2 2 1 0 Used Us Over-allocated space 2 2 4 2 2 2 1 0 BS BSiz ize Pointer Po 3 1 2 0 5 2 6 2 5 1 4 0 3 4 Dest./Col. 2 2 5 2 7 1 2 4 1 7 1 4 1 4 Value • Supports updates • Efficient memory manager – Supports edge insertion\deletion – Memory reclamation and deletion. – Hidden from user – Supports vertex insertion\deletion. • Framework • Good locality – Edge list contiguous Quickly Finding a Truss in Haystack 17
Experimental Setup - CPU Intel Dual Processor • Intel Xeon E5-2695 • 16 cores / per processor (32 in total) – 64 threads with Hyperthreading • 45MB LLC • 1TB of DDR4 Quickly Finding a Truss in Haystack 18
Experimental Setup - GPU Single Pascal 𝑄100 • 56 processors (SMs) • 64 threads / per processors (SPs) • 3584 hardware threads • 16GB of HBM2 – 720 GB/s bandwidth Quickly Finding a Truss in Haystack 19
Inputs Graphs • HPEC Graph Challenge • SNAP – Stanford Network Analysis Project The following is only a subset of these graphs: |𝑾| |𝑭| * Na Name Network T Type 𝑑𝑗𝑢 − 𝐼𝑓𝑞𝑄ℎ Citation 35k 421k 𝑏𝑛𝑏𝑨𝑝𝑜0601 Co-purchasing 400𝑙 2.4𝑁 𝑠𝑝𝑏𝑒𝑂𝑓𝑢 − 𝑄𝐵 Road 1𝑁 1.5𝑁 𝑏𝑡 − 𝑡𝑙𝑗𝑢𝑢𝑓𝑠 Trace route 1.69𝑁 11.1𝑁 𝑠𝑏𝑞ℎ500 − 𝑡𝑑𝑏𝑚𝑓21 Random 2.1𝑁 34𝑁 *largest: |E|= 134M Quickly Finding a Truss in Haystack 20
Benchmarks 1. Graph Challenge 1. Julia 2. Python 3. Matlab\Octave 2. Our algorithms tive - uses static triangle counting 1. 1. Ite Iterati ta - uses new algorithm 2. 2. Delta Quickly Finding a Truss in Haystack 21
Finding the Maximal Truss Time out – 8 hours Usually – 200X-500X faster Many times over 2000X faster Sometimes 10,000X faster Quickly Finding a Truss in Haystack 22
Execution time per iteration Quickly Finding a Truss in Haystack 23
Future Work • We still think that we can improve by another 10X… • New triangle counting kernel – Balanced and imbalanced intersections – Improved warp utilization Quickly Finding a Truss in Haystack 24
Summary • New algorithm for finding the maximal K- Truss • Given a static input we use techniques from dynamic graph algorithms • Hundreds to thousands of times faster than the benchmarks • We still think that we can improve by another 10X… Quickly Finding a Truss in Haystack 25
Thank you • Email: jfox43@gatech.edu Quickly Finding a Truss in Haystack 26
Backup Slides Quickly Finding a Truss in Haystack 27
Wang & Chang; 2012 • Modified version of Cohen’s algorithm • Sorts the edges based on their support – In each iteration, edges with a support smaller than 𝑙 − 2 are removed • Inherently sequential (due to update process) • Yet, significantly faster than Cohen’s algorithm • Uses hash maps for intersections Quickly Finding a Truss in Haystack 28
Hornet Data Layout • A scalable and dynamic data structure for graph algorithms and linear algebra based problems • Can support up-to 90 million updates per second • Low overhead in comparison with CSR – Initializing is also relatively in-expensive 20%-200% – Equal performance • Simple to use • Implemented for CUDA, yet portable for other architectures cuSTINGER paper: [Green&Bader; HPEC, 2016]: cuSTIN INGER: S : Supporting d dynamic g graph a algorithms fo for G GPUs Quickly Finding a Truss in Haystack 29
Hornet – Property Graph Support U SER -I NTERFACE 0 1 2 3 4 5 6 7 Vertex Id Id 2 2 3 2 2 2 1 0 Used Us 2 2 4 2 2 2 1 0 BSiz BS ize Po Pointer Dest./Col. 3 1 2 0 5 2 6 2 5 1 4 0 3 4 Weight 2 2 5 2 7 1 2 4 1 7 1 4 1 4 Type Time 1 User 1 User 2 …. • These are optional fields Quickly Finding a Truss in Haystack 30
Recommend
More recommend