Exploring Scalable Implementations of Triangle Enumeration in Graphs of Diverse Densities: Apache-Spark vs. GPU Travis Johnston, Stephen Herbein, and Michela Taufer Global Computing Laboratory University of Delaware Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 1
Introduction Graphs are powerful tools for modeling. Model social interaction: Friendship graphs Social networks Collaboration/Co-authorship graphs Phone call graphs Model computer networks: WWW (pages linking to other pages) WWW (hardware linking to other hardware) Model data moving through a network: Moving data from servers to users (WWW-hardware network) Infectious disease moving through a social network Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 2
Introduction What information can the structure of a graph convey? Identify the most influential nodes: Personalities with many Twitter followers e.g. Katy Perry, Justin Beiber, Taylor Swift, and Barrack Obama Prolific authors/collaborators e.g. Paul Erd˝ os with ≥ 500 collaborators and ≥ 1525 papers Important web pages e.g. get.adobe.com/reader/ , cnn.com , and google.com Identify communities Friends with similar interests Websites with similar topic Criminal networks Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 3
Introduction Why triangle enumeration? Used to calculate local clustering coefficient Used to compute transitivity ratio Directly applicable in spam detection and web link recommendation Finding triangles in graphs is a classic theoretical problem with numerous practical applications. The recent explosion of work on social networks has led to a great interest in fast algorithms to find triangles in graphs. The social sciences and physics communities often study triangles in real networks and use them to reason about underlying social processes. ... Triangle enumeration is also a fundamental subroutine for other more complex algorithmic tasks. [1] [1] http://www.cs.princeton.edu/~csesha/pubs/conf-triangle-enum.pdf [2] http://people.seas.harvard.edu/~babis/int-math-triangles.pdf [3] http://arxiv.org/abs/0904.3761 Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 4
Goal and Contributions Our goal: Study the efficiency of highly parallel algorithms for triangle enumeration on two parallel architectures Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 5
Goal and Contributions Our goal: Study the efficiency of highly parallel algorithms for triangle enumeration on two parallel architectures Our contributions: Present two algorithmic implementations of Triangle Enumeration Triangle Enumeration via matrix multiplication on GPU . Triangle Enumeration via MapReduce using Apache-Spark . Critically compare the performance on two graph models: Erd˝ os-R´ enyi (ER) random graph model Preferential attachment model Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 5
What is a Graph? Definition A graph G = ( V , E ) contains a set of vertices V and a set of edges E . Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 6
What is a Graph? Definition A graph G = ( V , E ) contains a set of vertices V and a set of edges E . vertices vertices Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 6
What is a Graph? Definition A graph G = ( V , E ) contains a set of vertices V and a set of edges E . Each edge e ∈ E is a set of two (distinct) vertices, e = { i , j } . vertices e i j edges edges Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 6
What is a Triangle? Definition Three vertices form a triangle if each pair of vertices share an edge. 2 5 3 4 1 6 This graph contains 2 triangles (blue). Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 7
Triangle Enumeration via Matrix Multiplication (GPU) Definition The Adjacency Matrix of a graph is a matrix A n , n = [ a ij ] where a ij = 1 if vertex i is adjacent to vertex j , and a ij = 0 otherwise. 2 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A = 0 0 0 0 0 0 3 4 0 0 0 0 0 0 0 0 0 0 0 0 1 6 Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8
Triangle Enumeration via Matrix Multiplication (GPU) Definition The Adjacency Matrix of a graph is a matrix A n , n = [ a ij ] where a ij = 1 if vertex i is adjacent to vertex j , and a ij = 0 otherwise. 2 5 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 A = 0 0 0 0 0 0 3 4 0 0 0 0 0 0 0 0 0 0 0 0 1 6 Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8
Triangle Enumeration via Matrix Multiplication (GPU) Definition The Adjacency Matrix of a graph is a matrix A n , n = [ a ij ] where a ij = 1 if vertex i is adjacent to vertex j , and a ij = 0 otherwise. 2 5 0 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 A = 0 0 0 0 0 0 3 4 0 0 0 0 0 0 0 0 0 0 0 0 1 6 Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8
Triangle Enumeration via Matrix Multiplication (GPU) Definition The Adjacency Matrix of a graph is a matrix A n , n = [ a ij ] where a ij = 1 if vertex i is adjacent to vertex j , and a ij = 0 otherwise. 2 5 0 1 1 0 0 0 1 0 1 0 0 0 1 1 0 0 0 0 A = 0 0 0 0 0 0 3 4 0 0 0 0 0 0 0 0 0 0 0 0 1 6 Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8
Triangle Enumeration via Matrix Multiplication (GPU) Definition The Adjacency Matrix of a graph is a matrix A n , n = [ a ij ] where a ij = 1 if vertex i is adjacent to vertex j , and a ij = 0 otherwise. 2 5 0 1 1 0 0 0 1 0 1 0 0 0 1 1 0 1 0 0 A = 0 0 1 0 0 0 3 4 0 0 0 0 0 0 0 0 0 0 0 0 1 6 Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8
Triangle Enumeration via Matrix Multiplication (GPU) Definition The Adjacency Matrix of a graph is a matrix A n , n = [ a ij ] where a ij = 1 if vertex i is adjacent to vertex j , and a ij = 0 otherwise. 2 5 0 1 1 0 0 0 1 0 1 0 0 0 1 1 0 1 0 0 A = 0 0 1 0 1 0 3 4 0 0 0 1 0 0 0 0 0 0 0 0 1 6 Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8
Triangle Enumeration via Matrix Multiplication (GPU) Definition The Adjacency Matrix of a graph is a matrix A n , n = [ a ij ] where a ij = 1 if vertex i is adjacent to vertex j , and a ij = 0 otherwise. 2 5 0 1 1 0 0 0 1 0 1 0 0 0 1 1 0 1 0 0 A = 0 0 1 0 1 1 3 4 0 0 0 1 0 0 0 0 0 1 0 0 1 6 Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8
Triangle Enumeration via Matrix Multiplication (GPU) Definition The Adjacency Matrix of a graph is a matrix A n , n = [ a ij ] where a ij = 1 if vertex i is adjacent to vertex j , and a ij = 0 otherwise. 2 5 0 1 1 0 0 0 1 0 1 0 0 0 1 1 0 1 0 0 A = 0 0 1 0 1 1 3 4 0 0 0 1 0 1 0 0 0 1 1 0 1 6 Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8
Triangle Enumeration via Matrix Multiplication (GPU) Definition The Adjacency Matrix of a graph is a matrix A n , n = [ a ij ] where a ij = 1 if vertex i is adjacent to vertex j , and a ij = 0 otherwise. 2 5 0 1 1 0 0 0 1 0 1 0 0 0 1 1 0 1 0 0 A = 0 0 1 0 1 1 3 4 0 0 0 1 0 1 0 0 0 1 1 0 1 6 Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8
Triangle Enumeration via Matrix Multiplication (GPU) Theorem If A is the adjacency matrix of a simple graph G, then the ij th entry of A k is the number of walks on k edges beginning at vertex i and ending at vertex j. Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 9
Triangle Enumeration via Matrix Multiplication (GPU) Theorem If A is the adjacency matrix of a simple graph G, then the ij th entry of A k is the number of walks on k edges beginning at vertex i and ending at vertex j. Corollary If A is the adjacency matrix of a simple graph G and A 3 = [ a ij ] then the a ii = tr ( A 3 ) number of triangles in G is 1 � . 6 6 Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 9
Recommend
More recommend