Parallel Triangle Counting in MPI Jason Li and David Wise

Background • A triangle in a undirected graph is a collection of 3 vertices such that all 3 pairs of vertices are connected by an edge. • “Triangle counting has emerged as an important building block in the study of social networks, identifying thematic structures of networks, spam and fraud detection, link classification and recommendation, and more ” [1]

Background • A triangle in a undirected This graph has graph is a collection of 3 vertices such that all 3 pairs of 2 triangles: vertices are connected by an edge. • “Triangle counting has emerged as an important building block in the study of social networks, identifying thematic structures of networks, spam and fraud detection, link classification and recommendation, and more ” [1]

The Underlying Algorithm • Initialize the counter to 0. • Sort the vertices in order of increasing degree, breaking ties arbitrarily. Similarly, sort the adjacency lists according to the same ordering . • For each edge ( v , w ) with v < w : • Let u v and u w be the first vertices in the adjacency lists of v and w respectively. • While u v exists and u v < v and u w exists and u w < v : • If u v < u w then set u v to the next neighbor of v . • Else if u w < u v then set u w to the next neighbor of w . • Else increment the counter and set u v and u w to their next neighbors.

Complexity of the Algorithm • The space complexity is just O ( m ) since we store the graph • Because the vertices are sorted by degree and each edge is assigned to its smaller neighbor, it can be shown that the sequential time complexity is O ( m 3/2 ).

Parallelizing the Algorithm • The focus of our project was efficiently parallelizing this algorithm • Naive idea: each edge is a task and can be arbitrarily assigned to a processor • The catch is that to process an edge, the processor needs to know the neighbors of each vertex on the edge • If the edges are arbitrarily assigned, each processor needs a copy of the whole graph

Reducing Communication • We want the edges assigned to each processor to hit as few vertices as possible • We can approach the problem by grouping the vertices • We partition the vertices into r = √ P groups   v 1 , …, v r and assign each processor a pair   ( v i , v j ) • The processor assigned pair ( v i , v j ) is responsible for all edges going from a vertex in v i to v j .

Cost Analysis • In the average case, 4 processors each processor gets 2 groups of vertices 1/ P edges: we expect near perfect P 2 speedup P 1 P 3 • Each processor gets P 4 the adjacency lists of 2 n / √ P vertices, The thick arrows represent which on average groups of edges. has total size 2 m / √ P

Actual Speedups Matched Expectations Time (s) 1 4 9 16 25 36 49 64 Number of Processors

More Results gplus k5000 live skitter Speedup 17.23 12.72 1.95 3.98

Thank you! Questions?

References 1. “Counting and Sampling Triangles from a Graph Stream” A. Pavan, Kanat Tangwongsan, Srikanta Tirthapura, Kun-Lung Wu, Proceedings of the VLDB Endowment VLDB Endowment Hompage archive, Volume 6 Issue 14, September 201, Pages 1870-1881. 2. “15-418 Final Report”, Shu-Hao Yu, YiCheng Qin. http://www.cs.cmu.edu/afs/ cs/user/shuhaoy/www/Final_Project.pdf .

Parallel Triangle Counting in MPI Jason Li and David Wise - PowerPoint PPT Presentation

Parallel Triangle Counting in MPI Jason Li and David Wise Background A triangle in a undirected graph is a collection of 3 vertices such that all 3 pairs of vertices are connected by an edge. Triangle counting has emerged as an

Triangle Counting in Large Sparse Graph Meng-Tsung Tsai r95065@cise.ntu.edu.tw Triangle Counting

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

MPI is too High-Level MPI is too Low-Level Marc Snir High-Level MPI MPI is an Application

Introduction to MPI T opics to be covered MPI vs shared memory Initializing MPI MPI

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

Programming Miscellaneous MPI-IO topics MPI-IO Errors Unlike the rest of MPI, MPI-IO errors

MPI-IO: A Retrospective Rajeev Thakur 25 th Anniversary of MPI Workshop Argonne, IL, Sept 25,

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

c p e c Writing Message-Passing Parallel Programs with MPI Edinburgh Parallel Computing Centre

Investigation of Parallel Processing Using How to Enable/Access Open MPI in Open MPI ADMB.

MPI Internals Advanced Parallel Programming Overview MPI Library Structure Point-to-point

MPI & MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards

Open MPI on the Cray XT presented by Richard L. Graham Galen Shipman Open MPI Is Open

Advanced MPI USER-DEFINED DATATYPES MPI datatypes MPI datatypes are used for communication

Braunston - The Triangle The Triangle redevelopment Phase 1 triangle cleared of

From Pascals Triangle to Sierpinskis Triangle Nicoleta Babutiu Q@A Todays journey

Corporate Partnerships A Learning Tool Provided by the National CASA Association 1 Corporate

Americans with Disabilities Act Update: What you need to know Rachelle Taylor Golden, Esq.

ITS ALL ABOUT the HOURS Not the DOLLARS My Big Three KPIs 1. Hours 2. Tech Recs 3. SDS

Geary Rapid SFMT A CAC June 1, 2017 Liz Brisson, G eary R apid Project Manager, SF MT A

The Road to World War II The Weimar Republic name of the post-WWI government in Germany millions

Success Criteria I can explain what prejudice means. I can describe how and why Jews were

Suffering will either make you better or make you bitter. Rank these rights from 1 to 6 ---

The Holocaust did not begin in the gas chambers... it began with words Presentation to Trustees