Computing Graph Centrality Erik Saule with collaborations with : Ahmet Erdem Sarıy¨ uce (Sandia), Kamer Kaya (Sabanci University), ¨ Umit V. C ¸ataly¨ urek (Georgia Tech) University of North Carolina at Charlotte (CS) SIAM CSE 2017 Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 1 / 17
Outline Closeness and Betweenness Centrality 1 Algorithmic Optimization 2 HPC Techniques 3 Incremental 4 Conclusion 5 Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 2 / 17
Centralities - Concept Answer questions such as Who controls the flow in a Applications network? Covert network (e.g., terrorist Who is more important? identification) Who has more influence? Contingency analysis (e.g., Whose contribution is weakness/robustness of significant for connections? networks) Viral marketing (e.g., who will Different kinds of graph spread the word best) road networks Traffic analysis social networks Store locations power grids mechanical mesh Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 3 / 17
Centralities - Definition Let G = ( V , E ) be a graph with the vertex set V and edge set E . 1 closeness centrality : cc [ v ] = far [ v ] , where the farness is defined as far [ v ] = � u ∈ comp ( v ) d ( u , v ). d ( u , v ) is the shortest path length between u and v . σ st ( v ) betweenness centrality : bc ( v ) = � σ st , where σ st is the s � = v � = t ∈ V number shortest paths between s and t , and σ st ( v ) is the number of them passing through v . Both metrics care about the structure of the shortest path graph. Brandes algorithm computes the shortest path graph rooted in each vertex of the graph. O ( | E | ) per source. O ( | V || E | ) in total. Believed to be asymptotically optimal [Kintali08]. Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 4 / 17
Brandes’s Algorithm for Betweenness [Brandes01] The algorithm is executed for each source s . Phase 1 Phase 2 Use BFS from s to compute the Back propagate to compute ratios of number of shortest path to all vertices. path. 1 Q .push( s ), σ [ s ] ← 1, d [ s ] ← 0 δ [ v ] ← σ [ v ] , ∀ v ∈ V while Q is not empty do while S is not empty do v ← Q .pop(), S .push( v ) w ← S .pop() for all w ∈ Γ( v ) do for v ∈ P [ w ] do if d [ w ] < 0 then δ [ v ] ← δ [ v ] + δ [ w ] Q .push( w ) if w � = s then d [ w ] ← d [ v ] + 1 bc [ w ] ← bc [ w ] + ( δ [ w ] × σ [ w ] − 1) if d [ w ] = d [ v ] + 1 then return bc σ [ w ] ← σ [ w ] + σ [ v ] P [ w ].push( v ) O ( | E | ) per source. O ( | V || E | ) in total. Believed to be asymptotically optimal [Kintali08]. Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 5 / 17
Outline Closeness and Betweenness Centrality 1 Algorithmic Optimization 2 HPC Techniques 3 Incremental 4 Conclusion 5 Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 6 / 17
Sampling Idea Only execute part of the sources for a random subset of the sources. And derive statistical bounds on the actual centrality. Changing the probability of pick a source can help correcting for sampling biais. Refs D. Bader, S. Kintali, K. Madduri, and M. Mihai. Approximating Betweenness Centrality. In WAW 2007 R. Geisberger, P. Sanders, and D. Schultes. 2008. Better Approximation of Betweenness Centrality. In ALENEX. U. Brandes and C. Pich. 2007. Centrality Estimation in Large Networks. I. J. Bifurcation and Chaos 17, 7 (2007). D. Eppstein J. Wang. Fast approximation of centrality. SODA 2001. 228229 K. Okamoto, W. Chen, and X.-Y. Li. Ranking of closeness centrality for large-scale social networks. In Proc. of FAW, 2008. Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 7 / 17
Graph Decomposition Articulation point +|A| Identical vertices +|B| x2 to A B A B Refs to M. Baglioni, F. Geraci, M. Pellegrini, and E. Lastres. 2012. Fast Exact Computation of Betweenness Centrality in Social Networks. In IEEE/ACM ASONAM 2012. Side Vertices R. Puzis, P. Zilberman, Y. Elovici, S. Dolev, and U. Brandes. 2012. Heuristics for Speeding up Betweenness Centrality Computation. In SocialCom 2012 M. Lee, J. Lee, J. Park, R. Choi, C. Chung. QUBE: a quick algorithm for updating betweenness centrality. WWW 2012 A. Sariyce, E. Saule, K. Kaya, and U. V. Catalyurek. Shattering and compressing networks for betweenness to centrality. SDM, 2013. L. Wang, F. Yang, L. Zhuang, H. Cui, F. Lv, X. Feng. Articulation Point Guided Redundancy Elimination for Betweenness Centrality. PPoPP 2016. A. Sariyuce, K. Kaya, E. Saule, and U. V. Catalyurek. Graph manipulations for fast centrality computation. ACM TKDD 2017. (to appear). Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 8 / 17
Outline Closeness and Betweenness Centrality 1 Algorithmic Optimization 2 HPC Techniques 3 BFS techniques Multi source techniques Distributed memory Incremental 4 Conclusion 5 Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 9 / 17
Standard BFS techniques Optimizing to hardware platform “If I change the storage format this way, I can squeeze x% more edges per cycle” too many to even bother Direction Optimization Notice that on a small diameter graph, the first iteration are faster in top-down and the last are faster in bottom-up. So switch in the middle. S. Beamer, K. Asanovix, D. Patterson. Direction-optimizing breadth-first search. SC 2012. H. Liu, H. Huang. Enterprise: Breadth-First Graph Traversal on GPUs. SC 2015 Good as a reference, but does not always directly port to centrality. Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 10 / 17
Multiple sources x x x Do BFS SpMV style (bottom-up) x x Each source maps to a SIMD lane x x x Reduce the number of graph reads x x Works better on low diameter graphs x Refs A. Buluc and J. Gilbert, The Combinatorial BLAS: Design, implementation, and applications. IJHPCA 2011 A. Erdem Sariyce, E. Saule, K. Kaya, and U. Catalyurek. Hardware/software vectorization for closeness centrality on multi-/many-core architectures. MTAAP, 2014. M. Then, M. Kaufmann, F. Chirigati, T. Hoang-Vu, K. Pham, A. Kemper, T. Neumann, H. Vo, The More the Merrier: Efficient Multi-Source Graph Traversal. VLDB 2014. A. Sariyuce, E. Saule, K. Kaya, and U. Catalyurek. Regularizing graph centrality computations. JPDC, 76:106–119, February 2015 H. Liu, H. Huang, Yang Hu. iBFS: Concurrent Breadth-First Search on GPUs. SIGMOD 2016. Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 11 / 17
Distributed Memory Graph Partitioning Partition the graph on multiple Graph Replication nodes. (1D, 2D, fine grain If I can fit the graph entirely on a decomposition.) Run BFS on it. node, then I can distribute work Usually does not scale well. coarsely by distributing sources. (Scalability has a COST) Scales linearly. F. McSherry, M. Isard, D. Murray. Scalability! But at what R. Lichtenwalter and N. Chawla. DisNet: A Framework for COST? HotOS 2015. Distributed Graph Computation. ASONAM 2011. A. Buluc and J. Gilbert, The Combinatorial BLAS: Design, and a lot more. implementation, and applications. IJHPCA 2011 M Bernaschi, G Carbone, F Vella. Betweenness centrality on Multi-GPU systems. MTAAP 2015 and a ton more Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 12 / 17
Outline Closeness and Betweenness Centrality 1 Algorithmic Optimization 2 HPC Techniques 3 Incremental 4 Algorithm HPC implementation Conclusion 5 Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 13 / 17
Incremental Principle Common Subgraphs Case 1: No consequential difference Case 2: One more edge in BFS graph Case 3: Potentially completly different graph Refs O. Green, R. McColl, and D. Bader. A fast algorithm for streaming betweenness centrality. In Proc. of SocialCom, 2012. A. Sariyuce, E. Saule, K. Kaya, and U. Catalyurek. Streamer: a distributed framework for incremental closeness centrality computation. In IEEE Cluster 2013. A. Sariyuce, E. Saule, K. Kaya, and U. Catalyurek. Incremental closeness centrality in distributed memory. Parallel Computing, 47:3–18, August 2015. Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 14 / 17
Leveraging Pipeling Refs A. Sariyuce, E. Saule, K. Kaya, and U. Catalyurek. Streamer: a distributed framework for incremental closeness centrality computation. In IEEE Cluster 2013. A. Sariyuce, E. Saule, K. Kaya, and U. Catalyurek. Incremental closeness centrality in distributed memory. Parallel Computing, 47:3–18, August 2015. Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 15 / 17
Outline Closeness and Betweenness Centrality 1 Algorithmic Optimization 2 HPC Techniques 3 Incremental 4 Conclusion 5 Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 16 / 17
Conclusion Lots of works out there. What is unclear (to me) or not done (that I know)? Is sampling really a closed problem? Reconcile graph decomposition and sampling. Reconcile graph decomposition and regularization. Better SpMM based algorithms. A good implementation that use all these techniques. Why do we keep on reinventing the same techniques? Does distributed memory (beside replication) ever make sense for centrality? Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 17 / 17
Recommend
More recommend