Par$$oning & Clustering Big Graphs George Karypis - PowerPoint PPT Presentation

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ Par$$oning ¡& ¡Clustering ¡Big ¡Graphs ¡ George ¡Karypis ¡ Department ¡of ¡Computer ¡Science ¡& ¡Engineering ¡ Twin ¡CiFes ¡ University ¡of ¡Minnesota ¡ Thursday, ¡May ¡23, ¡13 ¡ 1 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ Overview ¡ � Overview ¡of ¡graph ¡parFFoning ¡ � The ¡mulFlevel ¡paradigm ¡ � METIS ¡family ¡of ¡parFFoning ¡tools ¡ � MulF-‑threaded ¡algorithms ¡for ¡parFFoning ¡& ¡clustering ¡ � Closing ¡remarks ¡ Thursday, ¡May ¡23, ¡13 ¡ 2 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ Standard ¡graph ¡par$$oning ¡problem ¡ Given a graph G=(V, E) we want to partition it into k parts such that: each part has roughly the same number of vertices and the edges that straddle partitions (edge-cut) is minimized Applications � Parallel & distributed computing � Scientific computing � VLSI physical design � Data-mining � Storage and placement � ... It ¡is ¡NP-‑hard. ¡ ¡HeurisFc ¡algorithms ¡are ¡used! ¡ 5/23/13 ¡ 3 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ Overview ¡of ¡the ¡mul$level ¡graph ¡par$$oning ¡paradigm ¡ Thursday, ¡May ¡23, ¡13 ¡ 4 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ Coarsening ¡Phase ¡ Successive ¡coarse ¡graphs ¡are ¡constructed ¡by ¡compuFng ¡a ¡matching ¡ of ¡the ¡edges, ¡and ¡collapsing ¡together ¡the ¡verFces ¡incident ¡on ¡these ¡ edges. ¡ Heavy-Edge Matching 1 1 4 4 [2] 3 3 3 3 1 1 4 [2] 2 2 2 2 2 5 1 2 1 1 2 2 2 1 1 1 1 1 1 1 1 1 [2] [2] 3 3 1 5 1 3 1 1 4 [2] 4 3 3 Total Edge-Weight: 37 Total Edge-Weight: 21 5/23/13 ¡ 5 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ Refinement ¡Phase ¡ The ¡refinement ¡is ¡performed ¡by ¡using ¡ move-‑based ¡approaches, ¡ based ¡on ¡the ¡algorithm ¡by ¡Fiduccia-‑MaXheyses ¡(FM). ¡ Partition i Partition j Partition i Partition j 2 2 2 2 ID[v] = 4, 3 3 1 1 ED[v] = 8, v v Gain[v] = ED[v] - ID[v] = 4 3 3 1 1 Edgecut = 4 Edgecut = 8 5/23/13 ¡ 6 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ Why ¡does ¡the ¡mul$level ¡par$$oning ¡paradigm ¡work? ¡ � The ¡coarsening ¡phase ¡by ¡hiding ¡a ¡large ¡fracFon ¡of ¡the ¡edges, ¡ makes ¡the ¡parFFoning ¡problem ¡easier. ¡ � Performing ¡refinement ¡at ¡successive ¡finer ¡graphs, ¡enhances ¡the ¡ effecFveness ¡of ¡refinement ¡algorithms. ¡ � MulF-‑scale ¡refinement ¡ 10 10 10 10 10 10 10 10 3 3 10 10 After Coarsening 1 1 1 1 10 10 3 3 10 10 10 10 10 10 10 10 5/23/13 ¡ 7 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ When ¡does ¡the ¡mul$level ¡paradigm ¡have ¡difficul$es? ¡ � The ¡value ¡of ¡the ¡objecFve ¡funcFon ¡in ¡the ¡original ¡graph ¡cannot ¡be ¡ (Fghtly) ¡upper ¡bounded ¡while ¡operaFng ¡on ¡a ¡coarser ¡graph. ¡ � We ¡cannot ¡ensure ¡improvements ¡at ¡a ¡coarse ¡graph ¡lead ¡to ¡ improvements ¡in ¡the ¡original ¡graph. ¡ � Coarsening ¡fails ¡to ¡make ¡the ¡opFmizaFon ¡problem ¡easier ¡in ¡ coarser ¡graphs. ¡ � It ¡is ¡not ¡“in ¡tune” ¡with ¡the ¡objecFve. ¡ ¡ � Coarsening ¡fails ¡to ¡reduce ¡the ¡size ¡of ¡the ¡problem ¡(|V|+|E|). ¡ � Can ¡increase ¡the ¡runFme/memory ¡requirements. ¡ � The ¡objecFve ¡funcFon ¡is ¡based ¡on ¡global ¡properFes ¡of ¡the ¡graph. ¡ � Can ¡substanFally ¡increase ¡the ¡refinement ¡Fme. ¡ Thursday, ¡May ¡23, ¡13 ¡ 8 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ METIS, ¡ParMETIS, ¡& ¡hMETIS ¡ � Sogware ¡packages ¡for ¡parFFoning ¡ unstructured ¡graphs ¡and ¡ hypergraphs ¡and ¡compuFng ¡fill ¡ reducing ¡orderings. ¡ � METIS ¡was ¡released ¡in ¡1995 ¡ (current ¡version ¡5.x). ¡ ParMETIS ¡was ¡released ¡in ¡1997 ¡ � (current ¡version ¡4.x). ¡ hMETIS ¡was ¡released ¡in ¡1998 ¡ � (current ¡version ¡2.x) ¡ � They ¡are ¡freely ¡distributed ¡and ¡ widely ¡used. ¡ 5/23/13 ¡ 9 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ Beyond ¡the ¡tradi$onal ¡par$$oning ¡problem ¡(1) ¡ � Vertex ¡separators ¡ � ParFFon ¡the ¡graph ¡by ¡removing ¡a ¡minimum ¡set ¡of ¡verFces. ¡ � Broad ¡applicaFons ¡to: ¡ � matrix ¡reordering ¡for ¡direct ¡solvers ¡ � concurrency ¡extracFon ¡by ¡decoupling ¡computaFons ¡at ¡each ¡parFFon ¡ � overlapping ¡clustering ¡soluFons ¡ Thursday, ¡May ¡23, ¡13 ¡ 10 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ Beyond ¡the ¡tradi$onal ¡par$$oning ¡problem ¡(2) ¡ � Constraints ¡ � MulFple ¡balancing ¡constraints ¡ � balance ¡load ¡& ¡memory ¡requirements, ¡ ¡ � balance ¡the ¡different ¡types ¡of ¡modules ¡that ¡are ¡assigned ¡to ¡each ¡chip ¡in ¡a ¡ mulF-‑chip ¡FPGA ¡design, ¡ ¡ � balance ¡incoming ¡& ¡outgoing ¡messages, ¡ ¡ � balance ¡iteraFve ¡& ¡direct ¡solvers, ¡etc. ¡ � ConnecFvity ¡constraints ¡ � ensure ¡that ¡the ¡graph ¡induced ¡by ¡the ¡verFces ¡of ¡each ¡parFFon ¡is ¡ connected. ¡ � Placement ¡constraints ¡ � ensure ¡that ¡certain ¡verFces ¡are ¡placed ¡in ¡different ¡and/or ¡the ¡same ¡ parFFons. ¡ � No ¡constraints ¡ � ObjecFve ¡driven ¡parFFoning. ¡ Thursday, ¡May ¡23, ¡13 ¡ 11 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ Beyond ¡the ¡tradi$onal ¡par$$oning ¡problem ¡(3) ¡ � ObjecFves ¡ � CommunicaFon ¡volume ¡ � Subdomain ¡connecFvity ¡ � RedistribuFon ¡overhead ¡ � MulFple ¡edge-‑defined ¡cost ¡funcFons ¡ � Path-‑based ¡objecFves ¡ � Fming ¡consideraFons ¡in ¡VLSI ¡circuits ¡ � Clustering ¡objecFves ¡ ¡ � normalized ¡cut, ¡raFo ¡cut, ¡min-‑max, ¡modularity, ¡… ¡[CLUTO] ¡ � Various ¡combinaFons ¡of ¡the ¡above ¡ Thursday, ¡May ¡23, ¡13 ¡ 12 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ Some ¡performance ¡numbers ¡ Intel(R) ¡Xeon(R) ¡CPU ¡E5-‑2670 ¡@ ¡2.60GHz, ¡128GB ¡ Thursday, ¡May ¡23, ¡13 ¡ 13 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ IMPROVING ¡SINGLE ¡NODE ¡ PERFORMANCE ¡ 14 ¡ Thursday, ¡May ¡23, ¡13 ¡

U NIVERSITY ¡ OF ¡M INNESOTA , ¡D EPARTMENT ¡ OF ¡C OMPUTER ¡S CIENCE ¡& ¡E NGINEERING ¡ Mul$-‑threaded ¡graph ¡par$$oning/clustering ¡ � OpportuniFes: ¡ � MulF-‑core ¡processors ¡have ¡become ¡ubiquitous. ¡ � Their ¡cache-‑coherent ¡shared-‑memory ¡architecture ¡makes ¡it ¡easier ¡to ¡ develop ¡parallel ¡programs ¡ � Challenges: ¡ � Non-‑uniform ¡access ¡to ¡shared ¡memory. ¡ � Many ¡applicaFons ¡are ¡bound ¡by ¡memory ¡bandwidth. ¡ � Limited ¡memory ¡per ¡core. ¡ Thursday, ¡May ¡23, ¡13 ¡ 15 ¡

Par$$oning & Clustering Big Graphs George Karypis - PowerPoint PPT Presentation

U NIVERSITY OF M INNESOTA , D EPARTMENT OF C OMPUTER S CIENCE & E NGINEERING Par$$oning & Clustering Big Graphs George Karypis Department of Computer Science &

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Big graphs for big data: parallel matching and Outline clustering on billion-vertex graphs

Big graphs for big data: parallel matching and Outline clustering on billion-vertex graphs

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Graph Clustering Why graph clustering is useful? Distance matrices are graphs as useful as

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

New Applications of Moment-SOS Hierarchies Victor Magron , RA Imperial College 17 October 2014

New Applications of Moment-SOS Hierarchies Victor Magron , RA Imperial College 12 February 2015

New Applications of Moment-SOS Hierarchies Victor Magron , RA Imperial College 13 November 2014

DRAFT Scaling MySQL with Python draft2 Roberto Polli - roberto.polli@par-tec.it Par-Tec Spa -

NLCertify : A Tool for Formal Nonlinear Optimization Victor Magron , Postdoc LAAS-CNRS 18

New Applications of Semidefinite Programming Victor Magron , RA Imperial College 3 Fvrier 2015

VERITAS Observations Maria Krause Alexis Popkow of the Cygnus Region for the VERITAS

End-to-end Exactly-once Aggregation over Ad Streams Amiraj Dhawan Amit

Escaping Saddle Points with Adaptive Gradient Methods Matthew Staib 1 , Sashank Reddi 2 ,

The Changing Landscape of Unmanned Systems Larry Osborn EVP & Chief Strategy Officer

An Approach for Detecting Learning Styles in Learning Management Systems Sabine Graf Kinshuk

Kubernetes+GlusterFS: Lightning Ver. Mohamed Ashiq Liazudeen & Jos A. Rivera

Multidimensional quadrilateral lattices with the values in Grassmann manifold are integrable

Implemen'ng a ver'cally hardened DNP3 control stack Sven M.

Par$$oning & Clustering Big Graphs George Karypis - PowerPoint PPT Presentation

U NIVERSITY OF M INNESOTA , D EPARTMENT OF C OMPUTER S CIENCE & E NGINEERING Par$$oning & Clustering Big Graphs George Karypis Department of Computer Science &

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Big graphs for big data: parallel matching and Outline clustering on billion-vertex graphs

Big graphs for big data: parallel matching and Outline clustering on billion-vertex graphs

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Graph Clustering Why graph clustering is useful? Distance matrices are graphs as useful as

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

New Applications of Moment-SOS Hierarchies Victor Magron , RA Imperial College 17 October 2014

New Applications of Moment-SOS Hierarchies Victor Magron , RA Imperial College 12 February 2015

New Applications of Moment-SOS Hierarchies Victor Magron , RA Imperial College 13 November 2014

DRAFT Scaling MySQL with Python draft2 Roberto Polli - roberto.polli@par-tec.it Par-Tec Spa -

NLCertify : A Tool for Formal Nonlinear Optimization Victor Magron , Postdoc LAAS-CNRS 18

New Applications of Semidefinite Programming Victor Magron , RA Imperial College 3 Fvrier 2015

VERITAS Observations Maria Krause Alexis Popkow of the Cygnus Region for the VERITAS

End-to-end Exactly-once Aggregation over Ad Streams Amiraj Dhawan Amit

Escaping Saddle Points with Adaptive Gradient Methods Matthew Staib 1 , Sashank Reddi 2 ,

The Changing Landscape of Unmanned Systems Larry Osborn EVP &amp; Chief Strategy Officer

An Approach for Detecting Learning Styles in Learning Management Systems Sabine Graf Kinshuk

Kubernetes+GlusterFS: Lightning Ver. Mohamed Ashiq Liazudeen &amp; Jos A. Rivera

Multidimensional quadrilateral lattices with the values in Grassmann manifold are integrable

Implemen'ng a ver'cally hardened DNP3 control stack Sven M.

The Changing Landscape of Unmanned Systems Larry Osborn EVP & Chief Strategy Officer

Kubernetes+GlusterFS: Lightning Ver. Mohamed Ashiq Liazudeen & Jos A. Rivera