Multilevel refinement based on neighborhood similarity Alan Valejo, Jorge Valverde-Rebaza, Brett Drury and Alneu de Andrade Lopes Department of Computer Science ICMC, University of São Paulo C. P. 668, CEP 13560-970, São Carlos, SP, Brazil {alan, jvalverr, bdrury, alneu}@icmc.usp.br July, 2014
Outline 1. Introduction 2. RSim 3. Experiments 4. Conclusion
Introduction
Introduction RSim Experiments Conclusion Graph partition techniques aim to divide the set of vertices of a graph into k disjoint partitions � Social network � Biological network � Information network � Technology network • Vertices belonging to the same partitions share common properties and have similar roles • Graph partitioning is useful to understand the topological structure and dynamic processes of networks Valejo et al. 1 / 18
Introduction RSim Experiments Conclusion The graph partitioning problem is NP -complete • The identification of an optimal solution is a computationally expensive task • Infeasible for large-scale networks Big Data � Facebook, Web networks, Biological, Biomedical, ... Valejo et al. 2 / 18
Introduction RSim Experiments Conclusion Multilevel graph partitioning This strategy allows applying algorithms with high computational cost in large networks without significant impact on solution quality [Karypis and Kumar, 1998] Valejo et al. 3 / 18
Introduction RSim Experiments Conclusion Refinement To improve the multilevel solution Refinement methods tend to use the general structural properties of complex networks • Cut minimization and balancing • Maximization of the modularity Valejo et al. 4 / 18
Introduction RSim Experiments Conclusion Social networks � High clustering coefficient � Significant assortativity mixing � Numerous common relationships among their members These properties are quantified using neighborhood and similarity measures [Valverde-Rebaza and Lopes, 2012] Valejo et al. 5 / 18
RSim Refinement algorithm based on neighborhood similarity
Introduction RSim Similarity Measures Experiments RSim Conclusion Similarity measures quantify common characteristics between two vertices Global information � Higher accuracy, very time-consuming Hybrid similarity measures Local information [Valverde-Rebaza and Lopes, 2012] � Information about pair of vertices � Other network informations � Using community information � Generally faster Valejo et al. 6 / 18
Introduction RSim Similarity Measures Experiments RSim Conclusion Similarity measures quantify common characteristics between two vertices Global information � Higher accuracy, very time-consuming Hybrid similarity measures Local information [Valverde-Rebaza and Lopes, 2012] � Information about pair of vertices � Other network informations � Using community information � Generally faster Valejo et al. 6 / 18
Introduction RSim Similarity Measures Experiments RSim Conclusion Similarity measures quantify common characteristics between two vertices Global information � Higher accuracy, very time-consuming Hybrid similarity measures Local information [Valverde-Rebaza and Lopes, 2012] � Information about pair of vertices � Other network informations � Using community information � Generally faster Valejo et al. 6 / 18
Introduction RSim Similarity Measures Experiments RSim Conclusion Similarity measures quantify common characteristics between two vertices Global information � Higher accuracy, very time-consuming Hybrid similarity measures Local information [Valverde-Rebaza and Lopes, 2012] � Information about pair of vertices � Other network informations � Using community information � Generally faster Common neighbors ( 1 , 2 ) = |{ 3 , 4 }| Valejo et al. 6 / 18
Introduction RSim Similarity Measures Experiments RSim Conclusion Similarity measures quantify common characteristics between two vertices Global information � Higher accuracy, very time-consuming Hybrid similarity measures Local information [Valverde-Rebaza and Lopes, 2012] � Information about pair of vertices � Other network informations � Using community information � Generally faster Common neighbors ( 1 , 2 ) = |{ 3 , 4 }| Valejo et al. 6 / 18
Introduction RSim Similarity Measures Experiments RSim Conclusion Similarity measures quantify common characteristics between two vertices Global information � Higher accuracy, very time-consuming Hybrid similarity measures Local information [Valverde-Rebaza and Lopes, 2012] � Information about pair of vertices � Other network informations � Using community information � Generally faster Common neighbors Within-community common neighbors ( 1 , 2 ) = |{ 3 , 4 }| ( 1 , 2 ) = |{ 3 }| Valejo et al. 6 / 18
Introduction RSim Similarity Measures Experiments RSim Conclusion W measures. Reformulation of the local-similarity measures using information considering the common neighbors within-community Local measure W measure S CN S CN − W = | Λ W v , u = | Λ v , u | v , u | ��� v , u | Λ W v , u | | Λ v , u | S Jac S Jac − W v , u = = ��� v , u | Γ( x ) ∪ Γ( y ) | | Γ( x ) ∪ Γ( y ) | S AA v , y = � S AA − W = � ��� 1 v , y z ∈ Λ W 1 z ∈ Λ v , u v , u log k ( z ) log k ( z ) ... ... ��� WIC measure. The WIC measure uses information of the common neighbors inter and intra-communities of the evaluated pair ( v , u ) | Λ W if Λ W � v , u | v , u = Λ v , u S WIC = v , u | Λ W v , u | / | Λ I v , u | otherwise Valejo et al. 7 / 18
Introduction RSim Similarity Measures Experiments RSim Conclusion • Refinement process for the boundary vertex 5 using RSim-CN • Given C = { C A , C B } , C A = { 1 , 2 , 3 , 4 , 5 } , C B = { 6 , 7 , 8 , 9 } Initial Partitioning Valejo et al. 8 / 18
Introduction RSim Similarity Measures Experiments RSim Conclusion • Refinement process for the boundary vertex 5 using RSim-CN • Given C = { C A , C B } , C A = { 1 , 2 , 3 , 4 , 5 } , C B = { 6 , 7 , 8 , 9 } Initial Partitioning Uncoarsening Valejo et al. 8 / 18
Introduction RSim Similarity Measures Experiments RSim Conclusion • Refinement process for the boundary vertex 5 using RSim-CN • Given C = { C A , C B } , C A = { 1 , 2 , 3 , 4 , 5 } , C B = { 6 , 7 , 8 , 9 } CA CA | Λ 5 , 2 | + | Λ 5 , 4 | 1 = |{ 4 }| + |{ 2 }| � S CN − W ws ( C A ) = = = 1 5 , u k CA ( v ) k CA ( v ) 2 5 , u | u ∈ CA Initial Partitioning Uncoarsening Valejo et al. 8 / 18
Introduction RSim Similarity Measures Experiments RSim Conclusion • Refinement process for the boundary vertex 5 using RSim-CN • Given C = { C A , C B } , C A = { 1 , 2 , 3 , 4 , 5 } , C B = { 6 , 7 , 8 , 9 } CA CA | Λ 5 , 2 | + | Λ 5 , 4 | 1 = |{ 4 }| + |{ 2 }| � S CN − W ws ( C A ) = = = 1 5 , u k CA ( v ) k CA ( v ) 2 5 , u | u ∈ CA CB CB CB | Λ 5 , 6 | + | Λ 5 , 7 | + | Λ 5 , 8 | 1 = |{ 7 }| + |{ 6 , 8 }| + |{ 7 }| � S CN − W ws ( C B ) = = = 1.33 5 , u k CB ( v ) k CB ( v ) 3 5 , u | u ∈ CB Initial Partitioning Uncoarsening Valejo et al. 8 / 18
Introduction RSim Similarity Measures Experiments RSim Conclusion • Refinement process for the boundary vertex 5 using RSim-CN • Given C = { C A , C B } , C A = { 1 , 2 , 3 , 4 , 5 } , C B = { 6 , 7 , 8 , 9 } CA CA | Λ 5 , 2 | + | Λ 5 , 4 | 1 = |{ 4 }| + |{ 2 }| � S CN − W ws ( C A ) = = = 1 5 , u k CA ( v ) k CA ( v ) 2 5 , u | u ∈ CA CB CB CB | Λ 5 , 6 | + | Λ 5 , 7 | + | Λ 5 , 8 | 1 = |{ 7 }| + |{ 6 , 8 }| + |{ 7 }| � S CN − W ws ( C B ) = = = 1.33 5 , u k CB ( v ) k CB ( v ) 3 5 , u | u ∈ CB Initial Partitioning Uncoarsening Refinement Valejo et al. 8 / 18
Introduction RSim Similarity Measures Experiments RSim Conclusion • RSim has numerous variants based on the set of common neighbors. • It is possible that all of them lead to the same decisions. refining 5 Variant w ( C a ) w ( C b ) RSim-CN 1.00 1.33 RSim-HP 0.25 0.38 RSim-HD 0.20 0.26 Initial Partitioning Refinement Valejo et al. 9 / 18
Introduction RSim Similarity Measures Experiments RSim Conclusion • RSim has numerous variants based on the set of common neighbors. • It is possible that all of them lead to the same decisions. refining 5 Variant w ( C a ) w ( C b ) RSim-CN 1.00 1.33 RSim-HP 0.25 0.38 RSim-HD 0.20 0.26 Initial Partitioning Refinement refining 2 refining 4 Variant w ( C a ) w ( C b ) w ( C a ) w ( C b ) RSim-CN 2.00 0.00 2.00 0.00 RSim-HP 0.66 0.00 0.66 0.00 RSim-HD 0.55 0.00 0.55 0.00 Initial Partitioning Refinement Valejo et al. 9 / 18
Experiments
Introduction RSim Benchmark Experiments Case Study Conclusion We evaluated ten RSim variants • Each variant uses a different similarity measure (WIC or W-measures) Comparison RSim • KK [Karypis and Kumar, 1998] • RFG [Rotta and Noack, 2011] • baseline (also called no-refinement) [Almeida and Lopes, 2009] Flowchart Valejo et al. 10 / 18
Recommend
More recommend