Introduction Proposal Experimental Evaluation Conclusion Link Prediction in Online Social Networks Using Group Information Jorge Valverde-Rebaza and Alneu de Andrade Lopes Laboratory of Computational Intelligence (LABIC) University of São Paulo (USP) Brazil July 2014
Introduction Proposal Experimental Evaluation Conclusion Outline Introduction 1 Proposal 2 Experimental Evaluation 3 Conclusion 4 Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 2
Introduction Social Networks Proposal Groups Detection Experimental Evaluation Link Prediction Conclusion Outline Introduction 1 Proposal 2 Experimental Evaluation 3 Conclusion 4 Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 3
Introduction Social Networks Proposal Groups Detection Experimental Evaluation Link Prediction Conclusion Social Networks Structure made up of a set of actors (individual or organizations) and social relations between them Social network analysis is an interesting research field in graph and complex network theory, data mining, machine learning and other areas Rise of online social networks Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 4
Introduction Social Networks Proposal Groups Detection Experimental Evaluation Link Prediction Conclusion Groups Detection Real networks are characterized by high concentration of links within special groups of vertices and low concentrations of links between these groups Online social networks offer a wide variety of possible groups: families, working and friendship circles, artistic or academic preferences, towns, nations, etc. Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 5
Introduction Social Networks Proposal Groups Detection Experimental Evaluation Link Prediction Conclusion Link Prediction Process Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 6
Introduction Social Networks Proposal Groups Detection Experimental Evaluation Link Prediction Conclusion Link Prediction Measures Based on global information Higher accuracy Very time-consuming computation Usually infeasible for large-scale networks E.g.: Katz index, Hitting time index, Simrank, etc. [Lü and Zhou, 2011] Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 7
Introduction Social Networks Proposal Groups Detection Experimental Evaluation Link Prediction Conclusion Link Prediction Measures Based on local information Based on global information Higher accuracy Lower accuracy than measures based on global information Very time-consuming computation Faster computation Usually infeasible for large-scale networks E.g.: Common neighbors (CN), Adamic Adar (AA), Jaccard (Jac), Resource Allocation E.g.: Katz index, Hitting time index, (RA), Preferential Attachment (PA), etc. Simrank, etc. [Lü and Zhou, 2011] [Lü and Zhou, 2011] Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 7
Introduction Social Networks Proposal Groups Detection Experimental Evaluation Link Prediction Conclusion Link Prediction Measures Based on local information Based on global information Higher accuracy Lower accuracy than measures based on global information Very time-consuming computation Faster computation Usually infeasible for large-scale networks E.g.: Common neighbors (CN), Adamic Adar (AA), Jaccard (Jac), Resource Allocation E.g.: Katz index, Hitting time index, (RA), Preferential Attachment (PA), etc. Simrank, etc. [Lü and Zhou, 2011] [Lü and Zhou, 2011] Hybrid strategy based on community information As the community structure grows, the accuracy of these measures drastically improves Perform better than most of measures based on local information E.g.: PFF [Zheleva et al., 2010], CN1, RA1 [Soundarajan and Hopcroft, 2012], WIC, W-measures [Valverde-Rebaza and Lopes, 2012], etc. Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 7
Introduction Social Networks Proposal Groups Detection Experimental Evaluation Link Prediction Conclusion Link Prediction Measures Based on local information Based on global information Higher accuracy Lower accuracy than measures based on global information Very time-consuming computation Faster computation Usually infeasible for large-scale networks E.g.: Common neighbors (CN), Adamic Adar (AA), Jaccard (Jac), Resource Allocation E.g.: Katz index, Hitting time index, (RA), Preferential Attachment (PA), etc. Simrank, etc. [Lü and Zhou, 2011] [Lü and Zhou, 2011] Hybrid strategy based on community information As the community structure grows, the accuracy of these measures drastically improves A node belongs to just one group Perform better than most of measures based on local information E.g.: PFF [Zheleva et al., 2010], CN1, RA1 [Soundarajan and Hopcroft, 2012], WIC, W-measures [Valverde-Rebaza and Lopes, 2012], etc. Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 7
Introduction Preliminary Proposal WOCG Experimental Evaluation CNG Conclusion TPOG Outline Introduction 1 Proposal 2 Experimental Evaluation 3 Conclusion 4 Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 8
Introduction Preliminary Proposal WOCG Experimental Evaluation CNG Conclusion TPOG Preliminary We consider that each node participates in multiple groups In the network G ( V , E ) exists M > 1 groups identified by different group labels g 1 , g 2 , . . . g M Each node x belongs to a set of node groups G = { g a , g b , . . . g p } with size P > 0 and P ≤ M The set of neighbors of a vertex x is Γ( x ) = { y | ( x , y ) ∈ E } The set of all common neighbors (CN) of a vertex pair ( x , y ) is Λ x , y = Γ( x ) ∩ Γ( y ) Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 9
Introduction Preliminary Proposal WOCG Experimental Evaluation CNG Conclusion TPOG CN Within and Outside of Common Groups (WOCG) Considering G α,β = G α ∩ G β We redefine the set of CN as Λ x , y = Λ WCG ∪ Λ OCG x , y x , y = { z G γ ∈ Λ x , y | G α,β ∩ G γ � = ∅ } - the set of common Λ WCG x , y neighbors within common groups (WCG) Λ OCG = Λ x , y − Λ WCG - the set of common neighbors outside x , y x , y of the common groups (OCG) Our final score, called as common neighbors within and outside of common groups (WOCG) measure, is defined as: = | Λ WCG | x , y s WOCG (1) x , y | Λ OCG x , y | Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 10
Introduction Preliminary Proposal WOCG Experimental Evaluation CNG Conclusion TPOG Common Neighbors of Groups (CNG) We define the set of common neighbors of groups as x , y = { z G γ ∈ Λ x , y | G α ∩ G γ � = ∅ ∨ G β ∩ G γ � = ∅ } Λ G Our final score, called as common neighbors of groups (CNG), is defined as: s CNG = | Λ G x , y | (2) x , y Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 11
Introduction Preliminary Proposal WOCG Experimental Evaluation CNG Conclusion TPOG CN with Total and Partial Overlapping of Groups (TPOG) We redefine the set of CNG as Λ G x , y = Λ TOG ∪ Λ POG x , y x , y = { z G γ ∈ Λ G Λ TOG x , y | G α ∩ G γ � = ∅ ∧ G β ∩ G γ � = ∅ } - the set x , y of CN with total overlapping of groups (TOG) Λ POG = Λ G x , y − Λ TOG - the set of CN with partial overlapping x , y x , y of groups (POG) Our final score, called as the common neighbors with total and partial overlapping of groups (TPOG) measure, is defined as: = | Λ TOG x , y | s TPOG (3) x , y | Λ POG x , y | Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 12
Introduction Datasets Proposal Experimental setup Experimental Evaluation Results Conclusion Outline Introduction 1 Proposal 2 Experimental Evaluation 3 Conclusion 4 Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 13
Introduction Datasets Proposal Experimental setup Experimental Evaluation Results Conclusion Datasets Table : High-level topological features of our four social networks [Mislove et al., 2007] Flickr LiveJournal Orkut Youtube Number of nodes 1 , 846 , 198 5 , 284 , 457 3 , 072 , 441 1 , 157 , 827 Number of links 22 , 613 , 981 77 , 402 , 652 223 , 534 , 301 4 , 945 , 382 Average degree per node 12 . 24 16 . 97 106 . 1 4 . 29 Fraction of links symmetric 62 . 0 % 73 . 5 % 100 . 0 % 79 . 1 % Average path length 5 . 67 5 . 88 4 . 25 5 . 10 Diameter 27 20 9 21 Average clustering coefficient 0 . 313 0 . 330 0 . 171 0 . 136 Average assortativity coefficient 0 . 202 0 . 179 0 . 072 − 0 . 033 Number of node groups 103 , 648 7 , 489 , 073 8 , 730 , 859 30 , 087 Average number of groups membership per node 4 . 62 21 . 25 106 . 44 0 . 25 Average group size 82 15 37 10 0 . 47 0 . 81 0 . 52 0 . 34 Average group clustering coefficient Jorge Valverde-Rebaza and Alneu de Andrade Lopes Link Prediction in OSN using group information 14
Recommend
More recommend