Supervised Rank Aggregation Approach for Link Prediction in Complex Networks Manisha Pujari & Rushed Kanawati A3 firstname.lastname@lipn.univ-paris13.fr 17/10/2012
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Link Prediction 1 New Approach: Supervised Rank Aggregation 2 Experiment 3 Conclusion 4 M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 2/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Link Prediction Problem Link Prediction Predicting missing/hidden/ new links between nodes of a graph. Applications Recommender systems Academic/Professional collaborations Identification of structures of criminal networks Biological networks M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 3/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Link Prediction Approaches Dyadic : Computation of link score for unlinked vertices Structural : Mining rules for evolution of sub-graphs Topology based : Attributes computed for graph Node-feature based : Attributes computed for nodes Hybrid : Combination of the two Temporal : Consider dynamics of the networks Static : Do not consider the dynamics of a network M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 4/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Link Prediction Approaches Dyadic : Computation of link score for unlinked vertices Topology based : Attributes computed for graph Temporal : Consider dynamics of the networks M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 4/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Dyadic Topological Approaches Work of [Liben-Nowell & al.,2007] M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 5/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Dyadic Topological Approaches Work of [Liben-Nowell & al.,2007] M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 6/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Combining the effect of different topological measures: Application of supervised machine learning algorithms Work of [Hasan & al., 2006] Examples: ( Node x , Node y ) − → [ a 0 , a 1 , ...., a n ] [3 , 1 , Positive ] [1 , 0 . 33 , Negative ] . . . M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 7/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Combining the effect of different topological measures: Application of supervised machine learning algorithms Work of [Benchettara & al.,2010] M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 8/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Combining the effect of different topological measures: Application of supervised machine learning algorithms Work of [Benchettara & al.,2010] M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 9/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Combining the effect of different topological measures: Application of supervised machine learning algorithms Work of [Benchettara & al.,2010] Can we apply rank aggregation methods ? M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 9/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Rank Aggregation (Social choice theory) Combining various lists of ranked candidates to find a single list with minimum possible disagreement Expert 1 = ⇒ L 1 = [ A , B , C , D ] Expert 2 = ⇒ L 2 = [ B , D , A , C ] Expert 3 = ⇒ L 3 = [ C , D , A , B ] ... ... ... Expert n = ⇒ L n = [ D , C , A , B ] ——————————————— L aggregate = [? , ? , ? , ?] M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 10/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Supervised Rank Aggregation Combining different rankings to get an aggregation giving different weights to the experts w 1 ← Expert 1 = ⇒ L 1 → [ k elements ] w 2 ← Expert 2 = ⇒ L 2 → [ k elements ] w 3 ← Expert 3 = ⇒ L 3 → [ k elements ] ... ... ... w n ← Expert n = ⇒ L n → [ k elements ] We propose Link prediction based on Supervised Borda 1 Supervised Kemeny 2 M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 11/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Supervised Borda Aggregation Borda score: n � B ( x ) = B L i ( x ) ; where B L i ( x ) = { count ( y ) | L i ( y ) > L i ( x )& y ∈ L i } (1) i =1 Supervised Borda score: n � B ( x ) = w i ∗ B L i ( x ) (2) i =1 NOTE: L i ( x ) represent the rank (or index) of element x in input list L i . The lower the value of rank, the higher is the preference. U is the set of all elements in the lists. M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 12/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Kemeny Optimal Aggregation [Dwork & al.,2001] Based on relative ranking of elements NP-hard Approximate Kemeny M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 13/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Supervised Kemeny Aggregation Inputs: Ranked lists [ L 1 , L 2 , . . . , L n ], Weights [ w 1 , w 2 , . . . , w n ] , each list with m elements ( U ) Steps: Initial aggregation R 1 ∀ ( x , y ) ∈ R , Compute 2 score ( x , y ) = � n i =1 ( w i ∗ Pref i ( x , y )) where � 0 if y ≻ x i . e . L i ( x ) > L i ( y ) Pref i ( x , y ) = 1 if x ≻ y i . e . L i ( x ) < L i ( y ) If score ( x , y ) > w T where w T = � n i =1 w i , then x ≻ w y 3 2 Apply a sorting algorithm on R : Swap ( x , y ) only if y ≻ w x 4 R is the final aggregation. 5 M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 14/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Supervised Kemeny Aggregation M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 15/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Link Prediction based on Supervised Rank Aggregation Examples : ( Node x , Node y ) − → [ a 0 , a 1 , a 2 , ...., a n ] Steps: Learning Rank learning examples by attribute values 1 Consider only top t examples and compute attribute weight w a i 2 Validation Rank test examples by attribute to get n ranked lists 1 Apply supervised rank aggregation 2 Consider only top k examples of the aggregate list and compute 3 performance. M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 16/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Link Prediction Based on Supervised Rank Aggregation Computation of attribute weights: Maximization of identification positive examples: W a i = n ∗ Precision a i (3) where n is the total number of attributes and Precision a i is the precision of attribute a i . precision = fraction of retrieved examples that are really positive Minimization of identification of negative examples: W a i = n ∗ (1 − FPR a i ) (4) where n is the total number of attributes FPR a i is the false positive rate of attribute a i . false positive rate = fraction of negative examples retrieved as positive M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 17/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion DBLP data Author-Document bipartite graphs Graph Datasets Training Time Authors Publications Edges Dataset1 [1970,1973] 2661 1487 6634 Dataset2 [1972,1975] 4536 2542 10855 Projected graphs Author Graph Publication Graph Datasets Nodes Edges Nodes Edges Dataset1 2661 2575 1487 1520 Dataset2 4536 4510 2542 2813 Examples Training Labeling Testing Training examples Test examples Datasets Time Time Time Pos Neg Pos Neg Dataset1 [1970,1973] [1974,1975] [1971,1974] 30 1663 41 3430 Dataset2 [1972,1975] [1976,1977] [1973,1976] 87 19245 82 18675 M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 18/28
Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion Topological Attributes Neighborhood-based attributes: Common neighbors : VC ( x , y ))= � Γ( x ) ∩ Γ( y ) � Jaccard’s coefficient : JC ( x , y ))= � Γ( x ) ∩ Γ( y ) � � Γ( x ) ∪ Γ( y ) � 1 Adamic Adar: AD ( x , y )= � log � Γ( z ) � [Adamic & al.2003] z ∈ Γ( x ) ∩ Γ( y ) Preferential attachment: AP ( x , y )= � Γ( x ) × Γ( y ) � [Huang & al., 2005] Distance-based attributes: Shortest path distance(Dis) Katz: Katz ( x , y ) = Σ ∞ l =1 β ℓ × � path ( ℓ ) x , y � , where path ( ℓ ) x , y is the number of paths between x and y of length ℓ and β is a positive parameter which favours shortest paths [Katz,1953] Maximum forest algorithm (MFA) [Fouss & al., 2007] Centrality-based attributes: Product of PageRank (PPR)[Brin & al., 1998] Product of degree centrality (PCD) Product of clustering coefficient (PCF) M. Pujari & R. Kanawati Link Prediction by Supervised Rank Aggregation 19/28
Recommend
More recommend