A Na¨ ıve Bayes model based on overlapping groups for link prediction in online social networks Jorge Valverde-Rebaza and Alneu de Andrade Lopes Laboratory of Computational Intelligence (LABIC) University of S˜ ao Paulo (USP) Brazil April 2015
Outline Introduction 1 Proposal 2 Experiments 3 Conclusions 4 Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 2 / 27
Outline Introduction 1 Proposal 2 Experiments 3 Conclusions 4 Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 3 / 27
Social Networks Structure made up of a set of actors (individuals or organizations) and social relations between them. SNA is an interesting research field in graph and complex network theory, data mining, machine learning and other areas. Rise of online social networks (OSN). Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 4 / 27
Groups detection Real networks are characterized by high concentration of links within special groups of vertices and low concentrations of links among these groups. Online social networks (OSNs) offer a wide variety of possible (overlapping) groups: families, working and friendship circles, artistic or academic preferences, towns, nations, etc. Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 5 / 27
Link Prediction (LP) process Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 6 / 27
Presence of groups 7 8 2 3 11 9 4 1 6 10 12 13 5 14 15 17 16 Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 7 / 27
b b b a a a 7 7 7 8 8 8 c c 2 2 2 3 3 3 11 11 11 9 9 9 4 4 4 1 1 1 6 6 6 10 10 10 12 12 12 13 13 13 5 5 5 14 14 14 15 15 15 d 17 17 17 16 16 16 Presence of overlapping groups a 7 8 2 3 11 9 4 1 6 10 12 13 5 14 15 17 16 Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 8 / 27
b b a a 7 7 8 8 c c 2 2 3 3 11 11 9 9 4 4 1 1 6 6 10 10 12 12 13 13 5 5 14 14 15 15 d 17 17 16 16 Presence of overlapping groups b a a 7 7 8 8 2 2 3 3 11 11 9 9 4 4 1 1 6 6 10 10 12 12 13 13 5 5 14 14 15 15 17 17 16 16 Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 8 / 27
b a 7 8 c 2 3 11 9 4 1 6 10 12 13 5 14 15 d 17 16 Presence of overlapping groups b b a a a 7 7 7 8 8 8 c 2 2 2 3 3 3 11 11 11 9 9 9 4 4 4 1 1 1 6 6 6 10 10 10 12 12 12 13 13 13 5 5 5 14 14 14 15 15 15 17 17 17 16 16 16 Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 8 / 27
Presence of overlapping groups b b b a a a a 7 7 7 7 8 8 8 8 c c 2 2 2 2 3 3 3 3 11 11 11 11 9 9 9 9 4 4 4 4 1 1 1 1 6 6 6 6 10 10 10 10 12 12 12 12 13 13 13 13 5 5 5 5 14 14 14 14 15 15 15 15 d 17 17 17 17 16 16 16 16 Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 8 / 27
Link Prediction in the presence of overlapping groups b a 7 8 c 2 3 11 9 4 1 6 10 12 13 5 s 14,15 14 15 d 17 16 Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 9 / 27
Outline Introduction 1 Proposal 2 Experiments 3 Conclusions 4 Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 10 / 27
LP measures Traditional [L¨ u and Zhou, 2011] Common Neighbors (CN) Adamic Adar (AA) Jaccard (Jac) Resource Allocation (RA) Preferential Attachment (PA) Others Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 11 / 27
LP measures Traditional [L¨ u and Zhou, 2011] Based on the Na¨ ıve Bayes Model Common Neighbors (CN) [Liu et al., 2011] Adamic Adar (AA) Local Na¨ ıve Bayes (LNB) Jaccard (Jac) CN with Local Na¨ ıve Bayes (LNB-CN) Resource Allocation (RA) AA with Local Na¨ ıve Bayes (LNB-AA) Preferential Attachment (PA) RA with Local Na¨ ıve Bayes (LNB-RA) Others Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 11 / 27
LP measures Traditional [L¨ u and Zhou, 2011] Based on the Na¨ ıve Bayes Model Common Neighbors (CN) [Liu et al., 2011] Adamic Adar (AA) Local Na¨ ıve Bayes (LNB) Jaccard (Jac) CN with Local Na¨ ıve Bayes (LNB-CN) Resource Allocation (RA) AA with Local Na¨ ıve Bayes (LNB-AA) Preferential Attachment (PA) RA with Local Na¨ ıve Bayes (LNB-RA) Others Based on Overlapping Groups Information [Valverde-Rebaza and Lopes, 2014] CN Within and Outside of Common Groups (WOCG) CN of Groups (CNG) CN with Total and Partial Overlapping of Groups (TPOG) Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 11 / 27
LP measures Traditional [L¨ u and Zhou, 2011] Based on the Na¨ ıve Bayes Model Common Neighbors (CN) [Liu et al., 2011] Adamic Adar (AA) Local Na¨ ıve Bayes (LNB) Jaccard (Jac) CN with Local Na¨ ıve Bayes (LNB-CN) Resource Allocation (RA) AA with Local Na¨ ıve Bayes (LNB-AA) Preferential Attachment (PA) RA with Local Na¨ ıve Bayes (LNB-RA) Others Based on Overlapping Groups Information Our proposals [Valverde-Rebaza and Lopes, 2014] Group Na¨ ıve Bayes (GNB) CN Within and Outside of Common CN with Group Na¨ ıve Bayes (GNB-CN) Groups (WOCG) AA with Group Na¨ ıve Bayes (GNB-AA) CN of Groups (CNG) RA with Group Na¨ ıve Bayes (GNB-RA) CN with Total and Partial Overlapping of Groups (TPOG) Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 11 / 27
Definitions Given the network G ( V , E ) with M > 1 groups identified by different group labels g 1 , g 2 , . . . , g M . Each node x ∈ V belongs to a set of node groups G α = { g a , g b , . . . , g p } with size P > 0 and P ≤ M . When a node x belongs to a set of node groups G α , this node is represented as x G α . The overlapping groups neighborhood of a node: Γ G ( x ) = { y G β | (( x G α , y G β ) ∈ E ∨ ( y G β , x G α ) ∈ E ) ∧ G α ∩ G β � = ∅ } . The overlapping groups degree of a node: k G ( x ) = | Γ G ( x ) | . The set of common neighbors of groups is defined as: Λ G x , y = Γ G ( x ) ∩ Γ G ( y ) . We define the overlapping groups clustering coefficient of a node: ∆ G C G x , where ∆ G x and Λ G x = x are respectively the number of x ∆ G x + Λ G connected and disconnected pair of nodes whose common neighbors of groups include x . Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 12 / 27
Group Na¨ ıve Bayes We denote by L x , y and L x , y the class variables of link existence and nonexistence, respectively. Thus, the posterior probability of connection and disconnection of the pair ( x , y ) given its set of common neighbors of groups are: P ( L x , y ) P (Λ G P ( L x , y ) P (Λ G x , y | L x , y ) x , y | L x , y ) P ( L x , y | Λ G P ( L x , y | Λ G x , y ) = x , y ) = P (Λ G P (Λ G x , y ) x , y ) We define the ratio between these equations define the likelihood score s x , y . Decomposing P (Λ G x , y | L x , y ) = � x , y P ( z | L x , y ) and z ∈ Λ G P (Λ G x , y | L x , y ) = � x , y P ( z | L x , y ) , we have: z ∈ Λ G s x , y = P ( L x , y ) P ( L x , y ) P ( L x , y | z ) � z ∈ Λ G P ( L x , y ) P ( L x , y ) P ( L x , y | z ) x , y Considering that P ( L x , y | z ) = C G z and P ( L x , y | z ) = 1 − C G z , we define the group na¨ ıve Bayes (GNB) measure as: s GNB x , y Ω − 1 N G = � z ∈ Λ G x , y z z = ∆ G z + 1 z + 1 and Ω = P ( L x , y ) where N G P ( L x , y ) . Λ G Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 13 / 27
Group Na¨ ıve Bayes Forms From the GNB equation, we add an exponent f ( k G ( x )) to Ω − 1 N G z , where f is a function of overlapping groups degree. Using Log function on both sides, we obtain the next linear equation: s GNB ′ x , y f ( k G ( z )) log (Ω − 1 N G = � z ) z ∈ Λ G x , y Here we consider three forms of function f : f ( k G ( x )) = 1, 1 1 f ( k G ( x )) = log ( k G ( x )) and f ( k G ( x )) = k G ( x ) , which are corresponding to the group na¨ ıve Bayes form of CN, AA and RA, respectively: s GNB − CN = | Λ G x , y | log (Ω − 1 ) + � x , y log ( N G z ) x , y z ∈ Λ G 1 s GNB − AA log ( k G ( z )) ( log ( N G z ) + log (Ω − 1 )) = � z ∈ Λ G x , y x , y s GNB − RA k G ( z ) ( log ( N G 1 z ) + log (Ω − 1 )) = � x , y z ∈ Λ G x , y Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 14 / 27
Outline Introduction 1 Proposal 2 Experiments 3 Conclusions 4 Jorge Valverde-Rebaza A NB model on overlapping groups for link prediction in OSN 15 / 27
Recommend
More recommend