Overview Complex Network Analysis Outlook M ODULARITY O PTIMIZATION L IMITATIONS Hypothesis The best partition of a graph is the one that maximizes the modularity. If a network has a community structure, then it is possible to find a precise partition with maximal modularity If a network has a community structure, then partitions inducing high modularity values are structurally similar. All three hypothesis do not hold [GdMC10, LF11].
Overview Complex Network Analysis Outlook P ROPAGATION -B ASED A PPROACHES Algorithm 1 Label propagation Require: G = < V , E > a connected graph, 1: Initialize each node with unique label l v 2: while Labels are not stable do for v 2 V do 3: | Γ l ( v ) | /* random tie-breaking */ l v = arg max l end for 4: 5: end while 6: return communities from labels Γ l ( v ) : set of neighbors having label l
Overview Complex Network Analysis Outlook L ABEL P ROPAGATION Advantages I Complexity : O ( m ) I Highly parallel Disadvantages I No convergence guarantee, oscillation phenomena I Low robustness Different runs yield very different community structure due to randomness
Overview Complex Network Analysis Outlook S EED - CENTRIC ALGORITHMS (K ANAWATI , SCSM'2014) Algorithm 2 General seed-centric community detection algorithm Require: G = < V , E > a connected graph, 1: C ; 2: S compute seeds(G) 3: for s 2 S do C s compute local com(s,G) 4: C C + C s 5: 6: end for 7: return compute community( C )
Overview Complex Network Analysis Outlook L INK P REDICTION Link predction I Structural Find hidden/missing links in a network ex. Missing links in Wikipedia I Temporal Predicting new links to appear at time t p based on the network state at instants t < t p Readings I M. Pujari et. al. , Link prediction in complex networks , chapter 3, in Advanced methods for complex networks Analysis, N. Meghanthan (Editor), IGI publishing, 2016.
Overview Complex Network Analysis Outlook O UTLOOK : A LTERNATIVE NETWORK MODELS Network science is mature enough to a move towards more complex, expressive models ⌅ K-partite networks
Overview Complex Network Analysis Outlook A LTERNATIVE NETWORK MODELS Network science is mature enough to a move towards more complex, expressive models ⌅ K-partite networks ⌅ Dynamic networks
Overview Complex Network Analysis Outlook A LTERNATIVE NETWORK MODELS Network science is mature enough to a move towards more complex, expressive models ⌅ K-partite networks ⌅ Dynamic networks ⌅ Heterogeneous networks ⌅ Multiplex networks
Overview Complex Network Analysis Outlook A LTERNATIVE NETWORK MODELS Network science is mature enough to a move towards more complex, expressive models ⌅ K-partite networks ⌅ Dynamic networks ⌅ Heterogeneous networks ⌅ Multiplex networks ⌅ Attributed networks
Overview Complex Network Analysis Outlook A LTERNATIVE NETWORK MODELS Network science is mature enough to a move towards more complex, expressive models ⌅ K-partite networks ⌅ Dynamic networks ⌅ Heterogeneous networks ⌅ Multiplex networks ⌅ Attributed networks Next: A powerful model : Multiplex Network
Mining Attributed Networks Part 2 – Analysis of Multiplex Networks Rushed Kanawati, Martin Atzmueller A 3 , Universit´ e Sorbonne Paris Cit´ e, France CSAI, Tilburg University, Netherlands DSAA'17, Tokyo – 20 October 2017
O UTLINE 1 Multiplex networks: Overview & Definitions 2 Analysis of multiplex networks 1 Network Measures 2 Analysis tasks 3 Evaluation 3 Conclusions
M ULTIPLEX N ETWORK : D EFINITIONS G = < V , E , C > I V set of nodes I E = { E 1 , . . . , E α } : 8 k 2 [ 1 , α ] E k ✓ V ⇥ V I C Layer coupling links from [Mucha et. al., 2010] Coupling I Ordinal Coupling : Diagonal inter-layer links among consecutive layers. I Categorical Coupling : Diagonal inter-layer links between all pairs of layers. I Generalized coupling ? Ex. Decay functions
N OTATION Notation I A [ k ] Adjacency Matrix of slice k : a [ k ] ij 6 = 0 iff ( v i , v j ) 2 E k , 0 otherwise. I m [ k ] = | E k | . Often, we have m ⇠ n I Neighbors of v in slice k : Γ ( v ) [ k ] = { x 2 V : ( x , v ) 2 E k } . I All neighbors of v : Γ ( v ) tot = [ s ∈ { 1 ,..., α } Γ ( v ) [ s ] v = k Γ ( v ) [ k ] k I Node degree in slice k : d k I Total degree of node v : d tot = || Γ tot ( v ) || v
M ULTIPLEX NETWORKS : R ELATED TERMS Recommended readings a et. al. . Multilayer Networks . arXiv:1309.7233, / S. Mikko Kivel¨ March 2014
P OWER OF THE MULTIPLEX MODEL Multi-relational networks European airports network
P OWER OF MULTIPLEX MODEL Dynamic networks Academic collaborations per year
P OWER OF THE MULTIPLEX MODEL Heterogeneous networks DBLP author-centred multiplex network
M ULTIPLEX NETWORKS : M EASURES ⌅ Need of generalization of the usual measures : Degree Neighborhood Centralities Paths and distances Clustering coefficient . . . ⌅ New layer-oriented questions to answer : Which layers determine the centrality of a user Which layers are relevant to measure the similarity of two nodes How one layer influence the evolution of another . . .
A PPROACHES 1 Transformation into a monoplex centred problem I Layer aggregation approaches. I Hypergraph transformation based approaches I Ensemble approaches 2 Generalization of monoplex oriented algorithms to multiplex networks
L AYER AGGREGATION
L AYER AGGREGATION Aggregation functions 8 9 1 l α : A [ l ] 1 ij 6 = 0 α A ij = 1 < w k A [ k ] X A ij = ij α 0 otherwise : k = 1 A ij = k { d : A [ d ] A ij = sim ( v i , v j ) 6 = 0 } k ij
K- UNIFORM HYPERGRAPH TRANSFORMATION Principle I A k-uniform hypergraph is a hypergraph in which the cardinality of each hyperedge is exactly k I Mapping a multiplex to a 3-uniform hypergraph H = ( V , E ) such that : V = V [ { 1 , . . . , α } ( u , v , i ) 2 E if 9 l : A [ l ] uv 6 = 0, u , v 2 V , i 2 { 1 , . . . , α } I Apply hypergraphs analysis approaches (Ex. tensor-based approaches)
M ULTIPLEX : N ODE NEIGHBORHOOD Some options I Γ mux ( v ) = [ α k = 1 Γ k ( v ) I Γ mux ( v ) = \ α k = 1 Γ k ( v ) I Γ mux ( v ) = { x 2 Γ ( v ) tot : sim ( x , v ) � δ } δ 2 [ 0 , 1 ] I Γ mux ( v ) = { x 2 Γ ( v ) tot : Γ ( v ) tot \ Γ ( x ) tot Γ ( v ) tot [ Γ ( x ) tot � δ } I . . .
P ATHS , SHORTEST DISTANCE Some options I Path in an aggregated network P m α = 1 d ( u , v ) [ α ] I d average = 8 u , v 2 V and ( u , v ) / 2 E i . m I path � length ( u , v ) = < r 1 , r 2 , . . . , r α > where r i number of links in layer i j < r y j r y I path x ( u , v ) dominates path y ( u , v ) 9 j : r x j , 8 k 6 = j r x j
W HAT ABOUT COMMUNITIES ? What is a dense subgraph in a multiplex network ? [BCG11]
C OMMUNITY DETECTION IN MULTIPLEX NETWORKS Approaches 1 Transformation into a monoplex community detection problem I Layer aggregation approaches. I Multi-objective optimization approach. I Ensemble clustering approaches 2 Generalization of monoplex oriented algorithms to multiplex networks . I Generalized-modularity optimization I Generalized info-map I Generalized walktrap I Seed-centric approaches
M ULTI - OBJECTIVE OPTIMIZATION APPROACH [AP14] 1 Rank the set of α layers according to some importance criteria C 1 community ( G [ 1 ] ) 2 for i 2 [ 2 , α ] do: 3 C i optimize ( community ( G [ i ] ) , similarity ( C i � 1 )) 4 return C α
E NSEMBLE CLUSTERING APPROACHES
E NSEMBLE CLUSTERING APPROACHES Ensemble Clustering [SG03] I CSPA: Cluster-based Similarity Partitioning Algorithm I HGPA: HyperGraph-Partitioning Algorithm I MCLA: Meta-Clustering Algorithm I . . .
E NSEMBLE CLUSTERING : APPROACHES CSPA: Cluster-based Similarity Partitioning Algorithm I Let K be the number of basic models, C i ( x ) be the cluster in model i to which x belongs. K P δ ( C i ( v ) , C i ( u )) i = 1 I Define a similarity graph on objects : sim ( v , u ) = K I Cluster the obtained graph : Isolate connected components after prunning edges Apply community detection approach I Complexity : O ( n 2 kr ) : n # objects, k # of clusters, r # of clustering solutions
CSPA : E XAMPLE from Seifi, M. Cœurs stables de communaut´ es dans les graphes de terrain. Th` ese de l'universit´ e Paris 6, 2012
E NSEMBLE CLUSTERING : ILLUSTRATION I
E NSEMBLE CLUSTERING : ILLUSTRATION II
E NSEMBLE CLUSTERING : ILLUSTRATION III
E NSEMBLE CLUSTERING : ILLUSTRATION IV
E NSEMBLE CLUSTERING : ILLUSTRATION V
E NSEMBLE CLUSTERING : ILLUSTRATION VI
E NSEMBLE CLUSTERING : ILLUSTRATION VII
M ULTIPLEX M ODULARITY [MRM + 10] Generalized modularity I 0 0 1 1 d [ k ] i d [ k ] Q multiplex ( P ) = 1 @ A [ k ] j X X A δ kl + δ ij C kl ij � λ k @ ij A 2 m [ k ] 2 µ c 2 P i , j 2 c k , l : 1 ! α m [ k ] + C l P I µ = jk j 2 V k , l : 1 ! α I C kl ij Inter slice coupling = 0 8 i 6 = j
S EED - CENTRIC ALGORITHMS [K AN 14] Algorithm 1 General seed-centric community detection algorithm Require: G = < V , E > a connected graph, 1: C ; 2: S compute seeds(G) 3: for s 2 S do C s compute local com(s,G) 4: C C + C s 5: 6: end for 7: return compute community( C )
T HE L ICOD ALGORITHM [YK14] 1 Compute a set of seeds that are likely to be leaders in their communities Heuristic : nodes having higher degree centralities than their neighbors 2 Each node in the graph ranks seeds in function of its own preference In function of increasing shortest path (length) 3 Iterate till convergence: Each node modifies its preference vector in function of neighbor's preferences Applying rank aggregation methods.
M UX L ICOD Multiplex degree centrality [BNL13] d [ k ] d [ k ] α ! d multiplex X i i = � log i d [ tot ] d [ tot ] k = 1 i i Multiplex shortest path α SP ( u , v ) [ k ] P SP ( u , v ) multiplex = k = 1 α Multiplex neighborhood Γ mux ( v ) = { x 2 Γ ( v ) tot : Γ ( v ) tot \ Γ ( x ) tot Γ ( v ) tot [ Γ ( x ) tot � δ }
R ANK AGGREGATION [PK12, DKNS01]
O THER ALGORITHMS 1 Random walk based approach (Generalization of Walktrap [KM15] 2 Generalized infomap [DLAR15]
E VALUATION CRITERIA I 1 Multiplex modularity 2 Redundancy [BCG11] k { k : 9 A [ k ] uv 6 = 0 } k X ρ ( c ) = α ⇥ k P c k ( u , v ) 2 ¯ ¯ P c ¯ ¯ P the set of couple ( u
Overview Analysis Conclusion E VALUATION CRITERIA II I Variety V c : the proportion of occurrence of the community c across layers of the multiplex. k9 ( i , j ) 2 c / A [ s ] α ij 6 = 0 k X V c = (1) α � 1 s = 1 I Exclusivity ε c : number of pairs of nodes, in community c , that are connected exclusively in one layer. α k P c , s k X ε c = (2) k P c k s = 1 37 / 49
Overview Analysis Conclusion E VALUATION CRITERIA III I Homogeneity H c : How uniform is the distribution of the number of edges, in the community c , per layer. ⇢ 1 σ c = 0 if H c = (3) σ c 1 � otherwise σ max c with α k P c , s k X avg c = α s = 1 v α u ( k P c , s k � avg c ) 2 X u σ c = t α s = 1 r ( max ( k P c , d k ) � min ( k P c , d k )) 2 σ max = c 2 38 / 49
Overview Analysis Conclusion D ATASETS Benchmark networks Lazzega Lawyer network #nodes 71 #layer 3 39 / 49
Overview Analysis Conclusion D ATASETS Dataset Physicians collaboration network #nodes 246 #layers 3 40 / 49
Overview Analysis Conclusion R ESULTS : R EDUNDANCY 41 / 49
Overview Analysis Conclusion R ESULTS : C OMPLEMENTARITY 42 / 49
Overview Analysis Conclusion R ESULTS : MULTIPLEX MODULARITY 43 / 49
Overview Analysis Conclusion P ARETO FRONT 44 / 49
Overview Analysis Conclusion L AZEGA DATASET : C OMPARATIVE STUDY Figure: NMI (lower triangular part) , adjusted Rand (upper triangular part). 45 / 49
Overview Analysis Conclusion C ONCLUSIONS I Multiplex networks provide a rich representation of real-world interaction systems I A lot of work to reformulate basic network concepts for multiplex settings, e.g. Roles, RandomWalk, PageRank, etc. I Community evaluation: still an open problem I Uncovered topics : Layer selection and compression, Co-evolution models, Dynamics on multiplex networks I Ideas under exploration: I Multiplex of multiplexes I Interactive Multiplex network visualisation. I Benchmarking of available tools 46 / 49
MAN Tutorial Part III: Analysis of Attributed Networks Rushed Kanawati, Martin Atzmueller Université Sorbonne Paris Cité, France Tilburg University, Netherlands DSAA 2017, Tokyo, 2017-10-20
Agenda � Overview/Recap: Attributed Networks � Compositional Subgroup Analysis � Community Detection � Link Prediction � Summary 2
Terminology (Recap) Network è Graphs � Set of atomic entities (actors) è nodes, vertices � Set of links/edges between nodes ("ties") � Edges model pairwise relationships � Edges: Directed or undirected � Social network [Wassermann & Faust 1994] � Social structure capturing actor relations � Actors, links given by dyadic ties between actors (friendship, kinship, organizational position, …) è Set of nodes and edges � Abstract object – independent of representation 3
Variables [Wassermann & Faust 1994] � Structural � Measure ties between actors ( è links) � Specific relation � Make up connections in graph/network � Compositional � Measure actor attributes � Age � Gender � Ethnicity � Affiliation � … � Describe actors 4
Attributed Graphs � Graph: edge attributes and/or node attributes � Structure: ties/links (of respective relations) � Attributes - additional information � Actor attributes (node labels) � Link attributes (information about connections) � Attribute vectors for actors and/or links � … can be mapped from/to each other � Integration of heterogenous data (networks + vectors) � Enables simultaneous analysis of relational + attribute data 5
Attributed Network/Graph � Examples � Citation Attributes � (Co-)Authors � Affiliation � Country � Gender � … � WWW � Links � Content (BoW) � … 6 (Newman 2003)
Subgroups & Cohesive subgroups [Wasserman & Faust 1994] � Subgroup � Subset of actors (and all their ties) � Define subgroups using specific criteria (homogeneity among members) � Compositional – actor attributes � Structural – using tie structures � Detection of cohesive subgroups & communities è structural aspects � Subgroup discovery è actor attributes � … attributed graph è can combine both 7
