Community structure in networks Argimiro Arratia & Ramon - PowerPoint PPT Presentation

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Back to methods for detection of community structure [Fortunato, 2010] Community structure in networks Argimiro Arratia & Ramon Ferrer-i-Cancho Universitat Polit` ecnica de Catalunya Version 0.6 Complex and Social Networks (2020-2021) Master in Innovation and Research in Informatics (MIRI) Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Back to methods for detection of community structure [Fortunato, 2010] Instructors ◮ Ramon Ferrer-i-Cancho, rferrericancho@cs.upc.edu, http://www.cs.upc.edu/~rferrericancho/ ◮ Argimiro Arratia, argimiro@cs.upc.edu, http://www.cs.upc.edu/~argimiro/ Please go to http://www.cs.upc.edu/~csn for all course’s material, schedule, lab work, etc. Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Back to methods for detection of community structure [Fortunato, 2010] What is community structure? Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Back to methods for detection of community structure [Fortunato, 2010] Why is community structure important? Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Back to methods for detection of community structure [Fortunato, 2010] .. but don’t trust visual perception it is best to use objective algorithms Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Hierarchical clustering algorithms Back to methods for detection of community structure [Fortunato, 2010] Contents Clustering algorithms (General outlook) Hierarchical clustering algorithms Quantifying the quality of community structure [Yang and Leskovec, 2012] Back to methods for detection of community structure [Fortunato, 2010] Girvan-Newman algorithm Modularity optimization algorithms Graph partitioning algorithms Clique percolation method Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Hierarchical clustering algorithms Back to methods for detection of community structure [Fortunato, 2010] Clustering algorithms (General outlook) Clustering algorithms are either: ◮ Agglomerative: begin with singleton groups and Hierarchical join successively by similarity. E.g. Lovain algorithm ◮ Divisive: begin with one group containing all points and divide successively. E.g. Girvan-Newman Partitional separate points in arbitrary number of groups and exchange elements according to similarity. E.g k -means, graph partition. Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Hierarchical clustering algorithms Back to methods for detection of community structure [Fortunato, 2010] Clustering algorithms (General outlook) Similarity It is desirable that it has the properties of a distance metric (except possibly for triangle inequality which may not hold if graph is not complete). ◮ d ( x , y ) ≥ 0 and d ( x , d ) = 0 ◮ d ( x , y ) = d ( y , x ) ◮ d ( x , y ) ≤ d ( x , z ) + d ( z , y ) (triangle inequality) This is to guarantee convergence of clustering algorithms, usually based on greedy selection. If a distance d ( x , y ) is considered then we talk about dissimilarity : high values d ( x , y ) mean low similarity. Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Hierarchical clustering algorithms Back to methods for detection of community structure [Fortunato, 2010] Clustering algorithms (General outlook) If want to interpret high value of similarity as high similarity, and we are working with distance metric d ( x , y ), the consider its inverse: s ( x , y ) = 1 / d ( x , y ) or 1 / d ( x , y ) + 0 . 5. NB: We are here concern with clustering elements with an already defined rule of association (i.e. networks); hence similarity will reflect some structural property of the network. Other form of clustering (in statistical analysis) is on elements described by features from which one defines a similarity network (complete graph). Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Hierarchical clustering algorithms Back to methods for detection of community structure [Fortunato, 2010] Similarity measures w ij for nodes I When network cannot be embedded in Euclidean space and similarity must be inferred from the adjacency relation between vertices (implicit similarity) Let A be the adjacency matrix of the network, i.e. A ij = 1 if ( i , j ) ∈ E and 0 otherwise. ◮ Jaccard index: � w ij = | Γ( i ) ∩ Γ( j ) | k A ik A kj | Γ( i ) ∪ Γ( j ) | = � k ( A ik + A jk ) where Γ( i ) is the set of neighbors of node i Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Hierarchical clustering algorithms Back to methods for detection of community structure [Fortunato, 2010] Similarity measures w ij for nodes II ◮ Cosine similarity: ( From the equation xy = | x || y | cos θ ) � k A ik A kj n ij w ij = = (recall A ij = 1 or 0) �� k i k j k A 2 k A 2 ik jk where: ◮ n ij = | Γ( i ) ∩ Γ( j ) | = � k A ik A kj , and ◮ k i = � k A ik is the degree of node i ◮ Another normalization for n ij : the idea is to normalize by the expected number of common neighbors, if neighbors were chosen uniformly at random. This is approximately k i k j / n . And so � k A ik A kj n ij w ij = k i k j / n = n � � k A ik k A jk Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Hierarchical clustering algorithms Back to methods for detection of community structure [Fortunato, 2010] Similarity measures w ij for nodes III ◮ Euclidean distance: or rather Hamming distance since A is binary (a dissimilarity) � ( A ik − A jk ) 2 d ij = k ◮ Normalized Euclidean distance: 1 (also a dissimilarity) k ( A ik − A jk ) 2 � n ij d ij = = 1 − 2 k i + k j k i + k j ◮ Pearson correlation coefficient � k ( A ik − µ i )( A jk − µ j ) r ij = cov ( A i , A j ) = σ i σ j n σ i σ j � where µ i = 1 1 � k A ik and σ i = � k ( A ik − µ i ) 2 n n 1 Uses the idea that maximum value of d ij is when there are no common neighbors and then d ij = 1 Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Hierarchical clustering algorithms Back to methods for detection of community structure [Fortunato, 2010] Similarity measures for sets of nodes ◮ Single linkage: s XY = x ∈ X , y ∈ Y s xy min ◮ Complete linkage: s XY = x ∈ X , y ∈ Y s xy max � x ∈ X , y ∈ Y s xy ◮ Average linkage: s XY = | X | × | Y | ◮ Ward (or minimum variance): s XY = | X | × | Y | | X | + | Y ||| c x − c y || 2 , where c x is the centroid of X : ∀ u , v ∈ X , || u − c x || 2 ≤ || u − v || 2 Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Hierarchical clustering algorithms Back to methods for detection of community structure [Fortunato, 2010] Notes on similarity measures for sets of nodes Ward’s method says:“the distance between two clusters X and Y is how much the sum of squares will increase when we merge them”. In math: || x i − c X ∪ Y || 2 − || x i − c X || 2 − � � � || x i − c Y || 2 ∆( X , Y ) = i ∈ X ∪ Y i ∈ X i ∈ Y ◮ single linkage : tends to make too small (in size) clusters ◮ complete: too big and fewer clusters ◮ average : more or less regular ◮ Ward’s : tends to minimise the total within cluster variance Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Hierarchical clustering algorithms Back to methods for detection of community structure [Fortunato, 2010] Hierarchical clustering From hairball to dendogram Argimiro Arratia & Ramon Ferrer-i-Cancho Community structure in networks

Community structure in networks Argimiro Arratia & Ramon - PowerPoint PPT Presentation

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Back to methods for detection of community structure [Fortunato, 2010] Community structure in networks Argimiro Arratia & Ramon

COMMUNITY MANAGEMENT jono bacon COMMUNITY COMMUNITY COMMUNITY COMMUNITY COMMUNITY COMMUNITY

Current Network Structure for Pediatrics Hospital Networks Country, state, regional, Academic

Detecting community structure in networks M.E.J. Newmans results 1 , 2 (presented by Botond

Computer Networks I Computer Networks I Networks A networks connection structure is known as

Community Structure in Large Community Structure in Large Social and Information Networks Social

Types of networks (social networks, computer networks, entity- relationship networks, )

P2P Networks as Content P2P Networks as Content Delivery Networks Delivery Networks FINAL

Topics ! Use of networks ! Network structure ! Implementation of networks Computer Networks

Community structure in networks Argimiro Arratia & Marta Arias Universitat Polit` ecnica de

Mobile Communications Ad-Hoc Networks & Wireless Sensor Networks Ad-hoc networks

Outline Applications of Random Networks Random Networks Applications of Random Networks

STRUCTURE STRUCTURE Highlight the structure of Highlight the structure of material material

Part IV I/O System Chapter 12: Mass Storage Structure Chapter 12: Mass Storage Structure 1

The structure of cellular networks The structure of cellular networks To be able to construct and

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Mobility and cellular networks Mobility and cellular networks Cellular radio and PCS networks

Gonze, Lecture Thu. 2 1 Temperature-dependent band structures X. Gonze, Universit catholique

trt t ss s

Waived Testing: Tips for Resolving the Top Noncompliance Standards Ron S. Quicho, MS Associate

FOIA www.foia.gov FOIA AHC August 27, 2018 CivicActions Exemptions 1. Classified / national

Epidemiology of Chronic Pain Joanna G. Katzman, MD, MSPH Director UNM Pain Center and ECHO Pain

HBV Testing Linkage to Care Webinar October 30, 2018 Project Staff Principal Investigator Karen

BORN-DIGITAL PRESERVATION BASIC PRINCIPLES AND PRACTICES Classroom Building 215, University of

Outcross Bulls & BDGP Overview Q: What is an outcross bull? Ans: A bull with a

Community structure in networks Argimiro Arratia & Ramon - PowerPoint PPT Presentation

Clustering algorithms (General outlook) Quantifying the quality of community structure [Yang and Leskovec, 2012] Back to methods for detection of community structure [Fortunato, 2010] Community structure in networks Argimiro Arratia & Ramon

COMMUNITY MANAGEMENT jono bacon COMMUNITY COMMUNITY COMMUNITY COMMUNITY COMMUNITY COMMUNITY

Current Network Structure for Pediatrics Hospital Networks Country, state, regional, Academic

Detecting community structure in networks M.E.J. Newmans results 1 , 2 (presented by Botond

Computer Networks I Computer Networks I Networks A networks connection structure is known as

Community Structure in Large Community Structure in Large Social and Information Networks Social

Types of networks (social networks, computer networks, entity- relationship networks, )

P2P Networks as Content P2P Networks as Content Delivery Networks Delivery Networks FINAL

Topics ! Use of networks ! Network structure ! Implementation of networks Computer Networks

Community structure in networks Argimiro Arratia &amp; Marta Arias Universitat Polit` ecnica de

Mobile Communications Ad-Hoc Networks &amp; Wireless Sensor Networks Ad-hoc networks

Outline Applications of Random Networks Random Networks Applications of Random Networks

STRUCTURE STRUCTURE Highlight the structure of Highlight the structure of material material

Part IV I/O System Chapter 12: Mass Storage Structure Chapter 12: Mass Storage Structure 1

The structure of cellular networks The structure of cellular networks To be able to construct and

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Mobility and cellular networks Mobility and cellular networks Cellular radio and PCS networks

Gonze, Lecture Thu. 2 1 Temperature-dependent band structures X. Gonze, Universit catholique

trt t ss s

Waived Testing: Tips for Resolving the Top Noncompliance Standards Ron S. Quicho, MS Associate

FOIA www.foia.gov FOIA AHC August 27, 2018 CivicActions Exemptions 1. Classified / national

Epidemiology of Chronic Pain Joanna G. Katzman, MD, MSPH Director UNM Pain Center and ECHO Pain

HBV Testing Linkage to Care Webinar October 30, 2018 Project Staff Principal Investigator Karen

BORN-DIGITAL PRESERVATION BASIC PRINCIPLES AND PRACTICES Classroom Building 215, University of

Outcross Bulls &amp; BDGP Overview Q: What is an outcross bull? Ans: A bull with a

Community structure in networks Argimiro Arratia & Marta Arias Universitat Polit` ecnica de

Mobile Communications Ad-Hoc Networks & Wireless Sensor Networks Ad-hoc networks

Outcross Bulls & BDGP Overview Q: What is an outcross bull? Ans: A bull with a