Identifying and Characterizing Nodes Important to Community Using - PowerPoint PPT Presentation

Identifying and Characterizing Nodes Important to Community Using the Spectrum of the graph

Citation => Published in volume 6 of the journal PLoS ONE’s November 2011 edition => Authors: Yang Wang, Zengru Di, Ying Fan all from the department of Systems Science, Beijing Normal University, China

Overview • Networks represent the interaction structure among components in a wide range of real complex systems • Exploring network communities • reveals the network • provides new aspect of dynamic processes • uncovers the relationship among the nodes • This paper devices a new approach to identify the important nodes without knowing the exact partition of the network

Construction • Based on the implication that the Spectrum of the adjacency matrix gives indication of community structure in network • Distinguishes the critical nodes as • community core - eigenvalues • bridge – graph Laplacian • Experiments on synthetic and real networks

Definitions • Eigen vector: A non-zero column vector v is a eigenvector of a matrix A iff there exists a number λ such that Av= λ v. • Eigen value: The number λ is called the eigen value corresponding to that eigenvector v.

Identifying important nodes • Proposed Method: A Centrality Metric based on the spectrum of Adjacency Matrix • Definitions: Binary network G=(V,E) • |V| = m, |E| = n • Eigenvectors are orthogonal and normalized • Objective Function : • Maximize eigenvalues ( λ ) using perturbation theory • where P k is the relative change in the c largest eigenvalues as node k is removed

Centrality Metric • where V ik is the k th element of v i and P k lies in the interval [0,1]. If a node k is important to the community structure, P k will be large • In a network with n nodes and c communities, • To scale the index to 1, I k = P k / c where • If the index I is large than 1/n , it is an important node

Distinguish two kinds of important nodes • RatioCut Technique: | C i | is the size of the community C i . Ratio cut problem reduces to Mincut problem when the sizes of the communities are almost the same. • Case 1: c = 2 Index vector s with N elements

Continued • RatioCut function becomes:: L is the graph Laplacian defined as L ij =-A ij for i≠j and L ii =k i where k i is the degree of node i . Also there are two constraints on s

Continued • The partition problem can be devised as the following minimization problem • Solution to this problem is found to be the eigenvector corresponding to the second-smallest eigenvalue of L , denoted by u 2 • Community core nodes: | u i 2 | is relatively large • bridge nodes: | u i 2 | is near zero

Continued • Case 2: c > 2 A new n x c -index matrix S is defined as s i,j = 1 /√| C j | if vertex i є C j , else 0 RatioCut= Tr( S T LS ). L is a symmetric matrix which can be written as L = UDU T where U is the eigenvector of L and D is the diagonal matrix of eigenvalues D ii = β i RatioCut can be written as

Continued • Defining vertex vector of i as r i and let [ r i ] j = U ij the equates can be written as given that the network has almost equal sized communities. [ G k : set of vertices in community k ] Minimizing the RatioCut equates to the maximization problem Where p is a parameter. For clear community structure, p=c can be chosen.

Continued • If the community structure is quire clear, vertex vector magnitude | r i | in the first p terms give the identity of bridge nodes, denoted by b if the index b of a given vertex is near zero, it indicates that the presence of that node results in a large RatioCut and hence it is a bridge node.

Continued • In order to scale the index to 1, a new term is defined as w k where w k = b k / c • Considering an ER random network with n nodes as a null model, index of each node would be 1/n • If w-score of any node is smaller than 1/n, this vertex has nearly equal membership in more than one community and hence it is a bridge node.

Pros of this approach • Less computational cost O(mn)

Experimental Results • Synthetic Network  The centrality metric I predicts node 1, 8 and 15 as important nodes. W-score identifies 15 as the bridge node  ΔH index also gives correct prediction, however requires significant computational cost  M can identify cores only

Experimental Results (contd.) Real World Network Zachary’s karate club (social network) with c=2  The centrality metric I identifies the community core: node 1 and node 34 (administrator and Instructor).  The w-score identifies node 3 as the overlapping node i.e. the bridge between these two communities

Zachary’s karate club visualization  The diameter of each vertex is proportional to I  Large diameter indicates important vertex  Color of each vertex is related to the index w-score  Red vertices behave like “overlapping” nodes or bridges  Yellow vertices lie inside their own communities

Word Association Network  Four communities: Intelligence, Astronomy, Light, Colors  word Bright is related to all of them. Likewise Sun  Community critical nodes: Bright, Sun, Moon, Smart  Community cores: Moon and Smart  Bridges: Bright and Sun

Scientist Collaboration Network  Network represents scientists whose research centers on the properties of networks of one kind or another  Edges placed between scientists who have published one paper together  Centrality metric I identifies the group leader: Newman, Boccaletti, Barabasi  w-score is not large as they have collaboration between scientists outside their own communities

C. Elegans neural network  Network is divided into 3 communities (sensory, interneuron, motor neuron)  Each node represents a neuron and each edge represents a synaptic connection between neurons  high centrality metric I : important interneurons ( AVA, AVB , … )  w-score is very small because most of the important nodes act as bridge since the connection between communities is more necessary

Applications in weighted networks Artificial Network  Adjacency matrix for undirected network is real and symmetric  Works well in small artificial network  10 nodes with two communities  Higher weight means closer relationship between vertices  4 and 9 are the core of the communities  11 is the bridge between communities

Applications in weighted networks (Contd.) Real Network: SFI (Santa Fe Collaboration)  SFI collaboration network  Vertices 2, 12 and 24 are group leaders (community cores)  Vertices 1, 9 and 11 are bridges  The result is different from the corresponding unweighted network  edge weight might affect the result s

Limitations  In case of many heterogeneous cluster size, the community identification fails  This limitation is a result of the adjacency matrix property  N small 2 < N large , small communities cannot be detected  δ = N large / N small  I cannot identify the important nodes in the small communities when the communities are in very different size

Conclusion/Observation  Proposed method works well in many cases without knowing the exact community structure  The number of communities must be known, although  This paper does not say anything about the effect of removing/adding any node  The underlying community structure change is not taken into consideration  The directed case is not considered which is subject to future research  The identification of such key nodes is important and could potentially be used  to identify the organizer of the community in social networks,  to develop an immunization strategy in an epidemic process,  to identify key nodes in biological networks

Identifying and Characterizing Nodes Important to Community Using - PowerPoint PPT Presentation

Identifying and Characterizing Nodes Important to Community Using the Spectrum of the graph Citation => Published in volume 6 of the journal PLoS ONEs November 2011 edition => Authors: Yang Wang, Zengru Di, Ying Fan all from the

Degree centrality Network Analysis in Python I Important nodes Which nodes are important?

Habanero Operating Committee January 25 2017 Habanero Overview 1. Execute Nodes 2. Head Nodes

Minimum Number Of Nodes Minimum number of nodes in a binary tree whose height is h. At

Minimum Number Of Nodes Minimum number of nodes in a binary tree whose height is h. At

Minimum Number Of Nodes Minimum number of nodes in a binary tree whose height is h. At

On Inferring and Characterizing On Inferring and Characterizing Internet Routing Policies

Detecting and Detecting and Characterizing Heterogeneity Characterizing Heterogeneity

Chemspace KNIME nodes Chemspace Search Chemspace KNIME nodes Chemspace Search and Chemspace

Chemspace KNIME nodes Expanded Search Chemspace KNIME nodes Chemspace Search and Chemspace

Tree A tree consists of a set of nodes and a set of edges that connect pairs of nodes.

What are Graphs? Nodes and Edges A graph consists of dots called nodes or vertices.

The effects of dangling nodes on citation networks Erjia Yan & Ying Ding ISSI 2011 - June

Alexander Lee: C: elegans metabolic network Graph of C. elegans metabolic network. Note that

Identifying and characterizing Sybils in the Tor network August 12, 2016 USENIX Security

Proteomics The process of identifying, characterizing, and quantifying all expressed proteins in

Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud

Mechanically checked proof on Dijkstras shortest path algorithm Qiang Zhang J Moore October

Effective and Efficient Malware Detection at the End Host Clemens KOLBITSCH, Paolo MILANI

Response Time-Optimized Distributed Cloud Resource Allocation Matthias Keller Holger Karl

Measuring*Pay-Per-Install:** TheCommodi8za8onofMalwareDistribu8on* JuanCaballero,

REAPer Adaptive Micro-Source Energy-Harvester for Wireless Sensor Nodes SenseApp 2017 Ulf Kulau,

V2G Injector Whispering to cars and charging units through the Power-Line By Sbastien Dudek

Self-stabilization and expansion of a simple dynamic random graph model for Bitcoin-like

Broadening the Differential: Disclosures Spine and Lower Extremity Injuries in the Young Athlete

Sambuz

Useful Links

Newsletter

Mail Us

Identifying and Characterizing Nodes Important to Community Using - PowerPoint PPT Presentation

Identifying and Characterizing Nodes Important to Community Using the Spectrum of the graph Citation => Published in volume 6 of the journal PLoS ONEs November 2011 edition => Authors: Yang Wang, Zengru Di, Ying Fan all from the

Degree centrality Network Analysis in Python I Important nodes Which nodes are important?

Habanero Operating Committee January 25 2017 Habanero Overview 1. Execute Nodes 2. Head Nodes

Minimum Number Of Nodes Minimum number of nodes in a binary tree whose height is h. At

Minimum Number Of Nodes Minimum number of nodes in a binary tree whose height is h. At

Minimum Number Of Nodes Minimum number of nodes in a binary tree whose height is h. At

On Inferring and Characterizing On Inferring and Characterizing Internet Routing Policies

Detecting and Detecting and Characterizing Heterogeneity Characterizing Heterogeneity

Chemspace KNIME nodes Chemspace Search Chemspace KNIME nodes Chemspace Search and Chemspace

Chemspace KNIME nodes Expanded Search Chemspace KNIME nodes Chemspace Search and Chemspace

Tree A tree consists of a set of nodes and a set of edges that connect pairs of nodes.

What are Graphs? Nodes and Edges A graph consists of dots called nodes or vertices.

The effects of dangling nodes on citation networks Erjia Yan &amp; Ying Ding ISSI 2011 - June

Alexander Lee: C: elegans metabolic network Graph of C. elegans metabolic network. Note that

Identifying and characterizing Sybils in the Tor network August 12, 2016 USENIX Security

Proteomics The process of identifying, characterizing, and quantifying all expressed proteins in

Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud

Mechanically checked proof on Dijkstras shortest path algorithm Qiang Zhang J Moore October

Effective and Efficient Malware Detection at the End Host Clemens KOLBITSCH, Paolo MILANI

Response Time-Optimized Distributed Cloud Resource Allocation Matthias Keller Holger Karl

Measuring*Pay-Per-Install:** The*Commodi8za8on*of*Malware*Distribu8on* Juan*Caballero,*

REAPer Adaptive Micro-Source Energy-Harvester for Wireless Sensor Nodes SenseApp 2017 Ulf Kulau,

V2G Injector Whispering to cars and charging units through the Power-Line By Sbastien Dudek

Self-stabilization and expansion of a simple dynamic random graph model for Bitcoin-like

Broadening the Differential: Disclosures Spine and Lower Extremity Injuries in the Young Athlete

Sambuz

Useful Links

Newsletter

Mail Us

The effects of dangling nodes on citation networks Erjia Yan & Ying Ding ISSI 2011 - June

Measuring*Pay-Per-Install:** TheCommodi8za8onofMalwareDistribu8on* JuanCaballero,