A common-neighbors-based random graph model for community structure - PowerPoint PPT Presentation

A common-neighbors-based random graph model for community structure Emily Fischer Cornell University May 12, 2017

A Exam Emily Fischer Outline 1. Introduction • Preferential Attachment (PA) 2. Common Neighbors Model (CN) • Degree distribution • Community structure

A Exam Emily Fischer Preferential Attachment • Users prefer to connect to nodes of high degree

A Exam Emily Fischer Preferential Attachment • Users prefer to connect to nodes of high degree • Results in heavy-tailed degree distribution

A Exam Emily Fischer Issues with Preferential Attachmment The LinkedIn graph 1. does NOT have a power law degree distribution 2. has “community structure”

A Exam Emily Fischer Log-log plots of degree distribution

A Exam Emily Fischer Issues with Preferential Attachmment The LinkedIn graph 1. does NOT have a power law degree distribution 2. has “community structure”

A Exam Emily Fischer What is “community structure”? • Strong community structure • More edges within community than between communities

A Exam Emily Fischer What is “community structure"? • Preferential attachment • One central hub around high-degree node

A Exam Emily Fischer Common Neighbors Model

A Exam Emily Fischer Common Neighbors Model Users prefer to connect to nodes with whom they share many mutual friends

A Exam Emily Fischer Common Neighbors Model Sequence of graphs ( G t ) t ≥ 0 . • Given graph G t with n ( t ) nodes and m ( t ) edges

A Exam Emily Fischer Common Neighbors Model Sequence of graphs ( G t ) t ≥ 0 . • Given graph G t with n ( t ) nodes and m ( t ) edges • At time t + 1, a new node v arrives with probability α • If no new arrival, select v uniformly among existing nodes

A Exam Emily Fischer Common Neighbors Model Sequence of graphs ( G t ) t ≥ 0 . • Given graph G t with n ( t ) nodes and m ( t ) edges • At time t + 1, a new node v arrives with probability α • If no new arrival, select v uniformly among existing nodes • Select receiving node w with probability proportional to number of common neighbors between v and w • Γ v ( t ) is the neighborhood of v at time t • K vw ( t ) = | Γ v ( t ) ∩ Γ w ( t ) | K vw ( t ) + δ P (select w | sender = v ) = � u K vu ( t ) + δ n ( t )

A Exam Emily Fischer Common Neighbors Model Sequence of graphs ( G t ) t ≥ 0 . • Given graph G t with n ( t ) nodes and m ( t ) edges • At time t + 1, a new node v arrives with probability α • If no new arrival, select v uniformly among existing nodes • Select receiving node w with probability proportional to number of common neighbors between v and w • Γ v ( t ) is the neighborhood of v at time t • K vw ( t ) = | Γ v ( t ) ∩ Γ w ( t ) | K vw ( t ) + δ P (select w | sender = v ) = � u K vu ( t ) + δ n ( t ) • Form directed edge ( v , w ).

A Exam Emily Fischer Common Neighbors Model What does K vw ( t ) look like? Hard to analyze - feedback

A Exam Emily Fischer Common Neighbors Model What does K vw ( t ) look like?

A Exam Emily Fischer Common Neighbors Model What does K vw ( t ) look like? Hard to analyze - feedback

A Exam Emily Fischer Common Neighbor Process • Want to model evolution of K ij ( t ) on its own. • Start at ˜ K ij (0) = 0 for all pairs i , j .

A Exam Emily Fischer Common Neighbor Process • Want to model evolution of K ij ( t ) on its own. • Start at ˜ K ij (0) = 0 for all pairs i , j . • Given ( ˜ K ij ( t )) i , j ≥ 0 , at t + 1, • Select i uniformly from existing nodes • Choose η = c ( n ( t )) θ nodes, j 1 , j 2 , . . . , j η , preferentially with K ij ℓ ( t ) , and increase K ij ℓ ( t + 1) = K ij ℓ ( t ) + 1 .

A Exam Emily Fischer Common Neighbor Process • Want to model evolution of K ij ( t ) on its own. • Start at ˜ K ij (0) = 0 for all pairs i , j . • Given ( ˜ K ij ( t )) i , j ≥ 0 , at t + 1, • Select i uniformly from existing nodes • Choose η = c ( n ( t )) θ nodes, j 1 , j 2 , . . . , j η , preferentially with ˜ K ij ℓ ( t ), and increase K ij ℓ ( t + 1) = ˜ ˜ K ij ℓ ( t ) + 1 .

A Exam Emily Fischer Common Neighbor Process Let ˜ � N i ( t ) = K ij ( t ) j What is the distribution of N i ( t )?

A Exam Emily Fischer Common Neighbor Process Theorem j ˜ Let N i ( t ) = � K ij ( t ) . Then there exists a random variable Z i such that N i ( t ) → Z i t θ in probability, where Z i has characteristic function � α θ � � 1 − α 1 t ( e itz − 1) dt φ Z ( z ) = exp . αθ 0

A Exam Emily Fischer Common Neighbor Process

A Exam Emily Fischer Common Neighbor Process Result • The “total common neighbors” N i ( t ) converges when scaled by t θ . In progress/Future • Limiting distribution for ˜ K ij ( t ). • Use these distributions to analyze degree distribution of the graph

A Exam Emily Fischer Community Structure • How to quantify “strong community structure” • Compare community structure of CN and PA.

A Exam Emily Fischer Community Structure CN vs. PA

A Exam Emily Fischer Modularity Definition Given a graph partitioned into c communities, the modularity is c � ( e ii − a 2 Q = i ) i =1 where e ii is the fraction of edges with both end vertices in community i , and a i is the fraction of ends of edges with vertices in community i .

A Exam Emily Fischer Community Detection • Community detection algorithms aim to assign nodes to communities in a way that is reasonable • Some algorithms maximize modularity: Fast-greedy (FG), Largest-eigenvector (LE) • But there are other methods as well: Edge-betweenness (EB), Walktrap (WC).

A Exam Emily Fischer Modularity Averages of modularity over 100 trials ( α = . 2 , δ = . 5) Graph EB FG LE WC CN 500 .450 .472 .423 .401 PA 500 .276 .379 .333 .251 CN 1000 .310 .402 .350 .301 PA 1000 .103 .328 .279 .190 CN 5000 .145 .320 .176 PA 5000 .039 .277 .120

A Exam Emily Fischer Conclusion 1. PA mode lacks characteristics of LinkedIn network: • Power-law degree distribution • Lack of community structure 2. Common Neighbors Model • Limiting distribution of N i ( t ) in the common neighbors process • Better community structure than PA

A Exam Emily Fischer Edge Acceptance/Rejection Node v sends an invitation to a node w .

A Exam Emily Fischer Model 1: Edge Acceptance/Rejection w accepts the invitation with probability p vw ( t ).

A Exam Emily Fischer Edge Acceptance/Rejection How can acceptance probability achieve goals of (1) non-power law degree distribution and (2) community structure? • Rich may choose not to get richer • Probability of acceptance based on communities � p C v = C w p vw ( t ) = q C v � = C w .

A Exam Emily Fischer Edge Acceptance/Rejection How can acceptance probability achieve goals of (1) non-power law degree distribution and (2) community structure? • Rich may choose not to get richer: p vw ( t ) ↓ 0 • Probability of acceptance based on communities � p C v = C w p vw ( t ) = q C v � = C w .

A Exam Emily Fischer Edge Acceptance/Rejection How can acceptance probability achieve goals of (1) non-power law degree distribution and (2) community structure? • Rich may choose not to get richer: p vw ( t ) ↓ 0 • Probability of acceptance based on communities: � p C v = C w p vw ( t ) = C v � = C w . q

A Exam Emily Fischer Edge Acceptance/Rejection For now, constant acceptance probability p vw ( t ) = p for all v , w and t ≥ 0 .

A common-neighbors-based random graph model for community structure - PowerPoint PPT Presentation

A common-neighbors-based random graph model for community structure Emily Fischer Cornell University May 12, 2017 A Exam Emily Fischer Outline 1. Introduction Preferential Attachment (PA) 2. Common Neighbors Model (CN) Degree

COMMUNITY MANAGEMENT jono bacon COMMUNITY COMMUNITY COMMUNITY COMMUNITY COMMUNITY COMMUNITY

Approximate Nearest Neighbors Search Approximate Nearest Neighbors Search in High Dimensions in

awareness Contention between neighbors in carrier- sensing range (c- B C A neighbors)

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Back to Random Walks on Graphs Random walk on a graph: Stationary distribution: Back to Random

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Neighbors Nourishing Communities Neighbors Nourishing Communities M ission- T o provide

Algorithms for random k -SAT and k -colourings of a random graph Michael Molloy Dept of Computer

Random graph methods October 16, 2018 Random graph methods October 16, 2018 1 / 37 Graphs and

K-Nearest Neighbors Nicolas Indelicato K-Nearest Neighbors Dataset Background How the

k-Nearest Neighbors Lecture 2 k-Nearest Neighbors September 16, 2015 1 Wentworth Institute of

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Graph Data Processing M. Tamer Ozsu 1 / 75 Outline Introduction RDF Graph Querying

Integration Testing Path Based Chapter 13 Call graph based integration Use the call graph

Chapter 2: Random Variables In this chapter we will cover: 1. Discrete Random variables, ( 2.1

Random Numbers, Files, and Onwards Random Numbers Computers cannot produce truly random numbers.

A New Solution To The Random Assignment Problem By Anna Bogomolnaia, Herve Moulin Presented By

Forward-looking statements Except for the historical information contained herein, the matters

Salvaging Weak Security Bounds for Blockcipher-based Constructions Thomas Shrimpton (University

Random growth models and planted problems Graduating bits - ITCS 2016 Laura Florescu NYU

RRT-Connect: An Efficient Approach to Single-Query Path Planning James J. Kuffner, Jr. Steven M.

Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks Anuva Kulkarni

Genetic Algorithms Seth Bacon 4/25/2005 Seth Bacon 1 What are Genetic Algorithms Search

Educational Assortative Mating and Couples F ertility Working Paper to be presented at the 2017