Degrees, Power Laws and Popularity Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ February 13, 2020 Network Science Analytics Degrees, Power Laws and Popularity 1
Degree distributions Degree distributions Power-law degree distributions Visualizing and fitting power laws Popularity and preferential attachment Network Science Analytics Degrees, Power Laws and Popularity 2
Descriptive analysis of network characterstics ◮ Given a network graph representation of a complex system ⇒ Structural properties of G key to system-level understanding Example ◮ Q1: Underpinning of various types of basic social dynamics? A: Study vertex triplets (triads) and patterns of ties among them ◮ Q2: How can we formalize the notion of ‘importance’ in a network? A: Define measures of individual vertex (or group) centrality ◮ Q3: Can we identify communities and cohesive subgroups? A: Formulate as a graph partitioning (clustering) problem ◮ Characterization of individual vertices/edges and network cohesion ◮ Social network analysis, math, computer science, statistical physics Network Science Analytics Degrees, Power Laws and Popularity 3
Degree ◮ Def: The degree d v of vertex v is its number of incident edges ⇒ Degree sequence arranges degrees in non-decreasing order 3 3 5 4 2 2 1 6 2 4 2 3 ◮ In figure ⇒ Vertex degrees shown in red, e.g., d 1 = 2 and d 5 = 3 ⇒ Graph’s degree sequence is 2,2,2,3,3,4 ◮ In general, the degree sequence does not uniquely specify the graph ◮ High-degree vertices are likely to be influential, central, prominent Network Science Analytics Degrees, Power Laws and Popularity 4
Degree distribution ◮ Let N ( d ) denote the number of vertices with degree d ⇒ Fraction of vertices with degree d is P ( d ) := N ( d ) N v ◮ Def: The collection { P ( d ) } d ≥ 0 is the degree distribution of G ◮ Histogram formed from the degree sequence (bins of size one) P(d) d ◮ P ( d ) = probability that randomly chosen node has degree d ⇒ Summarizes the local connectivity in the network graph Network Science Analytics Degrees, Power Laws and Popularity 5
Joint degree distribution ◮ Q: What about patterns of association among nodes of given degrees? ◮ A: Define the two-dimensional analogue of a degree distribution Router-level Internet Protein interaction 10 8 8 6 log 2 (Degree) log 2 (Degree) 6 4 4 2 2 0 0 0 2 4 6 8 10 0 2 4 6 8 log 2 (Degree) log 2 (Degree) ◮ Prob. of random edge having incident vertices with degrees ( d 1 , d 2 ) Network Science Analytics Degrees, Power Laws and Popularity 6
A simple random graph model ◮ Def: The Erd¨ os-Renyi random graph model G n , p ◮ Undirected graph with n vertices, i.e., of order N v = n ◮ Edge ( u , v ) present with probability p , independent of other edges ◮ Simulation is easy: draw � n � i.i.d. Bernoulli( p ) RVs 2 Example ◮ Three realizations of G 10 , 1 6 . The size N e is a random variable Network Science Analytics Degrees, Power Laws and Popularity 7
Degree distribution of G n , p ◮ Q: Degree distribution P ( d ) of the Erd¨ os-Renyi graph G n , p ? ◮ Define I { ( v , u ) } = 1 if ( v , u ) ∈ E , and I { ( v , u ) } = 0 otherwise. ⇒ Fix v . For all u � = v , the indicator RVs are i.i.d. Bernoulli( p ) ◮ Let D v be the (random) degree of vertex v . Hence, � D v = I { ( v , u ) } u � = v ⇒ D v is binomial with parameters ( n − 1 , p ) and � n − 1 � p d (1 − p ) ( n − 1) − d P ( d ) = P ( D v = d ) = d ◮ In words, the probability of having exactly d edges incident to v ⇒ Same for all v ∈ V , by independence of the G n , p model Network Science Analytics Degrees, Power Laws and Popularity 8
Behavior for large n ◮ Q: How does the degree distribution look like for a large network? ◮ Recall D v is a sum of n − 1 i.i.d. Bernoulli( p ) RVs ⇒ Central Limit Theorem: D v ∼ N ( np , np (1 − p )) for large n 0.2 0.2 p=0.5, n=20 Binomial(20,1/2) p=0.5, n=40 0.18 0.18 Binomial(60,1/6) p=0.5, n=60 Poisson(10) 0.16 0.16 0.14 0.14 0.12 0.12 P(d) P(d) 0.1 0.1 0.08 0.08 0.06 0.06 0.04 0.04 0.02 0.02 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 d d ◮ Makes most sense to increase n with fixed E [ D v ] = ( n − 1) p = µ ⇒ Law of rare events: D v ∼ Poisson( µ ) for large n Network Science Analytics Degrees, Power Laws and Popularity 9
Law of rare events ◮ Substituting p = µ/ n in the binomial PMF yields n ! � µ � d � 1 − µ � n − d P n ( d ) = ( n − d )! d ! n n µ d (1 − µ/ n ) n = n ( n − 1) . . . ( n − d + 1) n d d ! (1 − µ/ n ) d n →∞ (1 − µ/ n ) n = e − µ ◮ In the limit, red term is lim ◮ Black and blue terms converge to 1. Limit is the Poisson PMF n →∞ P n ( d ) = 1 µ d e − µ = e − µ µ d lim d ! 1 d ! ◮ Approximation usually called “law of rare events” ◮ Individual edges happen with small probability p = µ/ n ◮ The aggregate (degree, number of edges), though, need not be rare Network Science Analytics Degrees, Power Laws and Popularity 10
The G n , p model and real-world networks ◮ For large graphs, G n , p suggests P ( d ) with an exponential tail ⇒ Unlikely to see degrees spanning several orders of magnitude Linear scale Logarithmic scale 10 0 0.12 10 -50 0.1 10 -100 0.08 10 -150 P(d) P(d) 0.06 10 -200 0.04 10 -250 0.02 10 -300 0 0 10 20 30 40 50 10 1 10 2 10 3 d d ◮ Concentrated distribution around the mean E [ D v ] = ( n − 1) p ◮ Q: Is this in agreement with real-world networks? Network Science Analytics Degrees, Power Laws and Popularity 11
World Wide Web ◮ Degree distributions of the WWW analyzed in [Broder et al ’00] ⇒ Web a digraph, study both in- and out-degree distributions ◮ Majority of vertices naturally have small degrees ⇒ Nontrivial amount with orders of magnitude higher degrees Network Science Analytics Degrees, Power Laws and Popularity 12
Internet autonomous systems ◮ The topology of the AS-level Internet studied in [Faloutsos 3 ’99] ◮ Right-skewed degree distributions also found for router-level Internet Network Science Analytics Degrees, Power Laws and Popularity 13
Seems to be a structural pattern ◮ More heavy-tailed degree distributions found in [Barabasi-Albert ’99] P(d) d d d Author collaboration Web graph Power grid ◮ These heterogeneous, diffuse degree distributions are not exponential Network Science Analytics Degrees, Power Laws and Popularity 14
Power laws Degree distributions Power-law degree distributions Visualizing and fitting power laws Popularity and preferential attachment Network Science Analytics Degrees, Power Laws and Popularity 15
Power-law degree distributions − 4 − 5 log 2 (Frequency) log 2 (Frequency) − 6 − 10 − 8 − 10 − 15 − 12 0 2 4 6 8 10 0 2 4 6 8 log 2 (Degree) log 2 (Degree) ◮ Log-log plots show roughly a linear decay, suggesting the power law P ( d ) ∝ d − α ⇒ log P ( d ) = C − α log d ◮ Power-law exponent (negative slope) is typically α ∈ [2 , 3] ◮ Normalization constant C is mostly uninteresting ◮ Power laws often best followed in the tail, i.e., for d ≥ d min Network Science Analytics Degrees, Power Laws and Popularity 16
Power law and exponential degree distributions ( a) ( b) 0.15 10 0 P(d)=d -‑2.1 ¡ P(d)=d -‑2.1 ¡ p k ~ k -2.1 p k ~ k -2.1 10 -1 � 0.1 10 -2 p k p � � P(d) p k Poisson ¡ 10 -3 POISSON 0.05 10 -4 Poisson ¡ POISSON 10 -5 10 -6 � � 0 10 20 k 30 40 50 10 0 10 1 10 2 10 3 d ¡ k d ¡ �� � (d) ( c) � � � ◮ Erd¨ os-Renyi’s Poisson degree distribution exhibits a sharp cutoff ⇒ Power laws upper bound exponential tails for large enough d � Network Science Analytics Degrees, Power Laws and Popularity 17 �� � � � �
Scale-free networks ◮ Scale-free network: degree distribution with power-law tail ◮ Name motivated for the scale-invariance property of power laws ◮ Def: A scale-free function f ( x ) satisfies f ( ax ) = bf ( x ), for a , b ∈ R Example ◮ Power-law functions f ( x ) = x − α are scale-free since f ( ax ) = ( ax ) − α = a − α f ( x ) = bf ( x ) , where b := a − α ◮ Exponential functions f ( x ) = c x are not scale-free because f ( ax ) = c ax = ( c x ) a = f a ( x ) � = bf ( x ) , except when a = b = 1 ◮ No ‘characteristic scale’ for the degrees. More soon ⇒ Functional form of the distribution is invariant to scale Network Science Analytics Degrees, Power Laws and Popularity 18
Power-law distributions are ubiquitous ◮ Power-law distributions widespread beyond networks [Clauset et al ’07] Network Science Analytics Degrees, Power Laws and Popularity 19
Normalization ◮ The power-law degree distribution P ( d ) = Cd − α is a PMF, hence ∞ ∞ 1 Cd − α ⇒ C = � � 1 = P ( d ) = � ∞ d =0 d − α d =0 d =0 ◮ Often a power law is only valid for the tail d ≥ d min , hence 1 1 d min x − α dx = ( α − 1) d α − 1 C = d = d min d − α ≈ � ∞ � ∞ min ⇒ Sound approximation since P ( d ) varies slowly for large d ◮ The normalized power-law degree distribution is � d � − α P ( d ) = α − 1 d ≥ d min , d min d min Network Science Analytics Degrees, Power Laws and Popularity 20
Recommend
More recommend