N ETWORK S CIENCE Scale-free Networks Prof. Marcello Pelillo Ca’ Foscari University of Venice a.y. 2016/17
The power law distribution: Discrete vs. Continuous formalism Continuous Formalism Discrete Formalism In analytical calculations it is often convenient to As node degrees are always positive assume that the degrees can take up any integers, the discrete formalism captures the positive real value: probability that a node has exactly k links: p ( k ) = Ck − γ . p k = Ck − γ . ∞ ∞ ∫ ∑ p k = 1 . p ( k ) dk = 1 k =1 k min 1 ∞ 1 1 = ( γ − 1) k γ − 1 C = k − γ = 1 ∑ C = = , C min ∞ ∞ ∫ ζ ( γ ) ∑ k − γ k − γ dk k =1 k =1 k min Riemann-Zeta p k = k − γ function min k − γ . p ( k ) = ( γ − 1) k γ − 1 ζ ( γ ) p k INTERPRETATION:
Power law a) Numbers of occurrences of words in the novel Moby Dick by Hermann Melville. b) Numbers of citations to scientic papers published in 1981, from time of publication until June 1997. c) Numbers of hits on web sites by 60000 users of the America Online Internet service for the day of 1 December 1997. d) Numbers of copies of bestselling books sold in the US between 1895 and 1965. e) Number of calls received by AT&T telephone customers in the US for a single day. f) Magnitude of earthquakes in California between January 1910 and May 1992. Magnitude is proportional to the logarithm of the maximum amplitude of the earthquake, and hence the distribution obeys a power law even though the horizontal axis is linear. g) Diameter of craters on the moon. Vertical axis is measured per square kilometre. h) Peak gamma-ray intensity of solar ares in counts per second, measured from Earth orbit between February 1980 and November 1989. i) Intensity of wars from 1816 to 1980, measured as battle deaths per 10 000 of the population of the participating countries. j) Aggregate net worth in dollars of the richest individuals in the US in October 2003. k) Frequency of occurrence of family names in the US in the year 1990. l) Populations of US cities in the year 2000. From: Newman 2006
The 80/20 rule “80% of the wealthis in the hands of the richest 20% of people.” Other examples • 80% of problems can be attributed to 20% of causes. • 80% of a company's profits come from 20% of its customers • 80% of a company's complaints come from 20% of its customers • 80% of a company's profits come from 20% of the time its staff spent • 80% of a company's revenue comes from 20% of its products • 80% of a company's sales are made by 20% of its sales staff • … Vilfredo Pareto (1848 – 1923) , Italian economist, political scientist and philosopher, who had important contributions to our understanding of income distribution and to the analysis of individuals choices. A number of fundamental principles are named after him, like Pareto efficiency, Pareto distribution (another name for a power-law distribution), the Pareto principle (or 80/20 law).
WORLD WIDE WEB Snapshots of the World Wide Web sample mapped out by Hawoong Jeong in 1998 [1]. The sequence of images show an increasingly magnified local region of the network. The first panel displays all 325,729 nodes, offering a global view of the full dataset. Nodes with more than 50 links are shown in red and nodes with more than 500 links in purple. The closeups reveal the presence of a few highly connected nodes, called hubs, that accompany scale-free networks.
WORLD WIDE WEB Nodes: WWW documents Links: URL links Expected Over 3 billion documents ROBOT: collects all URL’s found in a document and follows them recursively R. Albert, H. Jeong, A-L Barabasi, Nature , 401 130 (1999). Network Science: Scale-Free Property
Section 3 Hubs
The difference between a power law and an exponential distribution ( a) ( b) 0.15 10 0 p k ~ k -2.1 p k ~ k -2.1 10 -1 � 0.1 10 -2 p p k � � p k 10 -3 POISSON 0.05 10 -4 POISSON 10 -5 10 -6 � � 0 10 20 30 40 50 k 10 0 10 1 10 2 10 3 k �� � (d) ( c) � � � � �� � � � �
The difference between a power law and an exponential distribution Let us use the WWW to illustrate the properties of the high- k regime. The probability to have a node with k~100 is p 100 ≃ 10 − 30 • About if p k follows a Poisson distribution p 100 ≃ 10 − 4 • About if p k follows a power law. Consequently, if the WWW were to be a random network, according to the Poisson prediction we would expect 10 -18 k>100 degree nodes, or none. For a power law degree distribution, we expect about N k >100 = 10 9 k>100 degree nodes
The difference between a power law and an exponential distribution POISSON (a) (b) Number of nodes with k links Most nodes have Chicago the same number of links Boston No highly Los Angeles connected nodes Number of links (k) (d) POWER LAW (c) Number of nodes with k links Many nodes Chicago with only a few links Boston A few hubs with Los Angeles large number of links Number of links (k)
The size of the largest hub All real networks are finite à let us explore its consequences. à We have an expected maximum degree, k max Estimating k max ∞ ≈ 1 ∫ Why: we expect at most one node with degree > k max P ( k ) dk N ( natural upper cutoff ) k max ∞ ∞ = ( γ − 1) γ − 1 = k min γ − 1 ≈ 1 ∞ k − γ dk γ − 1 k − γ + 1 ∫ ∫ = ( γ − 1) k min γ − 1 ⎡ ⎤ P ( k ) dk ( − γ + 1) k min ⎣ ⎦ k max k max N k max k max 1 k max = k min N γ − 1
The size of the largest hub 1 k max = k min N γ − 1 To illustrate the difference in the maximum degree of an exponential and a scale-free network let us return to the WWW sample, consisting of N ≈ 3 × 10 5 nodes. As k min = 1, if the degree distribution were to follow an exponential, (4.17) predicts that the maximum degree should be k max ≈ 14 for λ =1. In a scale-free network of similar size and γ = 2.1, (4.18) predicts k max ≈ 95,000, a remarkable difference. Note that the largest in-degree of the WWW map of Image 4.1 is 10,721, which is comparable to kmax predicted by a scale-free network. This reinforces our conclusion that in a random network hubs are effectivelly forbidden, while in scale-free networks they are naturally present.
The size of the largest hub Expected maximum degree, k max 1 k max = k min N γ − 1 k max increases with the size of the network the larger a system is, the larger its biggest hub γ > 2: k max increases slower than N The largest hub will contain a decreasing fraction of links as N increases. γ = 2: k max ~ N The size of the biggest hub is O(N) γ < 2: k max increases faster than N: condensation phenomena The largest hub will grab an increasing fraction of links. Anomaly!
The size of the largest hub The estimated degree of the largest node in 10 10 scale-free and random 10 9 SCALE-FREE networks with the 1 k max ~ N same average degree 10 8 (N - 1) ( ��� ) � 〈 k 〉 = 3. 10 7 k max For the scale-free network we chose γ = 10 5 2.5. For comparison, we also show the 10 4 RANDOM NETWORK linear behavior, k max ~ 10 3 k max ~ In N N − 1, expected for a complete network. 10 2 10 1 Overall, hubs in a scale-free network are 10 0 several orders of 10 2 10 4 10 6 N 10 8 10 10 10 12 magnitude larger than the biggest node in a random network with 1 the same N and 〈 k 〉 k max = k min N γ − 1 � � �� � � � � �� �� �
�� � � � � � � � � �� � � � � � � � � The meaning of scale-free � � � �� p k � � k � � � k ��� Random Network 1/2 k k k Randomly chosen node: = ± Scale: � k � � � � Scale-Free Network � � k k = ± ∞ Randomly chosen node: Scale: none � � � � �� � � � � �� � � � � � � �� �� � � �
�� � � � � � � � The meaning of scale-free k k = ± σ k � For a random network the standard deviation follows σ = ‹k› 1/2 shown as a green dashed line on the figure. The symbols show σ for nine of the ten reference networks, calculated using the values shown in Table 4.1. The actor network has a very large 〈 k 〉 and σ , hence it omitted for clarity. For each network σ is larger than the value expected for a random network with the same 〈 k 〉 . The only exception is the power grid, which is not scale-free. While the phone call network is scale- �� free, it has a large γ , hence it is well approximated by a random network. � � � � � � � � � � � �� � � � � � ��� � � � � � � � � � � � �� � � � � �� � � � � � � �� �� � � �
Section 5 universality
INTERNET BACKBONE Nodes : computers, routers Links : physical lines (Faloutsos, Faloutsos and Faloutsos, 1999) Network Science: Scale-Free Property
Network Science: Scale-Free Property
SCIENCE CITATION INDEX Nodes : papers Links : citations 25 H.E. Stanley,... 1736 PRL papers (1988) 578... P(k) ~k - γ ( γ = 3) (S. Redner, 1998) Network Science: Scale-Free Property
SCIENCE COAUTHORSHIP Nodes : scientist (authors) M: math NS: neuroscience Links : joint publication (Newman, 2000, Barabasi et al 2001) Network Science: Scale-Free Property
Recommend
More recommend