Power Law Size Outline Power Law Size Distributions Distributions Power Law Size Distributions Overview Overview Introduction Introduction Principles of Complex Systems Examples Examples Zipf’s law Zipf’s law Course 300, Fall, 2008 Wild vs. Mild Wild vs. Mild CCDFs CCDFs Overview References References Introduction Prof. Peter Dodds Examples Zipf’s law Department of Mathematics & Statistics Wild vs. Mild University of Vermont CCDFs References Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License . Frame 1/33 Frame 2/33 The Don Power Law Size Size distributions Power Law Size Distributions Distributions Overview Overview Introduction Introduction Extreme deviations in test cricket Examples Examples The sizes of many systems’ elements appear to obey an Zipf’s law Zipf’s law Wild vs. Mild Wild vs. Mild inverse power-law size distribution: CCDFs CCDFs References References P ( size = x ) ∼ c x − γ where x min < x < x max and γ > 1 ◮ Typically, 2 < γ < 3. ◮ x min = lower cutoff 0 10 20 30 40 50 60 70 80 90 100 ◮ x max = upper cutoff Don Bradman’s batting average = 166% next best. Frame 3/33 Frame 5/33
Size distributions Power Law Size Size distributions Power Law Size Distributions Distributions Overview Overview Introduction Introduction Examples Examples Zipf’s law Zipf’s law Wild vs. Mild Wild vs. Mild Many systems have discrete sizes k : CCDFs CCDFs References References ◮ Usually, only the tail of the distribution obeys a power ◮ Word frequency law: ◮ Node degree (as we have seen): # hyperlinks, etc. ◮ number of citations for articles, court decisions, etc. P ( x ) ∼ c x − γ as x → ∞ . ◮ Still use term ‘power law distribution’ P ( k ) ∼ c k − γ where k min ≤ k ≤ k max Frame 6/33 Frame 7/33 Size distributions Power Law Size Size distributions Power Law Size Distributions Distributions Overview Overview Introduction Introduction Examples Examples Zipf’s law Zipf’s law Wild vs. Mild Wild vs. Mild CCDFs CCDFs References References Power law size distributions are sometimes called ◮ Negative linear relationship in log-log space: Pareto distributions after Italian scholar Vilfredo Pareto. log P ( x ) = log c − γ log x ◮ Pareto noted wealth in Italy was distributed unevenly (80–20 rule). ◮ Term used especially by economists Frame 8/33 Frame 9/33
Size distributions Power Law Size Size distributions Power Law Size Distributions Distributions Overview Overview Introduction Introduction Examples Examples Examples: Zipf’s law Zipf’s law Wild vs. Mild Wild vs. Mild CCDFs CCDFs Examples: ◮ Number of citations to papers: P ( k ) ∝ k − 3 . References References ◮ Individual wealth (maybe): P ( W ) ∝ W − 2 . ◮ Earthquake magnitude (Gutenberg Richter law): ◮ Distributions of tree trunk diameters: P ( d ) ∝ d − 2 . P ( M ) ∝ M − 3 ◮ Number of war deaths: P ( d ) ∝ d − 1 . 8 ◮ The gravitational force at a random point in the universe: P ( F ) ∝ F − 5 / 2 . ◮ Sizes of forest fires ◮ Diameter of moon craters: P ( d ) ∝ d − 3 . ◮ Sizes of cities: P ( n ) ∝ n − 2 . 1 ◮ Word frequency: e.g., P ( k ) ∝ k − 2 . 2 (variable) ◮ Number of links to and from websites (Note: Exponents range in error; see M.E.J. Newman arxiv.org/cond-mat/0412004v3 ( ⊞ )) Frame 11/33 Frame 12/33 Size distributions Power Law Size Zipfian rank-frequency plots Power Law Size Distributions Distributions Overview Overview Introduction Introduction Examples Examples Zipf’s law Zipf’s law Wild vs. Mild Wild vs. Mild CCDFs CCDFs Power-law distributions are.. George Kingsley Zipf: References References ◮ often called ‘heavy-tailed’ ◮ noted various rank distributions ◮ or said to have ‘fat tails’ followed power laws, often with exponent -1 (word frequency, city sizes...) “Human Behaviour and the Principle of Least-Effort” [2] Addison-Wesley, Important!: Cambridge MA, 1949. ◮ Inverse power laws aren’t the only ones: ◮ We’ll study Zipf’s law in depth... ◮ lognormals, stretched exponentials, ... Frame 13/33 Frame 15/33
Zipfian rank-frequency plots Power Law Size Power law distributions Power Law Size Distributions Distributions Overview Overview Introduction Introduction Examples Examples Zipf’s law Zipf’s law Zipf’s way: Wild vs. Mild Wild vs. Mild CCDFs CCDFs ◮ s i = the size of the i th ranked object. References References Gaussians versus power-law distributions: ◮ i = 1 corresponds to the largest size. ◮ Example: Height versus wealth. ◮ s 1 could be the frequency of occurrence of the most ◮ Mild versus Wild (Mandelbrot) common word in a text. ◮ Mediocristan versus Extremistan ◮ Zipf’s observation: (See “The Black Swan” by Nassim Taleb [1] ) s i ∝ i − α Frame 16/33 Frame 18/33 Turkeys... Power Law Size Taleb’s table [1] Power Law Size Distributions Distributions Overview Overview Introduction Introduction Examples Examples Mediocristan/Extremistan Zipf’s law Zipf’s law Wild vs. Mild Wild vs. Mild CCDFs CCDFs ◮ Most typical member is mediocre/Most typical is either References References giant or tiny ◮ Winners get a small segment/Winner take almost all effects ◮ When you observe for a while, you know what’s going on/ It takes a very long time to figure out what’s going on ◮ Prediction is easy/Prediction is hard ◮ History crawls/History makes jumps ◮ Tyranny of the collective/Tyranny of the accidental Frame 19/33 Frame 20/33 From “The Black Swan” [1]
Complementary Cumulative Distribution Power Law Size Complementary Cumulative Distribution Power Law Size Distributions Distributions Function: Function: Overview Overview CCDF: Introduction Introduction Examples Examples Zipf’s law Zipf’s law ◮ Wild vs. Mild Wild vs. Mild P ≥ ( x ) = P ( x ′ ≥ x ) = 1 − P ( x ′ < x ) CCDFs CCDFs References References CCDF: ◮ � ∞ P ( x ′ ) d x ′ = ◮ P ≥ ( x ) ∝ x − γ + 1 x ′ = x ◮ � ∞ ◮ Use when tail of P follows a power law. ( x ′ ) − γ d x ′ ∝ ◮ Increases exponent by one. x ′ = x ◮ Useful in cleaning up data. ◮ ∞ � 1 − γ + 1 ( x ′ ) − γ + 1 � = � � x ′ = x ◮ ∝ x − γ + 1 Frame 22/33 Frame 23/33 Complementary Cumulative Distribution Power Law Size Size distributions Power Law Size Distributions Distributions Function: Overview Overview Introduction Introduction Brown Corpus (1,015,945 words): Examples Examples Zipf’s law Zipf’s law CCDF: Zipf: Wild vs. Mild Wild vs. Mild CCDFs CCDFs 3.5 1 ◮ Discrete variables: References References 3 0.5 2.5 P ≥ ( k ) = P ( k ′ ≥ k ) 0 2 N > n −0.5 n i 1.5 −1 1 −1.5 ∞ � 0.5 −2 = P ( k ) 0 −2.5 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 0 0.5 1 1.5 2 2.5 3 3.5 k ′ = k n rank i ∝ k − γ + 1 ◮ The, of, and, to, a, ... = ‘objects’ ◮ ‘Size’ = word frequency ◮ Beep: CCDF and Zipf plots are related... ◮ Use integrals to approximate sums. Frame 24/33 Frame 25/33
Size distributions Power Law Size Details on the lack of scale: Power Law Size Distributions Distributions Observe: Overview Overview ◮ NP ≥ ( x ) = the number of objects with size at least x Introduction Introduction Examples Examples Zipf’s law Zipf’s law where N = total number of objects. Let’s find the mean: Wild vs. Mild Wild vs. Mild CCDFs CCDFs ◮ If an object has size x i , then NP ≥ ( x i ) is its rank i . References References ◮ � x max ◮ So � x � = xP ( x ) d x x i ∝ i − α = ( NP ≥ ( x i )) − α x = x min � x max xx − γ d x ∝ x ( − γ + 1 )( − α ) = c i x = x min Since P ≥ ( x ) ∼ x − γ + 1 , c � � x 2 − γ max − x 2 − γ = . min 2 − γ 1 α = γ − 1 A rank distribution exponent of α = 1 corresponds to Frame 26/33 Frame 27/33 a size distribution exponent γ = 2. The mean: Power Law Size And in general... Power Law Size Distributions Distributions Overview Overview Introduction Introduction Examples Examples Zipf’s law Zipf’s law Wild vs. Mild Wild vs. Mild CCDFs CCDFs c References References � � x 2 − γ max − x 2 − γ � x � ∼ . Moments: min 2 − γ ◮ All moments depend only on cutoffs. ◮ Mean blows up with upper cutoff if γ < 2. ◮ No internal scale dominates (even matters). ◮ Mean depends on lower cutoff if γ > 2. ◮ Compare to a Gaussian, exponential, etc. ◮ γ < 2: Typical sample is large. ◮ γ > 2: Typical sample is small. Frame 28/33 Frame 29/33
Recommend
More recommend