More Power-Law Mechanisms More Mechanisms for Generating Power-Law Distributions Optimization Minimal Cost Mandelbrot vs. Simon Principles of Complex Systems Assumptions Model Course CSYS/MATH 300, Fall, 2009 Analysis Extra Robustness HOT theory Self-Organized Criticality Prof. Peter Dodds COLD theory Network robustness References Dept. of Mathematics & Statistics Center for Complex Systems :: Vermont Advanced Computing Center University of Vermont Frame 1/60 .
More Power-Law Outline Mechanisms Optimization Optimization Minimal Cost Mandelbrot vs. Simon Minimal Cost Assumptions Model Mandelbrot vs. Simon Analysis Extra Assumptions Robustness Model HOT theory Self-Organized Criticality Analysis COLD theory Network robustness Extra References Robustness HOT theory Self-Organized Criticality COLD theory Network robustness References Frame 2/60
More Power-Law Another approach Mechanisms Optimization Minimal Cost Benoit Mandelbrot Mandelbrot vs. Simon Assumptions Model ◮ Mandelbrot = father of fractals Analysis Extra ◮ Mandelbrot = almond bread Robustness HOT theory ◮ Derived Zipf’s law through optimization [11] Self-Organized Criticality COLD theory ◮ Idea: Language is efficient Network robustness References ◮ Communicate as much information as possible for as little cost ◮ Need measures of information ( H ) and cost ( C )... ◮ Minimize C / H by varying word frequency ◮ Recurring theme: what role does optimization play in complex systems? Frame 4/60
More Power-Law Not everyone is happy... Mechanisms Optimization Minimal Cost Mandelbrot vs. Simon Assumptions Model Analysis Extra Robustness HOT theory Self-Organized Criticality Mandelbrot vs. Simon: COLD theory Network robustness ◮ Mandelbrot (1953): “An Informational Theory of the References Statistical Structure of Languages” [11] ◮ Simon (1955): “On a class of skew distribution functions” [14] ◮ Mandelbrot (1959): “A note on a class of skew distribution function: analysis and critique of a paper by H.A. Simon” ◮ Simon (1960): “Some further notes on a class of Frame 6/60 skew distribution functions”
More Power-Law Not everyone is happy... (cont.) Mechanisms Optimization Minimal Cost Mandelbrot vs. Simon Assumptions Mandelbrot vs. Simon: Model Analysis Extra ◮ Mandelbrot (1961): “Final note on a class of skew Robustness distribution functions: analysis and critique of a HOT theory Self-Organized Criticality model due to H.A. Simon” COLD theory Network robustness ◮ Simon (1961): “Reply to ‘final note’ by Benoit References Mandelbrot” ◮ Mandelbrot (1961): “Post scriptum to ‘final note”’ ◮ Simon (1961): “Reply to Dr. Mandelbrot’s post scriptum” Frame 7/60
More Power-Law Not everyone is happy... (cont.) Mechanisms Optimization Mandelbrot: Minimal Cost Mandelbrot vs. Simon “We shall restate in detail our 1959 objections to Simon’s Assumptions Model 1955 model for the Pareto-Yule-Zipf distribution. Our Analysis Extra objections are valid quite irrespectively of the sign of p-1, Robustness so that most of Simon’s (1960) reply was irrelevant.” HOT theory Self-Organized Criticality COLD theory Simon: Network robustness “Dr. Mandelbrot has proposed a new set of objections to References my 1955 models of the Yule distribution. Like his earlier objections, these are invalid.” Plankton: “You can’t do this to me, I WENT TO COLLEGE!” “You weak minded fool!” “That’s it Mister! You just lost your brain privileges,” etc. Frame 8/60
More Power-Law Zipfarama via Optimization Mechanisms Optimization Minimal Cost Mandelbrot vs. Simon Assumptions Mandelbrot’s Assumptions Model Analysis Extra ◮ Language contains n words: w 1 , w 2 , . . . , w n . Robustness HOT theory ◮ i th word appears with probability p i Self-Organized Criticality COLD theory Network robustness ◮ Words appear randomly according to this distribution References (obviously not true...) ◮ Words = composition of letters is important ◮ Alphabet contains m letters ◮ Words are ordered by length (shortest first) Frame 10/60
More Power-Law Zipfarama via Optimization Mechanisms Optimization Word Cost Minimal Cost Mandelbrot vs. Simon Assumptions ◮ Length of word (plus a space) Model Analysis Extra ◮ Word length was irrelevant for Simon’s method Robustness HOT theory Self-Organized Criticality Objection COLD theory Network robustness References ◮ Real words don’t use all letter sequences Objections to Objection ◮ Maybe real words roughly follow this pattern (?) ◮ Words can be encoded this way ◮ Na na na-na naaaaa... Frame 11/60
More Power-Law Zipfarama via Optimization Mechanisms Optimization Minimal Cost Mandelbrot vs. Simon Assumptions Binary alphabet plus a space symbol Model Analysis Extra i 1 2 3 4 5 6 7 8 Robustness word 1 10 11 100 101 110 111 1000 HOT theory Self-Organized Criticality length 1 2 2 3 3 3 3 4 COLD theory 1 + ln 2 i 1 2 2.58 3 3.32 3.58 3.81 4 Network robustness References ◮ Word length of 2 k th word: = k + 1 = 1 + log 2 2 k ◮ Word length of i th word ≃ 1 + log 2 i ◮ For an alphabet with m letters, word length of i th word ≃ 1 + log m i . Frame 12/60
More Power-Law Zipfarama via Optimization Mechanisms Total Cost C Optimization Minimal Cost Mandelbrot vs. Simon ◮ Cost of the i th word: C i ≃ 1 + log m i Assumptions Model ◮ Cost of the i th word plus space: C i ≃ 1 + log m ( i + 1 ) Analysis Extra ◮ Subtract fixed cost: C ′ i = C i − 1 ≃ log m ( i + 1 ) Robustness HOT theory ◮ Simplify base of logarithm: Self-Organized Criticality COLD theory Network robustness i ≃ log m ( i + 1 ) = log e ( i + 1 ) References C ′ ∝ ln ( i + 1 ) log e m ◮ Total Cost: n n � � p i C ′ C ∼ i ∝ p i ln ( i + 1 ) i = 1 i = 1 Frame 14/60
More Power-Law Zipfarama via Optimization Mechanisms Information Measure Optimization Minimal Cost Mandelbrot vs. Simon ◮ Use Shannon’s Entropy (or Uncertainty): Assumptions Model Analysis n Extra � H = − p i log 2 p i Robustness HOT theory i = 1 Self-Organized Criticality COLD theory Network robustness ◮ (allegedly) von Neumann suggested ‘entropy’... References ◮ Proportional to average number of bits needed to encode each ‘word’ based on frequency of occurrence ◮ − log 2 p i = log 2 1 / p i = minimum number of bits needed to distinguish event i from all others ◮ If p i = 1 / 2, need only 1 bit ( log 2 1 / p i = 1) ◮ If p i = 1 / 64, need 6 bits ( log 2 1 / p i = 6) Frame 15/60
More Power-Law Zipfarama via Optimization Mechanisms Optimization Minimal Cost Mandelbrot vs. Simon Assumptions Model Analysis Information Measure Extra Robustness ◮ Use a slightly simpler form: HOT theory Self-Organized Criticality COLD theory Network robustness n n � � References H = − p i log e p i / log e 2 = − g p i ln p i i = 1 i = 1 where g = 1 / ln 2 Frame 16/60
More Power-Law Zipfarama via Optimization Mechanisms Optimization Minimal Cost ◮ Minimize Mandelbrot vs. Simon Assumptions F ( p 1 , p 2 , . . . , p n ) = C / H Model Analysis Extra subject to constraint Robustness HOT theory Self-Organized Criticality n COLD theory � p i = 1 Network robustness References i = 1 ◮ Tension: (1) Shorter words are cheaper (2) Longer words are more informative (rarer) ◮ (Good) question: how much does choice of C / H as function to minimize affect things? Frame 17/60
More Power-Law Zipfarama via Optimization Mechanisms Time for Lagrange Multipliers: Optimization Minimal Cost ◮ Minimize Mandelbrot vs. Simon Assumptions Ψ( p 1 , p 2 , . . . , p n ) = Model Analysis Extra F ( p 1 , p 2 , . . . , p n ) + λ G ( p 1 , p 2 , . . . , p n ) Robustness HOT theory Self-Organized Criticality where COLD theory Network robustness � n F ( p 1 , p 2 , . . . , p n ) = C i = 1 p i ln ( i + 1 ) References H = − g � n i = 1 p i ln p i and the constraint function is n � G ( p 1 , p 2 , . . . , p n ) = p i − 1 = 0 i = 1 Frame 19/60 Insert question 4, assignment 2 ( ⊞ )
More Power-Law Zipfarama via Optimization Mechanisms Optimization Minimal Cost Mandelbrot vs. Simon Some mild suffering leads to: Assumptions Model Analysis Extra ◮ p j = e − 1 − λ H 2 / gC ( j + 1 ) − H / gC ∝ ( j + 1 ) − H / gC Robustness HOT theory Self-Organized Criticality COLD theory Network robustness ◮ A power law appears [applause]: α = H / gC References ◮ Next: sneakily deduce λ in terms of g , C , and H . ◮ Find p j = ( j + 1 ) − H / gC Frame 20/60
More Power-Law Zipfarama via Optimization Mechanisms Optimization Finding the exponent Minimal Cost Mandelbrot vs. Simon Assumptions ◮ Now use the normalization constraint: Model Analysis Extra n n n Robustness ( j + 1 ) − H / gC = � � � ( j + 1 ) − α 1 = p j = HOT theory Self-Organized Criticality COLD theory j = 1 j = 1 j = 1 Network robustness References ◮ As n → ∞ , we end up with ζ ( H / gC ) = 2 where ζ is the Riemann Zeta Function ◮ Gives α ≃ 1 . 73 ( > 1, too high) ◮ If cost function changes ( j + 1 → j + a ) then exponent is tunable ◮ Increase a , decrease α Frame 21/60
Recommend
More recommend