what explains power laws in language typology and
play

What explains power laws in language typology and language change? - PowerPoint PPT Presentation

What explains power laws in language typology and language change? Gerhard J ager gerhard.jaeger@uni-tuebingen.de UT ubingen Freiburg, January 19, 2011 Gerhard J ager (UT ubingen) Power laws Freiburg, January 19, 2011 1 / 112


  1. Projecting observed data on lower-dimensional-manifold Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 35 / 112

  2. Projecting observed data on lower-dimensional-manifold Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 36 / 112

  3. Projecting observed data on lower-dimensional-manifold Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 37 / 112

  4. Projecting observed data on lower-dimensional-manifold Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 38 / 112

  5. Projecting observed data on lower-dimensional-manifold Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 39 / 112

  6. Projecting observed data on lower-dimensional-manifold Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 40 / 112

  7. Smoothing the partitions from smoothed extensions we can recover smoothed partitions each pixel is assigned to category in which it has the highest degree of membership Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 41 / 112

  8. Smoothed partitions of the color space Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 42 / 112

  9. Smoothed partitions of the color space Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 43 / 112

  10. Smoothed partitions of the color space Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 44 / 112

  11. Smoothed partitions of the color space Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 45 / 112

  12. Smoothed partitions of the color space Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 46 / 112

  13. Smoothed partitions of the color space Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 47 / 112

  14. Smoothed partitions of the color space Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 48 / 112

  15. Smoothed partitions of the color space Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 49 / 112

  16. Smoothed partitions of the color space Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 50 / 112

  17. Smoothed partitions of the color space Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 51 / 112

  18. Convexity note: so far, we only used information from the WCS the location of the 330 Munsell chips in L*a*b* space played no role so far still, apparently partition cells always form continuous clusters in L*a*b* space Hypothesis (G¨ ardenfors): extension of color terms always form convex regions of L*a*b* space Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 52 / 112

  19. Support Vector Machines supervised learning technique smart algorithm to classify data in a high-dimensional space by a (for instance) linear boundary minimizes number of mis-classifications if the training data are not linearly separable SVM classification plot o o 3 o o o o o o o o o 2 o red o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 1 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o x o o o 0 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o −1 o o o o o o o o o o o o o o o o o o o o o o o green o o o o o o o o o o o −2 o o o o o o o o o o o o −3 o o −3 −2 −1 0 1 2 3 y Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 53 / 112

  20. Convex partitions a binary linear classifier divides an n -dimensional space into two convex half-spaces intersection of two convex set is itself convex hence: intersection of k binary classifications leads to convex sets procedure: if a language partitions the Munsell space into m categories, train m ( m − 1) many binary SVMs, one for each pair of 2 categories in L*a*b* space leads to m convex sets (which need not split the L*a*b* space exhaustively) Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 54 / 112

  21. Convex approximation Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 55 / 112

  22. Convex approximation Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 56 / 112

  23. Convex approximation Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 57 / 112

  24. Convex approximation Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 58 / 112

  25. Convex approximation Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 59 / 112

  26. Convex approximation Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 60 / 112

  27. Convex approximation Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 61 / 112

  28. Convex approximation Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 62 / 112

  29. Convex approximation Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 63 / 112

  30. Convex approximation Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 64 / 112

  31. Convex approximation on average, 93 . 7% of all Munsell chips are correctly classified by convex approximation proportion of correctly classified Munsell chips 0.95 0.90 ● ● ● ● ● ● ● 0.85 ● ● ● 0.80 Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 65 / 112

  32. Convex approximation compare to the outcome of the same procedure without PCA, and with PCA but using a random permutation of the Munsell chips 100 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 80 ● ● ● ● ● ● ● ● ● ● ● ● degree of convexity (%) ● ● 60 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 40 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● 1 2 3 Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 66 / 112

  33. Convex approximation choice of m = 10 is somewhat arbitrary outcome does not depend very much on this choice though 100 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 90 mean degree of convexity (%) 80 70 60 50 0 10 20 30 40 50 Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 67 / 112 no. of principal components used

  34. Implicative universals first six features correspond nicely to the six primary colors white, black, red, green, blue, yellow according to Kay et al. (1997) (and many other authors) simple system of implicative universals regarding possible partitions of the primary colors Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 68 / 112

  35. Implicative universals I II III IV V  white   white  red red / yellow       yellow     green / blue     green / blue   black black  white    white red � white / red / yellow  white    �   red / yellow yellow red / yellow           black / green / blue green green     black / green / blue   black / blue blue   black   white  white  red red       yellow  yellow        green   black / green / blue black / blue   white  white  red red       yellow / green  yellow / green / blue        blue   black black  white  red     yellow / green   black / blue source: Kay et al. (1997) Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 69 / 112

  36. Partition of the primary colors each speaker/term pair can be projected to a 15-dimensional vector primary colors correspond to first 6 entries each primary color is assigned to the term for which it has the highest value defines for each speaker a partition over the primary colors Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 70 / 112

  37. Partition of the primary colors for instance: sample speaker from Piraha (see above): A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 B extracted partition: C D E F G H I  white / yellow  J red     green / blue   black supposedly impossible, but occurs 61 times in the database Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 71 / 112

  38. Partition of primary colors most frequent partition types: { white } , { red } , { yellow } , { green, blue } , { black } ( 41 . 9% ) 1 { white } , { red } , { yellow } , { green } , { blue } , { black } ( 25 . 2% ) 2 { white } , { red, yellow } , { green, blue, black } ( 6 . 3% ) 3 { white } , { red } , { yellow } , { green } , { black, blue } ( 4 . 2% ) 4 { white, yellow } , { red } , { green, blue } , { black } ( 3 . 4% ) 5 { white } , { red } , { yellow } , { green, blue, black } ( 3 . 2% ) 6 { white } , { red, yellow } , { green, blue } , { black } ( 2 . 6% ) 7 { white, yellow } , { red } , { green, blue, black } ( 2 . 0% ) 8 { white } , { red } , { yellow } , { green, blue, black } ( 1 . 6% ) 9 10 { white } , { red } , { green, yellow } , { blue, black } ( 1 . 2% ) Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 72 / 112

  39. Partition of primay colors 87 . 1% of all speaker partitions obey Kay et al.’s universals the ten partitions that confirm to the universals occupy ranks 1, 2, 3, 4, 6, 7, 9, 10, 16, 18 decision what counts as an exception seems somewhat arbitrary on the basis of these counts Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 73 / 112

  40. Partition of primary colors ● 500 more fundamental problem: ● 200 partition frequencies are distributed 100 ● according to power law ● ● ● 50 ● frequency ● ● 20 ● frequency ∼ rank − 1 . 99 ● ● ● 10 ●●● ● ●● 5 ● no natural cutoff point to distinguish ● ● ● ● ● 2 ● ● ● ● ● ● ● regular from exceptional partitions 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 2 5 10 20 50 rank Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 74 / 112

  41. Partition of seven most important colors 500 ● ● 200 ● ● 100 ● ● 50 ● ● ● ● ● ● frequency frequency ∼ rank − 1 . 64 ● 20 ●● ●● 10 ● ● ● ● ● ● ● ● ● ● 5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 2 5 10 20 50 100 rank Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 75 / 112

  42. Partition of eight most important colors ● 200 100 ● ● ● ● ● ● ● ● 50 ●●●● ●● ● ● ● ● frequency ● 20 ● frequency ∼ rank − 1 . 46 ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● 5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 2 5 10 20 50 100 200 rank Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 76 / 112

  43. Power laws Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 77 / 112

  44. Power laws Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 78 / 112

  45. Power laws from Newman 2006 Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 79 / 112

  46. Power laws are not everywhere Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 80 / 112

  47. Other linguistic power law distributions number of vowel systems and their frequency of occurrence vowels 3 14 4 14 5 4 2 5 97 3 6 26 12 12 7 23 6 5 4 3 8 6 3 3 2 9 7 7 3 (from Schwartz et al. 1997, Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 81 / 112

  48. Other linguistic power law distributions 100 ● 50 ● ● 20 frequency frequency ∼ rank − 1 . 06 ● ● ● ● 10 ● ● ● ● 5 ● ● ● ● ● ● ●●● 2 ●● 1 2 5 10 20 rank Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 82 / 112

  49. Other linguistic power law distributions ● ● 500 ● ● ● ● ● ● ● ● ● 100 size of language families ●●●●●● frequency ● ● ● 50 ● ● ● ● ● ● ● source: Ethnologue ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● frequency ∼ rank − 1 . 32 ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● 1 2 5 10 20 50 100 rank Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 83 / 112

  50. Other linguistic power law distributions ● 1000 500 ● ● 200 ● ● ● ● number of speakers per frequency (in million) ● ● 100 ● ●● language ●●●● ● ● ● ● 50 ● ● ● ● ● ● source: Ethnologue ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● frequency ∼ rank − 1 . 01 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 2 5 10 20 50 100 200 rank Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 84 / 112

  51. The World Atlas of Language Structures large scale typological database, conducted mainly by the MPI EVA, Leipzig 2,650 languages in total are used 142 features, with between 120 and 1,370 languages per feature available online Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 85 / 112

  52. The World Atlas of Language Structures Maslova 2008, “Meta-typological 1.000 distributions” ● ● ● ● ● ● ● ● ● ● ●●●●●● 0.500 ● ● ● ● ● ● ● hypothesis: ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.200 ● ● ● ● ● ● ● ● ● pick a random value for each feature ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.100 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● estimate the probability that a random ● ● ● ● ● ● ● ● ● ● ● ● 0.050 ● ● ● ● ● ● ● ● ● ● x ● ● language has this value ● ● ● ● ● ● 0.020 ● ● ● the likelihood that an arbitrarily ● ● ● 0.010 ● chosen feature value has a probability ● ● ● ● 0.005 ● x is proportional to a power of x ● ● only holds for the most frequent 30% of 0.01 0.02 0.05 0.10 0.20 0.50 1.00 P(p(type)<=x) all types for the entire range of type frequencies, the hypothesis can be rejected Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 86 / 112

  53. The World Atlas of Language Structures however, Maslova is perhaps right in the assumption that languages are power-law distributed across WALS types worth to test it within features rather than across features problem: number of feature values usually too small for statistic evaluation solution: cross-classification of two (randomly chosen) features only such feature pairs are considered that lead to at least 30 non-empty feature value combinations pilot study with 10 such feature pairs Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 87 / 112

  54. The World Atlas of Language Structures 0 10 Feature 1: Consonant-Vowel Ratio Pr(X ≥ x) −1 Feature 2: Subtypes of 10 Asymmetric Standard Negation Kolmogorov-Smirnov test: positive −2 10 0 1 2 10 10 10 x Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 88 / 112

  55. The World Atlas of Language Structures 0 10 Feature 1: Weight Factors in Pr(X ≥ x) Weight-Sensitive Stress −1 10 Systems Feature 2: Ordinal Numerals Kolmogorov-Smirnov test: positive −2 10 0 1 2 10 10 10 x Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 89 / 112

  56. The World Atlas of Language Structures 0 10 Feature 1: Third Person Zero of Verbal Person Pr(X ≥ x) Marking −1 10 Feature 2: Subtypes of Asymmetric Standard Negation Kolmogorov-Smirnov test: positive −2 10 0 1 2 10 10 10 x Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 90 / 112

  57. The World Atlas of Language Structures 0 10 Feature 1: Relationship between the Order of Object and Verb and the Pr(X ≥ x) Order of Adjective and −1 10 Noun Feature 2: Expression of Pronominal Subjects Kolmogorov-Smirnov −2 10 test: positive 0 1 2 3 10 10 10 10 x Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 91 / 112

  58. The World Atlas of Language Structures 0 10 Feature 1: Plurality in Independent Personal Pr(X ≥ x) Pronouns −1 10 Feature 2: Asymmetrical Case-Marking Kolmogorov-Smirnov test: positive −2 10 0 1 2 10 10 10 x Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 92 / 112

  59. The World Atlas of Language Structures 0 10 Feature 1: Locus of Marking: Pr(X ≥ x) Whole-language −1 10 Typology Feature 2: Number of Cases Kolmogorov-Smirnov test: positive −2 10 0 1 2 10 10 10 x Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 93 / 112

  60. The World Atlas of Language Structures 0 10 Feature 1: Prefixing vs. Suffixing in Inflectional Pr(X ≥ x) Morphology −1 10 Feature 2: Coding of Nominal Plurality Kolmogorov-Smirnov test: positive −2 10 0 1 2 3 10 10 10 10 x Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 94 / 112

  61. The World Atlas of Language Structures 0 10 Feature 1: Prefixing vs. Suffixing in Inflectional Pr(X ≥ x) Morphology −1 10 Feature 2: Ordinal Numerals Kolmogorov-Smirnov test: positive −2 10 0 1 2 10 10 10 x Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 95 / 112

  62. The World Atlas of Language Structures 0 10 Feature 1: Coding of Pr(X ≥ x) Nominal Plurality −1 10 Feature 2: Asymmetrical Case-Marking Kolmogorov-Smirnov test: positive −2 10 0 1 2 10 10 10 x Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 96 / 112

  63. The World Atlas of Language Structures 0 10 Feature 1: Position of Pr(X ≥ x) Case Affixes −1 10 Feature 2: Ordinal Numerals Kolmogorov-Smirnov test: negative −2 10 0 1 2 10 10 10 x Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 97 / 112

  64. Why power laws? Critical states Power laws are characteristic of critical states only small ice crystals in water above freezing point one big ice crystal in water below freezing point during transition from liquid to solid state: ice crystals of many sizes power-law distributed similar effect for all kinds of phase transitions in physics power laws are considered finger print of criticality Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 98 / 112

  65. Why power laws? Self-organized criticality some systems tend to return into a critical state due to their internal dynamics (see Bak et al. 1987) well-studied effect in computer simulations of cellular automata candidates for real-life examples are earth quakes forest fires breakdowns of electricity networks landscape formation avalanches ... Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 99 / 112

  66. Why power laws? this may turn neighboring cells The sandpile model into the critical state, leading to cellular automaton; loosely further shifts inspired by real sand piles see the computer simulation each cell has a certain value, its slope single grains are added at random, increasing the slope if the slope of a cell exceeds a critical value: its slope is reduced by r the slope of the four neighboring cells is increased by 1 Gerhard J¨ ager (UT¨ ubingen) Power laws Freiburg, January 19, 2011 100 / 112

Recommend


More recommend