Further evidence for punctuated language evolution Gerhard Jäger Tübingen, March 29, 2017 Gerhard Jäger Punctuated evolution Tübingen, March 29, 2017 1 / 34
Punctuated equilibrium Gould and Eldredge (1977): surprising lack of intermediate stages in fossil record possible explanation: evolutionary change occurs primarily during speciation phases when a species is in equilibrium, it neither changes nor speciates Gerhard Jäger Punctuated evolution Tübingen, March 29, 2017 2 / 34
Punctuated equilibrium both the existence of the Tübingen, March 29, 2017 Punctuated evolution Gerhard Jäger in biology explanation are still contentious phenomenon and the causal accelerated evolution Possible causal mechanism speciation leads to small ones, rarely reach fjxation mutations, even benefjcial stabilizes species large population sizes 3 / 34 populations (bottlenecks) →
Punctuated language evolution Dixon (1997): same logic applies to language change as well rapid tree-like diversifjcation (as in history of IE languages) are the exception in human history Australia prior to European conquest, with an equilibrium between diversifjcation and contact — and hence no tree-like structure — are the rule rejected by most historical linguists, especially by experts on Australian languages Gerhard Jäger Punctuated evolution Tübingen, March 29, 2017 4 / 34
Quantitative approaches 1 Pagel et al. (2006) amount of evolutionary change is refmected in path lengths of phylogenetic tree (without molecular clock) if punctuational hypothesis is true, there should be positive correlation between length of a path (tip to root) and number of nodes on that path correlation is tested via phylogenetic regression (PGLS) Gerhard Jäger Punctuated evolution Tübingen, March 29, 2017 5 / 34
Quantitative approaches 1 Typical shape of a tree instantiating punctual evolution Gerhard Jäger Punctuated evolution Tübingen, March 29, 2017 6 / 34
Quantitative approaches 1 Atkinson et al. (2008): apply this logic to Bayesian trees, based on manual cognacy data, from Austronesian, Bantu, Indo-European and Polynesian Gerhard Jäger Punctuated evolution Tübingen, March 29, 2017 7 / 34
Desiderata Results only for small number of large and well-studied language families bias Addressed in Holman and Wichmann (2016) using a difgerent technical approach Next part of this talk: Use Atkinson et al.’s method Applied to phylogenies from 6,000+ ASJP doculects, using automatically obtained characters for phylogenetic inference Gerhard Jäger Punctuated evolution Tübingen, March 29, 2017 8 / 34 Based on manual cognate judgments → possible source of implicit
The Automated Similarity Judgment Program Collaborative data collection project around Cecil Brown, Eric Holman, Søren Wichmann and others covers more about 7,000 languages and dialects basic vocabulary of 40 words for each language, in uniform phonetic transcription freely available used concepts: I, you, we, one, two, person, fjsh, dog, louse, tree, leaf, skin, blood, bone, horn, ear, eye, nose, tooth, tongue, knee, hand, breast, liver, drink, see, hear, die, come, sun, star, water, stone, fjre, path, mountain, night, full, new, name Gerhard Jäger Punctuated evolution Tübingen, March 29, 2017 9 / 34
Automated Similarity Judgment Project manus see drink bibere drink liv3r yekur liver brest pektus, mama breast hEnd hand si ni genu knee t3N tongue tu8 dens tooth nos nasus nose widere hear concept star Tübingen, March 29, 2017 Punctuated evolution Gerhard Jäger fEir iNnis fjre ston lapis stone wat3r water stela audire star s3n sol sun k3m wenire come dEi mori die hir English Latin concept unus dog fjS piskis fjsh pers3n persona, homo person tu duo two w3n one dag wi nos we yu tu you Ei ego I English Latin kanis louse Ei pedikulus okulus eye ir auris ear horn kornu horn bon os bone bl3d blood skin kutis skin lif leaf tri arbor tree laus 10 / 34 liNgw ∼ E foly ∼ u* saNgw ∼ is akw ∼ a
PMI string similarity Pointwise Mutual Information (PMI) automatically trained from ASJP data (Jäger, 2013) PMI similarity between two strings: aggregate PMI score for optimal pairwise alignment of those strings Gerhard Jäger Punctuated evolution Tübingen, March 29, 2017 11 / 34 between two sound classes a and b : PMI ( a, b ) . = log P ( a,b are homologous ) P ( a ) P ( b )
Calibrated PMI similarity vi Tübingen, March 29, 2017 Punctuated evolution Gerhard Jäger non-cognates values ofg diagonal provide sample of similarity distribution between (possibility of meaning change is disregarded) values along diagonal give similarity between candidates for cognacy . . . fisk tvo English / Swedish et 12 / 34 yog … fiS du tu w3n wi yu Ei − 7 . 77 0 . 75 − 7 . 68 − 7 . 90 − 8 . 57 − 10 . 50 − 7 . 62 0 . 33 − 5 . 71 − 7 . 41 2 . 66 − 8 . 57 − 2 . 72 − 2 . 83 − 1 . 34 − 6 . 45 0 . 70 4 . 04 − 5 . 47 − 7 . 87 − 5 . 47 − 6 . 43 − 1 . 83 − 4 . 70 − 7 . 91 − 4 . 27 − 3 . 64 − 4 . 57 − 6 . 98 0 . 39 − 7 . 45 − 11 . 2 − 3 . 07 − 9 . 97 − 8 . 66 7 . 58
Calibrated PMI similarity similarity for all concepts Tübingen, March 29, 2017 Punctuated evolution Gerhard Jäger 13 / 34 language similarity: average word calibrated string similarity : English vs. Swedish 15 10 5 0 PMI similarity −5 let s be the PMI-similarity between the −10 English and Swedish word for concept c −15 −20 −25 different meaning same meaning − log (probability that random word pairs are more similar than s )
Cognate clustering 40 Mixe-Zoque 69 Hmong-Mien 6 36 206 Peiros (1998) Miao-Yao 241 Mayan 30 1,113 355 Brown et al. (2008) Mayan 59 Torricelli 8 36 270 Sanders and Sanders (1980) Kamasau 102 Tai-Kadai 12 Cysouw et al. (2006) 39 399 21 Tübingen, March 29, 2017 Punctuated evolution Gerhard Jäger 2,311 13 318 40 10,106 total 68 Uralic 39 10 769 Zhivlov (2011) ObUgrian 232 Austroasiatic 16 40 579 Peiros (1998) Mon-Khmer 79 Mixe-Zoque 40 Peiros (1998) clustering of ASJP strings into automatically inferred cognate classes Greenhill et al. (2008) Afro-Asiatic 21 39 770 Militarev (2000) Afrasian 409 Austronesian 100 34 2,306 ABVD Chinese Cognate classes Families Languages Concepts Words Source Dataset sources (only the 40 ASJP concepts were used) goldstandard supervised learning, based on expert cognacy judgments as a grain of salt) (Jäger and Sofroniev, 2016; Jäger et al., 2017) (take “cognate” with 351 Běijīng Dàxué (1964) Kadai 2,089 74 Japonic 10 39 387 Hattori (1973) Japanese 318 Indo-European 52 40 Dunn (2012) 422 IELex 183 Trans-New Guinea 14 32 441 McElhanon (1967) Huon 126 Sino-Tibetan 18 20 14 / 34
Cognate clustering calibrated word similarity and language similarity were used as cognate for each pair of synonymous ASJP entries Label Propagation (Raghavan et al., 2007) for clustering Gerhard Jäger Punctuated evolution Tübingen, March 29, 2017 15 / 34 predictors to train a Support Vector Machine → probability of being 0 . 84 B-cubed F-score with cross-validation on goldstandard data
Cognate clustering LOWER_SORBIAN Indo-European oko eye KASHUBIAN Indo-European wokwo eye Indo-European eye voko eye LOWER_SORBIAN_2 Indo-European woko eye MACEDONIAN Indo-European CZECH oko eye eye eye VLACH Indo-European okklu eye BELARUSIAN Indo-European voka BOSNIAN Indo-European Indo-European oko eye BULGARIAN Indo-European oko eye CROATIAN oko OLD_CHURCH_SLAVONIC Indo-European BAINOUK_GUNYAAMOLO UPPER_SORBIAN Indo-European voCko eye UPPER_SORBIAN Indo-European voko eye Atlantic-Congo oko g3li eye USINO Nuclear_Trans_New_Guinea ogo Gerhard Jäger Punctuated evolution Tübingen, March 29, 2017 eye Indo-European Indo-European oko oko eye POLISH Indo-European oko eye SERBOCROATIAN Indo-European eye UKRAINIAN SLOVAK Indo-European oko eye SLOVENIAN Indo-European oko eye okLu TURIA_AROMANIAN concept Indo-European soku eye CHAKMA_UnnamedInSource Indo-European sog eye DALMATIAN vaklo ASSAMESE eye FRIULIAN Indo-European voli eye ITALIAN Indo-European okkyo Indo-European eye ITALIAN_GROSSETO_TUSCAN NORTHERN_LOW_SAXON doculect glot_fam transcription eye DORASQUE Chibchan oko eye Indo-European ok ok eye NORTH_FRISIAN_AMRUM Indo-European uk eye STELLINGWERFS Indo-European eye Indo-European eye Indo-European Indo-European ogu eye SARDINIAN_CAMPIDANESE Indo-European oxu eye SARDINIAN_LOGUDARESE okru eye eye SICILIAN_UnnamedInSource Indo-European okiu eye SPANISH Indo-European oho SARDINIAN wokLu okyo eye eye JUDEO_ESPAGNOL Indo-European oxo eye LATIN Indo-European okulus NEAPOLITAN_CALABRESE Indo-European Indo-European woky3 eye ROMANIAN_2 Indo-European oky eye ROMANIAN_MEGLENO 16 / 34
Recommend
More recommend