investigating the potential of ancestral state
play

Investigating the potential of ancestral state reconstruction - PowerPoint PPT Presentation

Investigating the potential of ancestral state reconstruction algorithms in historical linguistics Gerhard Jger & Johann-Mattis List Tbingen University & CRLAO / Team AIRE, Paris Capturing Phylogenetic Algorithms for Linguistics,


  1. Investigating the potential of ancestral state reconstruction algorithms in historical linguistics Gerhard Jäger & Johann-Mattis List Tübingen University & CRLAO / Team AIRE, Paris Capturing Phylogenetic Algorithms for Linguistics, Leiden October 28, 2015 Jäger & List (Tübingen/Paris) Ancestral state reconstruction Leiden 1 / 42

  2. Introduction What is Ancestral State Reconstruction? While tree-building methods seek to find branching diagrams which explain how a language family has evolved, ASR methods use the branching diagrams in order to explain what has evolved concretely. Ancestral state reconstruction is very common in evolutionary biology but only spuriously practiced in computational historical linguistics (Bouchard-Côté et al. 2013). In classical historical linguistics, on the other hand, linguistic reconstruction of proto-forms and proto-meanings is very common and one of the main goals of the classical comparative method (Fox 1995). Jäger & List (Tübingen/Paris) Ancestral state reconstruction Leiden 2 / 42

  3. Introduction As a result of this restriction, it is quite likely that we cannot recover Leiden Ancestral state reconstruction Jäger & List (Tübingen/Paris) a good candidate word form (cognate set) for the proto-language. It is, however, very interesting to see to which degree we can propose the original form from our data. meaning which may also be cognate with the words in our sample. ASR of Lexical Replacement Patterns within one concept slot here, disregarding all words with a different contrast to classical semantic reconstruction, we are only operating This question resembles the task of “semantic reconstruction”, but in proto-language of all descendant languages. word forms was the most likely candidate to be used in the know which of the words are cognate or not, we may ask which of the If we look for words corresponding to one meaning in a wordlist and 3 / 42

  4. Introduction ASR of Lexical Replacement Patterns Leiden Ancestral state reconstruction Jäger & List (Tübingen/Paris) 4 / 42 Kopf head kop tête testa cap "head" "head" "head" "head" "head" "head"

  5. Introduction ASR of Lexical Replacement Patterns Leiden Ancestral state reconstruction Jäger & List (Tübingen/Paris) 4 / 42 Kopf head kop tête testa cap "head" "head" "head" "head" "head" "head"

  6. Introduction ASR of Lexical Replacement Patterns Leiden Ancestral state reconstruction Jäger & List (Tübingen/Paris) 4 / 42 ? ? ? "head" ? ? Kopf head kop tête testa cap "head" "head" "head" "head" "head" "head"

  7. Introduction ASR of Lexical Replacement Patterns Leiden Ancestral state reconstruction Jäger & List (Tübingen/Paris) 4 / 42 *kop testa "head" "head" Kopf head kop tête testa cap "head" "head" "head" "head" "head" "head"

  8. Introduction ASR of Lexical Replacement Patterns Leiden Ancestral state reconstruction Jäger & List (Tübingen/Paris) 4 / 42 *kaput- "head" *haubud- caput "head" "head" *kop testa "head" "head" Kopf head kop tête testa cap "head" "head" "head" "head" "head" "head"

  9. Introduction This talk reconstruction of cognate class at the root Jäger & List (Tübingen/Paris) Ancestral state reconstruction Leiden 5 / 42 ? A C A B C B

  10. Introduction This talk reconstruction of cognate class at the root Jäger & List (Tübingen/Paris) Ancestral state reconstruction Leiden 5 / 42 B A C A B C B

  11. Materials and Methods Materials Data Jäger & List (Tübingen/Paris) Ancestral state reconstruction Leiden 6 / 42

  12. Materials and Methods Materials Leiden Ancestral state reconstruction Jäger & List (Tübingen/Paris) occur in PAn) 1584 cognate classes (79 test set: 74 concepts, occur in PAn) 1695 cognate classes (88 training set: 81 concepts, set: split into training set and test entries for Proto-Austronesian 210 concepts; for 154 of them 100 were selected at random ABVD PIE) gold standard Data IELex 153 Indo-European doculects 207 concepts entries for Proto-Indo-European arbitrarily split into training set and test set: training set: 67 concepts, 1127 cognate classes (83 occur in PIE) test set: 68 concepts, 957 cognate classes (79 from 7 / 42 743 Austronesian doculects → for 135 concepts → used as

  13. Materials and Methods trees from posterior Leiden Ancestral state reconstruction Jäger & List (Tübingen/Paris) Methods trees maximum clade credibility distributions 8 / 42 random samples of 1000 Trees Malayo-Polynesian data set (training + test trees were inferred with full Prerequisites: Trees data) via Bayesian inference IELex outgroup: Anatolian ABVD outgroup: Anakalang EastSumbaneseUmbuRatuNggaidialect Mamboru EastSumbaneseKamberaSoutherndialect EastSumbaneseLewadialect Kambera Masiwang TetunTerikFehandialect Lakalai NakanaiBilekiDialect GhariNggeri GhariTandai Talise TaliseMalagheti Tolo KwaraaeSolomonIslands Toambaita Lau Saa Tabar Babuyan Isamorong Ivasay Itbayat Itbayaten Imorod Iraralay Yami KakidugenIlongot Cebuano Surigaonon Tagalog TagalogAnthonydelaPaz ManoboAtadownriver ManoboAtaupriver WesternBukidnonManobo DayakNgaju Katingan Indonesian MalayBahasaIndonesia Melayu Kerinci Ogan Komering KomeringUluAdumanisVillage KomeringIlirPalauGemantungVillage KomeringKayuAgungAsli KomeringUluDamarpuraVillage LampungApiDaya KomeringUluPerjayaVillage LampungApiBelalau LampungApiKotaAgung LampungApiKrui LampungApiRanau LampungApiSukau LampungApiKalianda LampungApiTalangPadang LampungApiJabung LampungApiPubian LampungApiSungkai LampungApiWayKanan Lampung LampungNyoAbungKotabumi LampungNyoAbungSukadana LampungNyoMenggalaTulangBawang Carolinian Woleai Chuukese FijianBau Neveei TannaSouthwest FutunaEast Niue Samoan Tongan Luangiua Sikaiana Rennellese Tikopia Hawaiian Marquesan Maori Pukapuka Penrhyn Rarotongan Tuamotu Rurutuan TahitianModern Prasun BabatanaKatazi Kati Ashkun Sengga Sogdian Kubokota Ossetic Luqa Digor_Ossetic Iron_Ossetic Blablanga Wakhi Shughni Sariqoli BlablangaGhove Baluchi MaringeKmagha Kurdish KilokakaYsabel Zazaki Tadzik Kokota Persian CiuliAtayalBandai Pashto Waziri SquliqAtayal Old_Persian PaiwanKulalao Avestan Vedic_Sanskrit Kashmiri Hindi 0.06 Panjabi_St Lahnda Urdu Bhojpuri Sindhi Magahi Marwari Gujarati Marathi Assamese Oriya Bihari Bengali Nepali Khaskura Singhalese Gypsy_Gk Old_Prussian Latvian Lithuanian_O Lithuanian_St Bulgarian_P Macedonian Bulgarian Macedonian_P Serbocroatian Serbian Serbocroatian_P Slovenian Slovenian_P Russian Russian_P Ukrainian_P Ukrainian Polish Byelorussian Slovak Byelorussian_P Czech_E Czech Slovak_P Czech_P Polish_P Upper_Sorbian Lower_Sorbian Old_Church_Slavonic Old_Breton Old_Welsh Old_Cornish Cornish Breton_List Breton_Se Breton_St Welsh_C Gaulish Welsh_N Old_Irish Irish_A Irish_B Gaelic_Scots Manx Oscan Umbrian Vlach Rumanian_List Romansh Dolomite_Ladino Ladin Italian Friulian Walloon French Provencal Catalan Brazilian Spanish Portuguese_St Sardinian_L Sardinian_C Latin Sardinian_N Gothic Flemish Afrikaans Dutch_List Frisian German Standard_German_Munich Schwyzerduetsch Letzebuergesch Old_High_German Pennsylvania_Dutch Old_English Old_Gutnish English Old_Norse Icelandic_St Old_Swedish Faroese Stavangersk Danish Norwegian Danish_Fjolde Gutnish_Lau Oevdalian Swedish Swedish_Up Tocharian_A Swedish_Vl Tocharian_B Albanian_T Albanian Albanian_G Standard_Albanian Albanian_Top Albanian_C Albanian_K Ancient_Greek Greek_D Greek_Ml Greek_Md Tsakonian Greek_K Greek_Mod Classical_Armenian Armenian_List Armenian_Mod Lycian Luvian Hittite Palaic 600.0

Recommend


More recommend