The phylogenetics of basic word order Gerhard Jäger Tübingen University University of Tübingen, March 24, 2018 Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 1 / 36
Major word orders Major word orders Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 2 / 36
Major word orders 4.5 SVO VSO VOS OVS OSV 139.1 49.3 11.8 4.7 0.8 Statistics of major word order distribution 66.3% 23.4% 5.6% 2.2% 2.1% 0.4% Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 SOV Weighted by lineages 3 / 36 OVS 3 11 19 79 442 OSV VOS 42.3% VSO SVO SOV Raw numbers 1,045 languages, 211 lineages, 32 families with at least 5 languages data: WALS intersected with ASJP 47.0% 491 7.6% 0.3% 1.8% 1.1% by language by family 0 1000 0 200 pattern 1 pattern 1 SOV SOV SVO SVO 250 50 VSO VSO VOS VOS 750 OVS 150 OVS OSV OSV 500 100 frequency frequency
Major word orders permutation circle 3/24/2018 The phylogenetics of basic word order Gerhard Jäger (Tübingen) transition probability inversely related to path length Previous approaches 4 / 36 Ferrer-i-Cancho (2015): exceptions due to difgusion Proto-world was SOV Gell-Mann and Ruhlen (2011): general pathway: SOV → SVO ↔ VSO/VOS minor pathway: SOV → OVS/OSV SOV SVO OSV VSO OVS VOS
Major word orders Previous approaches Maurits and Griffjths (2014): Bayesian rate estimation, based on fjve families and NJ-trees Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 5 / 36
Major word orders Phylogenetic non-independence languages are phylogenetically structured if two closely related languages display the same pattern, these are not two independent data points Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 6 / 36 ⇒ we need to control for phylogenetic dependencies
Major word orders Phylogenetic non-independence Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 7 / 36
Major word orders estimate transition probabilities 3/24/2018 The phylogenetics of basic word order Gerhard Jäger (Tübingen) the equations in (1).” tionary distribution on the basis of and as it were to ‘predict’ the sta- cover a distributional universal is to Phylogenetic non-independence “In this case, the only way to dis- purely synchronic statistical data.” cannot be discovered on the basis of stationary, a distributional universal pology cannot be assumed to be “If the A-distribution for a given ty- Maslova (2000): 8 / 36
The phylogenetic comparative method The phylogenetic comparative method Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 9 / 36
The phylogenetic comparative method Modeling language change Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 10 / 36 Markov process
The phylogenetic comparative method Modeling language change Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 10 / 36 Markov process Phylogeny
The phylogenetic comparative method Modeling language change Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 10 / 36 Markov process Phylogeny Branching process
The phylogenetic comparative method Estimating rates of change if phylogeny and states of extant languages are known... ... transition rates and ancestral states can be estimated based on Markov model Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 11 / 36
The phylogenetic comparative method Estimating rates of change if phylogeny and states of extant languages are known... ... transition rates and ancestral states can be estimated based on Markov model Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 11 / 36
Inferring a world tree of languages Inferring a world tree of languages Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 12 / 36
Inferring a world tree of languages From words to trees 3/24/2018 The phylogenetics of basic word order Gerhard Jäger (Tübingen) 13 / 36 Swadesh lists training pair-Hidden Markov Model sound similarities applying pair-Hidden Markov Model word alignments classification/ clustering cognate classes feature extraction character matrix Bayesian phylogenetic inference phylogenetic tree
Inferring a world tree of languages From words to trees 3/24/2018 The phylogenetics of basic word order Gerhard Jäger (Tübingen) 13 / 36 Swadesh lists training pair-Hidden Markov Model sound similarities applying pair-Hidden Markov Model word alignments classification/ clustering cognate classes feature extraction character matrix Bayesian phylogenetic inference phylogenetic tree
Inferring a world tree of languages From words to trees 3/24/2018 The phylogenetics of basic word order Gerhard Jäger (Tübingen) 13 / 36 Swadesh lists training pair-Hidden Markov Model sound similarities applying pair-Hidden Markov Model word alignments classification/ clustering cognate classes feature extraction character matrix Bayesian phylogenetic inference phylogenetic tree
Inferring a world tree of languages From words to trees 3/24/2018 The phylogenetics of basic word order Gerhard Jäger (Tübingen) 13 / 36 Swadesh lists training pair-Hidden Markov Model sound similarities applying pair-Hidden Markov Model word alignments classification/ clustering cognate classes feature extraction character matrix Bayesian phylogenetic inference phylogenetic tree
Inferring a world tree of languages From words to trees 3/24/2018 The phylogenetics of basic word order Gerhard Jäger (Tübingen) 13 / 36 Swadesh lists training pair-Hidden Markov Model sound similarities applying pair-Hidden Markov Model word alignments classification/ clustering cognate classes feature extraction character matrix Bayesian phylogenetic inference phylogenetic tree
Inferring a world tree of languages From words to trees 3/24/2018 The phylogenetics of basic word order Gerhard Jäger (Tübingen) 13 / 36 Swadesh lists training pair-Hidden Markov Model sound similarities applying pair-Hidden Markov Model word alignments classification/ clustering cognate classes feature extraction character matrix Bayesian phylogenetic inference phylogenetic tree
Inferring a world tree of languages From words to trees 3/24/2018 The phylogenetics of basic word order Gerhard Jäger (Tübingen) 13 / 36 Nilo-Saharan Khoisan n a d i i v a Altaic r D c Indo-European Niger-Congo a l i r Swadesh lists U training pair-Hidden Markov Model Afro-Asiatic sound Subsaharan NW Eurasia similarities applying Africa pair-Hidden Markov Model Australian Australia/Papua e l i l o r r i c T Sepik word alignments T r a n s orricelli - N e w G u P T i n e a a Trans-NewGuinea p u Trans-NewGuinea classification/ a SE Asia Trans-NewGuinea clustering Chibchan Otomanguean cognate classes Arawakan Panoan America Ainu Macro-Ge Cariban feature extraction ucanoan n T p i a T u Penutian c Austronesian g i A l character matrix n e Otomanguean e a D N n a c Bayesian t e Hokan z A Mayan o - phylogenetic U t Nakh-Daghestanian inference Salish phylogenetic n a u n h e n T c M i a Timor-Alor-Pantar Austro-Asiatic ai-Kadai tree e t u g - e Q n b T i o - m o H n i S
Estimating word-order transition patterns Estimating word-order transition patterns Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 14 / 36
Estimating word-order transition patterns Workfmow in total) estimate posterior tree distributions with MrBayes for each family, using Glottolog as constraint tree test whether universal or lineage-specifjc model gives a better fjt estimate transition rates with best model estimate stationary distribution of major word order categories apply stochastic character mapping (SIMMAP; Bollback 2006) estimate expected number of mutations for each transition type Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 15 / 36 (data from all 32 families with ≥ 5 languages in data base; 778 languages
Estimating word-order transition patterns Estimating posterior tree distributions using characters extracted from ASJP data (Jäger 2018) Glottolog as constraint tree ascertainment bias correction relaxed molecular clock (IGR) uniform tree prior if convergence later than after 1,000,000 steps, sample 1,000 trees from posterior Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 16 / 36 Γ -distributed rates stop rule: 0 . 01 , samplefreq=1000
Estimating word-order transition patterns Phylogenetic tree sample Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 17 / 36
Estimating word-order transition patterns Estimating transition rates totally unrestricted model, all 30 transition rates are estimed independently implementation using RevBayes (Höhna et al., 2016) Gerhard Jäger (Tübingen) The phylogenetics of basic word order 3/24/2018 18 / 36
Recommend
More recommend