Concatenation A complicated story
Concatenated ML • Assumes all sequences evolve down 1 tree • As mutation rate -> 0 • Likelihood dominated by single mutations • When a character changes, it changes once • Maximum Likelihood -> Maximum Parsimony • This is Claim 1
Proposition 1 The six-leaf balanced tree has a lower parsimony score than the unbalanced tree
• As branches shorten, deep coalescence ‘almost always’
Conditions • Concatenated ML is inconsistent for certain caterpillar trees • Our intuition is that this is all species trees n>6 • This proof covers caterpillar species trees n=6 • However, the mutation rate must be very low • r-state symmetric and infinite alleles equivalent • Gets worse as you add genes/sites
Partitioned ML • What makes it partitioned? • Only assume constant topology • Branch lengths + substitution matrices can vary • Statistically consistent under ILS? • Not affected by this proof, so maybe?
PANIC • Unpartitioned ML positively misleading • All results for statistical consistency of summary methods assumes infinite length genes • No gene tree error • In the presence of gene tree error: ?!?! • In the presence of recombination: ?!?! • No method proven consistent on fixed-length genes
Keep Calm, Concatenate • short edges -> concatenated alignments fail • Extinction -> few short edges deep in tree • Failures probably restricted to twigs • Real data->few deep short edges • But how would you know?
Now what? • Good news: • The species tree is identifiable • A statistically consistent method is possible • Bad news: • Either don’t have it or can’t prove we have it • Good news again: • Concatenation works ok, I guess
Questions?
Recommend
More recommend