Character # Taxon 1 2 3 4 5 6 7 8 9 10 A 0 0 0 0 0 - PowerPoint PPT Presentation

Character # Taxon 1 2 3 4 5 6 7 8 9 10 A 0 0 0 0 0 0 0 0 0 0 B 1 0 0 0 0 1 1 1 1 1 C 0 1 1 1 0 1 1 1 1 1 D 0 0 0 0 1 1 1 1 1 0

✬ ✩ ✬ ✩ B 1 B C ✫ ✪ D ✫ ✪ 10 A C 6 7 8 9 2 3 4 D 5 A

Interestingly, without polarization Hennig’s method can infer unrooted trees. We can get the tree topology, but be unable to tell paraphyletic from monophyletic groups. The outgroup method amounts to inferring an unrooted tree and then rooting the tree on the branch that leads to an outgroup.

D B 5 1 B C A D 10 C A 6 7 8 9 2 3 4

Inadequacy of logic Unfortunately, though Hennigian logic is valid we quickly find that we do not have a reliable method of generating accurate homology statements. The logic is valid, but we don’t know that the premises are true. In fact, we almost always find that it is impossible for all of our premises to be true.

Character conflict Homo sapiens AGTTCAAGT Rana catesbiana AATTCAAGT Drosophila melanogaster AGTTCAAGC C. elegans AATTCAAGC The red character implies that either ( Homo + Drosophila ) is a group (if G is derived) and/or ( Rana + C. elegans ) is a group. The green character implies that either ( Homo + Rana ) is a group (if T is derived) and/or ( Drosophila + C. elegans ) is a group. The green and red character cannot both be correct.

Character # Taxon 1 2 3 4 5 6 7 8 9 10 11 12 A 0 0 0 0 0 0 0 0 0 0 0 0 B 1 0 0 0 0 1 1 1 1 1 1 1 C 0 1 1 1 0 1 1 1 1 1 1 0 D 0 0 0 0 1 1 1 1 1 0 0 1

✬ ✩ ✬ ✩ ✬ ✩ ✬ ✩ ✬ ✩ ✬ ✩ ✬ ✩ ✬ ✩ ✬ ✩ ✬ ✩ ✬ ✩ A B C ✫ ✪ ✫ ✪ ✫ ✪ ✫ ✪ ✫ ✪ ✫ ✪ ✬ ✩ D ✫ ✪ ✫ ✪ ✫ ✪ ✫ ✪ ✫ ✪ ✫ ✪

Character conflict Two characters are compatible if they can both be mapped on the same tree so that all of the character states displayed could be homologous. Incompatible characters are evidence of homoplasy in the data Homoplasy literally means the “same change” has occurred more than once in the evolutionary history of the group. The presence of homoplasy undermines Hennigian analyses.

white = space of all possible matrices

blue = space of matrices with the pattern: A B C D - * * -

red = space of matrices with the pattern: A B C D - * - *

yellow = space of matrices with the pattern: A B C D - - * *

all eight categories of matrices

blue = space of matrices compatible with tree: (A,(B,C),D)

blue = space of matrices compatible with tree: (A,C,(B,D))

blue = space of matrices compatible with tree: (A,B,(C,D))

Hennigian: grey = any tree blue = B+C red = B+D yellow = C+D white = no tree (conflicting characters)

A 0000000000 B 1111111111 C 1111111111 D 1111111111 A 0000000000 A B 1111111110 C 1111111111 D 1111111111 A 0000000000 B B 1111111111 C 1111111110 D 1111111111 C A 0000000000 B 1111111110 C 1111111110 D 1111111111 D A 0000000000 B 1111111111 C 1111111111 A D 1111111110 A 0000000000 B 1111111110 D C 1111111111 D 1111111110 C A 0000000000 B 1111111111 C 1111111110 D 1111111110 B A 0000000000 B 1111111101 C 1111111111 D 1111111111 A A 0000000000 B 1111111100 C C 1111111111 D 1111111111 B A 0000000000 B 1111111101 C 1111111110 D 1111111111 D

What can we do if our data end up in the white (character conflict) or grey (uninformative characters only) zone? • can we detect character conflict? • is there a logic-based solution to the problem of character conflict?

Detecting character conflict in binary characters Consider the four possible combinations of states in a two-character matrix. The characters are incompatible iff (when you look across all taxa) you see all four state combinations. Char 1 0 1 0 × × Char 2 1 × ×

What can we do if our data end up in the white (character conflict) or grey (uninformative characters only) zone? • Can we detect character conflict? Yes • Is there a logic-based solution to the problem of character conflict? – recoding characters? – “reciprocal illumination”?

What can we do if our data end up in the white (character conflict) or grey (uninformative characters only) zone? • Can we detect character conflict? Yes • Is there a logic-based solution to the problem of character conflict? No, nothing purely based on logic (and the suggestions for culling data to make matrices suitable for logical inference can lead to unsatisfyingly subjecive analyses). • What can we do? We must have an “error model”

Statistical inference There are many ways to derive estimators, we are going to talk about maximum likelihood estimation: θ ∈ Θ ∈ X X x ∼ Pr( X = x | θ ) L ( θ ) = Pr( X = x | θ ) ˆ θ = arg max L ( θ )

A B C D Θ A D B C A C B D

A B C D θ ∈ Θ

A 0000000000 B 1111111111 C 1111111111 D 1111111111 A 0000000000 B 1111111110 C 1111111111 D 1111111111 A 0000000000 B 1111111111 C 1111111110 D 1111111111 A 0000000000 B 1111111110 C 1111111110 D 1111111111 A 0000000000 B 1111111111 C 1111111111 X D 1111111110 A 0000000000 B 1111111110 C 1111111111 D 1111111110 A 0000000000 B 1111111111 C 1111111110 D 1111111110 A 0000000000 B 1111111101 C 1111111111 D 1111111111 A 0000000000 B 1111111100 C 1111111111 D 1111111111 A 0000000000 B 1111111101 C 1111111110 D 1111111111 ...

A 0000000000 B 1111111111 0.00024 C 1111111111 D 1111111111 A 0000000000 A B 1111111110 C 1111111111 D 1111111111 0.00024 A 0000000000 B B 1111111111 C 1111111110 D 1111111111 0.00024 C A 0000000000 B 1111111110 C 1111111110 D 1111111111 D 0.00024 A 0000000000 B 1111111111 C 1111111111 Pr( X = x | θ ) D 1111111110 A 0000000000 B 1111111110 C 1111111111 D 1111111110 0.00024 A 0000000000 B 1111111111 C 1111111110 D 1111111110 0.00024 A 0000000000 B 1111111101 C 1111111111 D 1111111111 0.00024 A 0000000000 B 1111111100 C 1111111111 D 1111111111 A 0000000000 B 1111111101 C 1111111110 D 1111111111

A 0000000000 B 1111111111 C 1111111111 D 1111111111 A 0000000000 A B 1111111110 C 1111111111 D 1111111111 A 0000000000 B B 1111111111 C 1111111110 D 1111111111 C A 0000000000 B 1111111110 C 1111111110 D 1111111111 D A 0000000000 B 1111111111 C 1111111111 x ∼ Pr( X = x | θ ) D 1111111110 A 0000000000 B 1111111110 C 1111111111 D 1111111110 A 0000000000 B 1111111111 C 1111111110 D 1111111110 A 0000000000 B 1111111101 C 1111111111 D 1111111111 A 0000000000 B 1111111100 C 1111111111 D 1111111111 A 0000000000 B 1111111101 C 1111111110 D 1111111111

A 0000000000 B 1111111110 x represents C 1111111110 D 1111111111

A θ 1 B C D ? A A 0000000000 θ 2 D ? B 1111111110 B C 1111111110 D 1111111111 C ? A θ 3 C B D

A Pr( x | θ 1 ) = 0 . 00024 θ 1 B C D 0 . 00024 A A 0000000000 θ 2 D B 1111111110 B C 1111111110 D 1111111111 C A θ 3 C B D

A Pr( x | θ 2 ) = 0 . 0002 θ 1 B C D 0 . 00024 A A 0000000000 θ 2 D B 1111111110 B C 1111111110 0 . 0002 D 1111111111 C A θ 3 C B D

A Pr( x | θ 3 ) = 0 . 00022 θ 1 B C D 0 . 00024 A A 0000000000 θ 2 D B 1111111110 B C 1111111110 0 . 0002 D 1111111111 C 0 . 00022 A θ 3 C B D

A B ˆ θ = arg max L ( θ ) C D A A 0000000000 D B 1111111110 B C 1111111110 D 1111111111 C A C B D

ML Estimation • Flexible form of inference • Requires a model: Pr( X = x | θ ) Under mild conditions, ML estimation is asymptotically: • not very biased, • efficient How can we come up with a model?

Character # Taxon 1 2 3 4 5 6 7 8 9 10 A 0 0 0 0 0 - PowerPoint PPT Presentation

Character # Taxon 1 2 3 4 5 6 7 8 9 10 A 0 0 0 0 0 0 0 0 0 0 B 1 0 0 0 0 1 1 1 1 1 C 0 1 1 1 0 1 1 1 1 1 D 0 0 0 0 1 1 1 1 1 0 B 1 B C D 10 A C 6

Distinguishing Convergence on Two-Taxon and Three-Taxon Networks Jonathan Mitchell Supervisors:

Design Elements Issue Task Force March 12, 2014 1 Historic Character 2 Historic Character 3

Automated Design and Scoring of Degenerate Primers From Multiple Taxon-Specific Primers Den

Curriculum on Character Development L1/A: Character in Leadership Character Development Agenda

Curriculum on Character Development Character in Leadership Character Development Agenda

Character Education at Character Education at Northampton Academy An Academy of Character and

CANTERBURY TALES: POWERPOINT CHARACTER PRESENTATION CHARACTER PRESENTER PHYSICAL CHARACTER

- Character set - Character escape conventions - Canonical form - Line editing conventions

Strings II Review Strings are stored character by character. Can access each character

Marshall Ranch Character Management Area Character Statement The boundaries of the Marshall Ranch

Character Eyes: Seeing Language through Character-Level Taggers Yuval Pinter Marc Marone Jacob

Strings II Review Strings are stored character by character.

Chapter 6B Character Depth The visual appearance of a character is not enough to convey

Character Vectors and Factors STAT 133 Gaston Sanchez Department of Statistics, UCBerkeley

On character varieties of 3-manifold groups Misha Kapovich June 22-23, 2015 A character-buildier

Unicode Character Code A character is the smallest possible component of a tex t (e.g., A,

With Clostridium Difficile Infections: Observational Data Of The French Survey CLOdi Arnaud

Performance Measurement Work Group 10/21/2016 Meeting RY 2019 Maryland Hospital Acquired

Combining SAXS and crystallography Rob Meijers EMBL Hamburg Outstation, Germany Stories of

Progress and impact in Australia and abroad The impact of (our) . health services research

Bioimage Informatics for Systems Pharmacology Authors : Fuhai Li Zheng Yin Guangxu Jin Hong

Introduction to single cell RNA sequencing CRUK Bioinformatics Summer School 2018 Mike

Elissa Hallem, Ph.D. Department of Microbiology, Immunology, and Molecular Genetics University

4/21/14 Connecting Neural Connections The B rain R esearch through A

Sambuz

Useful Links

Newsletter

Mail Us