Combinatorics of spaces of trees: an application of topology to phylogenetics Curran N. McConnell Dalhousie University Categorical Approaches to Topology and Geometry, CMS Summer Meeting 2019
How phylogenetics works Discover when species branched apart by comparing their genomes. Determine pairwise ”evolutionary time” distance between gene sequences. Build the evolutionary tree that best refmects these pairwise distances. This uses the theory of maximum-likelihood estimation.
How phylogenetics breaks down Difgerent subsequences can suggest difgerent evolutionary histories. Anomalies occur because of: Statistical artefacts Model inadequacy Cross-species transfer of genetic material
How phylogenetics breaks down Detecting non-tree phenomena is hard! Biologists analyze gene sequences in terms of trees. How to detect non-tree phenomena, like when distantly-related plankton pass each other DNA directly?
How phylogenetics breaks down Idea: use topological data analysis (TDA) Topology can complement statistics to better distinguish between kinds of anomalies.
Where my research begins Use persistent homology to analyze evolutionary-tree datasets. Understand combinatorial and topological properties of the spaces these datasets live in.
Where my research begins Use persistent homology to analyze evolutionary-tree datasets. Understand combinatorial and topological properties of the spaces these datasets live in.
n -trees Defjnition A rootless binary tree is an acyclic connected graph in which every vertex is either order 1 or order 3. Defjnition A leaf in a rootless binary tree is a vertex that has exactly one neighbour. Defjnition An n -tree is a rootless binary tree with n labelled leaves. I will later mention rooted n -trees as well.
n -trees Defjnition A rootless binary tree is an acyclic connected graph in which every vertex is either order 1 or order 3. Defjnition A leaf in a rootless binary tree is a vertex that has exactly one neighbour. Defjnition An n -tree is a rootless binary tree with n labelled leaves. I will later mention rooted n -trees as well.
n -trees Defjnition A rootless binary tree is an acyclic connected graph in which every vertex is either order 1 or order 3. Defjnition A leaf in a rootless binary tree is a vertex that has exactly one neighbour. Defjnition An n -tree is a rootless binary tree with n labelled leaves. I will later mention rooted n -trees as well.
Properties of n -trees n -trees have a dual interpretation as triangulations of convex polygons with labelled sides. There are ( 2 n − 5 )!! = ( 2 n − 5 )( 2 n − 7 ) · ... · 5 · 3 · 1 n -trees for each n ≥ 3.
Dual interpretation of n -trees
The collection of ∞ -trees
Tree metrics A plethora of metrics are used. Reliable and fast-ish: quartet distance.
Quartet distance B Q T Q S d S T Quartet distance between two trees S and T is defjned by Defjnition B A A Defjnition B A is given by Symmetric difgerence of sets Defjnition if there exists an edge e in T such that deleting e from T causes where Q gives the set of quartets in a tree. A pair of pairs of vertices {{ a , b } , { c , d }} is a quartet in a tree T { a , b } and { c , d } to lie in separate components.
Quartet distance Defjnition if there exists an edge e in T such that deleting e from T causes Defjnition Defjnition Quartet distance between two trees S and T is defjned by d S T Q S Q T where Q gives the set of quartets in a tree. A pair of pairs of vertices {{ a , b } , { c , d }} is a quartet in a tree T { a , b } and { c , d } to lie in separate components. Symmetric difgerence of sets △ is given by A △ B = ( A ∪ B ) \ ( A ∩ B ) .
Quartet distance Defjnition if there exists an edge e in T such that deleting e from T causes Defjnition Defjnition Quartet distance between two trees S and T is defjned by where Q gives the set of quartets in a tree. A pair of pairs of vertices {{ a , b } , { c , d }} is a quartet in a tree T { a , b } and { c , d } to lie in separate components. Symmetric difgerence of sets △ is given by A △ B = ( A ∪ B ) \ ( A ∩ B ) . d ( S , T ) = | Q ( S ) △ Q ( T ) |
Tree spaces Let T n be the set of n -trees, for every n ∈ N . Let T ∞ be the set of binary trees with infjnitely many leaves. Let Q n be T n with quartet distance.
Dual interpretation of tree metrics homotopies. Contract exterior edges down to a point, one at a time. If you can fjnish at a pair of triangles glued to one another, one with sides a and b and the other with sides c and d , then Quartet distance �→ counting certain label-preserving {{ a , b }{ c , d }} is a quartet in your tree.
Z n B n , the homology module. Homology of a simplicial complex as its basis. computational reasons. Construct Z n n , the module of n -cycles. Construct B n im n 1 , the module of n -boundaries. Construct H n Construct C n as free module with n -simplices of the complex Software frequently uses Z / 2 Z as the module ring for
Z n B n , the homology module. Homology of a simplicial complex as its basis. computational reasons. Construct B n im n 1 , the module of n -boundaries. Construct H n Construct C n as free module with n -simplices of the complex Software frequently uses Z / 2 Z as the module ring for Construct Z n = ker ∂ n , the module of n -cycles.
Homology of a simplicial complex as its basis. computational reasons. Construct C n as free module with n -simplices of the complex Software frequently uses Z / 2 Z as the module ring for Construct Z n = ker ∂ n , the module of n -cycles. Construct B n = im ∂ n + 1 , the module of n -boundaries. Construct H n = Z n / B n , the homology module.
Homology of a simplicial complex as its basis. computational reasons. Construct C n as free module with n -simplices of the complex Software frequently uses Z / 2 Z as the module ring for Construct Z n = ker ∂ n , the module of n -cycles. Construct B n = im ∂ n + 1 , the module of n -boundaries. Construct H n = Z n / B n , the homology module.
Homology of a simplicial complex For H 0 , a better intuition is that elements represent connected components of the complex. H n is occupied by equivalence classes of n -cycles that surround each n + 1-dimensional hole in the complex.
Vietoris-Rips complex Defjnition Given a subset S of a metric space X , the Vietoris-Rips complex satisfjes the following condition: The homology of a fjltered Vietoris-Rips complex approximates the homology of a fjltered Čech complex. Under certain conditions, a Čech complex will have homology isomorphic to the singular homology of X . R ε contains every simplex σ constructed from points in S that For every a , b ∈ σ , B ε ( a ) � B ε ( b ) � = ∅ .
Vietoris-Rips complex Defjnition Given a subset S of a metric space X , the Vietoris-Rips complex satisfjes the following condition: The homology of a fjltered Vietoris-Rips complex approximates the homology of a fjltered Čech complex. Under certain conditions, a Čech complex will have homology isomorphic to the singular homology of X . R ε contains every simplex σ constructed from points in S that For every a , b ∈ σ , B ε ( a ) � B ε ( b ) � = ∅ .
Vietoris-Rips complex Defjnition Given a subset S of a metric space X , the Vietoris-Rips complex satisfjes the following condition: The homology of a fjltered Vietoris-Rips complex approximates the homology of a fjltered Čech complex. Under certain conditions, a Čech complex will have homology isomorphic to the singular homology of X . R ε contains every simplex σ constructed from points in S that For every a , b ∈ σ , B ε ( a ) � B ε ( b ) � = ∅ .
Persistent homology Begin with point cloud data. Infmate a -ball at each point. Draw an edge between points when their -balls intersect. Draw an n -simplex wherever possible. Compute the homology of this complex as changes. Track when generators appear/disappear.
Persistent homology Begin with point cloud data. Draw an edge between points when their -balls intersect. Draw an n -simplex wherever possible. Compute the homology of this complex as changes. Track when generators appear/disappear. Infmate a ε -ball at each point.
Persistent homology Begin with point cloud data. Draw an n -simplex wherever possible. Compute the homology of this complex as changes. Track when generators appear/disappear. Infmate a ε -ball at each point. Draw an edge between points when their ε -balls intersect.
Persistent homology Begin with point cloud data. Draw an n -simplex wherever possible. Compute the homology of this complex as changes. Track when generators appear/disappear. Infmate a ε -ball at each point. Draw an edge between points when their ε -balls intersect.
Persistent homology Begin with point cloud data. Draw an n -simplex wherever possible. Track when generators appear/disappear. Infmate a ε -ball at each point. Draw an edge between points when their ε -balls intersect. Compute the homology of this complex as ε changes.
Persistent homology Begin with point cloud data. Draw an n -simplex wherever possible. Track when generators appear/disappear. Infmate a ε -ball at each point. Draw an edge between points when their ε -balls intersect. Compute the homology of this complex as ε changes.
Persistent homology in quartet space
Recommend
More recommend