Multidimensional scaling and flat split systems Monika Balvoči¯ ut˙ e joint work with David Bryant University of Otago 6th Nov 2014 1 / 28
Splits and Split systems A split S = A | B is a bipartition of a set of taxa X into two non empty subsets such that X = A ∪ B and A ∩ B = ∅ . A split system S is set of splits { S } over some set of taxa X . 2 / 28
Equivalent representations of flat split systems b a c d Flat split system Oriented Planar split c b d b matroid splits network a ℓ ∞ a d c 3 / 28
FlatNJ – computing planar split networks Compute building blocks Identify neighbors Agglomerate Reverse agglomeration Weight and filter M. Balvoči¯ ut˙ e, A. Spillner and V. Moulton, FlatNJ:..., Syst. Biol. 2014, 63(3): 383–96 4 / 28
Neighbors e and f are neighbors a a d a b e e e c b d d c e c b a a a d f b f f c d d c f b b c 5 / 28
Not Neighbors b and f are not neighbors d a b a a b e b e c c d e d c b a a a f e f f f d c e c d d e c 6 / 28
Agglomeration a e a d a e b e a b c d d e b c b c a a d a f b f d c f c d d f b c b c a a a d a b b e,f e,f e,f c d d e,f b c d c b c 7 / 28
Agglomeration a e a d a e b e a b c d d e b c b c a a d a f b f d c f c d d f b c b c a a a d a b b e,f e,f e,f c d d e,f b c d c b c 7 / 28
Agglomeration a a d a e,f b a b e,f e,f c d d b c d c b e,f c b a,e,f d c 7 / 28
Agglomeration a a d a e,f b a b e,f e,f c d d b c d c b e,f c b a,e,f d c 7 / 28
Reversing agglomeration b a,e,f d c a a e,f e b f d b d c c 8 / 28
Reversing agglomeration b a,e,f d c a a e,f e b f d b d c c 8 / 28
Reversing agglomeration b a,e,f d c a a e,f e b f d b d c c 8 / 28
Reversing agglomeration b a,e,f d c a a e,f e b f d b d c c 8 / 28
Q: When does it fail? A: When there are no neighbours. 9 / 28
Affine splits Split – line ℓ S in R 2 − X ; Split system – arrangement of lines A in R 2 − X ; X X B A Split Split system 10 / 28
Neighbours in affine split systems b b a a d d e e g g Neighbours Not neighbours 11 / 28
For example a a c ⇒ b c b d d e e f f Input 12 / 28
For example a a c ⇒ b c b d d e e f f Input Output 12 / 28
For example a a c ⇒ b c b d d e e f f Input Output 12 / 28
Multidimensional scaling (MDS) Plot points in low (e.g. two) dimensional space based on their pairwise distances. 1 2 3 n . . . 1 1 0 d 12 d 13 . . . d 1 n n 2 d 12 0 d 23 . . . d 2 n 3 d 13 d 23 0 . . . d 3 n ⇒ . . . 3 . . . . . ... . . . . . . . . . . 2 n 0 d 1 n d 2 n d 3 n . . . 13 / 28
Multidimensional scaling (MDS) Plot points in low (e.g. two) dimensional space based on their pairwise distances. 1 2 3 n . . . 1 1 0 d 12 d 13 . . . d 1 n n 2 d 12 0 d 23 . . . d 2 n 3 d 13 d 23 0 . . . d 3 n ⇒ . . . 3 . . . . . ... . . . . . . . . . . 2 n 0 d 1 n d 2 n d 3 n . . . Minimize the difference between input and output distances. 13 / 28
MSD j � = i ( d ij − δ ij ) 2 j � = i ( d 2 ij − δ 2 ij ) 2 � � � � i i Stress �� � j � = i ( d ij − δ ij ) 2 i � � j � = i d 2 ij i . . . �� � j � = i w ij ( d ij − δ ij ) 2 i � � j � = i w ij d 2 ij i min 14 / 28
MSD j � = i ( d ij − δ ij ) 2 j � = i ( d 2 ij − δ 2 ij ) 2 � � � � i i Stress �� � j � = i ( d ij − δ ij ) 2 i � � j � = i d 2 ij i . . . �� � j � = i w ij ( d ij − δ ij ) 2 i � � j � = i w ij d 2 ij i min d ij – actual distance; δ ij – plotted distance 14 / 28
MSD j � = i ( d ij − δ ij ) 2 j � = i ( d 2 ij − δ 2 ij ) 2 � � � � i i Stress �� � j � = i ( d ij − δ ij ) 2 i � � j � = i d 2 ij i . . . �� � j � = i w ij ( d ij − δ ij ) 2 i � � j � = i w ij d 2 ij i min d ij – actual distance; δ ij – plotted distance 14 / 28
MSD j � = i ( d ij − δ ij ) 2 j � = i ( d 2 ij − δ 2 ij ) 2 � � � � i i Stress �� � j � = i ( d ij − δ ij ) 2 i � � j � = i d 2 ij i . . . �� � j � = i w ij ( d ij − δ ij ) 2 i � � j � = i w ij d 2 ij i min d ij – actual distance; δ ij – plotted distance 14 / 28
MSD j � = i ( d ij − δ ij ) 2 j � = i ( d 2 ij − δ 2 ij ) 2 � � � � i i Stress �� � j � = i ( d ij − δ ij ) 2 i � � j � = i d 2 ij i . . . �� � j � = i w ij ( d ij − δ ij ) 2 i � � j � = i w ij d 2 ij i min d ij – actual distance; δ ij – plotted distance 14 / 28
MSD j � = i ( d ij − δ ij ) 2 j � = i ( d 2 ij − δ 2 ij ) 2 � � � � i i Stress �� � j � = i ( d ij − δ ij ) 2 i � � j � = i d 2 ij i . . . �� � j � = i w ij ( d ij − δ ij ) 2 i � � j � = i w ij d 2 ij i min d ij – actual distance; δ ij – plotted distance 14 / 28
MSD j � = i ( d ij − δ ij ) 2 j � = i ( d 2 ij − δ 2 ij ) 2 � � � � i i Stress �� � j � = i ( d ij − δ ij ) 2 i � � j � = i d 2 ij i . . . �� � j � = i w ij ( d ij − δ ij ) 2 i � � j � = i w ij d 2 ij i min d ij – actual distance; δ ij – plotted distance 14 / 28
MSD S. L. France & J. D. Carroll, Two-Way Multidimensional Scaling: A Review, IEEE Trans. Syst., Man, Cybern.,Syst 2011, 41(5): 644–61 15 / 28
Agglomerative approach to MDS Take pairwise distance matrix Identify neighbours Agglomerate Reverse 16 / 28
Agglomeration g a d e b 17 / 28
Agglomeration g a d e b 17 / 28
Agglomeration g a d e b 17 / 28
Agglomeration g a d c e b 17 / 28
Agglomeration g a d c e b 17 / 28
Agglomeration g a d 1 d 3 d c d m e d 2 b � 2 d 2 1 +2 d 2 2 − d 2 d m = 3 4 17 / 28
Agglomeration g a d 1 d 3 d c d m e d 2 b � 2 d 2 1 +2 d 2 2 − d 2 d m = 3 4 17 / 28
Agglomeration g a d 1 d 3 d c d m e d 2 b � 2 d 2 1 +2 d 2 2 − d 2 d m = 3 4 17 / 28
Agglomeration 1 2 . . . m a b 1 0 d 12 . . . d 1 m d a 1 d b 1 2 d 12 0 . . . d 2 m d a 2 d b 2 . . . . . . ... . . . . . . . . . . . . m d 1 m d 2 m . . . 0 d am d bm a d a 1 d a 2 . . . d am 0 d ab b d b 1 d b 2 . . . d bm d ab 0 1 2 m c . . . � 2 d 2 a 1 +2 d 2 b 1 − d 2 1 0 d 12 . . . d 1 m d c 1 = ab 4 � 2 d 2 a 2 +2 d 2 b 2 − d 2 2 0 d c 2 = d 12 . . . d 2 m ab 4 . . . . . ... . . . . . . . . . . � am +2 d 2 bm − d 2 2 d 2 m 0 d cm = d 1 m d 2 m . . . ab 4 c d c 1 d c 2 . . . d cm 0 18 / 28
Agglomeration 1 2 . . . m a b 1 0 d 12 . . . d 1 m d a 1 d b 1 2 d 12 0 . . . d 2 m d a 2 d b 2 . . . . . . ... . . . . . . . . . . . . m d 1 m d 2 m . . . 0 d am d bm a d a 1 d a 2 . . . d am 0 d ab b d b 1 d b 2 . . . d bm d ab 0 1 2 m c . . . � 2 d 2 a 1 +2 d 2 b 1 − d 2 1 0 d 12 . . . d 1 m d c 1 = ab 4 � 2 d 2 a 2 +2 d 2 b 2 − d 2 2 0 d c 2 = d 12 . . . d 2 m ab 4 . . . . . ... . . . . . . . . . . � am +2 d 2 bm − d 2 2 d 2 m 0 d cm = d 1 m d 2 m . . . ab 4 c 0 d c 1 d c 2 . . . d cm 18 / 28
Agglomeration 1 2 . . . m a b 1 0 d 12 . . . d 1 m d a 1 d b 1 2 d 12 0 . . . d 2 m d a 2 d b 2 . . . . . . ... . . . . . . . . . . . . m 0 d 1 m d 2 m . . . d am d bm a d a 1 d a 2 . . . d am 0 d ab b d b 1 d b 2 . . . d bm d ab 0 ⇓ 1 2 . . . m c � 2 d 2 a 1 +2 d 2 b 1 − d 2 1 0 d 12 . . . d 1 m d c 1 = ab 4 � 2 d 2 a 2 +2 d 2 b 2 − d 2 2 d 12 0 . . . d 2 m d c 2 = ab 4 . . . . . ... . . . . . . . . . . � 2 d 2 am +2 d 2 bm − d 2 m d 1 m d 2 m . . . 0 d cm = ab 4 0 c d c 1 d c 2 . . . d cm 18 / 28
Expansion g d c e 19 / 28
Expansion g d c e 19 / 28
Expansion g a d c e b 19 / 28
Expansion g b ′ a d c e b a ′ 19 / 28
Expansion We know: g d cg a d c d cd d ce e b 20 / 28
Expansion We know: g d cg a d c d cd d ce e b c = { a, b } 20 / 28
Expansion We know: g d cg a d c d cd d ce e b c = { a, b } d ag , d bg d ad , d bd d ae , d be 20 / 28
Expansion We know: We don’t know: Actual dimension g d cg a d c d cd d ce e b c = { a, b } d ag , d bg d ad , d bd d ae , d be 20 / 28
Expansion g d c e 21 / 28
Expansion g d c e 21 / 28
Expansion g a = − b d c e b 21 / 28
Expansion g a = − b d c e b 21 / 28
Expansion g δ ag δ bg a = − b δ ad δ ab δ ae d c δ bd e δ be b 21 / 28
Expansion g δ ag δ bg a = − b δ ad δ ab δ ae d c δ bd e δ be b δ ab ∼ d ab δ ag ∼ d ag δ ad ∼ d ad δ ae ∼ d ae δ bg ∼ d bg δ bd ∼ d bd δ be ∼ d be 21 / 28
Recommend
More recommend