Highways and byways in group-theoretic genome space Attila Egri-Nagy, joint work with Andrew Francis and Volker Gebhardt Centre for Mathematics Research, School of Computing, Engineering and Mathematics University of Western Sydney Phylomania 2013 AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 1 / 32
Questions Is the distance a good enough measure? Can we use the number of shortest evolutionary paths? Maybe the ‘shape’ how these paths are put together... AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 2 / 32
Biology → Math Genome → permutations 6 5 7 3 8 9 1 4 2 Genomic distance → Length of geodesic words AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 3 / 32
Groups, generator sets Let G be a group with generators S = { s 1 , . . . , s n } . S ∗ is the set of all finite sequences, words of the elements of S . The group element realized by the word w is denoted by w , thus w ∈ S ∗ and w ∈ G . Example � � S = s 1 = (1 , 2) , s 2 = (2 , 3) s 1 s 2 s 1 s 2 = (1 , 2)(2 , 3)(1 , 2)(2 , 3) = (1 , 2 , 3) So s 1 s 2 s 1 s 2 = (1 , 2 , 3). sequences of generators ⇐ ⇒ sequences of events AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 4 / 32
Cayley graph The Cayley graph Γ( G , S ) of G with respect to the generating set S is the directed graph with group elements as nodes and the labeled edges s encoding the action of G on itself. Thus g − → gs is an edge. AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 5 / 32
Cayley graph of S 3 Example � � S = s 1 = (1 , 2) , s 2 = (2 , 3) (1,3,2) (1,2) 2 1 1 (1,3) () 2 2 (1,2,3) (2,3) 1 AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 6 / 32
Cayley graph of S 3 – different generators Example � � S = s 1 = (1 , 2) , s 2 = (2 , 3) , s 3 = (3 , 1) (1,2,3) (1,2) 3 1 1 2 2 1 (1,3) (1,3,2) 3 3 2 () (2,3) AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 7 / 32
Geodesic distance, shortest path The geodesic distance defined by d S ( g 1 , g 2 ) = | u | , where u is a minimal length word in S ∗ with the property that g 1 u = g 2 also denoted by u g 1 − → g 2 , and u is called a geodesic word. Geo S ( g 1 , g 2 ) is the set of all geodesic words from g 1 to g 2 . What is Geo S ( g 1 , g 2 )? AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 8 / 32
A partial order defined by the geodesics Due to a translation principle we can simpy write ℓ ( g ) instead of d (1 , g ). Similarly, we use Geo( g ) instead of Geo(1 , g ). Definition For group elements g 1 , g 2 ∈ G = � S � we write g 1 ≤ g 2 if ∃ w = uv ∈ S ∗ such that w = g 2 , u = g 1 , w ∈ Geo( g 2 ), i.e. there is a geodesic from the identity to g 2 and g 1 is on it. Also called the prefix order, or weak order for Coxeter groups. AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 9 / 32
Intervals With the partial order closed intervals are defined in the obvious way [1 , h ] := { g ∈ G | 1 ≤ g ≤ h } AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 10 / 32
Ranked poset (4 , 3) R 3 R 4 R 5 R 6 R 7 R 2 R 1 R 0 (0 , 0) � � The rank-sets of the interval (0 , 0) , (3 , 4) in Z × Z . AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 11 / 32
Ranked poset (4 , 3) R 7 R 6 R 5 R 4 R 3 R 2 R 1 R 0 (0 , 0) AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 12 / 32
Length and size In general there is no connection. (0 , 4) (2 , 2) y y (0 , 0) (0 , 0) x In Z 2 two group elements with same length can have intervals of different � = 5, � = 9. � � � � size. � [(0 , 0) , (0 , 4)] � [(0 , 0) , (2 , 2)] AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 13 / 32
� � Interval lattices in S 3 = (1 , 2 , 3) , (1 , 2) AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 14 / 32
� � S 3 = (1 , 2) , (2 , 3) AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 15 / 32
� � S 4 = (1 , 2) , (2 , 3) , (3 , 4) AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 16 / 32
� � S 4 = (1 , 2) , (2 , 3) , (3 , 4) , (1 , 4) AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 17 / 32
Is it a lattice? An obvious mathematical but biologically not so relevant question. A minimal counterexample would be: AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 18 / 32
Trying with involutions a b a b c c a b ab = bc = ca , ac = ba = cb . But since they are involutions, ba = cb = ⇒ c = bab AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 19 / 32
Trying it with 2 generators Minimal counterexamples a a b b a b a 2 = b 2 , ab = ba For instance, a = (3 , 4 , 5), b = (1 , 2)(3 , 4 , 5). AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 20 / 32
� � C 4 × C 2 = (3 , 4 , 5 , 6) , (1 , 2)(3 , 4 , 5 , 6) � � () , (1 , 2) AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 21 / 32
Sperner property? Sperner property: no antichain is bigger than the size of the maximal rank-set. Do these intervals have the Sperner property? AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 22 / 32
Sperner property? Sperner property: no antichain is bigger than the size of the maximal rank-set. Do these intervals have the Sperner property? NO. s 4 s 3 s 1 = s 4 s 1 s 3 = s 3 s 1 s 2 = s 1 s 3 s 2 AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 22 / 32
Anti-chains Do anti-chains give the number of paths? AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 23 / 32
Anti-chains Do anti-chains give the number of paths? NO. AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 23 / 32
Possible equivalence relations The ultimate goal is to find equivalence classes of group elements. 1 Same length: ℓ ( g 1 ) = ℓ ( g 2 ). 2 Same ‘width’: | Geo( g 1 ) | = | Geo( g 2 ) | . Probably the most decisive property for the biological application. 3 Same profile. 4 Same interval. AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 24 / 32
AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 25 / 32
� � S 4 = (1 , 2) , (2 , 3) , (3 , 4) , (1 , 4) [ 0, 3 ] [ 0, 3 ] [ 0, 3 ] [ 3, 4 ] [ 3, 4 ] [ 3, 4 ] (1,2,4,3) (1,3,2,4) (1,4,2,3) [ 4, 3 ] [ 4, 3 ] [ 4, 3 ] [ 3, 0 ] [ 3, 0 ] [ 3, 0 ] 1 3 2 1 2 4 2 4 3 3 1 1 3 2 4 4 2 4 2 2 4 4 3 1 4 2 3 1 4 2 [ 0, 3 ] [ 3, 4 ] (1,3,4,2) [ 4, 3 ] [ 3, 0 ] 1 3 4 3 1 3 1 2 1 3 AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 26 / 32
� � S 4 = (1 , 2) , (2 , 3) , (3 , 4) , (1 , 4) [ 0, 4 ] [ 0, 4 ] [ 0, 4 ] [ 4, 4 ] [ 4, 4 ] [ 4, 4 ] (1,2,3,4) (1,4,3,2) (1,3) [ 4, 4 ] [ 4, 4 ] [ 4, 4 ] [ 4, 0 ] [ 4, 0 ] [ 4, 0 ] 1 2 3 4 1 4 2 3 1 2 3 4 4 1 2 3 2 1 3 4 2 1 4 3 3 4 1 2 3 2 4 1 1 2 3 4 AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 27 / 32
n = 5 all inversions circular linear length 4 7 11 [length,width] 7 14 30 AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 28 / 32
Number of paths Assuming that we have an efficient algorithm for calculating the distance, we can also calculate the interval. For biological applications it is probably enough to estimate the interval by partially calculating it. AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 29 / 32
Algorithm 1: Constructing the graded interval [ g , h ]. input : g , h ∈ G , S generator set, d distance function output : [ g , h ] interval, R i rank-sets GradedInterval ( g , h , S , d ) : n ← d ( g , h ); R 0 ← { g } ; foreach i ∈ { 1 , . . . , n } do R i ← ∅ ; foreach g ′ ∈ R i − 1 do foreach s ∈ S do if d ( g ′ s , h ) = n − i then R i ← R i ∪ g ′ s ; AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 30 / 32
TODO list Study individual generating sets. (since no grand theory is available) Find the right interpretation in order to modify the distance function. AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 31 / 32
Thank You! AF,e-n@,VG (UWS CRM) Highways and Byways Phylomania 2013 32 / 32
Recommend
More recommend