Group theoretic formalization of double-cut-and-join model of - PowerPoint PPT Presentation

Group theoretic formalization of double-cut-and-join model of chromosomal rearrangement Sangeeta Bhatia Phd Supervisor- Prof.Andrew Francis Centre for Research in Mathematics University of Western Sydney 7 th November 2013

Rare is better – large scale mutations ◮ Large scale genome rearrangements such as insertion or deletion of genes, gene duplications, inversions of genes make good phlyogenetic markers, precisely because they are rare. ◮ Our focus - Determining a measure of difference between various species bssed on such large scale genome rearrangements. ◮ Our tool - algebra/group theory.

An example – Double cut and join

An example – Double cut and join ◮ Genome representation – graph.

An example – Double cut and join ◮ Genome representation – graph. ◮ Rearrangement events Inversion of a section ◮ Translocation of a section ◮ Fission/Fusion of strands ◮

Double-cut-and-join: genome representation

Double-cut-and-join: genome representation ◮ A “gene” or region has two extremities: a head and a tail.

Double-cut-and-join: genome representation ◮ A “gene” or region has two extremities: a head and a tail. ◮ Store “adjacencies” i.e. which gene extremities are adjacent on the genome.

Double-cut-and-join: genome representation ◮ A “gene” or region has two extremities: a head and a tail. ◮ Store “adjacencies” i.e. which gene extremities are adjacent on the genome. ◮ Example 5 h , 4 t 1 h , 3 t 3 h , 2 t 2 h 1 t 5 t , 4 h { 1 t , { 1 h , 3 t } , { 3 h , 2 t } , 2 h , { 5 h , 4 t } , { 5 t , 4 h }}

Double cut and join – the cut 1 t 1 h , 2 t 2 h , 3 t 3 h , 4 t 4 h 1 h 2 t 2 h , 3 t 3 h 4 t 4 h 1 t

Double cut and join operation — inversion 1 h 2 t 2 h , 3 t 3 h 4 t 4 h 1 t 1 t 1 h , 3 h 3 t , 2 h 2 t , 4 t 4 h

Double cut and join operation — excision 2 h , 3 t 1 t 1 h 2 t 3 h 4 t 4 h 1 t 1 h , 4 t 4 h 2 t , 3 h 2 h , 3 t

Circularization/Linearization 1 h , 2 t 2 h , 3 t 3 h , 4 t 1 t 4 h 4 h , 1 t 1 h , 2 t 3 h , 4 t 2 h , 3 t

Fusion/Fission 1 t 1 h , 2 t 2 h , 3 t 3 h , 4 t 4 h 1 h , 2 t 2 h 3 t 3 h , 4 t 4 h 1 t

Distance under the DCJ model – Adjacency graph 1 h 2 t 3 t 2 h 4 t 3 h 4 h 5 t 5 h 1 t 1 h 2 t 4 t 3 t 2 h 3 h 1 t 4 h 5 h 5 t

DCJ operator — Our re-formulation ◮ We assign a numeric label to each gene extremity. Let i be a gene. Then i t → 2 i − 1 i h → 2 i ◮ Thus if there are n genes, we get 2 n labels. Let us call this set X .

DCJ operator — Our re-formulation ◮ We assign a numeric label to each gene extremity. Let i be a gene. Then i t → 2 i − 1 i h → 2 i ◮ Thus if there are n genes, we get 2 n labels. Let us call this set X . ◮ A genome on n genes is a permutation π on the set X such that π ( i ) = j ⇐ ⇒ π ( j ) = i

DCJ operator — Our re-formulation ◮ For example for the genome { 1 t , ( 1 h , 2 h ) , 2 t } , the labels are 1 t → 1 , 1 h → 2 2 t → 3 , 2 h → 4

DCJ operator — Our re-formulation ◮ For example for the genome { 1 t , ( 1 h , 2 h ) , 2 t } , the labels are 1 t → 1 , 1 h → 2 2 t → 3 , 2 h → 4 and it is encoded as � � 1 2 3 4 1 4 3 2

DCJ operator — Our re-formulation For i , j ∈ X � ( i j ) π ( i j ) if π = . . . ( k i )( l j ) and k � = i or j � = l D ij ( π ) = ( i j ) π if i and j are fixed in π or π = . . . ( i j )

DCJ operator — Our re-formulation For i , j ∈ X � ( i j ) π ( i j ) if π = . . . ( k i )( l j ) and k � = i or j � = l D ij ( π ) = ( i j ) π if i and j are fixed in π or π = . . . ( i j ) ◮ Clearly, D ij = D ji .

DCJ operator — Our re-formulation For i , j ∈ X � ( i j ) π ( i j ) if π = . . . ( k i )( l j ) and k � = i or j � = l D ij ( π ) = ( i j ) π if i and j are fixed in π or π = . . . ( i j ) ◮ Clearly, D ij = D ji . ◮ Also, D 2 ij is identity.

KEY RESULTS

Key result # 1 – Structure of the group of D ij s ◮ Let Γ n be the set of genomic permutations on n regions. D ij is a bijection on Γ n . ◮ Let D be the subgroup of S Γ n generated by the D ij operators.

Key result # 1 – Structure of the group of D ij s ◮ Let Γ n be the set of genomic permutations on n regions. D ij is a bijection on Γ n . ◮ Let D be the subgroup of S Γ n generated by the D ij operators. Let the cardinality of Γ n be γ . If γ/ 2 is even then D is alternating group of degree γ . Otherwise it is a symmetric group of degree γ .

Key result # 1 – Structure of the group of D ij s ◮ Let Γ n be the set of genomic permutations on n regions. D ij is a bijection on Γ n . ◮ Let D be the subgroup of S Γ n generated by the D ij operators. Let the cardinality of Γ n be γ . If γ/ 2 is even then D is alternating group of degree γ . Otherwise it is a symmetric group of degree γ . ◮ Conjecture: γ/ 2 is even ∀ n > 2.

Key result # 2 – Characterization of cycles and paths of AG ( A , B ) Theorem Let A and B be genomes and let α be a k-cycle in the product π A π B . If α contains a point that is fixed in π A or π B , then the extremities in α form a path of length k in AG ( A , B ) . If α does not contain any point of that is fixed in π A or π B then let β be the cycle in π A π B that contains π B ( i ) for any i ∈ α . Then αβ is a cycle in AG ( A , B ) .

Characterization of cycles and paths of AG ( A , B ) – example π A = ( 1 , 10 )( 2 )( 3 , 5 )( 4 , 7 )( 6 )( 8 , 9 ) π B = ( 1 , 8 )( 2 , 3 )( 4 , 6 )( 5 , 7 )( 9 , 10 ) 2 t 3 t 2 h 4 t 3 h 4 h 5 t 5 h 1 t 1 h (3,5) (4,7) (6) (8,9) (1,10) 2 1 h 2 t 4 t 3 t 2 h 3 h 1 t 4 h 5 h 5 t (2,3) (5,7) (6,4) (1,8) (10,9) π A π B = ( 1 , 9 )( 8 , 10 )( 2 , 5 , 4 , 6 , 7 , 3 )

Key result # 3 – DCJ Distance d DCJ ( π A , π B ) = l ( π A π B ) + E 2 2 where l ( π A π B ) is the length π A π B and E is the number of cycles in π A π B that move two fixed points of π A or of π B .

Key result # 4 – Number of sorting scenarios Let π A and π B be genomic permutations on n regions such that π B π A encodes a cycle in the adjacency graph AG ( A , B ) . Then the number of optimal sorting scenarios between π A and π B is n n − 2 .

An example Let π a = ( 1 , 8 )( 2 , 3 )( 4 , 5 )( 6 , 7 ) , π b = ( 1 , 2 )( 3 , 4 )( 5 , 6 )( 7 , 8 )

An example Let π a = ( 1 , 8 )( 2 , 3 )( 4 , 5 )( 6 , 7 ) , π b = ( 1 , 2 )( 3 , 4 )( 5 , 6 )( 7 , 8 ) d 28 ( π a ) = ( 1 , 2 )( 8 , 3 )( 4 , 5 )( 6 , 7 )

An example Let π a = ( 1 , 8 )( 2 , 3 )( 4 , 5 )( 6 , 7 ) , π b = ( 1 , 2 )( 3 , 4 )( 5 , 6 )( 7 , 8 ) d 28 ( π a ) = ( 1 , 2 )( 8 , 3 )( 4 , 5 )( 6 , 7 ) d 48 d 28 ( π a ) = ( 1 , 2 )( 4 , 3 )( 8 , 5 )( 6 , 7 )

An example Let π a = ( 1 , 8 )( 2 , 3 )( 4 , 5 )( 6 , 7 ) , π b = ( 1 , 2 )( 3 , 4 )( 5 , 6 )( 7 , 8 ) d 28 ( π a ) = ( 1 , 2 )( 8 , 3 )( 4 , 5 )( 6 , 7 ) d 48 d 28 ( π a ) = ( 1 , 2 )( 4 , 3 )( 8 , 5 )( 6 , 7 ) d 68 d 48 d 28 ( π a ) = ( 1 , 2 )( 3 , 4 )( 5 , 6 )( 7 , 8 )

Group theoretic formalization of double-cut-and-join model of - PowerPoint PPT Presentation

Group theoretic formalization of double-cut-and-join model of chromosomal rearrangement Sangeeta Bhatia Phd Supervisor- Prof.Andrew Francis Centre for Research in Mathematics University of Western Sydney 7 th November 2013 Rare is better

Formalization: Formalization: Formalization: Formalization: The Case of Chile The Case of

More Java Graphics Shape Classes: Face Check out Faces from SVN Finish Java Graphics: text and

A Formalization of the Max-flow Min-cut Theorem in Higher Order Logic Niccol` o Veltri

Cut per region Marc Verderi GEANT4 collaboration meeting 01/ 10/ 2002 Introduction Cut here

JOINS IN SQL By Rohit Dhanwani OBJECTIVES Define and use different types of joins INNER

Cuts and Connectivity Cuts and Connectivity CSE, IIT KGP Vertex Cut and Connectivity Vertex Cut

Carbon-emission and emission-cut measures in CHIYODA-ward A: emission-cut by increased operation

Bevel cut Rabbet Straight cut cut Speed The ultimate goal FoamCorps Literal cutting-edge

Names Quattro S Double A Double S Double C Triple C Quattro C Variations All Boxer models

Double Chooz Experiment Status Double Chooz Experiment Status Jelena Maricic, Drexel University

When to Optimize Enumerating all possible plans Selection Pushdown Join Conversion Join

Chapter 3 Double Entry System (Part 2) 6.1 Short-cut to remember double entry principle When

Cut Not and Fail Cut, Not, and Fail York University CSE 3401 Vida Movahedi 1 York University

Lecture 6: Linear Programming for Sparsest Cut Sparsest Cut and SOS The SOS hierarchy

Cut Flower In High Tunnels Susan Cheek Small Farm Outreach Agent Cut Flowers: Field vs. High

Generalized Flow-Cut Dualities Sanjeevi Krishnan (Upenn) Bremen 2013 MAX FLOW = MIN CUT The

A Moldable Online Scheduling Algorithm and Its Application to Parallel Short Sequence Mapping Erik

Transcriptome and isoform reconstruc1on with short reads

Data Mining in Bioinformatics Day 6: Classification in Bioinformatics Karsten Borgwardt February

Modlisation individu-centre de systmes biologiques complexes Application la simulation

Local Genetic Adaptation in Beef Cattle Jared Decker Assistant Professor Beef Genetics

Using International Information In National Single Step Genomic BLUP In Swiss Dairy Cattle

Agricultural Economics and Farm Surveys Department Teagasc Trevor Donnellan Ag Econ and Farm

The GenABEL project for statistical genomics Yurii Aulchenko [ YuriiA consulting (NL) | ICG SB

Sambuz

Useful Links

Newsletter

Mail Us