Sequences, structures, shapes, and conformations Peter Schuster Institut für Theoretische Chemie, Universität Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA RNA 2006 Benasque, 17.– 27.07.2006
Web-Page for further information: http://www.tbi.univie.ac.at/~pks
tRNA phe : sequence and molecular structure
tRNA phe : secondary structure is a shape
N = 4 n N S < 3 n Criterion: Minimum free energy (mfe) Rules: _ ( _ ) _ � { AU , CG , GC , GU , UA , UG } A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs
Sequence space
CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... A C A C Hamming distance d (I ,I ) = 4 H 1 2 (i) d (I ,I ) = 0 H 1 1 (ii) d (I ,I ) = d (I ,I ) H 1 2 H 2 1 � (iii) d (I ,I ) d (I ,I ) + d (I ,I ) H 1 3 H 1 2 H 2 3 The Hamming distance between sequences induces a metric in sequence space
Every point in sequence space is equivalent Sequence space of binary sequences with chain length n = 5
Sequence space and structure space
Hamming distance d (S ,S ) = 4 H 1 2 (i) d (S ,S ) = 0 H 1 1 (ii) d (S ,S ) = d (S ,S ) H 1 2 H 2 1 � (iii) d (S ,S ) d (S ,S ) + d (S ,S ) H 1 3 H 1 2 H 2 3 The Hamming distance between structures in parentheses notation forms a metric in structure space
Two measures of distance in shape space: Hamming distance between structures, d H (S i ,S j ) and base pair distance, d P (S i ,S j )
Structures are not equivalent in structure space Sketch of structure space
? ? ?
Compatible structures � Suboptimal conformations
Reference for the definition of the intersection and the proof of the intersection theorem
Structure S k G k Neutral Network � G k C k Compatible Set C k The compatible set C k of a structure S k consists of all sequences which form S k as its minimum free energy structure (the neutral network G k ) or one of its suboptimal structures.
Structure S 0 Structure S 1 The intersection of two compatible sets is always non empty: C 0 � C 1 � �
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA secondary structures Christoph Flamm, Walter Fontana, Ivo L. Hofacker, Peter Schuster. RNA folding kinetics at elementary step resolution. RNA 6 :325-338, 2000 Christoph Flamm, Ivo L. Hofacker, Sebastian Maurer-Stroh, Peter F. Stadler, Martin Zehl. Design of multistable RNA molecules. RNA 7 :325-338, 2001 Michael T. Wolfinger, W.Andreas Svrcek-Seiler, Christoph Flamm, Ivo L. Hofacker, Peter F. Stadler. Efficient computation of RNA folding dynamics . J.Phys.A: Math.Gen. 37 :4731- 4741, 2004
Corresponds to base pair distance : d P ( S 1 , S 2 ) Base pair formation and base pair cleavage moves for nucleation and elongation of stacks
Base pair closure, opening and shift corresponds to Hamming distance: d H ( S 1 , S 2 ) Base pair shift move of class 1: Shift inside internal loops or bulges
Base pair shift Class 2 Base pair closure, opening and shift corresponds to Hamming distance: d H ( S 1 , S 2 ) Base pair shift move of class 2: Shift involves free ends
The kinetic folding algorithm A sequence X specifies an energy ordered set of compatible structures S ( X ): S ( X ) = {S 0 , S 1 , … , S m-1 , O } A trajectory T k ( X ) is a time ordered series of structures in S ( X ). A folding trajectory is defined by starting with the open chain O and ending with the global minimum free energy structure, S 0 or a metastable structure S k , which represents a local energy minimum: T 0 ( X ) = { O , S (1) , … , S (t-1) , S (t) , S (t+1) , … , S 0 } T k ( X ) = { O , S (1) , … , S (t-1) , S (t) , S (t+1) , … , S k } A description of the folding process is obtained through sampling a large number of trjectories. Formulation of kinetic RNA folding as a When no stopping structure, S 0 or S k , is defined, the long time stochastic process distribution of conformations is the Boltzmann ensemble.
Folding dynamics of the sequence GGCCCCUUUGGGGGCCAGACCCCUAAAAAGGGUC
Stochastic variables : N ( ) number of molecules with conformati on L t S j j ∑ = = = ( ) ( ) Probabilit ies : j ( ) Prob { N ( ) j } with N ( ) P t t n t N j j n j ∑ = < > = N = ( ) ( ) j j Expectatio n values : ( ) ( ) ( ) N t n n P t p t = j n j 0 n { ( ) } ( ) j ( ) m dP t ∑ = + + − + = ( ) ( ) ( ) j j j ( ) ( ) ( 1 ) ( ) ( ) ( ) n P t P t k n P t k n P t P t − + l 1 1 l l j n jl n j j n dt = ≠ 0 , l l j { ( ) } { ( ) } m m ∑ ∑ ∑ ∑ = N − + + − + N ( l , 1 ) ( ) ( ) ( ) ( l , ) n j j j n ( 1 ) k i P P k n P P k n k i P − + l 1 l 1 l l = = j i n j n n j j i 0 0 i i = ≠ = ≠ 0 , 0 , l l l l j j = , 0 , 1 , 2 , , and K n i N = → ± ∆ = ± , l 0 , 1 , K , ; single step : 1 ( , ) or 1 (, 0 ) j m n n n n = → ≤ τ ≤ + Transition probabilit ies : ( ) Prob { | } P t dt S S t t dt l l j j N ∑ = − = < > = − ( l , 1 ) ( l ) ( 1 ) ( ) n ( ) n ( ) P t k i P t k n k p t l l l l l j j i j j = 0 i ( ) m ( ) ∑ = − ∆ − ∆ ∆ = ∆ − ∆ 0 0 with exp / exp / and k k G RT G RT G G G 0 l l l l l j j k j j = ≠ 0 , l k k
{ ( ) } ( ) j N dp N m dP ∑ ∑ ∑ − = = + + − + ( 1 ) ( ) ( ) ( ) ( ) j n j ( 1 ) j n j n n k p P k n P k n k p P − + 1 1 l l l l l l j n j n j j n dt dt = = = ≠ 0 0 l 0 , l n n j dp ∑ ∑ m m = − = j ; 0 , 1 , 2 , K , k p p k j m l l l = ≠ j j = ≠ j l 0 , l l 0 , l j j dt
0 Free energy G � T { k 0 G � y g r e n e e e r F S { S { Saddle point T { k S k S k "Barrier tree" "Reaction coordinate" Definition of a ‚barrier tree‘
R 1D 2D GGGUGGAAC CACGAG GUUC CACGAG GAAC CACGAG GUUCCUCCC G 3 13 23 33 44 R 1D 2D 23 13 33 C G C G C G A A A A G/ A A C G C C G G G C G C G C A U A U U A U A A U A U G C G C G C G C G C G C A A U A /G A U G C 13 3 G C G CCC 44 1D 2D C G 33 GG 23 R 5' 3’ A A C G C G -1 -28.6 kcal·mol A U A U -1 -28.2 kcal·mol G C G C U U G C 3 G C An RNA switch G C 44 5' 3’ JN1LH -1 -28.6 kcal·mol J.H.A. Nagel, C. Flamm, I.L. Hofacker, K. Franke, -1 -31.8 kcal·mol M.H. de Smit, P. Schuster, and C.W.A. Pleij. Structural parameters affecting the kinetic competition of RNA hairpin formation, Nucleic Acids Res. 34 :3568-3576, 2006 .
-26.0 2.89 -28.0 4.88 -30.0 6.8 6.13 3.04 3.04 2.97 -32.0 Free energy [kcal / mole] 1.47 1.49 2.14 2.14 2.14 2.51 2.51 50 49 47 46 48 -34.0 45 44 43 1.9 41 40 42 38 39 36 35 37 34 33 32 1 0 8 9 3 3 6 27 2 25 24 2 3 2 -36.0 1 22 2 20 19 2 8 1.66 1 7 6 1 1 5 1 4 3 1.44 2 -38.0 1.46 1 1 1 11 2.44 10 2.09 2.36 -40.0 3.4 9 8 7 -42.0 2.44 5 6 2.44 4 -44.0 5.32 3 -46.0 -48.0 2 2.77 J1LH barrier tree -50.0 1
A ribozyme switch E.A.Schultes, D.B.Bartel, Science 289 (2000), 448-452
Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase ( A ) and a natural cleavage ribozyme of hepatitis- � -virus ( B )
The sequence at the intersection : An RNA molecules which is 88 nucleotides long and can form both structures
Two neutral walks through sequence space with conservation of structure and catalytic activity
Acknowledgement of support Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Universität Wien Wiener Wissenschafts-, Forschungs- und Technologiefonds (WWTF) Project No. Mat05 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Contracts No. 98-0189, 12835 (NEST) Austrian Genome Research Program – GEN-AU Siemens AG, Austria Universität Wien and the Santa Fe Institute
Recommend
More recommend