from sequences to structures and back
play

From Sequences to Structures and Back The Vienna RNA Package Peter - PowerPoint PPT Presentation

From Sequences to Structures and Back The Vienna RNA Package Peter Schuster Institut fr Theoretische Chemie, Universitt Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA Siemens PSE Life Science Symposium Brno,


  1. From Sequences to Structures and Back The Vienna RNA Package Peter Schuster Institut für Theoretische Chemie, Universität Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA Siemens PSE Life Science Symposium Brno, 14.03.2006

  2. Web-Page for further information: http://www.tbi.univie.ac.at/~pks

  3. 5' - end N 1 O CH 2 O GCGGAU UUA GCUC AGUUGGGA GAGC CCAGA G CUGAAGA UCUGG AGGUC CUGUG UUCGAUC CACAG A AUUCGC ACCA 5'-end 3’-end N A U G C k = , , , OH O N 2 O P O CH 2 O Na � O O OH N 3 O P O CH 2 O Na � O Definition of RNA structure O OH N 4 O P O CH 2 O Na � O O OH 3' - end O P O Na � O

  4. A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs

  5. N = 4 n N S < 3 n Criterion: Minimum free energy (mfe) Rules: _ ( _ ) _ � { AU , CG , GC , GU , UA , UG } A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs

  6. Conventional definition of RNA secondary structures

  7. Restrictions on physically acceptable mfe-structures: � � 3 and � � 2

  8. Vienna RNA Package RNAfold RNAdistance RNAinverse RNAduplex RNAsubopt RNAeval RNAheat RNAcofold RNApdist RNAalifold RNAplot http://www.tbi.univie.ac.at/RNA/

  9. RNA sequence Biophysical chemistry: thermodynamics and kinetics RNA folding : Structural biology, spectroscopy of biomolecules, Empirical parameters understanding molecular function RNA structure of minimal free energy Sequence, structure, and design

  10. 5’-end 3’-end A C (h) C S 5 (h) S 3 U (h) G C S 4 A U A U (h) S 1 U G (h) S 2 (h) C G S 8 0 G (h) (h) S 9 S 7 G C � A U y g A r A e n e (h) A S 6 C C e U e A Suboptimal conformations r U G G F C C A G G U U U G G G A C C A U G A G G G C U G (h) S 0 Minimum of free energy The minimum free energy structures on a discrete space of conformations

  11. hairpin loop hairpin hairpin loop loop stack free stack stack joint stack end bulge free end free end stack internal loop stack hairpin loop Elements of RNA hairpin loop multiloop secondary structures hairpin as used in free energy loop calculations s t a c k stack stack ∑ ∑ ∑ ∑ ∆ = + + + + 300 free free ( ) ( ) ( ) L G g h n b n i n end end 0 , ij kl l b i stacks of hairpin bulges internal base pairs loops loops

  12. RNA sequence Iterative determination of a sequence for the Inverse folding of RNA : given secondary RNA folding : structure Biotechnology, Structural biology, design of biomolecules spectroscopy of Inverse Folding with predefined biomolecules, Algorithm structures and functions understanding molecular function RNA structure of minimal free energy Sequence, structure, and design

  13. Inverse folding algorithm I 0 � I 1 � I 2 � I 3 � I 4 � ... � I k � I k+1 � ... � I t S 0 � S 1 � S 2 � S 3 � S 4 � ... � S k � S k+1 � ... � S t I k+1 = M k (I k ) and � d S (S k ,S k+1 ) = d S (S k+1 ,S t ) - d S (S k ,S t ) < 0 M ... base or base pair mutation operator d S (S i ,S j ) ... distance between the two structures S i and S j ‚Unsuccessful trial‘ ... termination after n steps

  14. Intermediate compatible sequences Initial trial sequences Stop sequence of an unsuccessful trial Intermediate compatible sequences Target sequence Target structure S k Approach to the target structure S k in the inverse folding algorithm

  15. Minimum free energy criterion 1st 2nd 3rd trial 4th 5th Inverse folding of RNA secondary structures The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.

  16. ( ) ( ) ( ) ( ) ( ) ∑ ( ) − ε − ε = γ γ = / , with kT / base pair probability p X T T a S T g e 0 Q T k ij k ij k k k k ( ) ( ) ∑ = γ Q T T k k ∑ ∑ = − = − ln with 1 base pairing entropy s p p p p ≠ i ij ij ii ij , j j j i Base pair probability derived from the partition function Q ( T )

  17. 3' 5' Example of a small RNA molecule with two low-lying suboptimal conformations which contribute substantially to the partition function UUGGAGUACACAACCUGUACACUCUUUC Example of a small RNA molecule: n=28

  18. U U G G A G U A C A C A A C C U G U A C A C U C U U U C C U C U U U C U C A C A U G U C C A A C A C A U G A G G U U U U U G G A G U A C A C A A C C U G U A C A C U C U U U C U C C G U G A U U A second suboptimal configuration C G U A ∆ E = 0.55 kcal / mole 0 →2 U A G C U A C C A C A C U first suboptimal configuration U U ∆ E C = 0.50 kcal / mole → U G G A G 0 1 C C U U A A U U G A U A C A C C A C 3' C U U U C U U U G G A G U C 5' C A minimum free energy A configuration U A G C � G = - 5.39 kcal / mole 0 U A C C A A C U U G G A G U A C A C A A C C U G U A C A C U C U U U C „Dot plot“ of the minimum free energy structure ( lower triangle ) and the partition function ( upper triangle ) of a small RNA molecule (n=28) with low energy suboptimal configurations

  19. Phenylalanyl-tRNA as an example for the computation of the partition function

  20. G first suboptimal configuration ∆ 0 E = 0.43 kcal / mole → 1 3’ 5’ tRNA phe without modified bases

  21. G C G G A U U U A G C U C A G D D G G G A G A G C MC C A G A C U G A A Y A U C U G G A G MU C C U G U G T P C G A U C C A C A G A A U U C G C A C C A A C C A C G C U U A A G A C A C C U A G C P T G U G U C C U MG A G G U C U A Y A A G U C A G A C C M C G A G A G G G D D G A C U C G A U U U A G G C G G C G G A U U U A G C U C A G D D G G G A G A G C MC C A G A C U G A A Y A U C U G G A G M U C C U G U G T P C G A U C C A C A G A A U U C G C A C C A G C A P U T C G C C U A U C C G G M U C C A A A A C G C U U A A G G G A Y G C G G A U U U U C C U A C A A M G G A C A C U C G G U A C G A A G G D G G D first suboptimal configuration ∆ 0 E = 0.94 kcal / mole → 1 3’ 5’ phe tRNA with modified bases G C G G A U U U A G C U C A G D D G G G A G A G C MC C A G A C U G A A Y A U C U G G A G MU C C U G U G T P C G A U C C A C A G A A U U C G C A C C A

  22. ( ) ( ) ( ) ( ) ( ) ∑ ( ) − ε − ε = γ γ = / , with kT / base pair probability p X T T a S T g e Q T 0 k ij k ij k k k k ( ) ( ) ∑ = γ Q T T k k ∑ ∑ = − = − ln with 1 base pairing entropy s p p p p i ij ij ii ≠ ij , j j j i Reliability measures for structure prediction

  23. Base pairing entropy and base pair probability in a model RNA molecule

  24. without modification nucleotides with modification base pairing entropy base pair probability Reliability of structure prediction in tRNA phe

  25. base pairing entropy base pair probability native structure Reliability of structure prediction in 5S ribosomal RNA

  26. The Folding Algorithm Master equation A sequence I specifies an energy ordered set of dP ( ) ∑ ∑ ∑ + + + 1 1 1 = m − = m − m ( ) ( ) k P t P t k P P k compatible structures S (I): = ik ki = ik i k = ki 0 0 0 i i i dt = + 0 , 1 , , 1 K k m S (I) = {S 0 , S 1 , … , S m , O } Transition probabilities P ij (t) = Prob {S i → S j } are A trajectory T k (I) is a time ordered series of defined by structures in S (I). A folding trajectory is defined by starting with the open chain O and P ij (t) = P i (t) k ij = P i (t) exp(- ∆ G ij /2RT) / Σ i ending with the global minimum free energy structure S 0 or a metastable structure S k which P ji (t) = P j (t) k ji = P j (t) exp(- ∆ G ji /2RT) / Σ j represents a local energy minimum: ∑ T 0 (I) = { O , S (1) , … , S (t-1) , S (t) , + 2 m Σ = exp(- ∆ G ki /2RT) S (t+1) , … , S 0 } k = ≠ 1 , k k i T k (I) = { O , S (1) , … , S (t-1) , S (t) , The symmetric rule for transition rate parameters is due S (t+1) , … , S k } to Kawasaki (K. Kawasaki, Diffusion constants near the critical point for time depen-dent Ising models . Phys.Rev. 145 :224-230, 1966). Formulation of kinetic RNA folding as a stochastic process

  27. Corresponds to base pair distance : d P ( S 1 , S 2 ) Base pair formation and base pair cleavage moves for nucleation and elongation of stacks

  28. Base pair closure, opening and shift corresponds to Hamming distance: d H ( S 1 , S 2 ) Base pair shift move of class 1: Shift inside internal loops or bulges

  29. (h) S 5 (h) S 1 (h) S 2 (h) (h) 0 S 9 S 7 Free energy G � (h) S 6 Suboptimal conformations Search for local minima in conformation space S h Local minimum

  30. 0 G � y T g { k r 0 e G n e � e e y r F g r e n e e e r F S { S { Saddle point T { k S k S k "Barrier tree" "Reaction coordinate" Definition of a ‚barrier tree‘

Recommend


More recommend