outline
play

Outline CSE 527 What is it Lecture 17, 11/24/04 How is it - PowerPoint PPT Presentation

Outline CSE 527 What is it Lecture 17, 11/24/04 How is it Represented RNA Secondary Structure Prediction Why is it important Examples Approaches RNA Structure RNA Pairing Watson-Crick Pairing Primary Structure:


  1. Outline CSE 527 • What is it Lecture 17, 11/24/04 • How is it Represented RNA Secondary Structure Prediction • Why is it important • Examples • Approaches RNA Structure RNA Pairing • Watson-Crick Pairing • Primary Structure: Sequence • C - G ~ 3 kcal/mole • A - U ~ 2 kcal/mole • Secondary Structure: Pairing • “Wobble Pair” G - U ~ 1 kcal/mole • Non-canonical Pairs (esp. if modified) • Tertiary Structure: 3D shape

  2. tRNA - Alt. A tRNA 3d Structure Representations 3’ Anticodon 5’ loop Anticodon loop tRNA - Alt. Why? Representations • RNA’s fold, and function 3’ 5’ • Nature uses what works Anticodon Anticodon loop loop

  3. Importance G A A A A A A A A U G C G U U C U C G A C C U G C U A G C G G U G C A A G G G A G C G A U C G C C G G A C G C A A G A • Ribozymes (RNA Enzymes) G G G A A G G A G G A C A C C A C U U G U A • Retroviruses C C C C G A • Effects on transcription, translation, A A A splicing... G A G C U G C C A A A U A A G A A A • Functional RNAs: rRNA, tRNA, snRNA, G U G A G A C A C U C U U U G U G U C G U G C U C U G C snoRNA, micro RNA, RNAi, riboswitches, A G G C G U C G G A C G C A U U regulatory elements in 3’ & 5’ UTRs, ... C G G U A A A A C G U G C U U G U U G U A G G C G RNA Pairing Definitions • Sequence 5’ r 1 r 2 r 3 ... r n 3’ in {A, C, G, T} • Watson-Crick Pairing • A Secondary Structure is a set of pairs i•j s.t. • C - G ~ 3 kcal/mole 1. i < j-4 • A - U ~ 2 kcal/mole 2. if i•j & i’•j’ are two pairs with i ≤ i’, then • “Wobble Pair” G - U ~ 1 kcal/mole A. i = i’ & j = j’, or } • Non-canonical Pairs (esp. if modified) B. j < i’, or First pair precedes 2nd, or is nested within it. No C. i < i’ < j’ < j “pseudoknots.”

  4. Nested Precedes A Pseudoknot A-C / \ 3’ - A-G-G-C-U U U-C-C-G-A-G-G-G | C-C-C - 5’ \ / U-C-U-C Pseudoknot Approaches to Approaches, II Structure Prediction • Maximum Pairing • Comparative sequence analysis + works on single sequences + handles all pairings (incl. pseudoknots) + simple - requires several (many?) aligned, - too inaccurate appropriately diverged sequences • Minimum Energy • Stochastic Context-free Grammars + works on single sequences - ignores pseudoknots Roughly combines min energy & comparative, but - only finds “optimal” fold no pseudoknots • Partition Function • Physical experiments (x-ray crystalography, NMR) + finds all folds - ignores pseudoknots

  5. “optimal pairing of r i ... r j ” Several (overlapping, but exhaustive) possibilities Nussinov: Max Pairing 1.r i is unpaired; look at best 3. they pair with each other, way to pair r i+1 ... r j so 1 + best r i+1 ... r j-1 • B(i,j) = # pairs in optimal pairing of r i ... r j i i+1 i i+1 • B(i,j) = 0 for all i, j with i ≥ j-4; otherwise j j j-1 • B(i,j) = max of: 1. B(i+1,j) 4.They pair, but not to each other; 2.r j is unpaired; look at best way to pair r i ... r j-1 i pairs with k for 2. B(i,j-1) some i < k < j; i i so look at best 3. B(i+1,j-1) +(if r i pairs with r j then 1 else 0) k r i ... r k + best r k+1 ... r j 4. max { B(i,k)+B(k+1,j) | i < k < j } (don’t need to look at j j-1 other k; why?) k+1 Time: O(n 3 ) j Loop-based Energy Pair-based Energy Minimization Minimization Detailed experiments show it’s 1 • • E(i,j) = energy of pairs in optimal pairing of r i ... r j more accurate to model based 2 on loops, rather than just pairs • E(i,j) = ∞ for all i, j with i ≥ j-4; otherwise Loop types 3 • • E(i,j) = min of: 1. Hairpin loop • E(i+1,j) 2. Stack 4 energy of one pair 3. Bulge • E(i,j-1) 4. Interior loop Time: O(n 3 ) • E(i+1,j-1) + e(r i , r j ) 5. Multiloop 5 • min { E(i,k)+E(k+1,j) | i < k < j }

  6. Base Pairs and Stacking cytosine uracil thymine guanine adenine Zuker: Loop-based Energy, I Loop Examples • W(i,j) = energy of optimal pairing of r i ... r j • V(i,j) = as above, but forcing pair i•j • W(i,j) = V(i,j) = ∞ for all i, j with i ≥ j-4 • W(i,j) = min(W(i+1,j), W(i,j-1), V(i+1,j-1), min { E(i,k)+E(k+1,j) | i < k < j } )

  7. Zuker: Loop-based Suboptimal Energy Energy, II bulge/ multi- • There are always alternate folds with near-optimal hairpin stack interior loop energies. Thermodynamics predicts that populations of identical molecules will exist in different folds; individual • V(i,j) =min(eh(i,j), es(i,j)+V(i+1,j-1), VBI(i,j), VM(i,j)) molecules even flicker among different folds • VM(i,j) = min { W(i,k)+W(k+1,j) | i < k < j } ) • Zuker’s algorithm can be modified to find suboptimal folds • VBI(i,j) = min { ebi(i,j,i ’ ,j ’ ) + V(i ’ , j ’ ) | i < i ’ < j ’ < j & i ’ -i+j-j ’ > 2 } • McCaskill gives a more elaborate dynamic programming algorithm calculating the “partition function,” which Time: O(n 4 ) bulge/ defines the probability distribution over all these states. interior O(n 3 ) possible if ebi(.) is “nice” Example of suboptimal folding Black dots: pairs in opt fold Colored dots: pairs in folds 2-5% worse than optimal fold Two competing secondary structures for the Leptomonas collosoma spliced leader mRNA.

  8. A “Mountain” diagram

Recommend


More recommend