computational aspects of ncrna research
play

Computational aspects of ncRNA research Mihaela Zavolan - PowerPoint PPT Presentation

Computational aspects of ncRNA research Mihaela Zavolan Biozentrum, Basel Swiss Institute of Bioinformatics Computational aspects on ncRNA research Bacterial ncRNAs Gene discovery Target discovery Discovery of transcription


  1. Computational aspects of ncRNA research Mihaela Zavolan Biozentrum, Basel Swiss Institute of Bioinformatics

  2. Computational aspects on ncRNA research Bacterial ncRNAs • Gene discovery • Target discovery • Discovery of transcription regulatory elements for ncRNAs

  3. Computational aspects on ncRNA research miRNAs • Gene discovery: automated annotation gene prediction • Expression profiling: sample comparisons visualization • Target discovery: modeling miRNA-mRNA target interaction • Characterization of regulatory networks involving RNAs: miRNA target prediction prediction of transcription regulatory elements

  4. Computational aspects on ncRNA research siRNAs: design • Optimization of silencing efficacy • Minimization of off-target effects

  5. ncRNA gene prediction Main feature: RNA secondary structure is important. Look for evidence of selection on the secondary structure. GGACaag GUCC GUGCucauGUAC GGACag GUUC GUAUuuu GUAC Identification of pairs of sites with high mutual information Proportion of miRNA sequences with a P- Mutations that are fixed in evolution value less than specified threshold preserve RNA structure (covariance (Bonnet et al . (2004) Bioinformatics 20 :2911) models behind tRNAscan-SE (S. Eddy), RNAalifold (I. Hofacker)) Structure stabilization

  6. ncRNA gene prediction Main feature: RNA secondary structure is important. Look for evidence of selection on the secondary structure. 300-50 200-200 50-50 mir-100 is expected to preserve its hairpin secondary structure through the various steps of miRNA biogenesis.

  7. Prediction of bacterial ncRNAs

  8. Promoter regions recognized by � 70 subunit of E.coli � factor TATA box binding site http://cwx.prenhall.com/horton/medialib/media_portfolio/

  9. RNA hairpins regulate transcription termination http://cwx.prenhall.com/horton/medialib/media_portfolio/

  10. Conserved secondary structures of Vibrio ncRNAs Lenz et al. - The small RNA chaperone Hfq and multiple small RNAs control quorum sensing in Vibrio harveyi and Vibrio cholerae. Cell 118:69-82 (2004).

  11. miRNA gene discovery Studies driven by Studies driven by experiment computation 1. large-scale cloning 1. genome-wide computational 2. functional annotation prediction 3. miRNA gene prediction 2. validation 4. validation (Lai et al., 2003 - fly; (Houbaviy et al., 2003 - mouse; Lim et al., 2003 - worm; Dostie et al., 2003 - rat; Lim et al., 2003 - vertebrates; Aravin et al., 2003 - fly; Berezikov et al., 2005 - vertebrates; Suh et al., 2004 - man; Pfeffer et al., 2005 - viruses). Pfeffer et al., 2004 - man, viruses). Laborious, exhaustive. Fast, incomplete.

  12. Functional annotation of small RNAs Sequences with known function (mRNA, rRNA, tRNA, miRNA, etc.) Small (16-30 nc) ALIGNMENT cloned RNAs Genome sequence

  13. Functional annotation of small RNAs Small (16-30 nc) cloned RNAs

  14. Functional annotation of small RNAs Small (16-30 nc) cloned RNAs match known rRNA sequences tRNA miRNA mRNA

  15. Functional annotation of small RNAs Small (16-30 nc) cloned RNAs match genome multiple copies Novel miRNAs hairpin conservation rRNA rRNA tRNA tRNA miRNA miRNA mRNA mRNA

  16. Functional annotation of small RNAs Small (16-30 nc) cloned RNAs multiple genome match genome rasiRNA multiple copies matches Novel miRNAs Novel miRNAs hairpin conservation rRNA rRNA rRNA tRNA tRNA tRNA miRNA miRNA miRNA mRNA mRNA mRNA

  17. miRNA gene prediction Issues: • find the locations in the genome that can give rise to miRNAs • predict the sequence of the mature miRNA Main clue: miRNA precursors form stem loop structures He & Hannon (Nat. Rev. Genet. 2004)

  18. ... so do many other genomic regions Fragment of protein- coding gene let-7a mir-147

  19. miRNA gene prediction using SVM Build a model from positive and negative examples. Detect candidate stem loops in (large) genomic sequences. Classify candidate stem loops using the model.

  20. miRNA gene prediction using SVM average distance between loops hsa-let-7c longest symmetrical region L = 84 longest slighly asymmetrical region dG = -33.5 kcal/mole negative stem Nucleotide composition: A - 20% longest symmetrical regions C - 19% L = 68 G - 29% dG = -22.6 kcal/mole U - 32% Paired nucleotides: A-U - 31% G-U - 14% longest slightly asymmetrical region G-C - 29% Proportion of nucleotides in: symmetrical loops - 17% Pfeffer et al. 2005 asymmetrical loops - 4%

  21. miRNA gene prediction using SVM Positives: Negatives: human genomic mRNAs, rRNAs, regions containing tRNAs, known miRNAs viral stem loops Features with largest negative weights: Negatives Positives Free energy Nr. nc. in symmetrical loops in LSAR Nr. nc. in asymmetrical loops in LSAR Avg. size of asymmetrical loops Features with largest positive weights: Stem length Length longest symmetrical region Nr. A-U pairs in LSAR 29% false negatives 3% false positives Nr. G-C pairs in LSAR Used SVMlight http://svmlight.joachims.org/

  22. Detecting candidate stem loops Search for stems whose secondary structure remains the same irrespective of their flanking sequences. example : hsa-mir-100 300-50 200-200 50-50 86% of the known human microRNAs belong to such robust stems. Density of robust stems in human genome: approximately 1 every 10 kb.

  23. Classification of candidate stem loops L = 78 miRNA precursor? dG = 31.6 kcal/mole LSR yes: miR-UL1 of CMV LSAR (cloning frequency: 101) SVM score: 0.8

  24. Application: miRNA gene prediction in viruses Identification of microRNAs of the herpesvirus family. Nature Methods (2005).

  25. Sensitivity-specificity plots for evaluating the performance of prediction programs TP TP Sn = TP + FN , Sp = TP + FP

  26. Sensitivity-specificity plots for evaluating the performance of prediction programs TP TP Sn = TP + FN , Sp = TP + FP

  27. Sensitivity-specificity plots for evaluating the performance of prediction programs TP TP Sn = TP + FN , Sp = TP + FP

  28. Sensitivity-specificity plots for evaluating the performance of prediction programs TP TP Sn = TP + FN , Sp = TP + FP

  29. Variations on miRNA gene prediction � w v � = f f f Lim, L. P. et al . (2003) Genes & Dev. 17 :991

  30. Variations on miRNA gene prediction Berezikov, E. et al. (2005) Cell 120 :21 Proportion of miRNA sequences with a P- value less than specified threshold (Bonnet et al . (2004) Bioinformatics 20 :2911)

  31. Variations on miRNA gene prediction Xie, X. et al. (2004) Nature 434 :338

  32. miRNA gene prediction servers http://genes.mit.edu/mirscan/ http://www.mirz.unibas.ch

  33. Prediction of ncRNAs using comparative genomics RNAz (www.tbi.univie.ac.at/~wash/RNAz) • Start with an alignment of homologous sequences • Compute the following features: - mean free energy of aligned sequences - structure conservation index ( ) SCI = E A / E - mean pairwise identity - number of sequences in the alignment • Use a SVM to classify candidates E A is the free energy of the alignment (takes into account mutations that preserve the structure), and is the mean E free energy of aligned sequences.

  34. Modeling miRNA-mRNA interaction for target prediction Known miRNA-mRNA target: C.e_hbl-1 miRNA : cel-let-7 interactions in C.elegans target 5' U GUU C A 3' AUUAUACAACC C ACCUCA UGAUAUGUUGG G UGGAGU miRNA 3' U AU A 5' target: C.e_LIN-41A target: C.e._COG-1A miRNA : cel-let-7 miRNA : cel-lsy-6 target 5' U AUU U target 5' C CA A 3' 3' GU CUUAUACAAAA UUAUACAACC CUGCCUC CG GAGUAUGUUUU GAUAUGUUGG GAUGGAG miRNA 3' GCUUUA CA 5' miRNA 3' UU AU U 5' Hybrids generated using RNAhybrid http://bibiserv.techfak.uni-bielefeld.de/rnahybrid/

  35. Modeling miRNA-mRNA interaction Use evolutionary conservation to determine what defines an miRNA target site. • Define an interaction model (e.g. the first 8 nucleotides of the miRNA have to be perfectly paired with their mRNA target site).

  36. Modeling miRNA-mRNA interaction Use evolutionary conservation to determine what defines an miRNA target site. • Define an interaction model (e.g. the first 8 nucleotides of the miRNA have to be perfectly paired with their mRNA target site). • Determine the locations of all candidate sites in a reference species (e.g. human).

  37. Modeling miRNA-mRNA interaction Use evolutionary conservation to determine what defines an miRNA target site. • Define an interaction model (e.g. the first 8 nucleotides of the miRNA have to be perfectly paired with their mRNA target site). • Determine the locations of all candidate sites in a reference species (e.g. human). • Determine the number of these candidate sites that are conserved in a set of species that have the miRNA.

Recommend


More recommend