current trends non coding rnas
play

Current Trends: Non-coding RNAs Central Dogma of molecular biology - PDF document

Current Trends: Non-coding RNAs Central Dogma of molecular biology Reverse RNA virus transcriptase replication DNA RNA Protein Cellular functions (mRNA) in vitro (ncRNA) 1 Non-coding RNAs Found in prokaryotes (small RNAs)


  1. Current Trends: Non-coding RNAs “Central Dogma” of molecular biology Reverse RNA virus transcriptase replication DNA RNA Protein Cellular functions (mRNA) in vitro (ncRNA) 1

  2. Non-coding RNAs • Found in prokaryotes (small RNAs) and eukaryotes (non-coding RNAs). • Well-characterized examples: tRNA, rRNA Non-coding RNAs • Enzymatic activity - self-splicing introns - peptidyl transfer - viral replication • Regulation of other genes - eukaryotes: 21-25 nts; micro RNAs - prokaryotes: 50-550 nts; small RNAs 2

  3. Eukayotic vs. Prokaryotic ncRNAs RNAi RNAa… Gottesman, Trends in Genetics 21:399-404 ncRNAs can regulate gene expression at many steps Red = bacterial Blue = eukaryotes Storz et al., Ann. Rev. Biochem. 74:199-217 3

  4. Targets of RNA Gene Regulation messenger RNA UAGCAUGUACGUAGCUAGCUACGAUUGUUAUUACUGUCGUGCUUUCACUUCUCGCAGGAGUCCUCGUAUGGUA C G U G U U C C C U G C G G A G C G C C U C A A A A G G C C G G A C G A A C A A RNA gene U A U U A C G G C G C C G G C C G U A C G U A G C U U U A U U U U C A U U A U G A C C U A A U A G C G Targets of RNA Gene Regulation UAGCAUGUACGUAGCUAGCUACGAU U U C A C C G C UGU A UA UGUCGUGCUUUCA UUCUC GC AGG GUC UCGUAUGGUA G G C ||| | || ||||||||||||| ||||| || ||| ||| || G ACA U GU ACAGCACGAAAGU AAGAG CG UCC CAG AGGACUAGCG UUCAUUAUGACCUUCGUU C U CC C C U G C A G G A C U C C G G G A C U U U U U 4

  5. Non-coding RNAs are elusive • Not annotated in genomes: lack of defined sequence features • Small, often missed in genetic studies • Missed in assays for protein function • None of 70-100 E. coli ncRNAs found by mutation C G U G U U C C C U G C G G A G C G C C U C A A A A G G C C G G A C G A A C A A U A U U A C G G C G C C G G C C G U A C G U A G C U U U A U U U U C A U U A U G A C C U A A U A G C G Drosophila bantam gene discovery • Overexpression of an intergenic region causes cell and tissue overgrowth • Deletion of intergenic region surrounding EP element results in slow growth EP 41kb 5

  6. bantam encodes a miRNA that regulates a conserved growth pathway hippo growth salvador suppressors warts yorkie growth promoter bantam growth promoter growth cell death Why study ncRNAs in bacteria? • We live in a bacterial world • Bacteria serve as useful model organisms • Bacteria are diverse • Understanding bacteria is useful in many important applications Food Poisoning Strep Throat Tuberculosis Diptheria Dysentery The Black Plague Yaws Whooping cough Typhoid Fever Lyme Disease Pneumonia Botulism Meningitis Scarlet Fever Syphilis Gastroenteritis Gonorrhea Dental Cavities Peptic Ulcers Cholera Anthrax Rheumatic Fever Rocky Mountain Spotted Fever Tetanus Leprosy 6

  7. Shewanella oneidensis • Gram-negative γ -proteobacterium • Found primarily in deep water anaerobic habitats • Can use a wide variety of compounds as terminal electron acceptors • Bioremediation potential: reduces soluble chromium and uranium to insoluble forms S. oneidensis genome overview 45.9% G-C content; 85.5% of genome is coding 5131416 bp total: chromosome is 4969803 bp; pMR-1 is 161613 bp Total genes 5066 tRNA and rRNA genes 128 (2.5%) Protein-coding genes 4938 (97.5%) Genes assigned function 2915 (59%) Conserved hypothetical genes 864 (17.5%) Hypothetical genes 1159 (27%) 7

  8. Computational ncRNA prediction • Most ncRNAs are intergenic, function in trans • Bacterial ncRNAs have promoters, terminators • Genes have distinct nucleotide composition • Conserved secondary structures (stem-loop) • Tiling microarray data • Data can be integrated into a set of predictions ncRNA Genes GTCAGTATAGTCGCATTATAGCCGATCTGAGTCAGTCAGTCGTAGTATCGTAGTCAGTCGTACGTAGTCAGTCGTATCAGTCGAGTCAGTCGA GCTAGTCGATCGTATCACTATCATCGTACGTAGTGCTAGTCAGTGTCATCGATGCGTACGTAGTCAGTTACGTAGCATCGTACGTAGTCATGC ATGCTAGCTAGCTAGCTAGCTAGCTACGCGATCGTGCGTATGCGTATATTATATGCGCTAGCAGTCGTAGTACGTAGTACTATGTATGCGTAC GTGATGCTAGTTGCGTACGATAGCGATACGATCAGTCGTATCGATCGTATGCATCGAGAGTCGTAGTAGCGATTAGCGCTAGTCATTATAGTC GTACTTAGGTCGCGGCGATTACGGATAGTCTGATCACGACGTATGAGCTGACGCGGCGATCAGGAAGACCCTCGCGGAGAACCTGAAAGCACG ACATTGCTCACATTGCTTCCAGTATTACTTAGCCAGCCGGGTGCTGGCTTTTTGTACGTACTGAGTCGGCATTATAGCGTATGCATACGGAGT ACGAGTCGTACGGACAGTCGTAGTCAGTCTGATCAGTCAGTCGTAGTCGTATGCAGTCGACGAGTCGTACGTATGCAGTCGATCGTTAGGACT CGTAAGTCGTATCATATCGGATTATAGCATGCTAGAGCTAGTCGTATAGTCTACGAGTTATACGTCTAGTGGCTAGTGTACGTCAGTCGTACG ATGCAGTTAGTAGTCTAGTATTACGATTAGTCGTGATCTGAGTAGTTACGTCGATGGTAGCCATTATACGTACTTAC 8

  9. 1) Nucleotide composition GTCAGTATAGTCGCATTATAGCCGATCTGAGTCAGTCAGTCGTAGTATCGTAGTCAGTCGTACGTAGTCAGTCGTATCAGTCGAGTCAGTCGA GCTAGTCGATCGTATCACTATCATCGTACGTAGTGCTAGTCAGTGTCATCGATGCGTACGTAGTCAGTTACGTAGCATCGTACGTAGTCATGC ATGCTAGCTAGCTAGCTAGCTAGCTACGCGATCGTGCGTATGCGTATATTATATGCGCTAGCAGTCGTAGTACGTAGTACTATGTATGCGTAC GTGATGCTAGTTGCGTACGATAGCGATACGATCAGTCGTATCGATCGTATGCATCGAGAGTCGTAGTAGCGATTAGCGCTAGTCATTATAGTC GTACTTAGGTCGCGGCGATTACGGATAGTCTGATCACGACGTATGAGCTGACGCGGCGATCAGGAAGACCCTCGCGGAGAACCTGAAAGCACG ACATTGCTCACATTGCTTCCAGTATTACTTAGCCAGCCGGGTGCTGGCTTTTTGTACGTACTGAGTCGGCATTATAGCGTATGCATACGGAGT ACGAGTCGTACGGACAGTCGTAGTCAGTCTGATCAGTCAGTCGTAGTCGTATGCAGTCGACGAGTCGTACGTATGCAGTCGATCGTTAGGACT CGTAAGTCGTATCATATCGGATTATAGCATGCTAGAGCTAGTCGTATAGTCTACGAGTTATACGTCTAGTGGCTAGTGTACGTCAGTCGTACG ATGCAGTTAGTAGTCTAGTATTACGATTAGTCGTGATCTGAGTAGTTACGTCGATGGTAGCCATTATACGTACTTAC Frequency of nucleotides Frequency of dinucleotides A 0.28 A 0.28 A 0.23 A 0.23 C 0.22 C 0.22 C 0.27 C 0.27 G 0.22 G 0.22 G 0.27 G 0.27 T 0.28 T 0.28 T 0.23 T 0.23 2) Comparative Genomics: Mutation Patterns Genome (1) GTCAGTATAGTCGCATTATAGCCGATCTGAGTCAGTCAGTCGTAGTATCGTAGTCAGTCGTACGTAGTCAGTCGTATCAGTCGAGTCAGTCGA GCTAGTCGATCGTATCACTATCATCGTACGTAGTGCTAGTCAGTGTCATCGATGCGTACGTAGTCAGTTACGTAGCATCGTACGTAGTCATGC ATGCTAGCTAGCTAGCTAGCTAGCTACGCGATCGTGCGTATGCGTATATTATATGCGCTAGCAGTCGTAGTACGTAGTACTATGTATGCGTAC GTGATGCTAGTTGCGTACGATAGCGATACGATCAGTCGTATCGATCGTATGCATCGAGAGTCGTAGTAGCGATTAGCGCTAGTCATTATAGTC GTACTTAGGTCGCGGCGATTACGGATAGTCTGATCACGACGTATGAGCTGACGCG GCGATCAGGAAGACCCTCGCGGAGAACCTGAAAGCACG ACATTGCTCACATTGCTTCCAGTATTACTTAGCCAGCCGGGTGCTGGCTTTTT GTACGTACTGAGTCGGCATTATAGCGTATGCATACGGAGT ACGAGTCGTACGGACAGTCGTAGTCAGTCTGATCAGTCAGTCGTAGTCGTATGCAGTCGACGAGTCGTACGTATGCAGTCGATCGTTAGGACT CGTAAGTCGTATCATATCGGATTATAGCATGCTAGAGCTAGTCGTATAGTCTACGAGTTATACGTCTAGTGGCTAGTGTACGTCAGTCGTACG ATGCAGTTAGTAGTCTAGTATTACGATTAGTCGTGATCTGAGTAGTTACGTCGATGGTAGCCATTATACGTACTTAC Genome (2) GATATCGGTACGTGTCTAGCATGAGTCTATCTATACTGTCGGCGTATCGTACGTATGCGTAATCGATCAGTGTCGTATCGAGTTACGATGCAT GAGTCGTACGTATCGTAGCATGCTAGCTACGATGCTAGCATGCTAGCATCGATGCATGCATGCTGACTAGATCGTACGTAGCTACGTAGTCGT AAGTCGTAGTCGTAGCTAGTTAGCGCGTATAGCGTACGTAGTACGTATCGATGCGTAGTCATTACGACTGATCGTAAGTCGA GCGATCAGCAA GACCCACGAGGAGAACCTGAAGCACGACATTGCTCAATTGCTTCCAGATTACGTAGCCAGGGCCGGGTGCTGGTTTTT CAGTCGTACGTAGCT AGTAGTCGTACTGAGCAGTCTAGCATCGTAGTCATGATTGCGTACGTATCGATCGAGTCGATGCATGTATATATGCCGCGTACTGACGTACGT AGTCTAGCTAGTCATGCTATATACGGCGCTAGTCGTAGTACGTCGTAGTCAGTGTCAGTATCGAGTCATGCATGTCGTACGTATGGCATGGCT AGTCATGGACTAGCTAGTAGCGTACGTAGTCATTATACGTACGTCGTATGATATATTAGCGCCGCGGTGTACTGCGTCGTGTCGTATACTACT GATCTGATCGTAGTACTGCTACGTAGTCGTAGCAGTCGATCGTATGCATGCGTAGTCGTAGTCTAGCTGATCTACGTAGTCGTAGTATGCGTA GTCTAGTCTATGCATTATATGCTATAGTCATGCTAGCATACGT Shewanella amazonensis, Shewanella baltica, Shewanella denitrificans, Shewanella frigidimarina, Shewanella loihica, Vibrio cholerae, Yersinia pestis, Photorhabdus luminescens, Photobacterium profundum 9

  10. 2) Mutation Patterns that Conserve RNA Structure Derive score based on: # of compensatory mutations Length of sequence Sequence structure A A A A G A G A C G C G T A C G T A A T G C G C T T G T T G Rivas and Eddy, BMC Bioinformatics 2:8 2) Score of Conserved RNA Structure 0.10 Documented RNAs Intergenic Regions EVD for Documented RNAs 0.08 EVD for Intergenic Regions Frequency (probability) of score 0.06 0.04 0.02 0.00 -15 -12 -9 -6 -3 0 3 6 9 12 15 18 21 24 Score 10

  11. 3) DNA Microarray Data Examine correlation of expression for probes near one another in the genome: 1) intergenic regions likely to produce RNA 2) for those much less likely to produce RNA 3) Correlation of Transcript Expression 90 Correlation in non-operon IG regions 80 Correlation in operon IG regions 70 60 # of IG Probes 50 40 30 20 10 0 -1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Correlation Coefficient 11

Recommend


More recommend