introduction to rna seq
play

Introduction to RNA-Seq David Wood Winter School in Mathematics and - PowerPoint PPT Presentation

Introduction to RNA-Seq David Wood Winter School in Mathematics and Computational Biology July 1, 2013 RNA is... Diverse Dynamic Central DNA Epigenetics rRNA RNA tRNA e c n a d n u b A Protein mRNA Time RNA is... Diverse


  1. Introduction to RNA-Seq David Wood Winter School in Mathematics and Computational Biology July 1, 2013

  2. RNA is... Diverse Dynamic Central DNA Epigenetics rRNA RNA tRNA e c n a d n u b A Protein mRNA Time

  3. RNA is... Diverse Dynamic Central DNA Epigenetics rRNA RNA tRNA e c n a d n u b A Protein mRNA Time Qualitative Integrative Quantitative Understand the molecular basis of gene function. Classify and transform cellular states

  4. RNA studies involve... Biological System Questions Project Available Technology Resources DB ~/bin

  5. RNA studies involve... Biological System Questions Project Available Technology Resources DB ~/bin This talk: Focusing on reference based mammalian RNA-seq analysis

  6. Transcriptional Complexity pA pA pA TSS TSS TSS ATG ATG pA TSS AAA ATG AAA ATG ATG AAA ATG AAA ATG AAA ATG ATG microRNAs genomic DNA spliced intron protein coding regions transcription start site pA polyadenylation signal TSS non-coding regions translation start site polyadenylation AAA ATG

  7. Transcriptional Complexity pA pA pA TSS TSS TSS ATG ATG pA TSS PASR miRNA tiRNA AAA ATG AAA ATG ATG AAA ATG AAA ATG AAA ATG ATG microRNAs genomic DNA spliced intron protein coding regions transcription start site pA polyadenylation signal TSS non-coding regions translation start site polyadenylation AAA ATG

  8. Transcriptional Complexity pA pA pA TSS TSS TSS ATG ATG pA TSS PASR miRNA Alu tiRNA AAA ATG AAA ATG ATG AAA ATG AAA ATG AAA ATG ATG AAA AAA microRNAs genomic DNA spliced intron protein coding regions transcription start site pA polyadenylation signal TSS non-coding regions translation start site polyadenylation AAA ATG

  9. Transcriptional Complexity Mutations Allelic Expression pA pA pA TSS TSS TSS ATG ATG pA TSS PASR miRNA Alu tiRNA AAA ATG AAA ATG ATG AAA ATG AAA ATG AAA ATG ATG AAA AAA RNA Editing microRNAs genomic DNA spliced intron protein coding regions transcription start site pA polyadenylation signal TSS non-coding regions translation start site polyadenylation AAA ATG

  10. RNA-seq pA pA pA TSS TSS TSS ATG ATG pA TSS PASR miRNA Alu tiRNA AAA ATG AAA ATG ATG AAA ATG AAA ATG AAA ATG ATG AAA AAA non-spliced reads mutations junction reads strand specific Cloonan et al . Nat Methods 2008 ; 5:613-619

  11. Advantages of RNA-seq Discovery genes, exons, junctions, UTRs, fusions (Present and Future) %#!!!!" %!!!!!" ,-./01-2340" $#!!!!" $!!!!!" #!!!!" !" #&" #'" (!" ($" (%" ()" (*" (#" ((" (+" (&" ('" +!" +$" 5/06789":6-02;/" <-;462/"=;>2/?" @6?-.>.A;/" /1BCD" <06E>;?6/6"

  12. Advantages of RNA-seq Dynamic Range Discovery genes, exons, junctions, UTRs, fusions (Present and Future) %#!!!!" %!!!!!" ,-./01-2340" $#!!!!" $!!!!!" Mortazavi et al. Nat. Methods 2008; 5:621–628 #!!!!" !" #&" #'" (!" ($" (%" ()" (*" (#" ((" (+" (&" ('" +!" +$" 5/06789":6-02;/" <-;462/"=;>2/?" @6?-.>.A;/" /1BCD" <06E>;?6/6"

  13. Advantages of RNA-seq Dynamic Range Discovery genes, exons, junctions, UTRs, fusions (Present and Future) %#!!!!" %!!!!!" ,-./01-2340" $#!!!!" $!!!!!" Mortazavi et al. Nat. Methods 2008; 5:621–628 #!!!!" !" #&" #'" (!" ($" (%" ()" (*" (#" ((" (+" (&" ('" +!" +$" 5/06789":6-02;/" Nucleotide <-;462/"=;>2/?" @6?-.>.A;/" Specific /1BCD" <06E>;?6/6"

  14. Typical experiment workflow Field / Clinic Wet Lab Dry Lab Design Experiment Run Experiment Obtain RNA Sample Acquisition Field / Clinic / Lab Make Library Sequencing 1 ° Base Calling Mapping 2 ° Library QC 2 ° Analysis 3 ° Verification Sample Acquisition 3 ° Interpretation Validation Publish

  15. Typical experiment workflow Field / Clinic Wet Lab Dry Lab Design Experiment Run Experiment Obtain RNA Sample Acquisition Field / Clinic / Lab Make Library Sequencing 1 ° Base Calling Mapping 2 ° Library QC 2 ° Analysis 3 ° Verification Sample Acquisition 3 ° Interpretation Validation Publish

  16. Typical experiment workflow Field / Clinic Wet Lab Dry Lab Design Experiment Run Experiment Obtain RNA Sample Acquisition Field / Clinic / Lab Make Library Sequencing 1 ° Base Calling Mapping 2 ° Library QC 2 ° Analysis 3 ° Verification Sample Acquisition 3 ° Interpretation Validation Publish

  17. Typical experiment workflow Field / Clinic Wet Lab Dry Lab Design Experiment Run Experiment Obtain RNA Sample Acquisition Field / Clinic / Lab Make Library Sequencing 1 ° Base Calling Mapping 2 ° Library QC 2 ° Analysis 3 ° Verification Sample Acquisition 3 ° Interpretation Validation Publish

  18. Library Construction Deplete rRNA AAAAA 5% Target AAAAA RNA tRNA Enrich polyA AAAAA (15%) RNA Profile rRNA (80%) (ribosomes) AAA AAAAA Fragment A Capture cellular RNA (tiling arrays) ds-cDNA synthesis Sequencing Ligate adaptors + Amplify

  19. Typical experiment workflow Field / Clinic Wet Lab Dry Lab Design Experiment Run Experiment Obtain RNA Sample Acquisition Field / Clinic / Lab Make Library Sequencing 1 ° Base Calling Mapping 2 ° Library QC 2 ° Analysis 3 ° Verification Sample Acquisition 3 ° Interpretation Validation Publish

  20. RNA-seq Mapping Challenge #1: Introns ATG AAA

  21. RNA-seq Mapping Challenge #1: Introns ATG AAA Align to database Split Read of junctions or Alignments transcriptome Trapnell et al. Bioinformatics 2009; 25:1105-11 Wood et al. Bioinformatics 2011; 27:580–581

  22. RNA-seq Mapping Challenge #1: Introns ATG AAA Align to database Split Read of junctions or Alignments transcriptome Trapnell et al. Bioinformatics 2009; 25:1105-11 Wood et al. Bioinformatics 2011; 27:580–581 Challenge #2: Correctness Sufficient Overlap Sufficient Evidence

  23. RNA-seq Mapping Challenge #1: Introns ATG AAA Align to database Split Read of junctions or Alignments transcriptome Trapnell et al. Bioinformatics 2009; 25:1105-11 Wood et al. Bioinformatics 2011; 27:580–581 Challenge #2: Correctness Challenge #3: Multi-mappers Sequence Align to the Sufficient Overlap Similarity transcriptome Sufficient Evidence

  24. RNA-seq Mapping Data QC Align to Align to Align to Split read (clipping) Filter Set ‘genome’ ‘junctions’ Alignment Flag and Choose Alignments, Disambiguate Exclude Exclude Tophat: Trapnell et al. Bioinformatics 2009; 25:1105-11

  25. RNA-seq Mapping Data QC Align to Align to Align to Split read (clipping) Filter Set ‘genome’ ‘junctions’ Alignment Flag and Choose Alignments, Disambiguate Exclude Exclude Tophat: Trapnell et al. Bioinformatics 2009; 25:1105-11 BAM BAM BAM Alignment Filtering Analysis Library QC

  26. RNA-seq Mapping rRNA, tRNA reference? gene model? Algorithm? ? diploid? ESTs? Data QC Align to Align to Align to Split read (clipping) Filter Set ‘genome’ ‘junctions’ Alignment Flag and Choose Alignments, Disambiguate Exclude Exclude Tophat: Trapnell et al. Bioinformatics 2009; 25:1105-11 BAM BAM BAM Alignment Filtering Analysis Library QC

  27. Typical experiment workflow Field / Clinic Wet Lab Dry Lab Design Experiment Run Experiment Obtain RNA Sample Acquisition Field / Clinic / Lab Make Library Sequencing 1 ° Base Calling Mapping 2 ° Library QC 2 ° Analysis 3 ° Verification Sample Acquisition 3 ° Interpretation Validation Publish

  28. Library Quality Control (QC) Deplete rRNA Target AAAAA 5% AAAAA RNA tRNA Enrich polyA AAAAA (15%) RNA Profile rRNA (80%) (ribosomes) AAA AAAAA Fragment A Capture cellular RNA (tiling arrays) ds-cDNA synthesis Sequencing Ligate adaptors + Amplify

  29. Library Quality Control (QC) Deplete Affects RNA content rRNA Target AAAAA 5% (Expression AAAAA RNA quantification) tRNA Enrich polyA AAAAA (15%) RNA Profile rRNA (80%) (ribosomes) AAA AAAAA Fragment A Capture cellular RNA (tiling arrays) ds-cDNA synthesis Sequencing Ligate adaptors + Amplify

  30. Library Quality Control (QC) Deplete Affects RNA content rRNA Target AAAAA 5% (Expression AAAAA RNA quantification) tRNA Enrich polyA AAAAA (15%) RNA Profile rRNA (80%) Affects Insert Size (ribosomes) AAA AAAAA Fragment (transcript A identification) Capture cellular RNA (tiling arrays) ds-cDNA synthesis Sequencing Ligate adaptors + Amplify

  31. Library Quality Control (QC) Deplete Affects RNA content rRNA Target AAAAA 5% (Expression AAAAA RNA quantification) tRNA Enrich polyA AAAAA (15%) RNA Profile rRNA (80%) Affects Insert Size (ribosomes) AAA AAAAA Fragment (transcript A identification) Capture cellular RNA (tiling arrays) Affects ds-cDNA Strand Specificity synthesis Sequencing Ligate adaptors + Amplify

Recommend


More recommend