ou outline
play

Ou Outline In Introduct ction to RNA RNA-se seq Sa Samp - PowerPoint PPT Presentation

Ou Outline In Introduct ction to RNA RNA-se seq Sa Samp mple preparation Qu Quality y control Tr Transcript assembly Re Read alignment Di Differ eren ential g gen ene e e expres ession Da Data v


  1. Ou Outline In Introduct ction to RNA RNA-se seq § Sa Samp mple preparation § Qu Quality y control § Tr Transcript assembly § Re Read alignment § Di Differ eren ential g gen ene e e expres ession § Da Data v visualization a and p plotting § Todos Santos 2018

  2. Re Regulation of gene expression Regulation of transcription: Re Transcription factors Tr § § Hi Histone e modifications § DN DNA methylation RNA-se RN seq me measures Re Regulation of RNA processing: st steady st state mRNA Po Polyadenylation § le levels ls an and RNA NA Spl Splicing ng § seque se quenc nce Capping Ca § composition co § RN RNA export Regulation of translation: Re mRNA decay mR § Translational repression Tr § Seque Se questration § Po Posttranslational regulation: Chemical modifications (e.g. phosphorylation) Ch § § Protein turnover (proteolysis) Pr Todos Santos 2018 Fu et al. (2014)

  3. RN RNA-se seq is is the the mo most t commo mmon n HT HTS a application Reuter et al 2015 Todos Santos 2018

  4. Sample preparation Sa Use high-qua Us quality R RNA a as s start rting ng m materi rial. § Minor differ eren ences ces between een samples es can have e a substantial impact ct on § ge gene expression. Th Three biological replicates is the default but not ideal for every § situation. si Some So me recomme mmended kits for standard RNA-se seq: § § NEBNe NE Next Ul Ultra II Directional RNA Library Prep Ki Il Illumina kits § Todos Santos 2018

  5. Sa Sample preparation St Starting RNA § Ty Typically 1-5 5 ug ug of of high-qua quality t total R RNA i is i ide deal. § Se Sequencing depth § Ty Typically you want about 20 million high quality reads/library. § Considerati Co tions § Strand specific (default is yes) St § Single-en Si end or paired ed-en end (singl gle e is suffici cien ent for wel ell annotated ed § tr transcriptomes) Lo Long g rea eads vs short rea eads (short Il Illumina rea eads, 50-150 150 nt nt, a , are us usua ually § sufficient) su rR rRNA de depl pletion o n or o r oligo-dT dT § Lo Low quantity/singl gle e cel cell § Todos Santos 2018

  6. RNA-se RN seq libr librar ary pr prepar paratio tion Ol Oligo go-dT dT se selection rRN rRNA de depl pletion illumina.com Zhernakova et al. (2009) Todos Santos 2018

  7. Libr Librar ary compo mpositio ition Dual Index Library shown Metzker, M.L. (2010) NRG HiSeq 2500 Sl Slide content courtesy of Il Illumina Todos Santos 2018

  8. FASTQ format FA Index sequence 1 @D64TDFP1:248:C50DMACXX:5:1101:1241:2095 1:N:0:ATCACG 2 Read 1 CACCGCCCGTCGCTATCCGGGACTGGAATTCTCGGGTGCCAAGGAACTCCA 3 + 4 CCCFFFFFHHHHHJIJGHJJJJIJJJJJGGGFFFFEABDHHHFHFF@@DD> 1 @D64TDFP1:248:C50DMACXX:5:1101:1371:2154 1:N:0:ATCACG 2 Read 2 TCAATATTTGCATAGGGTATCTGGAATTCTCGGGTGCCAAGGAACTCCAGT 3 + 4 CCCFFFFFHHHHHJJJJGFHIJJJJJJJJJJJJJFHHIIJJHGHJFGHJJI 1 @D64TDFP1:248:C50DMACXX:5:1101:1461:2205 1:N:0:ATCACG Read 3 2 GAAAGACGTCTTCCTAGATTATGGAATTCTCGGGTGCCAAGGAACTCCAGT 3 + 4 CCCFFFFFHHHHHJJJJJJJJJJJIJJJJJJJJJHIJJJJJGIIJFGIJJJ Metzker, M.L. (2010) NRG Line 1: sequence ID, description, and index; begins with @ Line 2: sequence; contains only A, C, T, G, and N Line 3: optional sequence ID; begins with + Line 4: signal quality of each base, cryptic code, phred 33 or 64 Todos Santos 2018

  9. Da Data a analy analysis is workflo low fastq files downloaded from server Demultiplexing and quality assessment Quality control – filter low quality data, trim adapters Map sequences to reference or de novo assemble reference Custom or standard data analysis Data visualization and presentation Todos Santos 2018

  10. Quality contr Quality trol Asses As essing g Rea ead Quality Ph Phred qua quality s score: a a m measur ure o of t the he qua quality o of ba base c calling ng: 10 log(P) wh where P P is the er error pr proba babi bility Q Q = -10 Todos Santos 2018

  11. Quality Quality contr trol 10 10 reads 10 10 reads 100 base ses 100 base ses P = P = 0.01 P = P = ? Q = -10 log(P) Q = 20 (Q2 Q Q20) Q = ? Q Q30 is a common quality thres eshold or quality cr criter erion Todos Santos 2018

  12. Quality contr Quality trol FastQC: Fa : a GUI I tool for asses essing g the e quality of high gh-th throughput t sequencing data. se Tr Trimmomatic: : software e for trimming g adapter er seq equen ences es and low ow- qua quality ba bases f from s seque quenc ncing ng r reads ds. Todos Santos 2018

  13. Sequence Se ce mapping/alignment Trapnell and Salzberg (2009) Todos Santos 2018

  14. Al Align gning r g reads t to m o mRN RNAs As Trapnell et al (2009) Todos Santos 2018

  15. Dif Differential tial gene ne expr pressio ion Trapnell et al (2010) Todos Santos 2018

  16. RNA-se RN seq pipe pipeline lines No reference genome? Use No se Trinity to asse ssemble transc scripts Ot Other mRNA NA aligners: Star, GNS NSAP, Top ophat2 Ot Other abundance estimator ors: RSEM EM, ht htseq-co count Ot Other com ommon on DE DE sof oftware: DE DESeq2, ed edgeR eR, , cu cuffdiff Pertea et al (2016) Va Various GUIs and R-ba base sed d tools s for dr drawing ng pl plots

  17. Genome browse Ge sers Integ egrative e Geno enomics Viewer er (IGV) Todos Santos 2018

  18. Ge Genome browse sers UCSC Geno enome e Browser er Todos Santos 2018

  19. Trinity w workfl flow 1. Diauxic shift S. pombe 2. Heat shock 3. Log phase 4. Plateau phase Bo Bowtie2 RS RSEM ed edgeR eR Haas et al (2013) Todos Santos 2018

Recommend


More recommend