Short read quality assessment Martin Morgan 1 June 20-23, 2011 1 - PowerPoint PPT Presentation

Short read quality assessment Martin Morgan 1 June 20-23, 2011 1 mtmorgan@fhcrc.org

Why sequence? e.g., RNA-seq ◮ Expression in novel (un-annotated) regions ◮ Exon junction / RNA editing insights ◮ Allele-specific / transcript isoform quantification ◮ Non-model organisms ◮ Greater dynamic range and sensitivity? Lessons from microarrays ◮ Initially: variability between manufactures, technologies, labs ◮ MAQC: quality control standards and analysis protocols

Example work flow – [4] Sample ◮ Purify poly(A)+ RNA with oligo(dT) magnetic beads ◮ cDNA synthesis primed with random hexamers Microarray ◮ Dye-swap, hybridization, florescence, analysis RNA-seq ◮ Fragment and size-select ◮ Illumina adapter ligation

Key issues ◮ Experimental design [1] ◮ Replication ◮ Randomization and blocking, e.g., batch effects ◮ Depth of coverage ◮ Statistical power ◮ Library complexity ◮ Coverage heterogeneity ◮ Estimation biases ◮ Legitimate comparison ◮ Sequencing uncertainty [2]

Key issues ◮ Experimental design [1] ◮ Replication ◮ Randomization and blocking, e.g., batch effects ◮ Depth of coverage ◮ Statistical power ◮ Library complexity ROC simulation ◮ Coverage heterogeneity ◮ Replication (red vs. blue) ◮ Estimation biases ◮ Randomization and blocking ◮ Legitimate comparison (solid vs. dot) ◮ Sequencing uncertainty [2]

Key issues 0 1 2 3 4 0 1 2 3 4 5 6 7 8 ◮ Experimental design [1] 1.0 0.8 ◮ Replication 0.6 ◮ Randomization and 0.4 Cumulative proportion of reads blocking, e.g., batch 0.2 effects 0.0 1 2 3 4 1.0 ◮ Depth of coverage 0.8 ◮ Statistical power 0.6 ◮ Library complexity 0.4 0.2 ◮ Coverage heterogeneity 0.0 ◮ Estimation biases 0 1 2 3 4 0 1 2 3 4 Number of occurrences of each read (log 10 ) ◮ Legitimate comparison ◮ Sequencing uncertainty [2] Cumulative proportion of reads occuring 0, 1, . . . times

Key issues ◮ Experimental design [1] 1.0 Cummulative proportion ◮ Replication 0.8 ◮ Randomization and blocking, e.g., batch 0.6 effects 0.4 ◮ Depth of coverage ◮ Statistical power 0.2 ◮ Library complexity 0.0 ◮ Coverage heterogeneity 2.0 2.2 2.4 2.6 ◮ Estimation biases Copies per read (log 10 ) ◮ Legitimate comparison ◮ Sequencing uncertainty [2] Actual versus uniform φ X 174 coverage

Key issues ◮ Experimental design [1] ◮ Replication ◮ Randomization and blocking, e.g., batch effects ◮ Depth of coverage ◮ Statistical power ◮ Library complexity ◮ Coverage heterogeneity ◮ Estimation biases ◮ Legitimate comparison Read count increases with gene length ◮ Sequencing uncertainty [2]

Key issues ◮ Experimental design [1] ◮ Replication ◮ Randomization and blocking, e.g., batch effects ◮ Depth of coverage ◮ Statistical power ◮ Library complexity ◮ Coverage heterogeneity ◮ Estimation biases Reads, stratified by cycle, ◮ Legitimate comparison supporting a spurious SNP call in ◮ Sequencing uncertainty [2] φ X 174

Case study Subset of Brooks et al. [3] ◮ RNAi and mRNA-seq to identify pasilla-regulated alternative splicing ◮ Purified polyA, random hexamer primed ◮ Single- and paired end sequences ◮ Alignment to reference genome and curated splic junctions

P. L. Auer and R. W. Doerge. Statistical design and analysis of RNA sequencing data. Genetics , 185:405–416, Jun 2010. H. C. Bravo and R. A. Irizarry. Model-based quality assessment and base-calling for second-generation sequencing data. Biometrics , 66:665–674, Sep 2010. A. N. Brooks, L. Yang, M. O. Duff, K. D. Hansen, J. W. Park, S. Dudoit, S. E. Brenner, and B. R. Graveley. Conservation of an RNA regulatory map between Drosophila and mammals. Genome Res. , 21:193–202, Feb 2011. J. H. Malone and B. Oliver. Microarrays, deep sequencing and the true measure of the transcriptome. BMC Biol. , 9:34, 2011.

Short read quality assessment Martin Morgan 1 June 20-23, 2011 1 - PowerPoint PPT Presentation

Short read quality assessment Martin Morgan 1 June 20-23, 2011 1 mtmorgan@fhcrc.org Why sequence? e.g., RNA-seq Expression in novel (un-annotated) regions Exon junction / RNA editing insights Allele-specific / transcript isoform

Assessment Initial Assessment Level increase Summary/Level Short-Term Short-Term Plan Review

short read genome assembly Sorin Istrail CSCI1820 Short-read genome assembly algorithms

2016 ANNUAL GENERAL MEETING Short Sea Shipping is OUR BUSINESS 2 Short Sea Shipping is OUR

GSM Short Message Service GSM Short Message Service GSM Short Message Service GSM Short Message

External Quality Assessment AIM of QUALITY SYSTEM AIM of QUALITY SYSTEM The aim of QUALITY SYSTEM

Read Write Inc. Phonics MISS CASBAN About Read Write Inc Phonics

Read Write Inc. Phonics Parents Meeting Who is Read Write Inc. Phonics for? Read Write Inc.

Kindergarten Reading Getting Ready for Kindergarten Oregon Trail School District Read, Read

RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW

Student Assessment in Scarsdale Education Report November, 2016 Assessment Defined Purposes

7-Speech Quality Assessment Quality Levels Subjective Tests Objective Tests Intelligibility

Mail Service Quality Support: Mail Service Quality Support: Mail Service Quality Support: Mail

Molecule Screen and Cell Quality Molecule Screen and Cell Quality Assessment Assessment

eQualite: eQualite: Quality Assessment Quality Assessment of of Software Suppliers Software

Assessment at SCIS February 2019 Why do we need assessment? How does assessment align with

Read Write Inc. Phonics Parents Meeting Teach a child to read and keep that child reading

Algorithms in Bioinformatics: A Practical Introduction Genome Rearrangement Evidences of Genome

Data Privacy Anonymization Li Xiong CS573 Data Privacy and Security Outline Inference

Web and Semantic Web MO826/MC936 - Information Systems Topics Andr Santanch Laboratory of

The GenABEL project for statistical genomics Yurii Aulchenko [ YuriiA consulting (NL) | ICG SB

Alper Sarikaya 1 , Michael Correll 2 , Jorge M. Dinis 1 , David H. OConnor 1,3 , and Michael

Breakthroughs and Big Questions: AIDS vaccine research in 2014 Mary A. Marovich Director,

Ma# Spangler, University of Nebraska June 19, 2019 DONE WITH CHANGES? DECISION SUPPORT USING

Ge Genomic d c divi vision o of l labo bor du during co g colony de y defense i in a a

Sambuz

Useful Links

Newsletter

Mail Us