Strategies for Bulk RNA-seq Analysis
Genome Transcriptome Assembly Mapping Mapping Reads Reads Reads RSEM, Trinity, STAR, Kallisto, Scripture, HISAT2 Sailfish, Stringtie Salmon Splice-aware Transcript mapping Assembly into Genome mapping and quantification transcripts htseq-count, Trinotate StringTie featureCounts Novel transcript Transcript Gene annotation discovery & counting counting Homology-based BLAST2GO Novel transcript annotation
✓ Genome Sequence reads ✓ GTF (annotation) FASTQ Quality control (+reference genome index) FASTQ (+known GTF, optional) Alignment to Genome: HISAT2, STAR multiple BAMs (+known GTF) Count reads associated with genes: htseq-count, featureCounts Count Matrix DGE with R: DESeq2, EdgeR, limma:voom
✓ Genome Sequence reads ✓ GTF (annotation)? FASTQ Quality control (+reference genome index) FASTQ (+known GTF, optional) Alignment to Genome: BAM HISAT2, STAR Reference-based multiple BAMs transcriptome assembly and (+known GTF) quantitation with StringTie Count reads associated with genes: htseq-count, featureCounts Count Matrix DGE with R: DGE with Cu ff Di ff , DESeq2, EdgeR, Ballgown limma:voom
✓ Transcriptome Sequence reads (FASTA) FASTQ (+reference transcriptome index) Quality control FASTQ (+reference genome index) FASTQ (nown GTF, optional) Alignment to Genome: BAM HISAT2, STAR Reference-based multiple BAMs transcriptome assembly and (+known GTF) quantitation with StringTie Pseudocounts with Kallisto, Count reads Sailfish, Salmon associated with genes: htseq-count, featureCounts Count Matrix Count Matrix generated using tximport DGE with R: DGE with DGE with Cu ff Di ff , DESeq2, EdgeR, Sleuth Ballgown limma:voom
✓ Genome ✓ Genome? ✓ GTF (annotation)? ✓ GTF (annotation)? De novo assembly Reference-based assembly Martin J.A. and Wang Z., Nat. Rev. Genet. (2011) 12:671–682
Reads RSEM, Trinity, Kallisto. Scripture Salmon, eXpress Alignment to new transcriptome: Transcript Assembly into Bowtie2, BWA mapping & transcripts quantification SAM/BAM Trinotate Count reads associated with genes Novel transcript annotation Count Matrix DGE with R: DESeq2, EdgeR, limma:voom Quantitation from assembled reads
These materials have been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Recommend
More recommend