Practical Bioinformatics Mark Voorhies 4/16/2018 Mark Voorhies Practical Bioinformatics
JavaTreeView link-out for ENSEMBL Mouse http://www.ensembl.org/Mus musculus/Gene/Summary?g=HEADER Mark Voorhies Practical Bioinformatics
Science! Mark Voorhies Practical Bioinformatics
Example Pipeline: Overview Mark Voorhies Practical Bioinformatics
Example Pipeline: Overview Generate Genome Transfer Pre-process Samples Coverage & Archival Transcriptome Pro fi le Di ff erential ~2.5-4 years Expression Annotation/ Analysis Paper (publish) Mark Voorhies Practical Bioinformatics
Example Pipeline: Overview Generate Genome Transfer Pre-process Samples Coverage & Archival Transcriptome Pro fi le ~1 day Di ff erential ~2.5-4 years Expression Annotation/ Analysis Paper (publish) Mark Voorhies Practical Bioinformatics
Example Pipeline: Overview Generate Genome Transfer Pre-process Samples Coverage & Archival Transcriptome Pro fi le ~1 day Di ff erential ~2.5-4 years Expression Annotation/ Analysis Follow-up Experiments Paper (publish) Mark Voorhies Practical Bioinformatics
Example Pipeline: Overview Generate Genome Transfer Pre-process Samples Coverage & Archival Transcriptome Pro fi le ~1 day Di ff erential ~2.5-4 years Expression Annotation/ Analysis Follow-up Experiments Paper (publish) Mark Voorhies Practical Bioinformatics
Example Pipeline: Overview Mark Voorhies Practical Bioinformatics
Example Pipeline: Details Mark Voorhies Practical Bioinformatics
GSE88801 Pipelines Mark Voorhies Practical Bioinformatics
EM: Expectation Maximization Constrain Update parameters Online EM estimated counts algorithm A C G T A C + G T Error probabilities Bias Output Input ∝ λ ∝ α L Targets (i −1) c m i =m i c −1 i–1 Capture target Fragment and m � sequences sequence i Get next read pair Update masses Relative Estimated Effective abundances counts counts P (−) P (−) P (−) L Align to target references P ( ) ∝ λ L · ρ · ω p |−,L p Augmented · φ − | p, − ,L alignment file Calculate assignment probabilities Roberts and Pachter, Nature Methods 10:71 Mark Voorhies Practical Bioinformatics
Abundance estimation with kallisto transcriptome=”GRCm38 all mRNA” export while read i ; do export jobname=”$ { i } . $ { transcriptome } . f r ” k a l l i s t o quant − i ”$ { transcriptome } . idx ” \ − t 4 −− s i n g l e −− fr − stranded − l 250 − s 50 − o ”$ { jobname } ” ”$ { i } 1 . f a s t q . gz” \ > ”$ { jobname } . log ” \ 2 > ”$ { jobname } . e r r ” done < sample names . t x t Mark Voorhies Practical Bioinformatics
Linear Least Squares Mark Voorhies Practical Bioinformatics
Linear Least Squares b i = y i σ i Mark Voorhies Practical Bioinformatics
Linear Least Squares A ij = f j ( x i ) σ i Mark Voorhies Practical Bioinformatics
Linear Least Squares χ 2 = | A · a − b | 2 Mark Voorhies Practical Bioinformatics
Linear Least Squares M � U i · b � � a = V i s i i Mark Voorhies Practical Bioinformatics
Multiple Hypothesis Testing http://xkcd.com/882/ Mark Voorhies Practical Bioinformatics
Final Homework Implement Needleman-Wunsch global alignment with zero gap opening penalties. Try attacking the problem in this order: 1 Initialize and fill in a dynamic programming matrix by hand ( e.g. , try reproducing the example from my slides on paper). 2 Write a function to create the dynamic programming matrix and initialize the first row and column. 3 Write a function to fill in the rest of the matrix 4 Rewrite the initialize and fill steps to store pointers to the best sub-solution for each cell. 5 Write a backtrace function to read the optimal alignment from the filled in matrix. If that isn’t enough to keep you occupied, try implementing Smith-Waterman local alignment and/or non-zero gap opening penalties. Mark Voorhies Practical Bioinformatics
Recommend
More recommend