practical bioinformatics
play

Practical Bioinformatics Mark Voorhies 4/28/2017 Mark Voorhies - PowerPoint PPT Presentation

Practical Bioinformatics Mark Voorhies 4/28/2017 Mark Voorhies Practical Bioinformatics Pearson distances Pearson similarity N i ( x i x offset )( y i y offset ) s ( x , y ) = N N i ( x i x offset ) 2 i ( y i


  1. Practical Bioinformatics Mark Voorhies 4/28/2017 Mark Voorhies Practical Bioinformatics

  2. Pearson distances Pearson similarity � N i ( x i − x offset )( y i − y offset ) s ( x , y ) = �� N �� N i ( x i − x offset ) 2 i ( y i − y offset ) 2 Mark Voorhies Practical Bioinformatics

  3. Pearson distances Pearson similarity � N i ( x i − x offset )( y i − y offset ) s ( x , y ) = �� N �� N i ( x i − x offset ) 2 i ( y i − y offset ) 2 Pearson distance d ( x , y ) = 1 − s ( x , y ) Mark Voorhies Practical Bioinformatics

  4. Pearson distances Pearson similarity � N i ( x i − x offset )( y i − y offset ) s ( x , y ) = �� N �� N i ( x i − x offset ) 2 i ( y i − y offset ) 2 Pearson distance d ( x , y ) = 1 − s ( x , y ) Euclidean distance � N i ( x i − y i ) 2 N Mark Voorhies Practical Bioinformatics

  5. Comparing all measurements for two genes Comparing two expression profiles (r = 0.97) ● ● 5 ● ● ● YFG1 log2 relative expression ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● −5 ● ● ● ● ● −5 0 5 TLC1 log2 relative expression Mark Voorhies Practical Bioinformatics

  6. Comparing all genes for two measurements ● ● ● ● ● ● ● ● 5 ● ● ● ● ● ● ● ● Array 2, log2 relative expression ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −10 ● ● −10 −5 0 5 10 Array 1, log2 relative expression Mark Voorhies Practical Bioinformatics

  7. Comparing all genes for two measurements Euclidean Distance ● ● ● ● ● ● ● ● 5 ● ● ● ● ● ● ● ● Array 2, log2 relative expression ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −10 ● ● −10 −5 0 5 10 Array 1, log2 relative expression Mark Voorhies Practical Bioinformatics

  8. Comparing all genes for two measurements Uncentered Pearson ● ● ● ● ● ● ● ● 5 ● ● ● ● ● ● ● ● Array 2, log2 relative expression ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −10 ● ● −10 −5 0 5 10 Array 1, log2 relative expression Mark Voorhies Practical Bioinformatics

  9. PLoS Pathogens 12:e1005910, Naomi Phillip et al Lower in fl amatory response for Ripk3-/-//Casp8-/- transplanted 1) in vivo macrophages in mice infected with Yersinia pestis Lower in fl amatory response at the mRNA level in Ripk3-/-//Casp8-/- 2) ex vivo macrophages stimulated with lipopolysaccharide (LPS) Mark Voorhies Practical Bioinformatics

  10. Measure all pairwise distances under distance metric Mark Voorhies Practical Bioinformatics

  11. Hierarchical Clustering Mark Voorhies Practical Bioinformatics

  12. Hierarchical Clustering Mark Voorhies Practical Bioinformatics

  13. Hierarchical Clustering Mark Voorhies Practical Bioinformatics

  14. Hierarchical Clustering Mark Voorhies Practical Bioinformatics

  15. Hierarchical Clustering Mark Voorhies Practical Bioinformatics

  16. It’s hard work at times, but you have to be realistic. If you have a large database with many variables and your goal is to get a good understanding of the interrelationships, then, unless you get lucky, this complex structure is bound to require some hard work to understand. Bill Cleveland and Rick Becker http://stat.bell-labs.com/project/trellis/interview.html Mark Voorhies Practical Bioinformatics

  17. Using JavaTreeView Mark Voorhies Practical Bioinformatics

  18. Adjust pixel settings for global view Mark Voorhies Practical Bioinformatics

  19. Adjust pixel settings for global view Mark Voorhies Practical Bioinformatics

  20. Select annotation columns Mark Voorhies Practical Bioinformatics

  21. Select annotation columns Mark Voorhies Practical Bioinformatics

  22. Select URL for gene annotations Mark Voorhies Practical Bioinformatics

  23. http://www.ensembl.org/Mus musculus/Gene/Summary?g=HEADER Mark Voorhies Practical Bioinformatics

  24. Select URL for gene annotations Mark Voorhies Practical Bioinformatics

  25. Activate and detach annotation window Mark Voorhies Practical Bioinformatics

  26. Activate and detach annotation window Mark Voorhies Practical Bioinformatics

  27. Activate and detach annotation window Mark Voorhies Practical Bioinformatics

  28. Homework 1 For a small expression profiling matrix ( 1000 genes): Cluster the genes Calculate the correlation matrix Write a CDT file of the clustered gene matrix with the correlation matrix appended Visualize the CDT+GTR files in JavaTreeView – how well did the clustering work? 2 Repeat the previous exercise, exploring difference clustering methods and/or distance methods 3 Read the supplemental RnaSeq methods for PLoS Pathogens 12:e1005910 (Text S2, exported from RStudio). To what extent is this a reproducible method? Is there additional data that would make it more reproducible? Mark Voorhies Practical Bioinformatics

Recommend


More recommend