proteomics informatics bmsc ga 4437
play

Proteomics Informatics (BMSC-GA 4437) Course Director David Feny - PowerPoint PPT Presentation

Proteomics Informatics (BMSC-GA 4437) Course Director David Feny Contact information David@FenyoLab.org http://fenyolab.org/presentations/Proteomics_Informatics_2014/ http://fenyolab.org/presentations/Proteomics_Informatics_2014/ Proteomics


  1. Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information David@FenyoLab.org http://fenyolab.org/presentations/Proteomics_Informatics_2014/

  2. http://fenyolab.org/presentations/Proteomics_Informatics_2014/

  3. Proteomics Informatics – Learning Objectives Be able analyze proteomics data sets and understand the limitations of the results.

  4. Proteomics Informatics – Syllabus Week 1 Overview of proteomics (1/28/2014 at 4 pm in TRB 718) Week 2 Overview of mass spectrometry (2/4/2014 at 4 pm in TRB 718) Week 3 Analysis of mass spectra: signal processing, peak finding, and isotope clusters (2/11/2014 at 4 pm in TRB 119) Week 4 Protein identification I: searching protein sequence collections and significance testing (2/18/2014 at 4 pm in TRB 718) Week 5 Protein identification II: de novo sequencing (2/25/2014 at 4 pm in TRB 718) Week 6 Databases, data repositories and standardization (3/4/2014 at 4 pm in TRB 718) Week 7 Proteogenomics (3/11/2014 at 4 pm in TRB 718) Week 8 Protein quantitation I: Overview (3/18/2014 at 4 pm in TRB 718) Week 9 Protein quantitation II: Targeted (3/25/2014 at 4 pm in TRB 718) Week 10 Protein characterization I: post-translational modifications (4/1/2014 at 4 pm in TRB 718) Week 11 Protein characterization II: Protein interactions (4/10/2014 at 4 pm in TRB 718) Week 12 Molecular Signatures (4/17/2014 at 4 pm in TRB 718) Week 13 Presentations of projects (4/22/2014 at 4 pm in TRB 718)

  5. Proteomics Informatics – Overview of Proteomics (Week 1) • Why proteomics? • Bioinformatics • Overview of the course

  6. Motivating Example: Protein Regulation Geiger et al., “Proteomic changes resulting from gene copy number variations in cancer cells”, PLoS Genet. 2010 Sep 2;6(9). pii: e1001090.

  7. Motivating Example: Protein Complexes Alber et al., Nature 2007

  8. Motivating Example: Signaling Choudhary & Mann, Nature Reviews Molecular Cell Biology 2010

  9. Bioinformatics Biological System Experimental Design Samples Measurements Raw Data Data Analysis Information

  10. Mass Spectrometry Based Proteomics Lysis Fractionation Digestion Mass spectrometry Peak Finding Charge determination MS De-isotoping Integrating Peaks Searching Identified and Quantified Proteins

  11. Proteomics Informatics – Overview of Mass spectrometry (Week 2) Ion Mass Detector Source Analyzer intensity mass/charge

  12. Proteomics Informatics – Overview of Mass spectrometry (Week 2) Mass Frag- Mass Ion Source Detector Analyzer 1 mentation Analyzer 2 y b

  13. Proteomics Informatics – Overview of Mass spectrometry (Week 2) LC Ion Source Mass Frag- Mass Analyzer 1 mentation Analyzer 2 Detector intensity intensity intensity intensity intensity intensity intensity intensity intensity intensity intensity intensity intensity intensity intensity mass/charge mass/charge mass/charge mass/charge mass/charge mass/charge mass/charge mass/charge mass/charge mass/charge mass/charge mass/charge mass/charge mass/charge mass/charge Time

  14. Proteomics Informatics – Analysis of mass spectra: signal processing, peak finding, and isotope clusters (Week 3) Intensity m/z

  15. Proteomics Informatics – Protein identification I: searching protein sequence collections and significance testing (Week 4) Sequence DB Lysis Pick Protein Fractionation Repeat for all proteins Digestion LC-MS Pick Peptide all peptides Repeat for MS/MS All Fragment Masses MS/MS Compare, Score, Test Significance

  16. Proteomics Informatics – Protein identification I: searching protein sequence collections and significance testing (Week 4)

  17. Proteomics Informatics – Protein identification II: de novo sequencing (Week 5) Amino acid masses 762 100 1-letter 3-letter Chemical Monois % Relative Abundance Average code code formula otopic 875 A Ala 71.0371 71.0788 C 3 H 5 ON R Arg 156.101 156.188 [M+2H] 2+ C 6 H 12 ON 4 N Asn 114.043 114.104 C 4 H 6 O 2 N 2 633 D Asp 115.027 115.089 292 C 4 H 5 O 3 N 405 534 1022 260 389 C Cys 103.009 103.139 C 3 H 5 ONS 504 907 1020 663 778 1080 E Glu 129.043 129.116 C 5 H 7 O 3 N 0 Q Gln 128.059 128.131 C 5 H 8 O 2 N 2 250 500 750 1000 m/z G Gly 57.0215 57.0519 C 2 H 3 ON H His 137.059 137.141 C 6 H 7 ON 3 I Ile 113.084 113.159 C 6 H 11 ON Mass Differences L Leu 113.084 113.159 C 6 H 11 ON K Lys 128.095 128.174 C 6 H 12 ON 2 M Met 131.04 131.193 C 5 H 9 ONS F Phe 147.068 147.177 Sequences C 9 H 9 ON P Pro 97.0528 97.1167 C 5 H 7 ON consistent S Ser 87.032 87.0782 C 3 H 5 O 2 N with spectrum T Thr 101.048 101.105 C 4 H 7 O 2 N W Trp 186.079 186.213 C 11 H 10 ON 2 Y Tyr 163.063 163.176 C 9 H 9 O 2 N V Val 99.0684 99.1326 C 5 H 9 ON

  18. Proteomics Informatics – Databases, data repositories and standardization (Week 6)

  19. Proteomics Informatics – Databases, data repositories and standardization (Week 6) Most proteins show very reproducible peptide patterns

  20. Proteomics Informatics – Databases, data repositories and standardization (Week 6) Query Spectrum Best match In GPMDB Second best match In GPMDB

  21. Proteomics Informatics – Proteogenomics (Week 7) Non-Tumor Sample Genome sequencing Identify germline variants Identify alternative splicing, Genome sequencing somatic variants and Tumor Sample RNA-Seq novel expression Alt. Splicing Novel Expression Tumor Specific Protein DB Exon 1 Exon 2 Exon X Exon 1 Exon 3 Exon 2 Reference Human Database (Ensembl) Variants Fusion Genes TCGA G AGCTG TCGA G AGCTG TCGA G AGCTG TCGA G AGCTG TCGA G AGCTG Gene X Gene X Gene Y Gene Y Exon 1 TCGATAGCTG Exon 1 Exon 2 Exon 1 Exon 2 Gene X Gene Y Kelly Ruggles

  22. Proteomics Informatics – Protein quantitation I: Overview (Week 8) Sample i C ij Protein j Lysis p L Peptide k p ij Fractionation Pr p MS ij p ik D Digestion ijk p Pep MS α k I ik LC-MS ik p LC ik   ∑ α L Pr D Pep LC MS = p p p p p p I C     ik k ij ij ij ijk ik ik ik j I k = ik C α L Pr D Pep LC MS ij p p p p p p k ij ij ijk ik ik ik

  23. Proteomics Informatics – Protein quantitation I: Overview (Week 8) Sample i Protein j Peptide k Lysis Assumption: α L Pr D Pep LC MS p p p p p p Fractionation k ij ij ijk ik ik ik constant for all samples Digestion = C / C I / I i i i i j j j j LC-MS n m n m MS MS

  24. Proteomics Informatics – Protein quantitation II: Targeted (Week 9) Shotgun proteomics LC-MS Targeted MS 1. Records M/Z 1. Select precursor ion MS MS Digestion Fractionation 2. Selects peptides based 2. Precursor fragmentation on abundance and fragments MS/MS MS/MS Lysis 3. Protein database search for 3. Use Precursor-Fragment pairs for identification peptide identification Uses predefined set of peptides Data Dependent Acquisition (DDA)

  25. Proteomics Informatics – Protein characterization I: post-translational modifications (Week 10) Peptide with two possible modification sites Matching MS/MS spectrum Intensity m/z Which assignment does the data support? 1, 1 or 2, or 1 and 2?

  26. Proteomics Informatics – Protein Characterization II: protein interactions (Week 11) E F A D A C B Digestion Mass spectrometry Identification

  27. Proteomics Informatics – Molecular Signatures (Week 12)

  28. Proteomics Informatics – Molecular Signatures (Week 12)

  29. Proteomics Informatics – Presentations of projects (Week 13) Select a published data set that has been made public and reanalyze it. Highlighted data sets: http://www.thegpm.org/ 10 min presentations

  30. Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information David@FenyoLab.org http://fenyolab.org/presentations/Proteomics_Informatics_2014/

Recommend


More recommend