Mapping the sub-cellular proteome Laurent Gatto lg390@cam.ac.uk – @lgatt0 http://www.damtp.cam.ac.uk/user/lg390/ Slides @ https://zenodo.org/record/1063508 22 Nov 2017, Cambridge Computational Biology Institute
These slides are available under a creative common CC-BY license. You are free to share (copy and redistribute the material in any medium or format) and adapt (remix, transform, and build upon the material) for any purpose, even commercially .
Plan Spatial proteomics The LOPIT pipeline Improving on LOPIT Experimental advances: hyperLOPIT Computational advances: Transfer learning Biological applications Dual-localisation Trans-localisation Open development: R/Bioconductor software
Regulations
Cell organisation Spatial proteomics is the systematic study of protein localisations. Image from Wikipedia http://en.wikipedia.org/wiki/Cell_(biology) .
Spatial proteomics - Why? Localisation is function ◮ The cellular sub-division allows cells to establish a range of distinct micro-environments, each favouring different biochemical reactions and interactions and, therefore, allowing each compartment to fulfil a particular functional role. ◮ Localisation and sequestration of proteins within sub-cellular niches is a fundamental mechanism for the post-translational regulation of protein function. Re-localisation in ◮ Differentiation: Tfe3 in mouse ESC (Betschinger et al., 2013). ◮ Activation of biological processes. Examples later.
Spatial proteomics - Why? Mis-localisation Disruption of the targeting/trafficking process alters proper sub-cellular localisation, which in turn perturb the cellular functions of the proteins. ◮ Abnormal protein localisation leading to the loss of functional effects in diseases (Laurila and Vihinen, 2009). ◮ Disruption of the nuclear/cytoplasmic transport (nuclear pores) have been detected in many types of carcinoma cells (Kau et al., 2004).
Spatial proteomics - How, experimentally Population level Single cell direct Subcellular fractionation (number of fractions) observation 2 fractions n continuous fractions n discrete 1 fraction (enriched fractions (gradient approaches) and crude) GFP Invariant Pure Subtractive LOPIT PCP Epitope rich fraction proteomics (PCA, (χ ) 2 Prot.-spec. fraction catalogue (enrichment) PLS-DA) antibody (clustering) Cataloguing Relative abundance Tagging Quantitative mass spectrometry Figure : Organelle proteomics approaches (Gatto et al., 2010)
Fusion proteins and immunofluorescence Figure : Targeted protein localisation.
Fusion proteins and immunofluorescence Figure : Example of discrepancies between IF and FPs as well as between FP tagging at the N and C termini (Stadler et al., 2013).
Spatial proteomics - How, experimentally Population level Single cell direct Subcellular fractionation (number of fractions) observation 2 fractions n continuous fractions n discrete 1 fraction (enriched fractions (gradient approaches) and crude) GFP Invariant Pure Subtractive LOPIT PCP Epitope rich fraction proteomics (PCA, (χ ) 2 Prot.-spec. fraction catalogue (enrichment) PLS-DA) antibody (clustering) Cataloguing Relative abundance Tagging Quantitative mass spectrometry Figure : Organelle proteomics approaches (Gatto et al., 2010). Gradient approaches: Dunkley et al. (2006), Foster et al. (2006). ⇒ Explorative/discovery approches , steady-state global localisation maps .
Cell lysis Fractionation/centrifugation e.g. Mitochondrion Quantitation/identi fi cation by mass spectrometry e.g. Mitochondrion
Quantitation data and organelle markers Fraction 1 Fraction 2 . . . Fraction m markers p 1 q 1,1 q 1,2 . . . q 1,m unknown p 2 q 2,1 q 2,2 . . . q 2,m loc 1 p 3 q 3,1 q 3,2 . . . q 3,m unknown p 4 q 4,1 q 4,2 . . . q 4,m loc i . . . . . . . . . . . . . . . . . . p j q j,1 q j,2 . . . q j, m unknown
Visualisation and classification Correlation profile − ER Correlation profile − Golgi Correlation profile − mit/plastid 0.6 0.5 0.4 0.5 0.4 0.3 0.4 0.3 0.2 0.3 0.2 0.1 0.2 0.1 11 11 11 1 2 4 5 7 8 12 1 2 4 5 7 8 12 1 2 4 5 7 8 12 0.0 Fractions Fractions Fractions Correlation profile − PM Principal component analysis 0.35 0.30 5 0.25 ● ● 0.20 ● ● ● ● ● ● ● ● 0.15 ● 11 0 ● 1 2 4 5 7 8 12 ● ● ● ● ● Fractions ● ● ● PC2 ● ● ● Correlation profile − Vacuole 0.6 −5 0.5 ● ● ER vacuole 0.4 ● ● Golgi ● marker mit/plastid PLS−DA 0.3 PM unknown 0.2 −10 −5 0 5 11 0.1 1 2 4 5 7 8 12 PC1 Fractions Figure : From Gatto et al. (2010), Arabidopsis thaliana data from Dunkley et al. (2006)
Data analysis Principal Component Analysis Plot Fraction 1 Fraction 2 . . . Fraction m markers ● ● ● ● ● ● ● ● prot 1 q 1,1 q 1,2 . . . q 1, m . . . unknown . . . ● ● 4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● prot 2 q 2,1 q 2,2 . . . q 2, m organelle 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● prot 3 q 3,1 q 3,2 . . . q 3, m unknown ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ● ● prot 4 q 4,1 q 4,2 . . . q 4, m organelle 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● . . . . . . . . ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● . . . . . . . . ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● PC2 (22.34%) ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● . . . . . . . . ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● prot i q i,1 q i,2 . . . q i, m organelle k ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● . . . . . . . . ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● . . . . . . . . ● ● ● ● ● ● ● ● ● . . . . . . . . ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 ● ● ● ● ● ● ● ● ● prot n q n,1 q n,2 . . . q n, m . . . unknown ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Fraction 1 Fraction 2 . . . Fraction m ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● prot 1 . . . . . . . . . . . . ● ● ● −4 ● ● ● ● ● ● ● . . . . ● ● . . . . ● ● prot i . . . . ● ● ● prot n . . . . . . . . . . . . −6 −4 −2 0 2 4 6 PC1 (64.36%) Supervised machine learning Using labelled marker proteins to match unlabelled proteins (of unknown localisation) with similar profiles and classify them as residents to the markers organelle class.
Recommend
More recommend