Definitions Definitions Genome DNA Genomics Transcriptomics - - PDF document

definitions definitions
SMART_READER_LITE
LIVE PREVIEW

Definitions Definitions Genome DNA Genomics Transcriptomics - - PDF document

Grard, Daniel, Automatic identification and Stephane, Sebastien characterisation of proteins Khaled, Pierre-Alain, Robin, David, Christine, Ron Patricia H., Markus, Patricia P., Marc, Vassilios Nadine Definitions Definitions Genome


slide-1
SLIDE 1

1

Ron

Gérard, Daniel, Stephane, Sebastien Khaled, Christine, Vassilios

Automatic identification and characterisation of proteins

Pierre-Alain, Robin, David, Patricia H., Markus, Patricia P., Marc, Nadine

Definitions Definitions

DNA Genome « Genomics » mRNA « Transcriptomics » proteins Proteome « Proteomics » Cell function

slide-2
SLIDE 2

2

Definitions (cont.) Definitions (cont.)

  • Proteome: the set of expressed genes

(proteins) in a given organism, at a given point in time, in a given situation. The protein complement of the genome.

  • Proteomics: the analysis of proteome(s)

The big challenge The big challenge

To observe the whole system To observe the whole system To understand the systems, instead of To understand the systems, instead of their components and their complexity their components and their complexity

slide-3
SLIDE 3

3

Protein chemistry Protein chemistry ≠ ≠ Proteomics Proteomics

Protein chemistry Protein chemistry

  • Individual proteins

Individual proteins

  • Complete sequence

Complete sequence analysis analysis

  • Emphasis on structure

Emphasis on structure and function and function

  • Structural biology

Structural biology Proteomics Proteomics

  • Complex mixtures

Complex mixtures

  • Partial sequence analysis

Partial sequence analysis

  • Emphasis on

Emphasis on identification by DB identification by DB matching matching

  • Systems biology

Systems biology

Question Question

If we can measure gene expression If we can measure gene expression (mRNA), why bother with proteomics? (mRNA), why bother with proteomics?

slide-4
SLIDE 4

4

Proteome complexity Proteome complexity

a b c a c d

Alternative splicing

a’ b c d

truncations, fragments

a b’ c’ d a b c d a b c d a b c d a b c d a b c d a b c d

PTMs discrete and heterogeneous

slide-5
SLIDE 5

5

Albumin α2-Macroglobuline Ceruloplasmine C1s IgM chaine µ α1-B-Glycoprotéine Prothrombin α 1-Antitrypsin Antithrombine III α1-Anti- chymotrypsin IgA chaine α IgD chaine δ α1-Antitrypsin α2-Antiplasmine Angiotensinogène Hémopexine ApoE3,3 ApoJ Zn- α Glycoprotéine ApoA-IV NA3 Transthyrétine (multimère) Fibrinogen chaine γ Fibrinogen cleaved γ chain Haptoglobin chaine β Haptoglobin clivée chaine β α2 -HS- Glycoprotéine Actin Gc-Globuline Histidine-Riche Glycoprotéine ApoJ ApoA-I ApoA-II Transthyrétine ApoD Ig chaines légères α− α− Microglobuline Haptoglobin chaine α1 Haptoglobin chaine α2 ProapoA-I SRBP ApoC-II & C-III ApoE (frag.) Ig chaine J C3 LRG ApoA-I (frag.) ApoA-IV (frag.)

  • Ac. Oroso-

mucoid Paraoxonase C- Réactive Protéine Sérum Amyloid P Hémopexine (frag.) Ig chaine légère κ C3 α IgA chaine S Vitronectin(frag.)

Haptoglobin α1 chain

Haptoglobin α2 chain

Transthyretin

slide-6
SLIDE 6

6

Modified from Electrophoresis 1997, 18, 533-537

Protein Abundance (CB stained 2D-PAGE gels) Message Abundance (% of clones in RTI)

ACTG ACTB CPS CYB5 ENPLCRTC HSP60 R=0.48 TBA1 LAMR TBB1 HSP90 LAMB TPM HSP70 GR75 F1ATPB PDI HSC70 BIP 2 clones 1 clone 1 0.1 0.01 0.001 1000 10000 100000 1000000

Log-log plot of the abundance of 19 gene products

at the protein level (X-axis) and mRNA level (Y-axis)

Proteins are heterogeneous Proteins are heterogeneous chemical entities chemical entities

  • Huge variations in concentration

Huge variations in concentration

  • Wide range of molecular weight

Wide range of molecular weight

  • Wide range of

Wide range of pI pI

  • Wide range of

Wide range of hydrophobicity hydrophobicity

  • Modify over time

Modify over time

  • Differ from individual to individual

Differ from individual to individual

  • Various stabilities

Various stabilities

slide-7
SLIDE 7

7

Active proteins are not living alone Active proteins are not living alone

  • Interactions with other proteins

Interactions with other proteins

  • Interactions with DNA/RNA

Interactions with DNA/RNA

  • Interactions with chemical entities (

Interactions with chemical entities (ligands ligands, , cofactors, metal ions, hormones, etc.) cofactors, metal ions, hormones, etc.)

How variable is the human How variable is the human proteome? proteome?

  • 1 proteome per tissue (250 tissues)

1 proteome per tissue (250 tissues)

  • 1 proteome per sub-cellular fractions, etc.

1 proteome per sub-cellular fractions, etc.

  • 1000 diseases

1000 diseases

  • Proteomes change with external influences (food,

Proteomes change with external influences (food, drugs, drugs, … …) )

  • Proteomes change with time

Proteomes change with time

  • One genome per individual, but how many

One genome per individual, but how many proteomes? >10 proteomes? >106

6?

?

  • 40

40’ ’000 genes 000 genes   1 1’ ’000 000’ ’000 proteins? 000 proteins?

slide-8
SLIDE 8

8

Definitions (end) Definitions (end)

  • Proteome: the set of expressed genes (proteins) in

a given organism, at a given point in time, in a given situation. The protein complement of the genome.

  • Proteomics: the analysis of proteome(s)
  • Proteomatics: bioinformatics for proteomics

Proteomics: an analytical and Proteomics: an analytical and bioinformatic bioinformatic challenge challenge

  • No equivalent of PCR

No equivalent of PCR

– A small amount of a polypeptide must be detected and analysed without any amplification

  • No specific hybridisation

No specific hybridisation

  • One gene produces

One gene produces n? n? proteins proteins

  • Requires analytical methods and specific

Requires analytical methods and specific bioinformatics tools bioinformatics tools

slide-9
SLIDE 9

9

Proteomics: a challenge Proteomics: a challenge

  • Integration of four important tools:

Integration of four important tools:

  • 1. Analytical protein separation technologies
  • 2. Databases (complete genome sequences, EST,

proteins)

  • 3. Mass spectrometry (MS)
  • 4. Bioinformatics
  • 1. Analytical protein separation
  • 1. Analytical protein separation

technologies technologies

  • LC

LC

  • 1-D

1-D

  • 2-D

2-D

slide-10
SLIDE 10

10

  • 2. Databases
  • 2. Databases
  • Complete genome sequences

Complete genome sequences

  • EST

EST

  • Proteins

Proteins  Large but Large but known known index of possible proteins index of possible proteins  Prediction (motifs, domains, structures) Prediction (motifs, domains, structures)

  • 3. Mass spectrometry (MS)
  • 3. Mass spectrometry (MS)
  • Precise analysis of bio-molecules (proteins

Precise analysis of bio-molecules (proteins and peptides): and peptides):

– Exact measure of intact protein masses  > 100 kDa – Exact measure of peptide masses from enzymatic digestion  high sensitivity and high precision – Peptide sequence analysis

slide-11
SLIDE 11

11

  • 4. Bioinformatics
  • 4. Bioinformatics
  • Correlation of MS data and specific protein

Correlation of MS data and specific protein sequence databases. sequence databases.

  • De novo

De novo sequencing sequencing

  • Characterisation of proteins with MS data

Characterisation of proteins with MS data

From A. Pandey, M. Mann Nature 2000, 405, 837-846

  • Micro-characterisation of proteins

* Identification (catalogue of proteins) * characterisation of post-translational modifications

  • Analysis of differential protein expression

associated to a disease, different cell states, sample treatments  drug targets, disease markers

  • Study of protein/protein interactions

Principal applications of Principal applications of proteomics proteomics

slide-12
SLIDE 12

12

Proteomics pathway Proteomics pathway

Databases Separation Sample Data processing Data Analysis, Selection of spot(s)

G Q M R T N E K E

... NRTKGG ...

Post-separation analysis

Sample and data tracking

Proteomics pathway Proteomics pathway

choice of sample sample collection sample pre-fractionation sample pre-treatment LC (CEX, affinity, etc.) 1-DE (CE, SDS) 2-DLC, 2-DE Samples comparison Statistical analysis Choice of fractions (LC) Choice of gel spots (1-DE, 2-DE) Systematic analysis Edman sequencing AAA Endoproteolytic cleavage Mass Spectrometry (MALDI-MS, ESI MS/MS) Specific Identification tools Specific Characterisation tools Analysis tools Validation tools