Information Storage and Processing in Biological Systems: A seminar course for the Natural Sciences Sept. 11 Biological Information, Sept 16 DNA, Gene regulation Sept 18 Translation and Proteins Sept 23 Enzymes and Signal transduction Sept 25 Biochemical Networks Sept 30 Simple Genetic Networks (Dr. Jacob) Oct 2
Background ÿ The Thread of Life. Susan Aldridge. Chapter 2 ÿ Molecular Biology of the Cell. Alberts et al. Garland Press Suggested further reading • Protein molecules as computational elements in living cells. D. Bray. Nature. 1995 Jul 27;376(6538):307-12. • Signaling complexes: biophysical constraints on intracellular communication. D. Bray. Annu Rev Biophys Biomol Struct. 1998;27:59-75. • Metabolic modeling of microbial strains in silico. Ms W. Covert, et al. Trends in Biochemical Sciences Vol.26 ( 2001). 179-186. • Modelling cellular behaviour. D. Endy & R. Brent. Nature(2001) 409: 391- 395.
A - Introduction to Proteins / Translation • The primary structure is defined as the sequence of amino acids in the protein. This is determined by and is co-linear to the sequence of bases (triplet codons) in the gene * . 5’---CTCAGCGTTACCAT---3’ DNA 3’---GAGTCGCAATGGTA---5’ transcription RNA 5’---CUCAGCGUUACCAU---3’ translation N---Leu-Ser-Val-Thr---C PROTEIN * - this is not strictly true in most eukaryotic genomes
Structure of Genes In Eukaryotic Organisms Transcription hnRNA heterogeneous nuclear RNA RNA splicing mRNA
Structure of Genes In Eukaryotic Organisms Introns Exons Transcription hnRNA heterogeneous nuclear RNA RNA splicing mRNA
Structure of Genes In Eukaryotic Organisms Transcription hnRNA heterogeneous nuclear RNA RNA splicing Alternative RNA splicing mRNA mRNA
Structure of Genes In Eukaryotic Organisms Control Elements Transcription hnRNA heterogeneous nuclear RNA RNA splicing mRNA
Structure of Genes In Eukaryotic Organisms • Coding sequence can be discontinuous and the gene can be composed of many introns and exons. • The control regions ( = operators) can be spread over a large region of DNA and exert action-at-a-distance . • There can be many different regulators acting on a single gene – i.e. more signal integration than in bacteria. • Alternate splicing can give rise to more than one protein product from a single ‘gene’. • Predicting genes (introns, exons and proper splicing) is very challenging. • Because the control elements can be spread over a large segment of DNA, predicting the important sites and their effects on gene expression are not very feasible at this time.
Translation • Translation is the synthesis of a polypeptide (protein) chain using the mRNA template. • Note the mRNA has directionality and is read from the 5’end towards the 3’end. Note that many ribosomes can read one message like beads on a string generating many polypeptide chains simultaneously.
Translation •The 5’end is defined at the DNA level by the promoter but this does not define the translation start. • The translation start sets the ‘register’ or reading frame for the message. • The end is determined by the presence of a STOP codon (in the correct reading frame).
Schematic Illustration of Translation Protein Synthesis involves specialized RNA molecules called transfer RNA or tRNA.
Translation Start Position The translation start is dependent on: 1) a sequence motif called a ribosome binding site (rbs) 2) an AUG start codon 5-10 bp downstream from the rbs 3’end of 16S rRNA 3’AU //-5’ UCCUCA |||||| 5’-NNNNNNNAGGAGU-N 5-10 -AUG-//-3’ mRNA rbs start
In bacteria a single mRNA molecule can code for several proteins. Such messages are said to be polycistronic . Since the message for all genes in such a transcript are present at the same concentration (they are on the same molecule), one might predict that translation levels will be the same for all the genes. This is not the case: translation efficiency can vary for the different messages within a transcript. Promoter Terminator (Start) (Stop) Gene 1 Gene 2 Gene 3 Gene 4 DNA mRNA 4 genes , 1 message
Translation Efficiency is an important part of gene expression Polycistronic mRNA Translation Tar Tap R B Y Z 5000 1000 <100 1000 18000 10000 (Protein monomer per cell) A single mRNA may encode several proteins. The final level of each protein may vary significantly and is a function of: 1) translation efficiency 2) protein stability
B – Introduction to Proteins / Characteristics • The primary structure is defined as the sequence of amino acids in the protein. This is determined by and is co-linear to the sequence of bases (triplet codons) in the gene * . 5’---CTCAGCGTTACCAT---3’ DNA 3’---GAGTCGCAATGGTA---5’ transcription RNA 5’---CUCAGCGUUACCAU---3’ translation N---Leu-Ser-Val-Thr---C PROTEIN * - this is not strictly true in most eukaryotic genomes
There are 20 naturally occurring amino acids in proteins, each with distinctive ‘side chains’ that give them characteristic chemical properties. amino group carboxylic acid H 2 NCHCCH 3 OHO amino acid (alanine)
There are 20 naturally occurring amino acids in proteins, each with distinctive ‘side chains’ that give them characteristic chemical properties. amino group carboxylic acid H 2 NCHCCH 3 OHO a -carbon amino acid (alanine) Amino acids differ in the side chains on the a- carbon.
There are 20 naturally occurring amino acids in proteins, each with distinctive ‘side chains’ that give them characteristic chemical properties. amino group carboxylic acid H 2 NCHCCH 3 OHO a -carbon amino acid (alanine) -CH 3 (methyl) Amino acids differ in the side chains on the a- carbon.
H 2 NCHCCH 2 OHOHN Alanine + Tyrptophan H 2 NCHCCH 3 OHO + (ala) + (trp) (A) + (W) H 2 O CHCCH 2 OHOHNH 2 NCHCCH 3 HNO Dipeptide (Ala-Trp) By convention polypeptides are written from the N-terminus (amino) peptide bond to the C-terminus (carboxy)
Alanine ala A Arginine arg R H 2 NCHCHOHO Asparagine asn N Aspartic acid asp D Glycine Cysteine cys C Glutamine gln Q HNCOHO Glutamic acid glu E Glycine gly G Histidine his H Isoleucine ile I Proline Leucine leu L Lysine lys K H 2 NCHCCH 2 OHOSH Methionine met M Phenylalanine phe F Proline pro P Serine ser S Threonine thr T Cysteine Tryptophan trp W Tyrosine tyr Y Valine val V
The Newly Synthesized Polypeptide • The information from DNA ‡ RNA ‡ Protein is linear and the final polypeptide synthesized will have a sequence of amino acids defined by the sequence of codons in the message. • The sequence of amino acids is called the primary structure. • Secondary structure refers to local regular/repeating structural elements. • The folded three dimensional structure is referred to as tertiary structure. Protein function depends on an ordered / defined three dimensional folding. The final three dimensional folded state of the protein is an intrinsic property of the primary sequence. How the primary sequence defines the final folded conformation is generally referred to as the Protein Folding Problem.
Primary structure of green fluorescent protein (single letter AA codes) SEQUENCE 238AA 26886MW MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKL PVPWPTLVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNY KTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKN GIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNE KRDHMVLLEFVTAAGITHGMDELYK The primary sequence can be derived directly from the gene sequence but going from sequence to structure or sequence to function is not possible unless there is a related protein for which structure or function is known. Likewise, the structure alone rarely provides information about function (only if the function of a related protein is known).
Projections of the Tertiary Structure of Green Fluorescent Protein Backbone tracing
Projections of the Tertiary Structure of Green Fluorescent Protein Ile 188 -Gly 189 -Asp 190 -Gly 191 -Pro 192 -Val 193 Backbone tracing
Projections of the Tertiary Structure of Green Fluorescent Protein “Ribbon diagram” showing secondary structures
Projections of the Tertiary Structure of Green Fluorescent Protein Secondary structures a -helix “Ribbon diagram” showing secondary structures
Projections of the Tertiary Structure of Green Fluorescent Protein Secondary structures a -helix b -strand “Ribbon diagram” showing secondary structures
Projections of the Tertiary Structure of Green Fluorescent Protein Ile 188 -Gly 189 -Asp 190 -Gly 191 -Pro 192 -Val 193 “Wireframe” model showing all atoms and chemical bonds.
Projections of the Tertiary Structure of Green Fluorescent Protein “Stick” model showing all “Space filling” model where each atom atoms and chemical bonds. is represented as a sphere of its Van der Waals radius.
Recommend
More recommend