Modeling the structure and function of proteins and macromolecular assemblies Marc A. Marti-Renom http://bioinfo.cipf.es/sgu/ Structural Genomics Unit Bioinformatics Department Prince Felipe Resarch Center (CIPF), Valencia, Spain
Structural Genomics Unit Bioinformatics Department, CIPF 2
Principles of protein structure GFCHIKAYTRLIMVG… Desulfovibrio vulgaris Anacystis nidulans Condrus crispus Anabaena 7120 GFCHIKAYTRLIMVG… Folding (physics) Evolution (rules) Threading Ab initio prediction Comparative Modeling D. Baker & A. Sali. Science 294, 93, 2001. 3
ModBase Statistics Large-scale modeling of the TrEMBL-SWISSPROT databases http://www.salilab.org/modbase/ Sequences (total) 1,930,692 Sequences (modeled) 1,084,784 Models 3,094,542 Un i ve r s i t y o f Ca li f o r n i a San F r anc i sco
Utility of protein structure models, despite errors 5
Example 1 Missense mutations in BRCT domains cancer not cancer associate associated ? F1761S M1652K L1705PS L1657P C1697R M1775E 1715NS1 E1660G R1699W M1775K H1686Q A1708E 722FF17 L1780P no transcription R1699Q S1715R 34LG173 I1807S K1702E P1749R V1833E 8EG1743 activation M1775R Y1703HF A1843T RA1752 n e 1704S b b b PF1761I v f n n Y V1665M rv rv rv D1692N transcription G1706A M1652I D1733G N activation A1669S M1775V P1806A - - - - - Y c c c N - 0 2 p p0 p m m R1751P M1652T W1718S C1787S A1823T < R1751Q V1653M T1720A G1788D V1833M - L1664P W1730S G1788V W1837R n R1758G p p p F1734S G1803A W1837G T1685A L1764P - Y E1735K V1804D S1841N o + o + T1685I I1766S V1736A V1808A A1843P N + + M1689R ? P1771L G1738R V1809A T1852S D1692Y D1739E V1809F T1773S P1856T F1695L D1739G V1810G P1776S P1859R V1696L D1739Y Q1811R D1778N R1699L P1812S V1741G D1778G G1706E N1819S H1746N D1778H W1718C M1783T 6
Example 1 Putative binding site on BRCA1 Putative binding site predicted in 2003 and accepted for publication on March 2004. Williams et al. 2004 Nature Structure Biology. June 2004 11 :519 Mirkovic et al. 2004 Cancer Research. June 2004 64 :3790 7
Example II 20% For many protein structures function is unknown Structural Traditional Genomics* methods Annotated** 654 28,342 Not 506 (43.6%) 6,815 (19,4%) Annotated Total 1,160 35,157 deposited * annotated as STRUCTURAL GENOMICS in the header of the PDB file **annotated with either CATH, SCOP, Pfam or GO terms in the MSD database 36,317 protein structures, as of August 8th, 2006 8
Example II Representation Surface geometry Sequence conservation Structure conservation Solvent Electrostatics accessibility 9
Example II Prediction accuracy Random Minimized FAD NAD NAP HEM ANP NDP ATP FMN ADP GDP AMP HEC CIT GAL GLC BOG NAG MES FUC MAN 0 25 50 75 100 ACCURACY (%) 10
Example III S. cerevisiae ribosome Fitting of comparative models into 15Å cryo- electron density map. 43 proteins could be modeled on 20-56% seq.id. to a known structure. The modeled fraction of the proteins ranges from 34-99%. C. Spahn, R. Beckmann, N. Eswar, P. Penczek, A. Sali, G. Blobel, J. Frank. Cell 107, 361-372, 2001. 11
Acknowledgments Tropical Disease Initiative MODEL ASSESSMENT COMPARATIVE MODELING Stephen Maurer (UC Berkeley) Andrej Sali Francisco Melo (CU) Arti Rai (Duke U) Alejandro Panjkovich (CU) M. S. Madhusudhan Andrej Sali (UCSF) Thomas Kepler (Duke U) Narayanan Eswar Ginger Taylor (TSL) Min-Yi Shen STRUCTURAL GENOMICS Damien Devos Stephen Burley (SGX) EVA Neboja Mirkovik Burkhard Rost (Columbia U) John Kuriyan (UCB) Alfonso Valencia (CNIO) Ursula Pieper NY-SGXRC CAMP MODELING ASSEMBLIES Xavier Aviles (UAB) FUNCTIONAL ANNOTATION Hans-Peter Nester (SANOFI) Frank Alber Fatima Al-Shahrour Ernst Meinjohanns (ARPIDA) Fred P. Davis Joaquin Dopazo Boris Turk (IJS) Maya Topf Markus Gruetter (UE) Matthias Wilmanns (EMBL) Wolfram Bode (MPG) FUNCTIONAL ANNOTATION Andrea Rossi BIOLOGY Fred P. Davis Jeff Friedman (RU) James Hudsped (RU) Un i ve r s i t y Partho Ghosh (UCSD) Prince Felipe Research Center Marie Curie Reintegration Grant Alvaro Monteiro (Cornell U) STREP grant Steven Krillis (St.George H) http://bioinfo.cipf.es/sgu/
Recommend
More recommend