premise
play

Premise: Most proteins that share a common evolutionary past -- - PowerPoint PPT Presentation

Premise: Most proteins that share a common evolutionary past -- homologs -- do not exhibit statistically significant amino acid sequence identity. Therefore, the ability to generate hypotheses of function by the exclusive use of methods


  1. Premise: Most proteins that share a common evolutionary past -- homologs -- do not exhibit statistically significant amino acid sequence identity. Therefore, the ability to generate hypotheses of function by the exclusive use of methods dependent on statistical significance is limited and restricts assignment of function by combined computational and experimental strategies.

  2. Premise: Although high sequence identity and low probability of random matching is good evidence of homology, it is inappropriate to assert that two proteins are not homologs simply because sequence identity is < 25% and the expectation value exceeds an arbitrary cutoff. Criteria developed for annotation are not necessarily optimal for recognition of homology or for generation of hypotheses of function. An opportunity exists to implement homology search algorithms that are based on a "forensic" (clues-based) analysis of genomic data.

  3. Extreme Protein Homology Search (work in progress)

  4. Ig, Cupredoxins Psi-BLAST linked pseudoazurin to a chicken antibody light chain. PAZ 19 ....EPAYIKANPGDTVTFI---------------------PVDKGHNVESIKDMI 49 GAL 5 AALTQPSSVSANPGETVKITCSGGSNSYG------VFQQKSPGSAPVTVIYWDDER 55 MCG 1 SALTQPPSASGSLGQSVTISCTGTSSDVGGYNYVSWFQQHA-GKAPKVIIYEVNKR 56 PAZ 50 PEGA-EKFKSKINENYVLTVTQPGA-----YLVKCTPHYAMGMIALIAVGDSPANL 99 GAL 52 PSGIPSRFSGSKSGS-THTLTITGVQAEDEAVYYCGSIDSSSGYAGFGAGTTLTVL 110 MCG 53 PSGVPDRFSGSKSGN-TASLTVSGLQAEDEADYYCSSYEGSDNFV-FGTGTKVTVL 109 PAZ 1PAZ Pseudoazurin GAL Chicken [ Gallus gallus ] immunoglobulin lambda light chain MCG 2MCG Human immunoglobulin lambda light chain

  5. Ig, Cupredoxins Psi-BLAST linked pseudoazurin to a chicken antibody light chain. PAZ 19 ....EPAYIKANPGDTVTFI---------------------PVDKGHNVESIKDMI 49 GAL 5 AALTQPSSVSANPGETVKITCSGGSNSYG------VFQQKSPGSAPVTVIYWDDER 55 MCG 1 SALTQPPSASGSLGQSVTISCTGTSSDVGGYNYVSWFQQHA-GKAPKVIIYEVNKR 56 PAZ 50 PEGA-EKFKSKINENYVLTVTQPGA-----YLVKCTPHYAMGMIALIAVGDSPANL 99 GAL 52 PSGIPSRFSGSKSGS-THTLTITGVQAEDEAVYYCGSIDSSSGYAGFGAGTTLTVL 110 MCG 53 PSGVPDRFSGSKSGN-TASLTVSGLQAEDEADYYCSSYEGSDNFV-FGTGTKVTVL 109 PAZ 1PAZ Pseudoazurin GAL Chicken [ Gallus gallus ] immunoglobulin lambda light chain MCG 2MCG Human immunoglobulin lambda light chain

  6. Between Twilight and Midnight: The Ephemeral Zone of Apparent Homology The match between pseudoazurin and the chicken light chain was obtained in the 3rd round of Psi-BLAST: 20% identity, E = 1.3. These values are not conventionally considered to be consistent with homology. Current searches do not find this match, presumably due to changes in the composition of the database, which has doubled in size. However, matches are found to another member of the immunoglobulin superfamily gi|266418 Macrophage colony-stimulating factor 1 receptor precursor (CSF-1-R) [Rattus rattus] Length=978 Expect = 0.29, Identities = 14/76 (18%), Positives = 19/76 (25%), Query 19 EPAYIKANPGDTVTFIPVDKGHNVESIKDMIPEGAEKFKSKINENYVLT-VTQPGAYLVK 77 Sbjct 26 SGPELVVEPGETVTLRCVSNG-SVEWDG-PISPYWTLDPESPGSTLTTRNATFKNTGTYR 83 Query 78 CTPHYAMGMIALIAVG 93 Sbjct 84 CTE-LEDPMAGSTTIH 98

  7. 2D9Q Granulocyte Colony Stimulating Factor and its Receptor

  8. Laccase exhibits potential homology to heavy chain variable domains. LAC 355 IISGAQNA-QDLLPSGSVYVLPSNADIEISFPATAAAPGAPHPFHLHGHAFAVVRSAGSTV-------YNYDNPIFRDV 425 PSP 15 DLMADEPP-SDLSKVTIAANVKNATYRNFVEIIFENREKTIQTYHLDGYSFFAVAIEPGKWSPEKRKNYNLVDAVSRHS 82 VH1 63 ...........................................................................RKVT 65 VH2 42 ...............................................GNKLEYMGYISFSGNTFYHPSLKSRISITRDT 73 VL 2 ...................................................................SYELTQPP---S 10 LAC 426 VSTGTPAAGDNVTIRF----RTDNPGPWFLHCHIDFHLE------AGFAVVFAEDIPDVA--SANPVPQAWSDLCPTYDARD 495 PSP 83 IQVYP---NSWAAVMT----TLDNAGMWNLRSDMWEKFY------LGQQLYFSVLSPSGSLRDEYNLPDNHP-LCGIVKGMP 160 VH 66 LTVDK---SSSTAYMQLSRLTSEDSA--VYYCARTNWERNYAMDYWGQGTSVTVSSAKTTAPSVYPL---AP-VCGGTTG.. 137 VH2 74 SKNQH---YLQLSSVT----TEDTA---TYYCANWDGTY------WGEGTLVTVSAAKTTAPSVYPL---AP-VCGDTTG.. 133 VL 11 VSVAP---GQTARIIC----GADNIGDKSVH---WYQQK------PGQAPVLVVYDDRDR-PSGIP................ 59 2PLT 3 ..TVKLG--ADSGALEFVPKTLTIKSGETV--NFVNNAGFPHNI---------------VFDEDAIPSGVNADAISRDD 61 1KCW 970 ....................................NEIDLHTVHFHGHSF---------QYKHRGVY-------SSDV 996 VH1 63 ...........................................................................RKVT 65 VH2 42 ...............................................GNKLEYMGYISFSGNTFYHPSLKSRISITRDT 73 VL 2 ...................................................................SYELTQPP---S 10 2PLT 62 YLNAP---GETYSVKL----TA--GAEYGYYCE--PHQG----AGMVGKIIV.......................... 97 1KCW 997 FDIFP---GTYQTLEM----FPRTPGIWLLHCHVTDHIH----AGMETTYTVLQNE...................... 1041 VH 66 LTVDK---SSSTAYMQLSRLTSEDSA--VYYCARTNWERNYAMDYWGQGTSVTVSSAKTTAPSVYPLAPVCGGTTG.. 137 VH2 74 SKNQH---YLQLSSVT-----TEDTA--TYYCANWDGTY------WGEGTLVTVSAAKTTAPSVYPLAPVCGDTTG.. 133 VL 11 VSVAP---GQTARIIC----GADNIGDKSVH---WYQQK------PGQAPVLVVYDDRDR-PSGIP............ 59 LAC 20270770 Laccase 2 [ Trametes pubescens (mushroom)] 520aa PSP 4105800 Pollen-specific protein [Petunia] 167aa 16%, 0.047 VH1 41352204 Heavy chain variable domain [Mouse] Q: PSP 20%, 1.2 VH2 31615669 Heavy chain variable domain [Mouse] 1NDG Q: PSP 16%, 0.45 VL 62860980 Lambda light chain variable domain [Human] Q: PSP (questionable) 2PLT 443425 2PLT Plastocyanin [ Chlamydomonas reinhardtii (green algae)] 98a Q:PSP 1KCW 1942284 Ceruloplasmin [Human] 1046aa 18%, 7e-04 Q:PSP

  9. Evidence for an evolutionary link between cupredoxins and immunoglobuliins 1. Suggestive sequence similarity. 2. Structural similarity: superposition of 1PAZ (25-33) with 2MCG (12-20) results in a r.m.s. deviation of 1.19 A 3. Structural similarity: superposition juxtaposes Cys78 (1PAZ) and Cys90 (2MCG). 4. Structural similarity: both proteins have a tyrosine corner near the Cys residues.

  10. Hypothesis: Immunoglobulin and Cupredoxin Superfamilies are Evolutionarily Linked CuXn Ig

  11. When the structure of Superoxide Dismutase was determined, its striking similarity to the structure of immunoglobulins was considered to be due to convergent evolution. SOD Cupredoxins SOD SOD1 3 ..AICVMS---GDVSGQVYFKQEGPQQPVSISGFLLNLPRGLHGFHVHEFG---------DTSNGCTSAG-EHFNPTNQD-HGAPDAAER 76 SOD2 25 ..AKATLKNAEGTEIGTATLTESSKG--VTIKLALKGLPPGEHAFHIHAVG---------KCEPPFTSAG-GHFNPENKK-HGKMAEGGA 103 LAC 278 ANPPYPLITIDRDSW-DENQFSLSTGSKPVWIDFIVNNLDEGPHPFHLHGHT-------FFILSLFESTIGWGSYNPHQPH-LNPSPYPPY 360 1KCW 741 .....MHAINGRMFGNLQGLTMHVGDEVNWYLMGMGNEIDL-HTVHFHGHSFQ-----------------------------------YK 797 LAC9 413 ........................................GPHPFHLHGHAFSVVR-----------SAGSSTYNYENP----------- 430 LAC11 417 ..............................ISG......SGPHPFHLHGHAFSVVR-----------SAGNSSYNYVNPV-KRDVVSMG- 457 SOD3 107 ....................VLTETPSGVEIVAQVQGLTAGLHGFHIHANGQCDPGPDAATGKTVPFGAAGGHFDPGMSHQHGQPGAPGA 173 SOD4 24 ..TAPIYTTGPKPV-AIGKVTFTQTPYGVLITPDLTNLPEGPHGFHLHKN----------ADCGNHGMDAEGHYDPQNTNSHQGP-YGNG 99 SOD1 77 HVG--DLGNVRSV---GCT-ALT-PIEMTDNVISLFG-PLSI----LGRSLVVHTDRDDLGLTDNPLSKITGNSGGRLACGIIAVCK 151 SOD2 100 HAG--DMPNLDVP---ASG-ALS-IDVVNDAVTLAKGKPNSVFK-DGGTALVIHAKADDY------KSDPAGNAGDRIACGVIEEAK 172 LAC 357 DFS--KALERDTVQIPRRG-HAV-LRLRADNPGVWLFHCHILWHLASGMAMLLEVM............................... 408 1KCW 794 HRG---VYSSDVFDIFPGT-YQ--LEMFPRTPGIWLLHCHVTDHIHAGMETTYTVL............................... 840 LAC9 431 -------VRRDVVDVGGAS-DNVTIRFTTDNPGPWFFHCHIEFHLVLGLAMVFM................................. 485 LAC11 458 --GDSDLVTIRFV---TDNPGPW-FFH-----------CHIEPHLVGGLAIVFAEAMEDTAAAH....................... 504 SOD3 174 PIDKAHAGELPNISVGADG-RGT-VRYLNTN........................................................ 202 SOD4 100 HLG--DLPVLYVTSNGKAM-IPT-LA............................................................. 121 SOD1 86355642 SOD (Cu/Zn) [ Hyphantria cunea nucleopolyhedrovirus] SOD2 39933302 putative superoxide dismutase (Cu/Zn) [Rhodopseudomonas palustris] aa=172 LAC 76008508 laccase-like protein [ Coccidiodes posadasii ] aa=408 (domain 3) 1KCW 180249 ceruloplasmin 1KCW [Human] {query 76008508} 19%;7e-06 1046aa (12/12/06) LAC9 115371531 laccase 9 [ Coprinopsis okayama7 ] {query 76008508} 525 aa LAC11 115371535 laccase 11 [ Coprinopsis cinerea okayama ] Rnd 3 15%; 4.5 {query 86355642} (01/04/07) SOD3 111618878 superoxide dismutase [ Acidovorax avenae ] Rnd 2 22%; 7.2 {query 76008508} 260aa (01/04/07) SOD4 54298239 superoxde dismutase [ Legionella pneumophila ] Rnd 4 19%; 6.1 {query 76008508} 162 aa (01/04/07)

Recommend


More recommend