mol2net
play

Mol2Net Information Signatures of Viral Proteins: A Study of - PDF document

Mol2Net , 2015 , 1( Section A, B, C, etc. ), pages 1- x, type of paper, doi: xxx-xxxx 1 http://sciforum.net/conference/mol2net-1 SciForum Mol2Net Information Signatures of Viral Proteins: A Study of Influenza A Hemagglutinin and Neuraminidase


  1. Mol2Net , 2015 , 1( Section A, B, C, etc. ), pages 1- x, type of paper, doi: xxx-xxxx 1 http://sciforum.net/conference/mol2net-1 SciForum Mol2Net Information Signatures of Viral Proteins: A Study of Influenza A Hemagglutinin and Neuraminidase Daniel J. Graham 1,* , Samuel Barlow 2 , Diego F. Cucalón 3 , and Jordan C. Hauck 4 1 Department of Chemistry and Biochemistry, Loyola University Chicago, 6525 North Sheridan Road, Chicago, Illinois 60626, USA; dgraha1@luc.edu; 2 Loyola University Chicago; sbarlow1@luc.edu. 3 Loyola University Chicago; dcucalon@luc.edu 4 Loyola University Chicago; jhauck@luc.edu. * Author to whom correspondence should be addressed; E-Mail: dgraha1@luc.edu; Tel: 1-773-508- 3169; Fax: 1-773-508-3086. Received: / Accepted: / Published: Abstract: MNPNQKIITIGSICMAI…… Hemagglutinin (HA) and neuraminidase (NA) are glycoproteins encoded by several types of are parsed for their correlated information, both the viral particles. Most notably, they exercise total accumulation and fluctuations. Data for the HA complementary chemical functions during infection and NA of multiple influenza A subtypes are and propagation of influenza A: infection of a host is illustrated via information signatures and phase plots. initiated by HA while NA catalyzes the release of This enables sharp contrasts to be drawn between seasonal infectious proteins and ones with high newly-made viral particles. The antibodies of the pandemic potential. Overall, the analysis molecules form the means of classifying the influenza illuminates new ways of evaluating HA and NA A subtypes: H1N1, H2N2, H3N2, etc.. Given the molecules for their subtype and virulence based on risks of viral exposure to global host populations, information properties. Just as important, the results intense effort is directed toward understanding the point to mutation strategies for re-directing and molecular mechanisms. Further, the design and attenuating the protein functions. formulation of drugs which subvert the mechanisms Keywords: proteins; information; influenza; are on-going challenges. This research focuses on the viruses; hemagglutinin; neuraminidase primary structure information expressed by the two proteins, applying an information theoretic model from previous research. The amino acid sequences for HA and NA such as MKARLLILLCALSATD…..

  2. 1. Introduction MNPNQKIITIGSICMVVGIISLILQIGNIISIWVSHSIQTGNQNHPETC NQSIITYENNTWVNQTYVNISNTNVVAGQDATSVILTGNSSLCPIS Hemagglutinin (HA) and neuraminidase GWAIYSKDNGIRIGSKGDVFVIREPFISCSHLECRTFFLTQGALLND KHSNGTVKDRSPYRTLMSCPVGEAPSPYNSRFESVAWSASACHD (NA) are glycoproteins in the surface membrane of GMGWLTIGISGPDNGAVAVLKYNGIITDTIKSWRNNILRTQESECA influenza particles [1]. Infection of a host is initiated CVNGSCFTIMTDGPSNGQASYKILKIEKGKVTKSIELNAPNYHYEE CSCYPDTGKVMCVCRDNWHGSNRPWVSFDQNLDYQIGYICSGVF by HA while NA catalyzes the release of newly-made GDNPRPNDGTGSCGPVSSNGANGIKGFSFRYDNGVWIGRTKSTSS viral particles [2]. The antibodies of the molecules RSGFEMIWDPNGWTETDSSFSVRQDIVAITDWSGYSGSFVQHPEL TGLDCMRPCFWVELIRGQPKENTIWTSGSSISFCGVNSDTVGWSW form the means of classifying the influenza A PDGAELPFSIDK Seq. (4 ) subtypes: H1N1, H2N2, H3N2, etc. [3]. At present, The sequences offer detailed information. there are at least 16 and 9 known subtypes for HA Yet a computer-unassisted reading of them is and NA, respectively. Given the risks of viral bewildering. This is apparent because, among other exposure to global populations, intense effort is things, one cannot distinguish the extraordinary from directed toward understanding the molecular ordinary. The above include formulae allied with the mechanisms. Further, the design and formulation of “Spanish flu” pandemic of 1918 [5] . But which ones drugs which subvert the mechanisms are on-going are these? The correct answers are Seqs. (2) and (4). challenges [4]. The reader’s uncertainty is understandable given the lengths and complexities of the sequences. Influenza HA and NA have presented thousands of variants. For example, two HA Our approach to proteins has looked for sequences are: guidance from information theory [6 - 10]. Here we MKARLLILLCALSATDADTICIGYHANNSTDTVDTVLEKNVTVTH focus on the HA and NA primary structure SVNLLEDSHNGKLCRLKGIAPLQLGKCNIAGWILGNPECESLLSNR information. The results draw contrasts between SWSYIAETPNSENGTCYPGDFADYEELREQLSSVSSFERFEIFPKER SWPKHNITRGVTAACSHAKKSSFYKNLLWLTEANGSYPNLSKSY seasonal molecules and ones with high virulence VNNKEKEVLVLWGVHHPSNIEDQRTLYRKENAYVSVVSSNYNRR potential. The data further point to mutation FTPEIAERPKVRGQAGRMNYYWTLLEPGDKIIFEANGNLIAPWYA FALSRGLGSGIITSNASMDECDTKCQTPQGAINSSLPFQNIHPVTIG strategies for re-directing and attenuating the ECPKYVRSTKLRMVTGLRNIPSIQSRGLFGAIAGFIEGGWTGMVD GWYGYHHQNEQGSGYAADQKSTQNAINGITNKVNSVIEKMNTQF functions. TAVGKEFNKLEKRMENLNKKVDDGFLDIWTYNAELLVLLENERT LDFHDSNVKNLYEKVKNQLRNNAKEIGNGCFEFYHKCDNECMES VKNGTYDYPKYSEESKLNREKIDGVKLESMGVYQILAIYSTVASS LVLLVSLGAISFWMCSNGSLQCRICI Seq. (1) 2. Proteins and Sequence Information MEARLLVLLCAFAATNADTICIGYHANNSTDTVDTVLEKNVTVT The approach builds on research from the HSVNLLEDSHNGKLCKLKGIAPLQLGKCNIAGWLLGNPECDLLLT ASSWSYIVETSNSENGTCYPGDFIDYEELREQLSSVSSFEKFEIFPKT mid-2000s. Work in this lab quantified the SSWPNHETTKGVTAACSYAGASSFYRNLLWLTKKGSSYPKLSKS YVNNKGKEVLVLWGVHHPPTGTDQQSLYQNADAYVSVGSSKYN correlated information CI expressed by the naturally RRFTPEIAARPKVRDQAGRMNYYWTLLEPGDTITFEATGNLIAPW occurring amino acids based on their atom and YAFALNRGSGSGIITSDAPVHDCNTKCQTPHGAINSSLPFQNIHPVT IGECPKYVRSTKLRMATGLRNIPSIQSRGLFGAIAGFIEGGWTGMI covalent bond structure [6, 8]. An average < CI > DGWYGYHHQNEQGSGYAADQKSTQNAIDGITNKVNSVIEKMNT and standard deviation σ CI were established and a QFTAVGKEFNNLERRIENLNKKVDDGFLDIWTYNAELLVLLENER TLDFHDSNVRNLYEKVKSQLKNNAKEIGNGCFEFYHKCDDACME ( i ) Z dimensionless quantity was based on each SVRNGTYDYPKYSEESKLNREEIDGVKLESMGVYQILAIYSTVASS CI LVLLVSLGAISFWMCSNGSLQCRICI Seq. (2) amino acid’s CI contribution relative to the average Two NA sequences are: CI , e.g. MNPNQKIITIGSICMAIGTISLILQIGNIISIWVSHSIQTGSQNHTGICN     ( W ) ( F ) 2 . 63 0 . 691 Z Z QRIITYENNTWVNQTYVNISNTNVVAGKDTTSMILAGNSSLCPIRG CI CI WAIYSKDNSIRIGSKGDVFVIREPFISCSHLECRTFFLTQGALLNDK   ( M ) 0 . 128 Z HSNGTVKDRSPYRALMSCPIGEAPSPYNSRFESVAWSASACHDGM CI GWLTIGISGPDDGAVAVLKYNGIITEIIKSWRKQILRTQESECVCVN   ( ) A 0 . 476 GSCFTIMTDGPSDGPASYRIFKIEKGKITKSIELDAPNSHYEECSCYP Z CI DTGKVMCVCRDNWHGSNRPWVSFNQNLDYQIGYICSGVFGDNP There are twenty amino acids and thus sixteen more RPKDGKGSCDPVNVDGADGVKGFSYRYGNGVWIGRTKSNSSRK GFEMIWDPNGWTDTDGNFLVKQDVVAMTDWSGYSGSFVQHPEL ( i ) Z TGLDCMRPCFWVELIRGRPREKTTIWTSGSSISFCGVNSDTVNWS to note as in reference [6]. The superscript CI WPDGAELPFTIDK Seq. (3) symbols refer to the amino acid while the numerical

Recommend


More recommend