Bioinformatics Institute (BII) A*STAR Singapore Frank Eisenhaber www.bii.a-star.edu.sg franke@bii.a-star.edu.sg Singapore, 13 th December 2017
New insights into TM-proteins sequence – structure - function Wong et al., 2010, PLoS Computational Biology, 6(7), doi:10.1371/journal.pcbi.1000867 Wong et al., 2011, Biology Direct, 6(57), doi:10.1186/1745-6150-6-57 Wong et al., 2012, Nucleic Acids Research, 40, W370 – W375, doi:10.1093/nar/gks379 Wong et al., 2014, BMC Bioinformatics 15, 166, doi:10.1186/1471-2105-15-166 Baker et al., 2017, BMC Biology, 15, 66, doi 10.1186/s12915-017-0404-4
Transmembrane helices. A “negative -not- inside/negative-outside rule” complements the “ positive- inside rule ”. James Baker 1,2 , Wing Cheong-Wong 1 , Birgit Eisenhaber 1 , Jim Warwicker 2 *, Frank Eisenhaber 1 * 1 BII at A*STAR, Singapore 2 MIB at Manchester, UK 3
Introduction Lipid bilayer Outside the Inside the cytoplasm cytoplasm Interface Interface 4
Introduction Inside Intra-membrane Outside flank helix flank Non-polar Positive Polar in both flanks (hydrophobic) charge Tyrosine enrichment Tryptophan enrichment enrichment at both Ulmschneider,M.B. and Sansom,M.S.P. (2001) Amino acid interfaces distributions in integral membrane protein structures. Biochim. 5 Biophys. Acta - Biomembr., 1512, 1 – 14.
“Problems” in previous study • Negative residues are especially rare, even in the flanks 1709 human TMHs ± 5 residues (single- 10000 pass) 7500 Residue count 5000 2500 0 L V A I G F S T R C K Y M W Q N H E D 6
New methods for this study • Segregate single-pass and multi-pass + other segregation • Cross reference experimental and predictive datasets • Align from the center (removes bias) • New normalisation – independent, percentage based • OLD: If we have a residue, where and what is it likely to be? • NEW: If we have a residue X, where is it likely to be? q i , r = 100 × a i , r a i , r p i , r = ( ) a r a i max r abundance = a amino acid type = i certain sequence position = r 7
Results If we have a residue X, where is it q i , r = 100 × a i , r likely to be? a i 20 Relative percentage Positive, inside Negative, outside 15 10 5 0 -15 -10 -5 0 5 10 15 8 Distance from centre of helix
Results At which membranes negative charges follow the negative-not- inside/negative-outside rule? • Single-pass graphically. • Multi-pass not graphically present, but statistically present in most cases. Single-pass (1194 helices) Multi-pass (12331 helices from 2093 proteins) 7 Percentage distribution Positive Percentage distribution 7 6 Negative 6 5 5 Leucine 4 4 3 3 2 2 1 1 0 0 -30 -20 -10 0 10 20 30 -30 -20 -10 0 10 20 30 Distance from centre of helix Distance from centre of helix 9
Results Single-pass Multi-pass Inner Inner Outer Outer Inner Inner Outer Outer flank leaflet leaflet flank flank leaflet leaflet flank 10
Our Findings Intra-membrane helix Inside Outside flank Inner Outer leaflet flank leaflet Higher Lower Suppression of Preference for leucine leucine negative charge negative charge propensity propensity Increasing cysteine propensity* 11
Conclusions • A “ negative-not-inside/negative- outside rule” complements the “positive - inside rule”. • Leucine intra-helix propensity reflects leaflet asymmetry. • Multi-pass helices are very different (on average) to single- pass helices. 12
Bac ackground con onsid iderations Similarity measure as a proxy to homology and its limitation E-value cutoff Convergent evolution or By chance Common ancestry Low Moderate High Very high Similarity score Homology is a hypothesis about common evolutionary origin Similarity is a measurable fact Long stretches of similarity versus local resemblances (physiologically constrained to form rudimentary structure)
Bac ackground con onsid iderations Issues with non-globular sequences Common ancestry Convergent evolution APMAP Long stretches of similarity of long globular segment Alignment of homologous structures Strictosidine synthase Dissopropyflurophosphatase Local resemblance of short Serum paraoxonase non-globular segment Drp35 Regucalcin Unrelated hits with a similar TM segment Sequence homology concept is not directly applicable to non- globular sequences. Signal-peptides/transmembrane helices (SP/TM) belong to this class Mimics the appearance of hydrophobic core match
Se Sequence comple lexit ity of of SP SP/T /TM Results of SEG (12/2.2/2.5) : % of low-complexity TMs α -helices Signal Single- Multi- peptides spanning spanning TMs TMs SP/TM have lower complexity than α -helices (12~33% versus 3%) Open-ended questions : • Should all TMs be excluded? What about multi-membrane proteins like GPCR? • Should all single-spanning TM be excluded? • What about those with ‘a few’ TMs?
Rela lationship ip am among th the TM helic lices, fu functional l α -helic ices an and lo low-comple lexit ity se segments Membrane anchors, functional TMs, α -helices, low-complexity segments Overlap of functional α - and TM- helices extents the sequence homology concept for membrane proteins SEG samples low hydrophobicity space and hence insufficient to distinguish ‘simple’ or ‘complex’ TMs
TM propertie ies in in multi-spannin ing membrane proteins For 2202 TCDB sequences Simple TMs being masked Find simple TMs Masked sequence Original sequence Mask ratio for 2202 sequences On average, each sequence has 8 Count TM helices Mask ratio (No. of masked TMs/Total TMs) Multi-spanning membrane proteins can harbor simple TM helices
Con onclusions TMs are either simple (likely of convergent evolution) or complex (likely of common ancestry). Signal peptides and simple TMs can attract unrelated hits. Simple TMs should be quantitatively excluded from similarity searches using the z- score criteria. Complex TMs embody ancestry information and justified for the application of sequence homology concept. Simple TMs are found in membrane proteins regardless of membrane topology. The caveat is that it occurs more frequently in low-spanning ones.
BII Yearbook 2017 • Thanks to Betty and all contributors • Timeline of BII’s history
Bioinformatics Institute: Status in 2017 Thank you !!
Recommend
More recommend