a novel index of protein protein interface propensity
play

A novel index of protein- protein interface propensity improves - PowerPoint PPT Presentation

A novel index of protein- protein interface propensity improves interface residue recognition Wentao Dai Email: wtdai AT scbit.org Shanghai Center for Bioinformation Technology SCBIT) Outline I. Background and Motivation II.


  1. A novel index of protein- protein interface propensity improves interface residue recognition Wentao Dai Email: wtdai AT scbit.org Shanghai Center for Bioinformation Technology ( SCBIT)

  2. Outline I. Background and Motivation II. Protein-Protein Interface Datasets – Astral2.05-40-4506 III. Characteristics of Interface – QIPI : Quantitative protein-protein Interface Propensity Index IV. Evaluation – SPR : Single domain based Patch Recognition V. Summary

  3. I.Background and Motivation

  4. I. Background and Motivation • protein-protein interaction • protein-protein interface properties • protein-protein interface prediction (residue recognition)

  5. II.Protein-Protein Interface Datasets • Comprehensive interface dataset – Training – Astral2.05-40-4506 • Testing interface dataset – Docking Benchmark 2.0 – CAPRI25 and Enz35

  6. II.Protein-Protein Interface Datasets • SCOPe : Structural Classification of Proteins — extended database (v2.05) • Astral2.05-40 : a subset of SCOPe2.05 with less than 40% identity between any two domains • Astral2.05-40-4506 : 4506 interfaces obtained from the Astral2.05-40 dataset

  7. III. Characteristics of Interface • Relative Interface Ratio (RIR) and Contact Preferences • Residue Composition and QIPI • Secondary Structure • Contact preference • Interface Size

  8. III. Characteristics of Interface ---Relative Interface Ratio (RIR) RIR = w i W i w f f ∑ = i i m m f number of interface residues of type i i W F F ∑ = i i m m F number of non-interface surface residues of type i i

  9. III. Characteristics of Interface ---Contact Preferences ContactFre q C C ∑ = ij mn m , n Contact Pr ef log ( ( C C ) ( w w ) ) ∑ = × 2 ij mn i j m , n number of interface-crossing C ij contacts between residues of types i and j

  10. III. Characteristics of Interface ---QIPI 0.1 1.6 0.09 1.4 0.08 1.2 0.07 1 0.06 Frequency Interface Ratio 0.05 0.8 Non-Inter RIR 0.04 0.6 0.03 0.4 0.02 0.2 0.01 0 0 H R K A V I L M P F W Y G C S T N Q D E

  11. III. Characteristics of Interface ---QIPI 0.1 1.6 0.09 1.4 0.08 1.2 0.07 1 0.06 Frequency Interface Ratio 0.05 0.8 Non-Inter RIR 0.04 0.6 0.03 0.4 0.02 0.2 0.01 0 0 H R K A V I L M P F W Y G C S T N Q D E Basic Hydrophobic Polar Acidic Aromatic

  12. III. Characteristics of Interface ---QIPI 0.1 1.6 0.09 1.4 0.08 1.2 0.07 1 0.06 Frequency Interface Ratio 0.05 0.8 Non-Inter RIR 0.04 0.6 0.03 0.4 0.02 0.2 0.01 0 0 H R K A V I L M P F W Y G C S T N Q D E Basic Hydrophobic Polar Acidic Aromatic

  13. III. Characteristics of Interface ---QIPI 0.1 1.6 0.09 1.4 0.08 1.2 0.07 1 0.06 Frequency Interface Ratio 0.05 0.8 Non-Inter RIR 0.04 0.6 0.03 0.4 0.02 0.2 0.01 0 0 H R K A V I L M P F W Y G C S T N Q D E Basic Hydrophobic Polar Acidic Aromatic

  14. III. Characteristics of Interface ---QIPI • Interface preference: – hydrophobic residues – aromatic residues – residues with long side chain • High interface propensity – Arg, Phe, Met, Trp and Tyr Quantitative residue interface propensity index H R K A V I L M P F 1.147 1.346 0.784 0.841 0.994 1.084 1.144 1.451 1.109 1.334 W Y G C S T N Q D E 1.284 1.368 0.823 1.172 0.873 0.966 0.958 0.909 0.830 0.805

  15. III. Characteristics of Interface ---Secondary structure 1.2 1 0.8 Interface 0.6 Non-Inter RIR 0.4 0.2 0 H E C

  16. III. Characteristics of Interface ---Secondary structure 0.06 2.5 0.05 2 0.04 1.5 Frequency Interface Ratio 0.03 Non-Inter RIR 1 0.02 0.5 0.01 0 0 I-C F-C S-C A-H A-C R-E N-H N-C D-E C-H C-C Q-E E-H E-C H-E I-H L-E K-H K-C M-E F-H P-E S-H T-E W-H W-C Y-E V-H V-C G-E

  17. III. Characteristics of Interface ---Secondary structure 0.06 2.5 0.05 2 0.04 1.5 Frequency Interface Ratio 0.03 Non-Inter RIR 1 0.02 0.5 0.01 0 0 I-C F-C S-C A-H A-C R-E N-H N-C D-E C-H C-C Q-E E-H E-C H-E I-H L-E K-H K-C M-E F-H P-E S-H T-E W-H W-C Y-E V-H V-C G-E

  18. III. Characteristics of Interface ---Secondary structure • strand (E) residues : negative interface propensity • coil(C) : positive interface propensity • Residue type : the principal factor of interface propensity

  19. III. Characteristics of Interface ---Contact Preferences ContactFre q C C ∑ = ij mn m , n Contact Pr ef log ( ( C C ) ( w w ) ) ∑ = × 2 ij mn i j m , n number of interface-crossing C ij contacts between residues of types i and j

  20. III. Characteristics of Interface ---Contact preference Ø High preferences Ø Cys-Cys contacts Ø hydrophobic contacts (A- W) Ø aromatic contacts (P-Y : Phe-Cys, Phe-Phe, Phe-Trp, Phe-Tyr, Trp-Tyr, Tyr-His, Tyr- Lys and Tyr-Met ) Ø contacts between oppositely charged residues (Arg-Asp, Arg-Glu) Arg, Phe, Trp and Tyr have the highest interface propensity RIR of these residues >1.2 and the number of contacts include these residues with high contact preference (more than 1.5 in pink)

  21. III. Characteristics of Interface ---Interface Size Fig.A : The average interface size is about 800 Å 2 and there are about 86% of interface sizes in the range of 0-2000 Å 2 . Figure B : the size of interface residue number has a gamma distribution and the average of interface residue numbers is about 20. Fig.C : The average domain size is about 9000 Å 2 which is much larger than that of interface.

  22. IV. Evaluation ---Interface residue recognition • Identification of surface residues • Generation of residue side- chain distance matrix • Construction of candidate interface patches • Merging the candidate interface patches • Selecting the top-ranked candidate interface patch

  23. IV. Evaluation ---SPR : Single domain based Patch Recognition Table 1 Patch Generation Thresholds A The ASA and distance with seed residue of patch residue ASA(> Å 2 ) Distance(Å) (2,5) 0 (5,7) 20 (7,9) 40 (9,11) 60 (11,13) 80 (13,15) 100 B Thresholds for patch merging Domain ASA(Å 2 ) Identity Ratio (0,5000) 0.8 (5000,7500) 0.7 (7500,10000) 0.6 (10000,+ ∞ ) 0.5

  24. IV. Evaluation ---SPR : Single domain based Patch Recognition E E w E w E w E = + + + Patch res 1 hydro 2 cons 3 sol E H E ( ASA RIR ) REF ∑ ∑ = = • res i r r hydro i i patch , r i patch , r ∈ ∈ ASA i is the relative accessible surface area of residue r at sequence H i is the hydrophobic score in the CASG920101 matrix of AAindex position I for the residue type r at sequence position i The RIR r for 20 amino acid residues are obtained from QIPI The REF r is the element of JANJ780101 in AAindex for residue type r V ⎛ ⎞ E ( C B ) ∑ = − i , out E ⎜ ⎟ ∑ = cons ir rr ⎜ ⎟ V V i patch , r − ∈ i patch ⎝ ⎠ ∈ i , sphere i , out Cyscore C ir is the self-substitution score in the position-specific substitution V i,sphere is defined as the sphere volume in the solvent accessible matrix produced by PSI-BLAST for the residue type r at sequence surface position I V i,out represents the volume out of the solvent accessible surface on B rr is the diagonal element of BLOSUM62 for residue type r residue i in the patch

  25. IV. Evaluation ---SPR : Single domain based Patch Recognition F COV ACC = ∗ TP : True positive (real interface residues and right prediction ) FP : False positive (non- interface residues but predicted ones ) FN : false negative (real interface residues but wrong prediction )

  26. IV. Evaluation ---SPR : Single domain based Patch Recognition F COV ACC = ∗ TP : True positive (real interface residues and right prediction ) FP : False positive (non- interface residues but predicted ones ) FN : false negative (real interface residues but wrong prediction )

  27. IV. Evaluation ---Contribution of interface features to interface residue recognition Coverage Accuracy F QIPI 0.472 0.188 0.089 Hydrophobic 0.321 0.238 0.076 Conservation 0.266 0.191 0.051 Solvation 0.147 0.160 0.023 QIPI+Hydrophobic 0.467 0.186 0.087 All-QIPI 0.312 0.239 0.075 All 0.475 0.194 0.092 Note: Bold values denote the best performance in each category.

  28. IV. Evaluation ---Performance of SPR Comparisons of SPR with several Comparisons of SPR with several popular interface prediction popular interface prediction programs on CAPRI25 dataset programs on Enz35 dataset ACC COV ACC COV SPR 0.34 0.4 SPR 0.36 0.58 Cons-PPISP 0.26 0.3 Cons-PPISP 0.36 0.5 Meta-PPISP 0.28 0.39 Meta-PPISP 0.48 0.55 Promate 0.26 0.3 Promate 0.4 0.45 PINUP 0.25 0.43 PINUP 0.47 0.53 Note: Bold values denote the best performance in each category.

  29. V.Summary • A large-scale comprehensive interface dataset Astral2.05-40-4506 for analysis • A novel quantitative residue interface propensity index (QIPI) • An interface prediction method Single domain based Patch Recognition (SPR)

  30. Acknowledgements • Shanghai Sailing Program (16YF1408600) • Shanghai Center for Bioinformation Technology – Prof. Yuan-Yuan Li , Prof. Yi-Xue Li and Liangxiao Ma • Suzhou Institute of Systems Medicine – Prof. Aiping Wu and Prof. Taijiao Jiang

Recommend


More recommend