structural comparison application to the study of protein
play

Structural Comparison: Application to the study of Protein Binding - PowerPoint PPT Presentation

Structural Comparison: Application to the study of Protein Binding Patches N. Malod-Dognin 1 (1) Department of Computing, Imperial College London, UK AlgoSB : Algorithms in Structural Bio-informatics Outline 1) Function/Complex/Binding Patches


  1. Structural Comparison: Application to the study of Protein Binding Patches N. Malod-Dognin 1 (1) Department of Computing, Imperial College London, UK AlgoSB : Algorithms in Structural Bio-informatics

  2. Outline 1) Function/Complex/Binding Patches 2) Mathematical Models 3) Divide and Conquer Strategies 4) Results Structural Comparison of Binding Patches 2/36

  3. Proteins perform their functions through binding Related biological problem : Measuring the specificity and affinity of the interaction Predicting the structure of a complex based on the unbound structures (docking) 2JEL, an antibody / antigen complex. Answering these questions requires insight on the surface atoms accounting for the interaction Structural Comparison of Binding Patches 3/36

  4. Protein Solvant Accessible Surface (SAS) Surface as seen by a water probe molecule (Lee & Richard, 1971) : No hydrogen atom, VdW radii + 1.4 Å (water probe) Atoms participating to the SAS are found using so-called α -shape Structural Comparison of Binding Patches 4/36

  5. Interface and Binding Patches b 4 b 5 b 3 b 2 b 1 a 3 a 1 a 2 a 6 a 5 a 4 Interface : All atoms participating in the interaction i.e. having SAS intersected by the SAS of the other partner → Exact computation using the Voronoï Interface Model (Cazals et Al., 2006) Patch : Interface atoms restricted to a single partner Structural Comparison of Binding Patches 5/36

  6. The Atom Shelling Tree Model Binding Patch as Pattern of Neasted Shells (Malod-dognin, Bansal and Cazals, 2012) 1 43 2 34 3 1 4 23 5 17 6 13 7 1 8 2 Each BP face (spherical polygon) is associated to a Shelling Order (SO), that is the distance, in terms of faces, to the boundary of the patch Connected components of faces having same SO form Shells Inclusion relation between shells is represented in the Atom Shelling Tree Structural Comparison of Binding Patches 6/36

  7. Structural Comparison of Patches Related Questions : Functional prediction i.e. Do similar receptor bind similar ligand ? Affinity prediction Do similar patches have similar binding affinities ? Structural classification of interfaces ? Do patches change morphology during docking ? hints towards rigide/flexible docking Structural Comparison of Binding Patches 7/36

  8. Pitfalls Algorithms for protein structure can not be used for comparing patches No total ordering between atoms on the protein surface. Hardness of Geometrical Comparison Ex : Largest quasi-isometric subset of atoms (Brint & Willett, 1987) ↔ maximum clique problem NP-Complete problem (Karp in 1972) Hard to approximate (Feige et Al., 1991) Fixed parameter intractable (Chen et Al., 2006) Structural Comparison of Binding Patches 8/36

  9. Low Resolution Level Methods Amino-acid level comparison (Scoppi : Winter et Al., 2006) Functional group level comparison (Probis : Konc & Janezick 2010) Structural Comparison of Binding Patches 9/36

  10. Outline 1) Function/Complex/Binding Patches 2) Mathematical Models 3) Divide and Conquer Strategies 4) Results Structural Comparison of Binding Patches 10/36

  11. Contact Map Overlap Maximization Without Order Nathalie (El-Kebir et Al., 2011), a cost-split model for PPI Alignment Classical representation Alignment graph CM1 4 4.1 4.2 4.3 4.4 CM1 3 3.1 3.2 3.3 3.4 1 2 3 4 2 2.1 2.2 2.3 2.4 1 2 3 4 1 CM2 1.1 1.2 1.3 1.4 1 2 3 4 CM2 Variables Each vertex i . k is represented by a boolean variable x ik Edge ( i . k , j . l ) are represented by two arcs / boolean variables : y ikjl (from i . k to j . l , i < j ), z jlik (from j . l to i . k , i < j ) Structural Comparison of Binding Patches 11/36

  12. Integer Programming Formulation 1 1 CM1 Objective : max ∑ 2 y ikjl + ∑ 2 z jlik i , k , j , l i , k , j , l 4.1 4.2 4.3 4.4 Subject to : 1/row : ∑ x ik ≤ 1 , ∀ i z 3.1 3.2 3.3 3.4 k 1/col : ∑ x ik ≤ 1 , ∀ k i 2.1 2.2 2.3 2.4 y bind y-arcs & tail-vertex : ∑ y ikjl ≤ x ik , ∀ i , k , j , i < j 1.1 1.2 1.3 1.4 l bind z-arcs & tail-vertex : ∑ z jlik ≤ x jl , ∀ i , k , j , i < j CM2 l edge equality : y ikjl = z jlik , ∀ i , k , j , l , i < j , k � = l Structural Comparison of Binding Patches 12/36

  13. Lagrangian Relaxation Approach When relaxing edge equality : y ikjl = z jlik Local Problem : ∀ i . k , find optimal sum of outgoing y ikjl and z jlik arcs, with at most one head vertex per row or per col : Rows Cols 1 1 1 4.1 4.2 4.3 4.4 4.1 4.2 4.3 4.4 4 1 1 4 1 1 1 1 3.1 3.2 3.3 3.4 3.1 3.2 3.3 3.4 3 1 1 1 2.1 2.2 2.3 2.4 2.1 2.2 2.3 2.4 1 1 1 1 1 1 1 1.1 1.2 1.3 1.4 1.1 1.2 1.3 1.4 1 Global Problem : Maximum Cost (based on outgoing arcs) set of vertices, with at most one vertex per row or column : Both are maximum cost bipartite matching between the rows and columns. Time complexity of Local+Global is O( n 4 log n ), versus O( n 2 ) for the ordered case. Structural Comparison of Binding Patches 13/36

  14. Outline 1) Function/Complex/Binding Patches 2) Mathematical Models 3) Divide and Conquer Strategies 4) Results Structural Comparison of Binding Patches 14/36

  15. Divide and Conquer Strategies : Probis (Konc & Janezic, 2010) Principle : Use 10Å spheres centered at each functional group Structural Comparison of Binding Patches 15/36

  16. Divide and Conquer Strategies : Compatch Principle : Use the Atom Shelling Tree to localized the matching : Local atom-matchings between shells (using topology or geometry) Global matching is reconstructed via the shell-matching of the tree-edit-distance 1 50 1 50 2 48 2 1 3 54 3 32 4 1 5 2 6 12 7 24 4 18 8 1 9 1 5 3 6 4 Structural Comparison of Binding Patches 16/36

  17. Using Tree Edit Distance Using only topological information (Patterns of neasted shells) T1 T2 1 50 1' 40 1' 40 2 10 3 20 2 10 3 20 2 10 3 20 4 5 4' 5 Edit operations and costs : Insert / Delete a shell i : | i | (nb atoms in shell i ) Edit shell i into shell j : abs ( | i |−| j | ) Optimal edit script = minimum cost, denoted by TED t ( T 1 , T 2 ) Dynamic programming (Bille, 2005) Dissimilarity score : DIS t = TED t ( T 1 , T 2 ) | T 1 | + | T 2 | , i.e. % of non-isotopologic atoms Structural Comparison of Binding Patches 17/36

  18. Local matching between two shells Objective : Largest matching between S 1 and S 2 such that for any two pairs of matched atoms i ↔ k and j ↔ l , | d ij − d kl | < 2Å. S1 (rows) S1 16Å 4 4.1 4.2 4.3 4.4 4Å 8Å 3 3.1 3.2 3.3 3.4 8Å 4 Å 4 Å 1 2 3 4 2 2.1 2.2 2.3 2.4 1 2 3 4 1 1.1 1.2 1.3 1.4 4 Å 4 Å 4 Å 7Å 7Å S2 1 2 3 4 (columns) 11Å S2 Maximum clique problem that is solved using Cliquer (Ostergard 2002). Structural Comparison of Binding Patches 18/36

  19. Global matching of the shell using Tree Edit Distance 2/2 Using Geometric information 16Å i 4Å 8Å 8Å 4 Å 4 Å 1 2 3 4 1 2 3 4 4 Å 4 Å 4 Å 7Å 7Å j 11Å TED costs : I J Insert / Delete a shell i : | i | Edit shell i into shell j : symmetric difference = | i | + | j |− 2 ×| i ∩ j | , where i ∩ j is the largest quasi-isometric subset between i and j . Dissimilarity score : DIS g = TED g ( T 1 , T 2 ) | T 1 | + | T 2 | Structural Comparison of Binding Patches 19/36

  20. Outline 1) Function/Complex/Binding Patches 2) Mathematical Models 3) Divide and Conquer Strategies 4) Results Structural Comparison of Binding Patches 20/36

  21. Running time comparison : Dataset_1 77 high resolution ( ≤ 2Å) Immunoglobulin/Antigen from IMGT-3D (Lefranc 2003) 15 high resolution Protease/Inibitor complexes from the Protein Docking Benchmark (Chen et Al, 2003) These 92 complexes yield a total of 184 patches. The all-against-all comparison involves 17020 pairs of patches. Structural Comparison of Binding Patches 21/36

  22. Running Time Comparison (over 17020 instances) T(s.) T(s.) 0.25 30 25 0.2 20 0.15 15 0.1 min(#BP1, #BP2) min(#BP1, #BP2) 10 0.05 5 0 200 0 200 150 150 100 100 50 50 100 150 200 250 300 100 150 200 250 300 50 50 max(#BP1, #BP2) max(#BP1, #BP2) T(s.) Total running times : 600 500 400 Top-Left : TED t : 315.6 secs. 300 min(#BP1, #BP2) 200 100 Top-Right : TED g ( ε = 2Å) : 9843.8 secs. 0 200 150 Bottom : Clique ( ε = 2Å) : 1166221.6 secs. 100 50 100 150 200 250 300 50 max(#BP1, #BP2) Structural Comparison of Binding Patches 22/36

  23. Dataset_2 116 high resolution ( ≤ 2Å) Immunoglobulin/Antigen from IMGT-3D 133 Enzime/Ligand complexes from the Affinity Benchmark (Kastritis et Al., 2011), with resolution in [1.1Å, 3.3Å] 249 complexes → 498 patches → 124251 pairs Family of complex Sub-Family of complex Partner Type Class identifier #patches (A) Antibody (Carb) Carbohydrate (R) Receptor A_Carb_R * 9 (L) Ligand A_Carb_L * 9 (Chem) Chemical (R) Receptor A_Chem_R * 40 (L) Ligand A_Chem_L * 40 (DNA) DNA (R) Receptor A_DNA_R 1 (L) Ligand A_DNA_L 1 (Pept) Peptide (R) Receptor A_Pept_R * 21 (L) Ligand A_Pept_L * 21 (Prot) Protein (R) Receptor A_Prot_R * 53 (L) Ligand A_Prot_L * 53 (E) Enzyme (Inhi) Inhibitor (R) Receptor E_Inhi_R * 40 (L) Ligand E_Inhi_L * 40 (Regu) Regulator (R) Receptor E_Regu_R * 11 (L) Ligand E_Regu_L * 11 (Subs) Substrat (R) Receptor E_Subs_R * 10 (L) Ligand E_Subs_L * 10 (OG) ? ? non-available non-available OG 34 (OR) ? ? non-available non-available OR 26 (OX) ? ? non-available non-available OX 68 Structural Comparison of Binding Patches 23/36

Recommend


More recommend