lattice models the simplest protein model
play

Lattice Models: The Simplest Protein Model The HP-Model (Lau & - PowerPoint PPT Presentation

Lattice Models: The Simplest Protein Model The HP-Model (Lau & Dill, 1989) model only hydrophobic interaction alphabet { H , P } ; H/P = hydrophobic/polar energy function favors HH-contacts structures are discrete, simple, and


  1. Lattice Models: The Simplest Protein Model The HP-Model (Lau & Dill, 1989) • model only hydrophobic interaction • alphabet { H , P } ; H/P = hydrophobic/polar • energy function favors HH-contacts • structures are discrete, simple, and originally 2D • model only backbone (C- α ) positions • structures are drawn (originally) on a square lattice Z 2 without overlaps: Self-Avoiding Walk Example HH-contact S.Will, 18.417, Fall 2011 H P P H P H

  2. HP-Model Definition Definition The HP-model is a protein model, where • Sequence s ∈ { H , P } n • Structure ω : [1 .. n ] → L (e.g. L = Z 2 , L = Z 3 ), s.t. 1. for all 1 ≤ i < n : [ d min ( Z 2 ) = 1] d ( ω ( i ) , ω ( i + 1)) = d min ( L ) 2. for all 1 ≤ i < j ≤ n : ω ( i ) � = ω ( j ) • Energy function E( s , ω ) = � 1 ≤ i < j ≤ n E s i , s j ∆( ω ( i ) , ω ( j )), H P where E = − 1 0 H P 0 0 S.Will, 18.417, Fall 2011 � 1 if d ( p , q ) = d min ( L ) and ∆( p , q ) = 0 otherwise

  3. HP-Model Definition Definition The HP-model is a protein model, where • Sequence s ∈ { H , P } n • Structure ω : [1 .. n ] → L (e.g. L = Z 2 , L = Z 3 ), s.t. 1. for all 1 ≤ i < n : [ d min ( Z 2 ) = 1] d ( ω ( i ) , ω ( i + 1)) = d min ( L ) 2. for all 1 ≤ i < j ≤ n : ω ( i ) � = ω ( j ) • Energy function E( s , ω ) = � 1 ≤ i < j ≤ n E s i , s j ∆( ω ( i ) , ω ( j )), H P where E = − 1 0 H P 0 0 S.Will, 18.417, Fall 2011 � 1 if d ( p , q ) = d min ( L ) and ∆( p , q ) = 0 otherwise

  4. Structures in the HP-Model Sequence HPPHPH S.Will, 18.417, Fall 2011

  5. How many structures are there? Self-avoiding Walks of the Square Lattice (without Symmetry) 100.000.000 10.000.000 1.000.000 100.000 10.000 1.000 100 10 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1516 17 18 19 20 Naive enumeration not possible. Even NP-complete: S.Will, 18.417, Fall 2011 B. Berger, T. Leighton. Protein folding in the hydrophobic-hydrophilic (HP) Model is NP-complete. RECOMB’98 P. Crescenzi. D. Goldman. C. Paoadimitriou. A. Piccolbom, and M. Yakakis. On the complexity of protein folding. RECOMB’98

  6. How many structures are there? Self-avoiding Walks of the Square Lattice (without Symmetry) 100.000.000 10.000.000 1.000.000 100.000 10.000 1.000 100 10 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1516 17 18 19 20 Naive enumeration not possible. Even NP-complete: S.Will, 18.417, Fall 2011 B. Berger, T. Leighton. Protein folding in the hydrophobic-hydrophilic (HP) Model is NP-complete. RECOMB’98 P. Crescenzi. D. Goldman. C. Paoadimitriou. A. Piccolbom, and M. Yakakis. On the complexity of protein folding. RECOMB’98

  7. Constraint Programming (CP) • Model and solve hard combinatorial problems as CSP by search and propagation • cf. ILP, but CP offers more flexible modeling and differs in solving strategies Definition A Constraint Satisfaction Problems (CSP) consists of • variables X = { X 1 , . . . , X n } , • the domain D that associates finite domains D 1 = D ( X 1 ) , . . . , D n = D ( X n ) to X . • a set of constraints C . S.Will, 18.417, Fall 2011 A solution is an assignment of variables to values of their domains that satisfies the constraints.

  8. Commercial Impact of Constraints Programming Michelin and Dassault, Renault Production planning Lufthansa, Swiss Air, . . . Staff planning Nokia Software configuration Siemens Circuit verification French National Railway Company Train schedule . . . . . . S.Will, 18.417, Fall 2011

  9. CP Example: The N-Queens Problem 4-Queens: place 4 queens on 4 × 4 board without attacks S.Will, 18.417, Fall 2011

  10. CP Example: The N-Queens Problem 4-Queens: place 4 queens on 4 × 4 board without attacks S.Will, 18.417, Fall 2011

  11. CP Example: The N-Queens Problem 4-Queens: place 4 queens on 4 × 4 board without attacks S.Will, 18.417, Fall 2011

  12. CP Example: The N-Queens Problem 4-Queens: place 4 queens on 4 × 4 board without attacks S.Will, 18.417, Fall 2011

  13. CP Example: The N-Queens Problem 4-Queens: place 4 queens on 4 × 4 board without attacks S.Will, 18.417, Fall 2011

  14. Model 4-Queens as CSP (Constraint Model) • Variables X 1 , . . . , X 4 X i = j means “queen in column i, row j” • Domains D ( X i ) = { 1 , . . . , 4 } for i = 1 .. 4 • Constraints (for i , i ′ = 1 .. 4 and i � = i ′ ) X i � = X i ′ (no horizontal attack) i − X i � = i ′ − X i ′ (no attack in first diagonal) i + X i � = i ′ + X i ′ (no attack in second diagonal) S.Will, 18.417, Fall 2011

  15. Solving 4-Queens by Search and Propagation, X 1 = 1 S.Will, 18.417, Fall 2011 X 1 , . . . , X 4 D ( X i ) = { 1 , . . . , 4 } for i = 1 .. 4 X i � = X i ′ , i − X i � = i ′ − X i ′ , i + X i � = i ′ + X i ′

  16. Solving 4-Queens by Search and Propagation, X 1 = 1 S.Will, 18.417, Fall 2011 X 1 , . . . , X 4 D ( X 1 ) = { 1 } , D ( X i ) = { 1 , . . . , 4 } for i = 2 .. 4 X i � = X i ′ , i − X i � = i ′ − X i ′ , i + X i � = i ′ + X i ′

  17. Solving 4-Queens by Search and Propagation, X 1 = 1 S.Will, 18.417, Fall 2011 X 1 , . . . , X 4 D ( X 1 ) = { 1 } , D ( X 2 ) = { 3 , 4 } , D ( X 3 ) = { 2 , 4 } , D ( X 4 ) = { 2 , 3 } X i � = X i ′ , i − X i � = i ′ − X i ′ , i + X i � = i ′ + X i ′

  18. Solving 4-Queens by Search and Propagation, X 1 = 1 S.Will, 18.417, Fall 2011 X 1 , . . . , X 4 D ( X 1 ) = { 1 } , D ( X 2 ) = { 3 , 4 } , D ( X 3 ) = { 4 } , D ( X 4 ) = { 2 , 3 } X i � = X i ′ , i − X i � = i ′ − X i ′ , i + X i � = i ′ + X i ′

  19. Solving 4-Queens by Search and Propagation, X 1 = 1 S.Will, 18.417, Fall 2011 X 1 , . . . , X 4 D ( X 1 ) = { 1 } , D ( X 2 ) = { 3 , 4 } , D ( X 3 ) = {} , D ( X 4 ) = { 2 , 3 } X i � = X i ′ , i − X i � = i ′ − X i ′ , i + X i � = i ′ + X i ′

  20. Solving 4-Queens by Search and Propagation, X 1 = 2 S.Will, 18.417, Fall 2011 X 1 , . . . , X 4 D ( X i ) = { 1 , . . . , 4 } for i = 1 .. 4 X i � = X i ′ , i − X i � = i ′ − X i ′ , i + X i � = i ′ + X i ′

  21. Solving 4-Queens by Search and Propagation, X 1 = 2 S.Will, 18.417, Fall 2011 X 1 , . . . , X 4 D ( X 1 ) = { 2 } , D ( X i ) = { 1 , . . . , 4 } for i = 2 .. 4 X i � = X i ′ , i − X i � = i ′ − X i ′ , i + X i � = i ′ + X i ′

  22. Solving 4-Queens by Search and Propagation, X 1 = 2 S.Will, 18.417, Fall 2011 X 1 , . . . , X 4 D ( X 1 ) = { 2 } , D ( X 2 ) = { 4 } , D ( X 3 ) = { 1 , 3 } , D ( X 4 ) = { 1 , 3 , 4 } X i � = X i ′ , i − X i � = i ′ − X i ′ , i + X i � = i ′ + X i ′

  23. Solving 4-Queens by Search and Propagation, X 1 = 2 S.Will, 18.417, Fall 2011 X 1 , . . . , X 4 D ( X 1 ) = { 2 } , D ( X 2 ) = { 4 } , D ( X 3 ) = { 1 } , D ( X 4 ) = { 3 , 4 } X i � = X i ′ , i − X i � = i ′ − X i ′ , i + X i � = i ′ + X i ′

  24. Solving 4-Queens by Search and Propagation, X 1 = 2 S.Will, 18.417, Fall 2011 X 1 , . . . , X 4 D ( X 1 ) = { 2 } , D ( X 2 ) = { 4 } , D ( X 3 ) = { 1 } , D ( X 4 ) = { 3 } X i � = X i ′ , i − X i � = i ′ − X i ′ , i + X i � = i ′ + X i ′

  25. Constraint Optimization Definition A Constraint Optimization Problem (COP) is a CSP together with an objective function f on solutions. A solution of the COP is a solution of the CSP that maximizes/minimizes f . Solving by Branch & Bound Search Idea of B&B: • Backtrack & Propagate as for solving the CSP • Whenever a solution s is found, add constraint “next solutions must be better than f ( s )”. S.Will, 18.417, Fall 2011

  26. Exact Prediction in 3D cubic & FCC The problem IN: sequence s in { H , P } n HHPPPHHPHHPPHHHPPHHPPPHPPHH OUT: self avoiding walk ω on cubic/fcc lattice with minimal HP-energy E HP ( s , ω ) S.Will, 18.417, Fall 2011

  27. A First Constraint Model • Variables X 1 , . . . , X n , Y 1 , . . . , Y n , Z 1 , . . . , Z n and HHContacts � X i � Y i is the position of the i th monomer ω ( i ) Z i • Domains D ( X i ) = D ( Y i ) = D ( Z i ) = {− n , . . . , n } • Constraints 1. positions i and i + 1 are neighbored (chain) 2. all positions differ (self-avoidance) 3. relate HHContacts to X i , Y i , Z i S.Will, 18.417, Fall 2011     0 X 1  = 4. 0 Y 1    0 Z 1

  28. Solving the First Model • Model is a COP (Constraint Optimization Problem) • Branch and Bound Search for Minimizing Energy • (Add Symmetry Breaking) • How good is the propagation? • Main problem of propagation: bounds on contacts/energy From a partial solution, the solver cannot estimate the maximally possible number of HH-contacts well. S.Will, 18.417, Fall 2011

  29. The Advanced Approach: Cubic & FCC HP−sequence Number of Hs Step 1 Step 2 Steps 1. Core Construction S.Will, 18.417, Fall 2011 2. Mapping

  30. The Advanced Approach: Cubic & FCC HP−sequence Layer Number of Hs sequences Step 1 Step 2 Step 3 Steps 1. Bounds 2. Core Construction S.Will, 18.417, Fall 2011 3. Mapping

  31. Computing Bounds • Prepares the construction of cores • How many contacts are possible for n monomers, if freely distributed to lattice points • Answering the question will give information for core construction • Main idea: split lattice into layers consider contacts • within layers • between layers S.Will, 18.417, Fall 2011

  32. Layers: Cubic & FCC Lattice S.Will, 18.417, Fall 2011

  33. Layers: Cubic & FCC Lattice S.Will, 18.417, Fall 2011

Recommend


More recommend