Fitting Protein Chains to Lattices. J´ an Maˇ nuch with Daya Gaur (1st part&partially 2nd part) Shirley Huang, Robert Benkoczi (2nd part) Simon Fraser University Fitting Protein Chains to Lattices. – p. 1/ ??
Proteins Proteins are polymers constructed from linear sequences (chains) of amino acids. When placed into a solvent they fold into 3D spatial structures minimizing the total energy. Problem (Protein Folding Problem) . How to predict the 3D structure of a protein based on linear sequence of its amino acids? Fitting Protein Chains to Lattices. – p. 2/ ??
Simplified model too many degrees of freedom impossible to compute the structure precisely for proteins with more than 7 amino acids most of the folding algorithm will first fold the amino acids ( C � ’s) are placed into vertices of a protein chain (see left) C � ’s in the proteins simplified models assume that the centers of regular lattice with edge size equal to the dis- tance of two consecutive (around 3.8Å) Example: into cubic lattices: Fitting Protein Chains to Lattices. – p. 3/ ??
Protein folding in simple models The simplest protein folding model was introduced by Dill (1985): not only the (centers of) residues are placed to the vertices of a lattice but also the energy function is simplified: instead of considering all different forces affecting the folding process, only hydrophobic interactions between amino acids neighboring in the lattice are considered. The model is called HP (hydrophobic/polar) model. Fitting Protein Chains to Lattices. – p. 4/ ??
Protein folding in simple models The simplest protein folding model was introduced by Dill (1985): not only the (centers of) residues are placed to the vertices of a lattice but also the energy function is simplified: instead of considering all different forces affecting the folding process, only hydrophobic interactions between amino acids neighboring in the lattice are considered. The model is called HP (hydrophobic/polar) model. Protein folding in HP model was shown to NP-complete in both 3D cubic lattice (Berger, Leighton (1998)) and 2D square lattice(Crescenzi, Goldman, Papadimitriou, Piccolboni, Yannakakis (1998)). Fitting Protein Chains to Lattices. – p. 4/ ??
Accuracy of lattice models Even though protein folding in lattice models is NP-complete, it is more computationally feasible than in the general model. However, even if we find the optimal fold in a certain lattice model it could be quite far from the real fold. We would like identify those lattice models which have potential to produce folds close to real 3D structures. Fitting Protein Chains to Lattices. – p. 5/ ??
Accuracy of lattice models Even though protein folding in lattice models is NP-complete, it is more computationally feasible than in the general model. However, even if we find the optimal fold in a certain lattice model it could be quite far from the real fold. We would like identify those lattice models which have potential to produce folds close to real 3D structures. Question. How to measure ability of lattices to represent proteins (certain types of proteins)? Fitting Protein Chains to Lattices. – p. 5/ ??
Accuracy of lattice models Even though protein folding in lattice models is NP-complete, it is more computationally feasible than in the general model. However, even if we find the optimal fold in a certain lattice model it could be quite far from the real fold. We would like identify those lattice models which have potential to produce folds close to real 3D structures. Question. How to measure ability of lattices to represent proteins (certain types of proteins)? Take 3D structures of known protein (PDB) and find their closest representations in a given lattice. Then measure similarity between the original (PDB) structures and their lattice approximations. Fitting Protein Chains to Lattices. – p. 5/ ??
L with side 1 , a sequence of points Protein chain fitting (PCF) problem p = p ; : : : ; p 1 n such that d ( p ; p ) = 1 , for every 1 � i � n , and i i +1 Problem. Instance: Equilateral lattice d ( p ; p ) � 1 for every n � j > i + 1 � 2 , i j � on sequences of points, and a number K . (P1) l = l ; : : : ; l L such that � ( p; l ) � K ? 1 n in (P2) a distance measure Question: Is there a path Fitting Protein Chains to Lattices. – p. 6/ ??
L with side 1 , a sequence of points Protein chain fitting (PCF) problem p = p ; : : : ; p 1 n such that d ( p ; p ) = 1 , for every 1 � i � n , and i i +1 Problem. Instance: Equilateral lattice d ( p ; p ) � 1 for every n � j > i + 1 � 2 , i j � on sequences of points, and a number K . (P1) l = l ; : : : ; l L such that � ( p; l ) � K ? 1 n in (P2) a distance measure Question: Is there a path r n P 2 d ( p ;l ) i i Two most common distance measures are i =1 -RMS ( p; l ) = n coordinate root mean square deviation c-RMS: Fitting Protein Chains to Lattices. – p. 6/ ??
L with side 1 , a sequence of points Protein chain fitting (PCF) problem p = p ; : : : ; p 1 n such that d ( p ; p ) = 1 , for every 1 � i � n , and i i +1 Problem. Instance: Equilateral lattice d ( p ; p ) � 1 for every n � j > i + 1 � 2 , i j � on sequences of points, and a number K . (P1) l = l ; : : : ; l L such that � ( p; l ) � K ? 1 n in (P2) a distance measure Question: Is there a path r P 2 [ d ( p ;p ) � d ( l ;l )℄ i j i j Two most common distance measures are 1 � i<j � n d-RMS( p; l ) = n ( n � 1) = 2 distance root mean square deviation d-RMS: Fitting Protein Chains to Lattices. – p. 6/ ??
Protein chain fitting (PCF) problem 1A0M protein fitted 1GUU protein fitted to cubic lattice to truncated tetrahedron lattice Fitting Protein Chains to Lattices. – p. 7/ ??
Applications of PCF problem Measuring accuracy of lattice models, i.e., their ability to represent protein chains. Used in a genetic protein folding algorithm (into lattice). The Cartesian combination operator is used to generate a new chain from two existing lattice chain. As a new chain is most likely an off-lattice chain, a PCF algorithm has to be used. Rabow, Sheraga (1996). Fitting Protein Chains to Lattices. – p. 8/ ??
Existing algorithms for PCF problem Exponential backtracking algorithm was introduced by Covell, Jernigan (1990). Dynamic programming approximation algorithms were presented in several papers, e.g., Rykunov, Reva, Filkensten (1995). A greedy approach keeping about 500 “best” lattice folds was used in Park, Levitt (1995). Improved DP algorithm: Rabow, Sheraga (1996). The self-consistent mean field procedure: Koehl, Delarue (1998). Mostly approximation algorithms. The complexity of the problem was so far unknown. Fitting Protein Chains to Lattices. – p. 9/ ??
Our results (1) Determine the complexity of PCF problem: we show that the problem is NP-complete for c-RMS measure and cubic lattice. (2) Use integer programming (package CPLEX) to exactly solve the PCF problem for known proteins (PDB). Fitting Protein Chains to Lattices. – p. 10/ ??
NP-completeness of PCF problem � with a set C clauses over a set We use reduction from a planar 3-SAT prob- X of variables in disjunctive normal form such that: lem proved by Lichtenstein (1982). Problem. Var-linked planar 3-SAT (VLP-3-SAT) Instance: A formula (S1) Every clause contains at most three variables. X of variables allows a linear order- x ; : : : ; x G = 1 n such that the graph � (S2) Every variable occurs in exactly three clauses, ( C [ X ; f x ; x 2 2 C or : x 2 2 once negated and twice positive. C g [ f x x ; i = 1 ; : : : ; n g ) is planar (here i i +1 (S3) The set x = x n +1 1 ). ing, say � satisfiable? Question: Is Fitting Protein Chains to Lattices. – p. 11/ ??
K so that every protein point has to be mapped to one of Reduction Set L with side 1 , a sequence of points the closest lattice points. p = p ; : : : ; p 1 n such that d ( p ; p ) = 1 , for every 1 � i � n , and Problem (PCF) . i i +1 Instance: Equilateral lattice d ( p ; p ) � 1 for every n � j > i + 1 � 2 , i j � on sequences of points, and a number K . (P1) l = l ; : : : ; l L such that � ( p; l ) � 1 n in K ? (P2) a distance measure Question: Is there a path Fitting Protein Chains to Lattices. – p. 12/ ??
K so that every protein point has to be mapped to one of Reduction Set the closest lattice points. We say that a point is flexible if the set of closest lattice points (to this point) has at least two elements. For instance, centers of faces or cubes are flexible. Fitting Protein Chains to Lattices. – p. 12/ ??
K so that every protein point has to be mapped to one of Reduction Set the closest lattice points. We say that a point is flexible if the set of closest lattice points (to this point) has at least two elements. For instance, centers of faces or cubes are flexible. Basic building blocks : “wire” “flipper” Fitting Protein Chains to Lattices. – p. 12/ ??
Recommend
More recommend