Protein Structure Analysis with Protein Structure Analysis with - PowerPoint PPT Presentation

Protein Structure Analysis with Protein Structure Analysis with Protein Structure Analysis with Sequential Monte Carlo Method Sequential Monte Carlo Method Sequential Monte Carlo Method Jinfeng Zhang Computational Biology Lab Department of Statistics Harvard University

Introduction Introduction Introduction • Structure � Function & Interaction – Protein structure initiative (PSI) is speeding up the information flow from sequence to structures. – Information does not readily flow from structures to structures. – Neither does it readily flow from structures to applications. • What are the bottle necks? – Sampling method. – Potential function.

Sampling Methods Sampling Methods Sampling Methods -- Folding & Growth Folding & Growth -- -- Folding & Growth Folding Method Growth Method From http://www.bioinformatics.buffalo.edu/

Sequential Monte Carlo (SMC) Sequential Monte Carlo (SMC) Sequential Monte Carlo (SMC) -- Step by Step Step by Step -- -- Step by Step Each sample has a weight! . . . . . . . . . . Resampling

SMC SMC SMC -- Summary Summary -- -- Summary • Short chains: – Exhaustive enumeration, useful for evaluation of SMC performance. • Long chains: – Sequential Monte Carlo, estimating interesting properties. • The main ingredients of SMC are: – Sequence of distributions “approaching” the target distribution π ( x 1 ,…, x n ). – Sampling distribution g t+1 ( x t+1 | x 1 ,…, x t ). – Resampling scheme.

Reference for SMC Reference for SMC Reference for SMC • J.S. Liu and R. Chen (1998). SMC for dynamic systems. J Amer Statist Assoc 93 , 1032-45. • J.S. Liu (2001). Monte Carlo Strategies in Scientific Computing . Springer-Verlag. • J. Liang, J. Zhang, R. Chen, (2002). J. Chem. Phys. 117:7, 3511-3521. • J. Zhang, R. Chen, C. Tang, and J. Liang, (2003). J. Chem. Phys. 118:12, 6102-6109. • J. Zhang, Y. Chen, R. Chen, and J. Liang, (2004). J. Chem. Phys. 121:1, 592-603.

Near Native Structures of Proteins Near Native Structures of Proteins Near Native Structures of Proteins

Native State is an Ensemble of Structures Native State is an Ensemble of Structures Native State is an Ensemble of Structures Ca 2+ ATPase pump Lac repressor 2BBN • Protein functions and interactions are determined by the near native structures.

Biological Problems Biological Problems Biological Problems • Stability – Probability of NNS under Boltzmann distribution. • Function – Analysis of NNS to detect correlated structural changes. • Interaction – Near native structures with diversified interfaces. • Difficulty of protein structure prediction – Probability of NNS under uniform distribution.

Methods for Studying NNS Methods for Studying NNS Methods for Studying NNS • Experimental method, such as NMR – Study one protein at a time. Limited to protein types. • MD simulation – Computationally expensive. Applicable for small proteins. • MCMC – Folding around the constrained native structure template is not efficient. • NMR combined with MD – Vendruscolo M, et. al. Nature (2005), 433 :128-32

Near Native Structures Near Native Structures Near Native Structures -- Connecting Experimental Structures and Applications Connecting Experimental Structures and Applications -- -- Connecting Experimental Structures and Applications SMC

Representation of Protein Structures Representation of Protein Structures Representation of Protein Structures • Optimized discrete state • Accuracy of ODSM. ALA PRO model (ODSM). 3.0 SC i-1 2.5 cRMSD α i C i-1 τ i 2.0 C i+1 C i 1.5 C i-2 SC i 3 4 5 6 7 8 9 10 Discrete State GLY HIS

Sequential Monte Carlo for Sampling NNS Sequential Monte Carlo for Sampling NNS Sequential Monte Carlo for Sampling NNS SMC Native structure •Definition of NNS: –Structures with RMSD < 3 Å to native structure. –Other similarity measures are Near Native Structures possible.

Comparison with Enumeration I. Comparison with Enumeration I. Comparison with Enumeration I. -- Estimation of Number of Conformations Estimation of Number of Conformations -- -- Estimation of Number of Conformations 1ail 24 24 ln(Number of Conformations) ln(Number of Conformations) 5 State Enum. 5 State Enum. 22 22 1.042×10 9 5 State SMC 20 20 1.039×10 9 18 18 Sample size: 10,000. 16 16 14 14 12 12 10 10 11 11 12 12 13 13 14 14 15 15 Length Length

Comparison with Enumeration II. Comparison with Enumeration II. Comparison with Enumeration II. -- Estimation of NNS Estimation of NNS -- -- Estimation of NNS −6 −6 RMSD Bin: 1: 1.0 Å - 1.5 Å; −8 −8 2: 1.5 Å - 2.0 Å; 3: 2.0 Å - 2.5 Å; −10 −10 ln(Probability) ln(Probability) 4: 2.5 Å - 3.0 Å; −12 −12 −14 −14 5 . 94 × 10 -8 −16 −16 L 15 Enum. L 15 Enum. L 15 SMC −18 −18 5 . 60 × 10 -8 Sample size: 1 1 2 2 3 3 4 4 RMSD Bin RMSD Bin 10,000.

Comparison with Enumeration III. Comparison with Enumeration III. Comparison with Enumeration III. -- Estimation of Native Contacts Estimation of Native Contacts -- -- Estimation of Native Contacts Enum. Enum. Enum. a a a SMC SMC 0.8 0.8 0.8 Probability Probability Probability 1nkd, RMSD Bin-2: 0.6 0.6 0.6 1.5 Å - 2.0 Å; 0.4 0.4 0.4 0.2 0.2 0.2 1.0 1.0 1.0 b b b Enum. Enum. Enum. 0.8 0.8 0.8 SMC Probability Probability Probability 0.6 0.6 0.6 1nkd, RMSD Bin-4: 2.5 Å - 3.0 Å; 0.4 0.4 0.4 0.2 0.2 0.2 0.0 0.0 0.0 0 0 0 5 5 5 10 10 10 15 15 15 20 20 20 25 25 25 30 30 30 35 35 35 Native Contact Native Contact Native Contact

Probability of NNS Probability of NNS Probability of NNS -- How Difficult Protein Structure Prediction is? How Difficult Protein Structure Prediction is? -- -- How Difficult Protein Structure Prediction is? Probability of NNS for 70 non-homologous proteins grouped by their length with 5 residues per interval. −10 −20 log 10 (Probability) −30 −40 −50 RMSD < 3A −60 RMSD < 4A RMSD < 5A −70 60 80 100 120 140 Length

Probability of NNS Probability of NNS Probability of NNS -- Effect of Model Complexity Effect of Model Complexity -- -- Effect of Model Complexity Average probability of NNS for 8 proteins at partial length and full length. 0 0 0 0 a a a a a 4−state 4−state 4−state 4−state 4−state b b b b 5−state 5−state 5−state 5−state 5−state 25 25 25 25 25 −20 −20 −20 −20 log10(N) log10(N) log10(N) log10(N) log10(N) log10(P) log10(P) log10(P) log10(P) 6−state 6−state 6−state 6−state 6−state 8−state 8−state 8−state 8−state 8−state 4−state 4−state 4−state 4−state 15 15 15 15 15 −40 −40 −40 −40 5−state 5−state 5−state 6−state 6−state −60 −60 −60 −60 8−state 5 5 5 5 5 20 20 20 20 20 30 30 30 30 30 40 40 40 40 40 50 50 50 50 50 20 20 20 20 40 40 40 40 60 60 60 60 80 80 80 80 Length Length Length Length Length Length Length Length Length • 4,5,6,8-state models all have same probability of NNS.

Probability Under Boltzmann Boltzmann Distribution Distribution Probability Under Probability Under Boltzmann Distribution -- Contact Potentials Contact Potentials -- -- Contact Potentials Piotr Pokarowski et. al., PROTEINS, 59:49–57 (2005)

Probability of NNS Under Boltzmann Boltzmann Probability of NNS Under Probability of NNS Under Boltzmann Distributions Distributions Distributions • Probability of NNS for 32 proteins with length from 31 to 90. −10 −10 −10 −20 −20 −20 log 10 (Probability) −30 −30 −30 −40 −40 −40 −50 −50 −50 Uniform distribution of 5−state model Uniform distribution of 5−state model Uniform distribution of 5−state model −60 −60 −60 Boltzmann distribution of 5−state model Boltzmann distribution of 5−state model Boltzmann distribution of 6−state model 30 30 30 40 40 40 50 50 50 60 60 60 70 70 70 80 80 80 90 90 90 Length Length Bin Length Bin • Pair-wise contact potential function stabilize NNS poorly.

Summary for NNS Summary for NNS Summary for NNS • Sequential Monte Carlo (SMC) for studying near native structures (NNS). • Probability of NNS is estimated for proteins up to length 150. • Models with different complexities have same probability of NNS. • Rigorous evaluation criterion for potential functions. Contact potentials do not stabilize native structures.

Side Chain Modeling Side Chain Modeling Side Chain Modeling

Introduction Introduction Introduction • Side chain modeling is important for protein structure prediction, protein interaction, and protein design. • Most current methods are looking for single conformation with minimum potential energy. • In structure prediction, the energy of a conformation is normally calculated ignoring the side chain conformational entropy.

Questions Questions Questions • Do structures with similar compactness have similar side chain conformational entropy? • Do structures with similar fold have similar side chain conformational entropy? • Do native structures have higher side chain entropy than random structures with similar compactness or similar fold? We address these questions with our new side chain modeling method.

Protein Structure Analysis with Protein Structure Analysis with - PowerPoint PPT Presentation

Protein Structure Analysis with Protein Structure Analysis with Protein Structure Analysis with Sequential Monte Carlo Method Sequential Monte Carlo Method Sequential Monte Carlo Method Jinfeng Zhang Computational Biology Lab Department of

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

CSE182-L7 CSE182-L7 Protein structure Basics Protein structure Basics Protein sequencing via MS

Protein-Protein interactions Reducing the complexity Why are protein-protein interactions

Geometric arrangement algorithms for protein structure determination Jeff Martin Bruce Donald

Protein design Chris Bystroff Biology 12 Apr 2016 1 Protein folding/ protein design folding

Animal protein production in a Animal protein production in a Animal protein production in a

DNA RNA Protein synthesis AMINO ACIDS PROTEIN Protein degradation FUNCTION Some properties

Dynamics of Protein-Protein Interactions: A Probabilistic Model Toward Protein Function Amir

Protein Structure Prediction 1 Ram Samudrala, University of Washington Rationale for

Part I : I ntroduction to Protein Structure A/P Shoba Ranganathan Kong Lesheng National

Protein Structure Prediction Protein = chain of amino acids (AA) aa connected by peptide

Protein Folding Protein Folding Proteins have unique 3-dimensional shapes created by the

Protein Folding Protein Folding Proteins have unique 3-dimensional shapes created by the

Hasup Lee, Seungtaek Sun and Ye-Yeong Park ( Group 6 ) Protein-Protein interaction is

Collaboration-based Function Prediction in Protein-Protein Interaction networks Hossein Rahmani

PROTEIN EXPRESSION AND PURIFICATION PROTEIN EXPRESSION AND PURIFICATION Why do we decide to

1 1 2 Academic Concept Focus on academic programs that: Benefit students because of the

IP Support for the Network Simulation Cradle Michael Kirsche and Roman Kremmer Computer

The DNS security mess D. J. Bernstein University of Illinois at Chicago, Technische Universiteit

Supermassive and intermediate-mass black holes in nuclear star clusters Alessandra

ASSET PROTECTION PROGRAM PROVIDING TOMORROW SOLUTIONS TODAY ASSET PROTECTION United States of

Complementing Unary Nondeterministic Automata Filippo Mera and Giovanni Pighizzini Dipartimento

Robust stability analysis of uncertain Linear Positive Systems via Integral Linear Constraints: L 1

WELCOME CANADIAN OCEANS FORUM June 4-5, 2019 | NSCC, Halifax, Nova Scotia Why and How We Map

Sambuz

Useful Links

Newsletter

Mail Us