Simulating biomolecular function from motions across multiple scales (I) Peter J. Bond (BII) peterjb@bii.a-star.edu.sg
Structural Biology: Why the Need for Simulation? 2017 • Explosion in number of structures deposited to PDB over past ~15 years… due to: year - Post-genomics era: accessibility to numerous genomes, more stable proteomes etc. - Automation in crystallization protocols, robotics. - Structural biology consortia (and money!) • Also improvements in NMR, RCSB PDB: RCSB Protein Data Bank cryoEM, & biophysical methods. https://www.rcsb.org/ 1972 • So with all this structural data, 0 125,000 no. of structures why the need for simulation? 2
The Importance of Dynamics and “Landscape”… single “snapshot” ligand binding 3
Methods & Associated (Typical) Scales TIME (s) Continuum 10 0 Coarse-grained (ms) 10 -3 simulation Atomic res. simulation ( µ s) 10 -6 Semi- (ns) 10 -9 empirical QM biomolecules (ps) 10 -12 Ab initio QM (fs) 10 -15 LENGTH (metres) 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 ( µ m) (nm) 4
Biomolecular Simulations: From Structure to Dynamics FF used to calculate resultant forces F i (& acceleration a i via Newton’s 2 nd law) on particle i with mass m i F i = −∇ i E system = m i a i thus we can relate gradient of PE to changes in positions / velocities as a function of time: − δ E system δ 2 r δ v i = m i δ t = m i i δ t 2 δ r i o Static structure – in vitro conditions. o Simulation: ~300 K, biological model ... o 10 3 – 10 5 atoms … o ~10 6 pair-wise interactions: “ force field” o Numerical integration of F=ma. o Coordinates calculated every 0.000000000000001 sec, ~ 1 CPU sec … 5
Biomolecular Simulations: From Structure to Dynamics COMPUTATIONAL COST... real … explicit implicit (e.g. ε , ± ξ ) o Static structure – in vitro conditions. o Simulation: ~300 K, biological model ... o 10 3 – 10 5 atoms … o ~10 6 pair-wise interactions: “ force field” o Numerical integration of F=ma Periodicity mimics infinite system (e.g. cube). o Coordinates calculated every Minimum image convention. 0.000000000000001 sec, ~ 1 CPU sec … Good rule of thumb: ≥ 2 nm between “images”. 6
Molecular Simulation – “Computational Microscope” • Computational modelling – now an indispensible tool for complementing traditional experiments. • Ariel Warshel: “ … the best tool we have to see how molecules are working.” (awarded Nobel Prize in Chemistry, 2013 with Levitt & Karplus). • Klaus Schulten coined the term “computational microscope”. • Not simply an in silico “imaging” technique – not just for movies … - dynamics, interactions, conformational changes, mechanisms! - no limitations on spatio-temporal “zoom”! - ability to carry out “alchemistry”! ii - ability to do “thought experiments”! - powerful tool: integrate model & experiment. But... Potential Limitations: 35 Å • Accuracy of starting model / available experimental data … • Accuracy of the underlying force field … • Limited sampling in time / space … 7
Simulating (and waiting for) Motions … Zwier & Chong. Current Opinion in Pharmacology. 2010. 10:745-752. energy conformation 8
The increasing power of biomolecular simulation Schlick et al. Biomolecular modeling and simulation: a field coming of age. Q Rev Biophys. 2011. 44:191-228. supercomputing power • < decade: ~10 3 ↑ simulation performance … - thanks to algorithms, architectures, cost … life cycle of E. coli - also improves FF accuracy. 9
Describing Biomolecular Interactions H-bonds (electrostatic … ) Covalent, ~1-2 Å H shared by 2x δ - atoms. ~100 kcal mol -1 . ~1-5 kcal mol -1 , ~2-4 Å. Electrostatic: ~3 Å ~1-5 kcal mol -1 ( ε =80) ~50 kcal mol -1 ( ε =2) i.e. medium dependent! “Hydrophobic interactions” (entropy driven) vdW: ~0.5-1 kcal mol -1 Attractive - transient polarization (also repulsive - orbital overlap) 10
Describing Biomolecular Interactions: “Force Field” quadratic E bond Morse cubic separation, r equilibrium value n = multiplicity (no. minima) φ = current angle γ = phase (minima position; x-axis) V n = barrier height (y-axis) 11
Describing Biomolecular Interactions: “Force Field” E vdw = 4 ε {( σ / R ) 12 - ( σ / R ) 6 } Pair-wise sum of all possible interacting non bonded atoms i and j … O(n 2 ) Lennard-Jones E (“6-12”) potential: R Electrostatics – decays slowly (i.e. 1/ R ) … many methods to treat this.. *** Stick with FF recommendation! *** σ
Energies & Force Fields (FFs)… Describe total energy of the system such that there are penalties for deviations from reference values. Energies are calculated using an § E TOTAL = E BONDED + E NON-BONDED empirically derived force field (FF) . “Balls & springs” : Bonded (+ f c / E o ), § non-bonded interactions (LJ), particle mass, size, partial charge. Parameters from where? § Fragment geometries – X-ray studies. § Biomolecules - highly specific refinements over the years (but cf. over-fitting, e.g. IDPs … ) Rotational barriers / vibrational § frequencies from spectroscopy. Charges from e.g. QM calculations. § van der Waal’s – trial and error § e.g. to match experimental densities. Thermodynamic properties … § Many accurate FFs are now available! § 13
Real Simulation Codes & Force Fields CHARMM (Chemistry at Harvard Molecular Mechanics) www.charmm.org ♦ Interface through fortran like scripting language - tough! ♦ Very powerful, many different features. Slow. ♦ $600 (academic) but also free reduced-functionality version. AMBER (Assisted Model Building with Energy Refinement) www.ambermd.org ♦ Suite of about 60 programs based around a few central ones ♦ Slow on standard CPUs; fast with GPU-optimization ♦ $500 (academic) $15-20,000 (industry). GROMACS (Groningen Machine for Chemistry Simulation) www.gromacs.org ♦ Simple interface (not scripting based) ♦ The fastest codes on 100’s cores (CPU/GPU) ♦ GNU licensed (i.e. free!) NAMD (Not just Another Molecular Dynamics program) www.ks.uiuc.edu/Research/namd ♦ Optimized for many 1000’s of cores ♦ Written in C++ with a TCL-based scripting interface. ♦ Also free of charge. 14
Automated Simulations … but be wary … http://bio.demokritos.gr/gromita/ - Graphical User Interface for GROMACS v4+ https://www.charmming.org -CHARMMing interface – preparation/submission/analysis. http://haddock.science.uu.nl/enmr/services/ GROMACS/main.php - Web-based portal for automated GROMACS simulations, distributed European Grid network (10 ns sims). http://py-enmr.cerm.unifi.it - similar for AMBER- based NMR refinement. http://mmb.irbbarcelona.org/MDWeb/ - Setting up /running / analysis of simulations in Amber, NAMD, GROMACS and related … http://www.bevanlab.biochem.vt.edu/ 15
Simulation Workflow ♦ missing atoms / residues / loops & mutations (Pymol, Early Steps: Know your system! (PDB “headers” & papers are your friend!) Modeller, Swiss- model etc.) Obtain structure – X-ray / NMR / model ♦ oligomer state ♦ disulfides (assess via distance only?) ♦ ligands Add H’s, consider pk A , prepare topology (CGenFF, PRODRG, SwissParam, VMD QMTool – Gaussian.) Solvate + add ions Bulk / structural / F V = −∇ crystal i i water / Minimize ions Energy Aim to “relax” system, e.g.: solvent/ Equilibration ion distribution, temperature, box size/density … Cf. ensemble (e.g. NPT ) E restr = k ( r - r 0 ) 2 Geometry Production e.g. Steepest descents – follow gradient “downhill” until threshold ( Δ E or F max ) Analyze 16
Assessing Errors & Convergence... Sampling & Convergence Simple - look at it! • Check distribution of properties against average – even distribution? • Calculate block averages for a single trajectory. • Calculate multiple simulation replicas and compare … (Ergodic … ) Protein structural deviation x no. steps each τ block should > τ relax 3 Comparison to Experiment C α RMSD (Å) e.g. RMSF vs B-factors 2 2 8 π 2 1 B i RMSF = Take frames 3 from here 10 0 time (ns) Care … this is a very limited indicator alone … … remember experimental error!
Case Study: Theory vs Experiment & OmpA L2 L1 L3 L4 ? NMR X-ray insoluble detergent • Bacterial outer membrane protein (~100,000 per cell!) • Flickering channel formation in lipid membranes , but no obvious pore in crystal. • NMR – but gradient of flexibility along barrel in detergent micelle complex. 18
• 4 monomers per unitcell, space group C2. • Detergent-mediated “protein fibre”. Bond et al, PNAS (‘06) 103 :9518- • 24 x octyltetraoxyethylene (C 8 E 4 ), 264 x H 2 O. • Loops modelled, crystal water & detergent + bulk water and ions. NVT ensemble simulation. 19
crystal simulation 2 ) 2 (Å B i = [8 π 2 /3].RMSF i L1 L2 L3 L4 T1 T2 T3 Bond et al, PNAS (‘06) 103 :9518- 6 4 RMSD (Å) 2 0 0 10 20 30 40 50 time (ns) • Detergent molecules dynamically cover protein fibre – membrane-like environment. • β -barrel RMSD low. Higher for loops – low crystal density & inherent high mobility. • B-factor correlation... Missing density - vibrations, fluctuations, and lattice disorder … 20
Recommend
More recommend