Computational Biophysics in the Petascale Computing Era Rommie E. Amaro . UC San Diego . Blue Waters Symposium . June 2018
Convergence of HPC, data science, & data enabling transformative advances at the intersection of observational and simulation sciences Exascale 360,000 cores + GPU acceleration 2013 Enveloped virus 160 mil+ atoms 1-100 μs Ranger 60k CPUs Compute Power Anton 2007 ribosome LeMieux 2 mil atoms 3k CPUs 100s ns SGI Origin HP 735 2002 128 CPUs 12 CPUs ATPase BW is a key co BW component of the 500k atoms 10s ns 1997 1993 cyb cyberin infrastruct cture ecosystem ion channel protein 100k atoms 10k atoms 1 ns 100s ps time
NAMD, AMBER, GROMACS, MARTINI… Influenza Cancer Chlamydia Trypanosomiasis
Bio Biophysics sics on Blu Blue Waters Actual Usage by Discipline, 4/2013-5/2018 Atmospheric Sciences 4% Molecular Biosciences 4% Engineering Materials Research Astronomical Sciences 4% 3% 5% Social, Behavioral, and Economic Sciences Fluid, Particulate, and Hydraulic Systems 0% 3% Geophysics 0% Extragalactic Astronomy and Cosmology 2% Chemistry Computer and Biochemistry and Magnetospheric Physics 6% Information Science Molecular Structure 2% and Engineering and Function Biological Sciences 0% 1% 2% Galactic Astronomy Earth Sciences 1% Neuroscience Nuclear Physics 7% Biology 1% Computer and 0% Computation Research Other 1% Chemical, 5% Thermal Physics 8% Systems 0% Planetary Astronomy Climate Dynamics 1% 1% Biophysics Stellar Astronomy and 15% Astrophysics 13% Elementary Particle Physics 13%
Bio Biophysics sics on Blu Blue Waters DI Data Intensive: uses large numbers of files, e.g. large disk space/bandwidth, or automated workflows/off-site transfers. GA GPU-Accelerated: written to run faster on XK nodes than on XE nodes TN Thousand Node: scales to at least 1,000 nodes for production science MI Memory Intensive: uses at least 50 percent of available memory on 1,000-node-runs BW Blue Waters: research only possible on Blue Waters MP Multi-Physics/multi-scale: job spans multiple length/timescales or physical/chemical processes ML Machine Learning: employs deep learning or other techniques, includes “big data” CI Communication-Intensive: requires high-bandwidth/low-latency interconnect for frequent, tightly coupled messaging IA Industry Applicable: Researcher has private sector collaborators or results directly applicable to industry
Computational biophysics bridges gaps across scales Blue waters took us into and across these key “capability gaps”; Engaging all-atom & coarse grained MD to give unseen views e.g., Can we understand the drug target in its real environment? into the inner workings of cells at the molecular level Can we understand the molecular and chemical mechanisms underlying disease?
/ OL15 is 0.44 A from NMR average of Dickerson DNA dodecamer Tom Cheatham University of Utah Reproducibility and convergence (ensembles, replica exchange) – we can overcome the sampling problem for modest systems (tetraloops and other RNA motifs) Force field assessment , validation, and optimization
Allosteric Dynamics of C-type Inactivation in the KcsA Potassium Channel Inner gate opening shifts the SF conformational preference from conductive to constricted conformation Benoît Roux Jing Li Pinched Conductive Pinched Conductive Eduardo Perozo
These waters are now visible in a new high-resolution structure of the open-inactivated KcsA (Perozo & Cuello, private communication)
Hepatitis B Viral capsid is a semi-permeable container with charge selectivity. Sodium (+) translocates five times faster than chloride (-) in HBV capsids. Analysis of 6 M solvent particles in parallel in Blue Waters. 230 Blue Waters XK nodes for 6 months
HBV flexibility reveals complex dynamics Hepatitis B virus capsid Hadden, JA., Perilla, JR. et al. eLife (2018)
ELECTRON DYNAMICS CONFORMATIONAL DYNAMICS MOLECULAR DYNAMICS Bottom-up biology of entire photosynthetic cell organelle ! BROWNIAN DYNAMICS JACS 138, 12077 (2016); eLife 5, e09541 (2016); JACS 139, 293 (2017)
LARGE-SCALE COARSE-GRAINED MOLECULAR SIMULATIONS OF THE VIRAL LIFECYCLE OF HIV-1 The immature HIV-1 assembly process is catalyzed by scaffolds Membrane deformation co-localizes & promotes assembly Gregory A. Voth University of Chicago RNA co-localizes protein & promotes assembly Pak, Grime, … and Voth. PNAS 114:E10056 (2017)
HIV-1 capsid HIV-1 virion 186 hexamers 12 pentamers 14
HIV capsid contains 186 hexamers, 12 pentamers HIV capsid: 4.2 million atoms, 1300+ proteins
A204 I201 E213 K203 Ions permeate through capsid Contact points control curvature Acoustic analysis reveals allostery between distant sites Perilla and Schulten . Nat. Commun. (2017), 15959
How membrane organization controls influenza infection: simulation & experiment Simulations yield a new molecular organizing principle for cholesterol that controls influenza binding and infection. Zawada…Kasson, 2016; Goronzy…Kasson, 2018.
3D Str tructu ctural al data a to build ild visib isible le vir irtu tual al ce cells lls Serial Se Se Section EM Se Serial Block ck EM Res esin-em embed edded ed samples es Re Resin-em embed edded ed ti tissues Routine dataset is 1.2 trillion pixels 100,000’s of structures in • a single dataset 18
Ex Extendin ing Mole lecula lar Structure to Cellu llula lar En Envir ironments 19
Cell-centered, data-centric modeling framework 21
Moving from single protein to whole virus Fully Atomic Reconstructions PyMolecule LipidWrapper CellPACK Alasdair Steven, NIH Improved sense of the physical arrangement of biological entities in complex biological milieu • Enables simultaneous study of multiple components • Mesoscale molecular models as a platform for other simulation approaches (e.g., Brownian dynamics, Mcell, • lattice boltzmann MD) … leads us to new avenues of investigation, not possible on the single protein scale Johnson et al, Nature Methods (2014),; Durrant & Amaro, PLOS Comp Bio (2014).
2013 2014 2015 2016 Equilibration, System & tool Prop. Prod membrane building, waiting Rev. 114,688 processors (16,384 Blue Waters nodes) 160 million atoms, 25.6 steps / s or ~4.5 ns/day 120 ns total, 12TB with explicit solvent Durrant and Amaro, unpublished (2017)
Petascale MD of Fully Enveloped Flu Virus • Largest biological system ever simulated (~165 million atoms) • 4.5 ns/day using 114,688 CPUs • 158 ns total simulation • Saving every 20 ps è ~25 TB of data • Collaboration with TCBG P41
Cell-scale Markov state models of protein dynamics Active Inactive Markov state models define metastable states and transitions between states Allows one to extract long timescale dynamics from many short timescale simulations Swope, Pande, Schutte, Noe…
MSMs characterize loop dynamics & druggable pockets Virion has 30 NAs, 236 HAs Enough sampling to make a Markov state model (MSM) of NA loop dynamics 2-state Macrostate model open/closed MFPT for the 150-loop: open to closed 52.9ns • closed to open 198.4 ns •
1000 nm (1 um) Molecular simulation at the mesoscale 10 nm 100 nm Biophysics is ready for exascale!
Acknowledgements http://amarolab.ucsd.edu http://nbcr.ucsd.edu
Dedicated to Klaus Schulten, 1947-2016 “Why? … because we can.”
Recommend
More recommend