neural network model chemistries
play

Neural Network Model Chemistries RIKEN 3/25/2017 John Parkhill - PowerPoint PPT Presentation

Neural Network Model Chemistries RIKEN 3/25/2017 John Parkhill Department of Chemistry and Biochemistry, Notre Dame Parkhill Group 5 Students (K Yao) Research Areas: Non-equilibrium realtime 35 . 5 30 quantum electronic dynamics.


  1. Neural Network Model Chemistries RIKEN 3/25/2017 John Parkhill Department of Chemistry and Biochemistry, Notre Dame

  2. Parkhill Group 5 Students (K Yao) Research Areas: Non-equilibrium realtime 35 . 5 30 quantum electronic dynamics. Absorbance ( au ) Absorbance ( au ) 25 25 10 20 20 . t ( fs ) 15 15 . 20 . 10 . . . 25 dt γ p = 1 5 . d . 2 { γ s γ t V ( s ) pr pr V ( t ) pr . st V ( t ) st pr η p η r + γ s γ t V ( s ) st . . . st η p η r . . . . . 0 0. 2.5 5. 7.5 10. 200 250 300 350 400 450 500 550 − γ p γ t V ( s ) pt rs V ( t ) rs pt η r η s − γ p γ t V ( s ) rs pt V ( t ) pt (a) � ( eV ) (b) Wavelength ( nm ) rs η r η s } Applications of Neural Networks to electronic structure theory.

  3. Realtime Spectra Realtime Transient Abs. spectra (Expt, Vauthey left) 4 2 0 Δ A 0 ps 2.5 ps 10 ps ! 2 0.5 ps 3 ps 12 ps 1 ps 4 ps 16 ps ! 4 1.5 ps 6 ps 20 ps 2 ps 8 ps (a) (b)

  4. Begging for Breakthroughs Electronic timescale ~ 4*10 -4 fs 1ps = 10 7 fock builds 114 days at 1 build/second Dynamics is sequential and parallelization in time is weak.

  5. 3 Models • Orbital Free DFT • Neural Networks + Many Body Expansion. • A Diatomics in Molecules NN Today 2000+ Atoms <100 Atoms <40 Atoms Force-Fields Density Functional Ab-Initio Soon 2000+ Atoms <100 Atoms <40 Atoms Force-Fields Density Functional Ab-Initio Neural Network

  6. Orbital Free DFT Hemoglobin : 16000 Daltons ~ 16000 orbitals vs Limit of Most KS ~ 3000 With orbital free-DFT you only need 1 orbital, and get a 10x speedup. But you must know a mysterious ‘functional’ which maps the density 400 to kinetic energy…. Kohn - Sham Orbital - Free 300 Time per SCF Z E kin = T ( n ( r )) dr 200 100 0 0 500 1000 1500 2000 Number of Al atoms

  7. Chemistry Starts in the 4th digit. ✓ | r n ( r ) | ◆ Z T GGA = τ TF ( n ( r )) F n ( r ) 3 / 4 0.0 Accuracy ~ 1 % - 0.1 Energy H hartree L KS - 0.2 TF - 0.3 GE2 No shell structure LLP - 0.4 TW4 - 0.5 - 0.6 No Bonding ! - 0.7 1.8 2.0 2.2 2.4 2.6 2.8 N - N distance H bohr L D Garcia-Aldea, JE Alvarellos Journal of chemical physics 127.14 (2007): 144109.

  8. A Kohn-Sham Kinetic Energy Density Local kinetic energy Ti omas Fermi and Von Weisacker 20 F sch - TF 15 F plus - TF Enhancement factor F sch - VW 10 F plus - VW ● Four types of F 5 ● Ti ey all display the shell 0 structure - 5 ● TF based F diverge at long - 10 0 1 2 3 4 Distance from nucleus H bohr L distance, while VW based converge

  9. CNN version 0.1 Pseudo 2-d input motivated by computational limitations quadrature point Density along lines fed into convolutional neural network. ~1 million quadrature points per atom. ~2000 inputs per quadrature point. ~Barely tractable

  10. Our Network τ T F = 3 5 / 3 n ( r ) 2 / 3 dr Z 10(3 π 2 ) F { n ( r ) , r 0 } τ tf ( r 0 ) dr 0 T { n ( r ) } = ~4000 samples per grid point Future thoughts: 10 6 samples per atom 3D Convolutional Networks Basis sets a = p b = q mn = f ( b i + ( m + a )( n + b ) w i − 1 ,i X X y i − 1 y i f ( x ) = max(0 , x ) ) ab a =0 b =0

  11. Finding a functional Learn F as a function of n(r), given as a slice of the surface. Z F { n ( r ) , r 0 } τ tf ( r 0 ) dr 0 T { n ( r ) } = Kohn-Sham Neural Network

  12. Ti e shape of the error Enhancement Factor in C-C bonding plane Accurate (KS) Neural Net Error(NN-KS) C 2 H 2 C 2 H 4 C 2 H 6

  13. Reproducing Potential Energy Surfaces

  14. Self-Consistent Densities. H H Errors of the NN’s optimal density in Hydrogen

  15. Getting Serious. • Better Density inputs. • Gradients (which require tight integration between electronic structure and the NN. • Some architecture for training data PYSCF TensorMol Train Sample Digest Evaluate

  16. What is TensorMol A set of chemical routines (90% python 10% C++) on top of Capabilities: • Behler-Parrinello, Many Body etc. • Various network types (FC, Convolutional, 3d) • A variety of digesters: (Coulomb, Symmetry Functions, Radial*Spherical Harmonics) • A variety of outputs (energy, force, probability) • Some gradients. • Integration with PYSCF for Ab-initio energies, Coulomb integrals etc.

  17. A Heretical Model Take some crystal structures, define a Gô type potential. Sample its normal modes, and learn its force. Ti en optimize other molecules

  18. A Heretical Model Take some crystal structures, define a Gô type potential. Sample its normal modes, and learn its force. Ti en optimize other molecules

  19. Physical Inputs with Invariance How to parameterize a molecule with invariances and retain invertibility? How to express atomic number di ff erences avoiding separate channels. Coulomb Matrix? Depth Map Sorted by distance or atomic number? 1 50 100 128 1 1 Symmetry Functions? 50 50 G 2OO’ G 2HO’ G 2HH’ G 2OO G 2HO G 2HH 100 100 G 1O G 1H G(r 1 , r 2 , r 3 ...) 128 128 1 50 100 128 Symmetry functions

  20. Physical Inputs with Invariance f nlm ( x, y, z ) = R n Y l,m R n = e − ( r − r 0 ) 2 / (2 σ 2 ) Real Space Embedded Space Ti is embedding is reversible! can go between geometry and embedded geometry reversibly.

  21. Generative Adversarial models. 1 50 100 128 Depth of field map for an 1 1 MD trajectory of 3 methanols 50 50 A way to create a set of nonlinear 100 100 modes to sample chemical space. 128 128 1 50 100 128

  22. My personal favorite Embedding for this atom X f r ( x, y, z ) Y l | rlm i = m ( x, y, z ) ⇤ ( Atm.Number ) atoms

  23. Partitioning of the energy. Behler-Parinnello Back propagates atom networks for each X E = E atom element with only 1 energy Atoms Many Body Expansion Uses separate monomer X X E = E mol + E pair + ... dimer etc. training data pairs Molecules Diatomics-in-molecules NN X Like Behler-Parinello but E = E bond bond energies vary less Bonds

  24. Neural Network PESs NN-MP2 MBE RI-MP2 MBE 180 6 150 wall time 5 120 4 Second Day 90 3 60 2 30 1 - 1 × 10 - 4 - 5 × 10 - 5 0.0 5 × 10 - 5 1 × 10 - 4 Error of Many - Body Energies ( a.u. ) 0 2000 4000 6000 8000 N 1 Number of Atoms E one − body 0.002 NN Three - body Energy ( a.u. ) 0.000 - 0.002 N 2 E two − body E total - 0.004 - 0.006 N 3 E three − body - 0.008 - 0.010 - 0.010 - 0.008 - 0.006 - 0.004 - 0.002 0.000 0.002 MP2 Three - body Energy ( a.u. )

  25. Cluster accuracy 0.015 NN MP2 AMEOBA09 0.010 Energy ( a.u. ) 0.005 0.000 - 0.005 2 3 4 5 6 Distance ( Å )

  26. Cluster accuracy

  27. Cluster accuracy Error (× 10 - 4 a.u. ) 10 0 - 10 0.02 0.00 Energies ( a.u. ) - 0.02 - 0.04 MP2-MBE NN-MBE AMOEBA09 - 0.06 RIMP2

  28. Cancellation of errors in large clusters

  29. Polarizable FF’s on notice. NN - MBE / AMOEBA09 ( Seconds ) 540 18 NN-MBE / MP2-MBE 450 MP2 - MBE ( Days ) 15 AMOEBA09 360 12 270 9 180 6 90 3 200 400 600 800 1000 1200 1400 Number of molecules

  30. Forces Provided by TensorFlow We Code - 16 488.2 water geometry optimization - 16 488.4 - 16 488.6 Energy ( a.u. ) - 16 488.8 - 16 489.0 - 16 489.2 - 16 489.4 0 100 200 300 400 500 Optimization Step

  31. DIM-NN Expresses the total molecular energy as a sum of bonds -Only requires total energy training data -Networks for each bond type

  32. Accurate total energies. E DFT : -939.6928 Ha H E DIM-NN : -939.6927 Ha H O H H H Similar errors for O H vitamin B12, D3 etc… - 66 bond energy (kcal / mol) H H - 81 N H H - 96 H O - 112 H - 127 H

  33. Ti e Space of Carbon Carbon Bonds 700,000 Carbon Carbon bond energies.

  34. A Synthetic Chemist MOL

  35. Conclusions Because of GPU dependencies and large datasets required, the most powerful ML PES methods are not in common use in chemistry Ti ey will be soon. Over the next year TensorMol and several other packages will appear where users can “Roll their own” ML-PES’s with minimal e ff ort. Ti ese will compete heavily with DFT Ti e domain of chemical space which can be explored in a weekend is about to exponentially increase. THANKS!

Recommend


More recommend