modeling perspective
play

Modeling Perspective Chengqiang Lu, Qi Liu*, Chao Wang, Zhenya Huang, - PowerPoint PPT Presentation

Molecular Property Prediction: A Multilevel Quantum Interactions Modeling Perspective Chengqiang Lu, Qi Liu*, Chao Wang, Zhenya Huang, Peize Lin, Lixin He Anhui Province Key Lab. of Big Data Analysis and Application,


  1. Molecular Property Prediction: A Multilevel Quantum Interactions Modeling Perspective Chengqiang Lu†, Qi Liu†*, Chao Wang†, Zhenya Huang†, Peize Lin‡, Lixin He‡ †Anhui Province Key Lab. of Big Data Analysis and Application, University of S&T of China ‡China Key Laboratory of Quantum Information, University of S&T of China AAAI 2019

  2. 01 Introduction 02 Related Work CONTENTS 03 MGCN 04 Experiment

  3. Introduction

  4. 01 Introduction Material Discovery Paradigms Feedback cycle Material Molecular Device Testing & concept synthesis construction characterization Example Properties Device Checking prototype Science 2018. Sanchez-Lengeling, et al. "Inverse molecular design using machine learning: Generative models for matter engineering."

  5. Application Material Medicine Food Discovery Design Development

  6. 01 Introduction The Most Time-consuming Step Material Molecular concept synthesis To find the molecule with desired properties. We need explore the molecule database (e.g. gdb-17), and predict molecular properties.

  7. 01 Introduction Our Task Properties: U0 (Atomization energy at 0K) U (Atomization energy at room temperature) H (Enthalpy at room temperature) G (Free energy of atomization) . . Input Output . (molecule) (properties) J. Chem. Inf. Model. 2012. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. Ruddigkeit Lars, van Deursen Ruud, Blum L. C.; Reymond J.-L.

  8. 01 Introduction Challenge: • Molecular quantum interactions are highly complex and hard to model. • The amount of labeled molecule data is significantly limited, which requires a generalizable approach for the prediction. • The molecule data is unbalanced: most of the molecules are small and few of them are large, thus the model should be transferable .

  9. Related Work

  10. 02 Related Works DFT (Density Functional Theory) • Classic physical methods which could date back to 1960s. • States that the quantum interactions between particles (e.g., atoms) create the correlation and entanglement of molecules which are closely related to their inherent properties • Pros: Accurate • Widely used df • • Cons: Extremely time consuming • Journal of Physics. 2014. Behler, Jörg. "Representing potential energy surfaces by high- • dimensional neural network potentials." Journal of Chemical Physics. 2017. Cubuk, Ekin D., et al. "Representations in neural network • based empirical potentials."

  11. 02 Related Works Traditional ML models Representations: Models: • BOB (bag of bonds) • Kernel ridge regression • Coulomb matrix • Random forest • HDAD (histogram of • Elastic Net distance, angle and • dihedral angle) • • • • • Cons: Hand crafted features need much domain expertise • Be restricted in practice • Journal of chemical theory and computation. 2013. Faber, F. A.; Hutchison, L.; Huang, B.; Gilmer, J.; Schoenholz, S. S.; Dahl, G. E.; Vinyals, O.; Kearnes, S.; Riley, P. F.; and von Lilienfeld, O. A. 2017. Prediction errors of molecular machine learning models lower than hybrid dft error.

  12. 02 Related Works Deep Neural Networks I Use grid-like data as input 2. Text 3. Sphere 1. Images Could utilize the models in CV/NLP • Initiative grid-like transformation usually caused • information loss 1. KDD’18. ChemNet: A Transferable and Generalizable Deep Neural Network for Small-Molecule Property Prediction 2. ACS’18. Automatic Chemical Design Using a Data -Driven Continuous Representation of Molecules 3. NIPS’17. Spherical convolutions and their application in molecular modelling

  13. 02 Related Works Deep Neural Networks II Use graph-like data as input • Deep Tensor Neural Network • Sch Net • Message Passing Neural Network Implement the conv-operator in graph • Achieve some superior experimental results • Have not utilize the multilevel property • Bad generalizability and transferability • Nature Comm’17. Quantum -chemical insights from deep tensor neural networks • NIPS’17 SchNet: A continuous- filter convolutional neural network for modeling quantum interactions • ICML’17 Neural Message Passing for Quantum Chemistry •

  14. 02 Problem Definition

  15. Multilevel Graph Convolutional Network (MGCN)

  16. Potential Energy Surfaces • Behler, Jörg. "Representing potential energy surfaces by high-dimensional neural network potentials." Journal of Physics: Condensed Matter 26.18 (2014): 183001. • Cubuk, Ekin D., et al. "Representations in neural network based empirical potentials." The Journal of Chemical Physics 147.2 (2017): 024104.

  17. Atom-centered symmetry functions

  18. Overview

  19. Input Example: CH 2 O 2 N = 5 (atoms) • Atom List • [C, H, H, O, O] 1xN • Edge Matrix • Edge Matrix NxN • Distance Matrix • Distance Matrix NxN

  20. Pre-processing Embedding Layer : generate initial representation of edges and atom. • Atom embedding: 𝐵 0 𝑂 × 𝐿 • Edge embedding: 𝐹 𝑂 × 𝑂 × 𝐿 Radial Basis Function Layer : convert distance matrix to robust distance tensors • ℎ - RBF function • 𝐸 𝑂 × 𝑂 × 𝐿

  21. Interaction Layers In each interaction layer: model will generate the atomic representations at higher level Aggregate multilevel representations and update the edge and pass them representation: to the Readout Layer In detail:

  22. Read Out Layer Thanks to the additivity and locality of molecular properties. We could process the final molecular representations separately and then sum them up.

  23. Discussion Generalizability : Transferability : • Coordinates -> Distance tensor: • First-level knowledge are translation rotation invariance. structure/spatial-irrelevanted. • Element-wise operations: index • Pre-trained embedding. invariance. • Drop-out. www.islide.cc 23

  24. Experiment

  25. Data sets QM9 • Most well-known data set • Contains 134k stable molecules • 13 different properties ANI-1 • Contains 20 million unstable molecules • Only one property

  26. Conclusion • Propose a well designed Multilevel Convolutional Neural Network (MGCN) for predicting molecular properties. • Model the quantum Interaction from a multilevel view using molecular graph as input. • MGCN model is transferable and generalizable. www.islide.cc 28

  27. Thanks for listening. 29

Recommend


More recommend