Contents Trend in Computer-Aided Materials Discovery - PowerPoint PPT Presentation

Contents  Trend in Computer-Aided Materials Discovery  High-Throughput Computational Screening & Exhaustive Enumeration  Deep-Learning-based Evolutionary Design  Deep-Learning-based Inverse Design  Efficacy of Computer-Aided Materials Discovery 1

Trend in Computer-Aided Materials Discovery  For accelerated materials discovery First-principles High-performance Machine Learning Quantum Chemistry Computing Trial-and-Error Simulation Virtual screening Targeted design (high cost) (low throughput) (low hit-rate) (high hit-rate) Right solutions Iterative experiments Pre-validation High throughput with minimum effort [ 1 st Gen. ] [ 2 nd Gen.] [ 3 rd Gen. ] Conventional Rationalization Efficiency Intelligence 2

Trend in Computer-Aided Materials Discovery  Prediction of materials property based on machine learning – Build-up of Materials vs. Property DB → Materials Informatics Kernel methods Bayesian approaches Deep Learning ANN ** in Chemistry (’71) (‘16 @ Stanford) QSAR * SMILES *** (‘87 Weininger) (’62, Hansch&Fujita) Bayesian Modeling Graph Kernels (‘09 @ MIT) (‘05 @ UC Irvine) * QSAR: Quantitative Structure-Activity Relationship ** ANN: Artificial Neural Network (‘18 @ Harvard) *** SMILES: Simplified Molecular-input-line Systems Introduction stage of Cheminformatics Development stage machine learning Process of Machine Learning @ Materials Research Descriptor Vector SMILES: CC(C)NCC(O)COC1=CC(CC2=CC=CC=C2)=C(CC(N)=O)C=C1 Fingerprint: Descriptor Training Analysis 011100011111101010010100100000101010001001010… graphs images 3

Trend in Computer-Aided Materials Discovery  Materials design based on machine learning – Inverse QSAR → Inverse Design Deep Learning / Generative Models Inverse QSAR Exhaustive Generation GAN * for molecules (Late 80’s~) (’12 @ Tokyo) (‘17 @ Harvard) Inverse Design Genetic Algorithms (’16 @ SAIT) SMILES Autoencoder (’92 @ Purdue) (‘16 @ Harvard) Focus on autonomous molecular generation * GAN: Generative Adversarial Network Autoencoder Combinatorial Evolutionary 4

Trend in Computer-Aided Materials Discovery  In-silico technologies for materials discovery Elemental Technologies Materials Discovery Methodologies Machine Learning [ In ] Targets Inverse Design [ Out ] Materials Molecules Informatics + + Evolutionary DB Design Molecular Target molecules Enumeration HTCS (High-Throughtput Automated Computational Screening) Simulation 5

High-Throughput Computational Screening & Exhaustive Enumeration “Landscape of phosphorescent light -emitting energies of homoleptic Ir(III)- complexes predicted by a graph- based enumeration and deep learning”, GI01.02.02, 2018 MRS fall meeting 6

High-Throughput Computational Screening  Property prediction with high-performance computing for large- scale exploration of materials candidates Seed Fragments Candidate Pool Combination large amounts Database of candidates Simulation Verification Target Materials 7

High-Throughput Computational Screening  ML (Machine Learning)-assisted HTCS for higher efficiency Seed Fragments Candidate Pool Combination (2) Prioritizing calculation based on active learning large amounts Database of candidates (1) Simulation + ML Verification Target Materials 8

High-Throughput Computational Screening  Exhaustive enumeration based on graph-theory – “Graphs” • Mathematical structures used to model pairwise relations between objects. • Made up of nodes and edges. • In chemistry, graph is used to model molecules, where nodes represent atoms and edges represent bonds. ※ Exhaustive enumeration : Systematical enumeration of all possible molecules for optimal solution search 9

High-Throughput Computational Screening  Complete list of non-isomorphic graphs ID No. of edges No. of edges at each node http://www.cadaeic.net/graphpics.htm 10

High-Throughput Computational Screening  Landscape of phosphorescent light-emitting energies of homoleptic Ir(III)-complex core structures – Ir(III)-complexes • Widely used as phosphorescent OLED dopants. • Figuring out the full landscape of emission color is important for discovering high-performing molecules in target color regions. New J. Chem ., 39 , 246 (2015) ACS Appl. Mater. Interfaces , 10 , 1888 – 1896 (2018) 11 Organic Electronics , 63 , 244 – 249 (2018)

High-Throughput Computational Screening  Approach – Consider the nodes in graph as rings and edges as ring-connections. – Limited the total number rings between 3 and 5. – Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs. 12

High-Throughput Computational Screening  Enumeration – For 5- and 6-membered rings. – Substitute some carbons of each molecule with nitrogen atoms (max. five). → Total 9,919,469 (~10M) core structures 1. Graphs 3. Set Iridium positions 2. Skeletons total 405 EA 4. Substitute some carbon atoms with nitrogen atoms 13

High-Throughput Computational Screening  Property prediction – Trained a deep-neural-network model with simulated T 1 data • Input: ECFP (Extended Connectivity FingerPrints) of molecular structures • Outputs: T 1 energy (phosphorescent light-emitting wavelength) 0.2 Mean Absolute Error of T 1 0.15 of the DNN (eV) With 80k training data, 0.1 the average prediction error was less than 0.1 eV 0.05 80k 0 10M = 0.8% 10K 20K 30K 40K 50K 60K 70K 80K Size of the training dataset By simulating the properties of only 0.8% molecules, we can fully scan the chemical space of 10M! 14

High-Throughput Computational Screening  Results – Distribution of T 1 values – Blue-color emitting materials are rare compared with red and green 6 x 100,000 5 Number of molecules Red 4 (18.4%) Green (4.3%) 3 Blue 2 (0.4%) 1 0 0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95 1.05 1.15 1.25 1.35 1.45 1.55 1.65 1.75 1.85 1.95 2.05 2.15 2.25 2.35 2.45 2.55 2.65 2.75 2.85 2.95 Predicted T1 (eV) 15

Conclusions  In materials discovery, deep-learning-based HTCS is a good alternative to conventional trial-and-error type approach.  Moreover, exhaustive enumeration makes it possible to systematically explore the whole chemical space.  With the proposed exhaustive enumeration method based on graph theory and deep learning, the whole landscape of 10M phosphorescent Ir-dopants could be scanned with just 0.8% computational cost compared with the pure simulation-based approach. 16

Deep-Learning-based Evolutionary Design “Evolutionary design of organic molecules based on deep learning and genetic algorithm”, COMP , ACS fall 2018 National Meeting 17

Evolutionary Design  A generic population-based metaheuristic optimization technique  Uses bio-inspired operators to reach near-optimal solutions ; mutation, crossover, and selection in case of genetic algorithm https://en.wikipedia.org/wiki/Fitness_landscape Initial population Fitness Calculate fitness Yes Done Satisfy constraints? No Selection Average fitness Mutation Crossover + New population Generation 18

Deep-Learning-Based Evolutionary Design  Proposed approach Conventional Proposed Expectations Molecular Descriptor Graph or ASCII string Bit string (ECFP) • Prevent heuristic bias RNN • Secure chemical validity Molecular Evolution Heuristic Random • Versatile evaluation is possible Fitness Evaluation Simple assessment DNN *ECFP (Extended Connectivity FingerPrint) DNN (Deep Neural Network), RNN (Recurrent Neural Network) SMILES (Simplified Molecular-Input Line-Entry System) DB Seed molecule (ECFP) Best-fit molecule 1 1 1 0 0 1 1 0 Fitness evaluation Inspection of Mutation (n=50) 1 0 1 0 (DNN) chemical validity 1 1 0 0 Decoding to SMILES (RNN) Inspection of Decoding to Iteration chemical validity SMILES (RNN) 1 1 0 0 0 0 0 1 Parents Fitness evaluation Evolution 1 1 0 1 Crossover Selection (DNN) Crossover → Mutation) 1 0 0 1 Mutation 1 1 0 0 0 0 0 1 19

Deep Learning-Based Evolutionary Design  Deep learning models [DNN] 3 hidden layers, 500 hidden units in each layer • [RNN] 3 hidden layers, 500 long short-term memory units • DNN Model RNN Model Input (ECFP*) <start> y 1 =‘CCC’ y 2 =‘CCC’ y T =‘)=O’ Input … t=1 t=2 t=3 t=T+1 (ECFP*) y 1 =‘ CCC ’ y 2 =‘ CC C ’ y 3 =‘ CC ( ’ <end> Output (SMILES) y = (‘ CCC ’,‘ CC C ’,‘ CC ( ’,…, ‘ )= O ’) → ‘ CCCC(N)=O ’ Output (Properties) *ECFP (dimension=5,000, neighbor size=6) 20

Contents Trend in Computer-Aided Materials Discovery - PowerPoint PPT Presentation

Contents Trend in Computer-Aided Materials Discovery High-Throughput Computational Screening & Exhaustive Enumeration Deep-Learning-based Evolutionary Design Deep-Learning-based Inverse Design Efficacy of Computer-Aided

Level 1, V2.0 Level 1, V2.0 1 Course Contents Course Contents Course Contents Course

Oasys Post Processing New Features in Version 16.0 www.arup.com/dyna Back to Contents Back to

Contents averages averages Contents Contents Harmonic mean (average) Harmonic mean (average)

Sage as a Calculator By Samaneh shafi naderi By Samaneh shafi naderi Sage as a Calculator

Contents Contents Fluid

Contents Contents.....2 Butter

PRODUCT LAW WORLDVIEW PRODUCT LAW WORLDVIEW TABLE OF CONTENTS TABLE OF CONTENTS INTRODUCTION

The Waterbase Limited Investor Presentation June - 2016 Contents Contents 2 Safe Harbour

17 www.scad.ae Table of Contents Table of Contents

Scytls voter-verifiability solutions Pnyx.DRE and Pnyx.VVPAT Contents Contents

Cencosud April 2016 Corporate Presentation | Contents | 2 Contents Investment Highlights

3 August 2006 Hong Kong www.solomon-systech.com Table of contents Table of contents

CONTENTS CONTENTS A. Company Profile 03 B. Products 06 Appendix 29 2/30 A. Company Profile

INVESTOR PRESENTATION February 2020 CONTENTS TABLE OF CONTENTS Majid Al Futtaim 2019

Marine Biodiversity Yoshihisa Shirayama Contents Contents Characteristics of Marine

Taeil Enterprise the antimicrobial material technology Table of Contents Table of Contents

Sueo durante el siglo XVII? By Josselyn Zaldvar History Of Spain at the end of 16 th and

The 2018 Performance Announcement Presentation February 2019 Global Connection PCL. Company

A Subgroup or a Subpopulation Design and Analysis Issues in Clinical Trials * SueJane Wang,

S8901 Quadro for AI, VR and Simulation Carl Flygare, PNY Allen Bourgoyne, NVIDIA Quadro

Evidence for Evolution Scientific evidence of biological evolution uses information from

Finding Structure in Texts with Topological Data Analysis Calli Clay and Ella Graham St.

FASTSP: linear time calculation of alignment accuracy Siavash Mir arabbaygi Research Preparation

1 1. Basic intro to singular chain complexes, compute homology of a point. (a) Basic

Sambuz

Useful Links

Newsletter

Mail Us