graph neural networks for drug development
play

Graph Neural Networks for Drug Development Marinka Zitnik - PowerPoint PPT Presentation

Graph Neural Networks for Drug Development Marinka Zitnik marinka@hms.harvard.edu Marinka Zitnik - Harvard - marinka@hms.harvard.edu 1 Drug Development Step 1: Design and Discovery Step 2: Preclinical Research Step 3: Clinical Research


  1. Graph Neural Networks for Drug Development Marinka Zitnik marinka@hms.harvard.edu Marinka Zitnik - Harvard - marinka@hms.harvard.edu 1

  2. Drug Development Step 1: Design and Discovery Step 2: Preclinical Research Step 3: Clinical Research Step 4: FDA Review Step 5: Post-Market and Safety Monitoring Marinka Zitnik - Harvard - marinka@hms.harvard.edu 2

  3. Opportunities for AI in Drug Development Step 1: Design and Support decision-making for a new Discovery drug in the laboratory Step 2: Preclinical Answer basic questions about safety Research and animal testing Step 3: Clinical Predict if drug is safe & effective to test Research on people, find new uses for drugs Step 4: FDA Automatic document review to make a Review decision to approve the drug or not Step 5: Post-Market and Detect adverse and safety issues in Safety Monitoring real time using electronic health data Marinka Zitnik - Harvard - marinka@hms.harvard.edu 3

  4. Why is it so challenging to realize this vision? Heart disease Asthma Brain Alzheimer’s disease Finding drugs for disease treatments relies on several types of interactions, e.g., drug-target, protein-protein, drug-drug, drug-disease, disease-protein pairs Machine learning for integrating data in biology and medicine: Principles, practice, and opportunities, Information Fusion 2019 4

  5. Today’s Talk Step 1: Design and Support decision-making for a new Discovery drug in the laboratory Step 2: Preclinical Answer basic questions about safety Research and animal testing Step 3: Clinical Predict if drug is safe & effective to test Research on people, find new uses for drugs Step 4: FDA Automatic document review to make a Review decision to approve the drug or not Step 5: Post-Market and Detect adverse and safety issues in Safety Monitoring real time using electronic health data Marinka Zitnik - Harvard - marinka@hms.harvard.edu 5

  6. Goal: Find which diseases a drug (new molecule) could treat Marinka Zitnik - Harvard - marinka@hms.harvard.edu 6

  7. What drug treats what disease? Drugs Diseases Goal: Predict what diseases a new molecule might treat ? ? “Treats” relationship ? Unknown drug-disease relationship Machine learning for integrating data in biology and medicine: Principles, practice, and opportunities, Information Fusion 2019 7

  8. Key Insight: Subgraphs Disease: Subgraph of rich Drug: Subgraph of rich protein network defined on protein network defined disease proteins on drug’s target proteins A drug likely treats a disease if it is close to the disease in pharmacological space [Paolini et al., Nature Biotech.’06; Menche et al., Science’15] Idea: Use the paradigm of embeddings to operationalize the concept of closeness in pharmacological space Marinka Zitnik - Harvard - marinka@hms.harvard.edu 8

  9. Predicting Links Between Drug and Disease Subgraphs Task: Given drug 𝐷 and disease 𝐸 , predict if 𝐷 treats 𝐸 Task: 1) Learn embeddings for 𝐷 ’s and 𝐸 ’s subgraphs 2) Use embeddings to predict probability that 𝐷 treats 𝐸 Marinka Zitnik - Harvard - marinka@hms.harvard.edu 9

  10. Neural Message Passing p( , ) Edge decoder Aggregate information from subgraphs Subgraph encoder 𝑘 𝑗 Aggregate information from neighbors Marinka Zitnik - Harvard - marinka@hms.harvard.edu 10

  11. We need drug repurposing dataset § Protein-protein interaction network culled from 15 knowledge databases with 19K nodes, 350K edges § Drug-protein and disease-protein links: § DrugBank, OMIM, DisGeNET, STITCH DB and others § 20K drug-protein links, 560K disease-protein links § Medical indications and contra-indications: Disease subgraph Drug subgraph § DrugBank, MEDI-HPS, DailyMed, Drug Central, RepoDB § 6K drug-disease indications Protein interaction network § Side information on drugs, diseases, proteins, etc.: § Molecular pathways, disease symptoms, side effects Marinka Zitnik - Harvard - marinka@hms.harvard.edu 11

  12. Predictive Performance Task: Given a disease and a drug, predict if the drug could treat the disease Up to 49% improvement Up to 172% improvement Marinka Zitnik - Harvard - marinka@hms.harvard.edu 12

  13. Drug Repurposing at Stanford Drug Disease N-acetyl-cysteine cystic fibrosis Rank: 14/5000 Task: Predict if an existing drug can be Xamoterol neurodegeneration Rank: 26/5000 repurposed for a new disease Plerixafor cancer Rank: 54/5000 Sodium selenite cancer Rank: 36/5000 Ebselen C difficile Rank: 10/5000 Itraconazole cancer Rank: 26/5000 Bestatin lymphedema Rank: 11/5000 Bestatin pulmonary arterial hypertension Rank: 16/5000 Ketaprofen lymphedema Rank: 28/5000 Sildenafil lymphatic malformation Rank: 26/5000 Tacrolimus pulmonary arterial hypertension Rank: 46/5000 Benzamil psoriasis Rank: 114/5000 Carvedilol Chagas’ disease Rank: 9/5000 Benserazide BRCA1 cancer Rank: 41/5000 Pioglitazone interstitial cystitis Rank: 13/5000 Sirolimus dystrophic epidermolysis bullosa Rank: 46/5000 Marinka Zitnik - Harvard - marinka@hms.harvard.edu 13

  14. Feedbacks for the AI Loop Marinka Zitnik - Harvard - marinka@hms.harvard.edu 14

  15. Feedbacks for the AI Loop Marinka Zitnik - Harvard - marinka@hms.harvard.edu 15

  16. Explaining GNN Predictions Key idea: § Summarize where in the data the model “looks” for evidence for its prediction § Find a small subgraph most influential for the prediction GNN Explainer: Generating Explanations for Graph Neural Networks, NeurIPS 2019 16

  17. GNNExplainer: Key Idea § Input: Given prediction 𝑔(𝑦) for node/link 𝑦 § Output: Explanation, a small subgraph 𝑁 * together with a small subset of node features: § 𝑁 * is most influential for prediction 𝑔(𝑦) § Approach: Learn 𝑁 * via counterfactual reasoning § Intuition: If removing 𝑤 from the graph strongly decreases the probability of prediction ⇒ 𝑤 is a good counterfactual explanation for the prediction GNN Explainer: Generating Explanations for Graph Neural Networks, NeurIPS 2019 17

  18. GNNExplainer: Results ”Why did you predict that this molecule will have a mutagenic effect on Gram-negative bacterium S. typhimurium ?” Explanation GNN Explainer: Generating Explanations for Graph Neural Networks, NeurIPS 2019 18

  19. Today’s Talk Step 1: Design and Support decision-making for a new Discovery drug in the laboratory Step 2: Preclinical Answer basic questions about safety Research and animal testing Step 3: Clinical Predict if drug is safe & effective to test Research on people, find new uses for drugs Step 4: FDA Automatic document review to make a Review decision to approve the drug or not Step 5: Post-Market and Detect adverse and safety issues in Safety Monitoring real time using electronic health data Marinka Zitnik - Harvard - marinka@hms.harvard.edu 19

  20. Polypharmacy Patients take multiple drugs to treat complex or co-existing diseases 46% of people over 65 years take more than 5 drugs Many take more than 20 drugs to treat heart diseases, depression or cancer 15% of the U.S. population affected by unwanted side effects Annual costs in treating side effects exceed $177 billion in the U.S. alone [Ernst and Grizzle, JAPA’01; Kantor et al., JAMA’15] Marinka Zitnik - Harvard - marinka@hms.harvard.edu 20

  21. Unexpected Drug Interactions Prescribed drugs Co-prescribed drugs Side Effects Task: How likely will a particular combination of drugs lead to a particular side effect? , ? 3% 2% prob. prob. Marinka Zitnik - Harvard - marinka@hms.harvard.edu 21

  22. Why is modeling polypharmacy hard? Combinatorial explosion § >13 million possible combinations of 2 drugs § >20 billion possible combinations of 3 drugs Non-linear & non-additive interactions ≠ + § Different effect than the additive effect of individual drugs Small subsets of patients Side effects are interdependent § No info on drug combinations not yet used in patients § Marinka Zitnik - Harvard - marinka@hms.harvard.edu 22

  23. Setup: Multimodal Networks 𝑠 Edge type 𝑗 5 Node types E.g., Specific type of drug- 𝑠 1 drug interaction ( 𝑠 0 ) 𝑠 0 𝑠 2 Mode 1 e.g., drugs 𝑠 E.g., drug-target interaction ( 𝑠 3 ) 3 𝑠 3 𝑠 3 𝑠 3 Mode 2 𝑠 4 e.g., proteins E.g., protein-protein interaction ( 𝑠 4 ) Modeling polypharmacy side effects with graph convolutional networks, Bioinformatics 2018 23

  24. Our Approach: Decagon 1. Encoder: Take a multimodal Embedding network and learn an embedding for every node 2. Decoder: Use the learned Embedding embeddings to predict typed ? r i Embedding edges between nodes Modeling polypharmacy side effects with graph convolutional networks, Bioinformatics 2018 24

  25. Encoder: Propagate Neighbors Generate embeddings based on local network neighborhoods separated by edge type 1) Determine a node’s computation 2) Learn how to transform and propagate graph for each edge type information across computation graph Example for edge type 𝑠 2 : 1 st order neighbor of 𝑤 2 nd order neighbor of 𝑤 Modeling polypharmacy side effects with graph convolutional networks, Bioinformatics 2018 Marinka Zitnik - Harvard - marinka@hms.harvard.edu 25

Recommend


More recommend