HW2 deadline postponed to next Thu, Oct 31! We are releasing an improved version of the starter code for HW2.Q4 -- keep an eye on Piazza! CS224W: Machine Learning with Graphs Jure Leskovec, JiaxuanYou, Stanford University http://cs224w.stanford.edu
… Output: Node embeddings. We can also embed larger network structures, subgraphs, graphs 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 2
¡ Key idea: Generate node embeddings based on local network neighborhoods A C TARGET NODE B B A A C B C A E F D F E D INPUT GRAPH A 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 3
¡ Intuition: Nodes aggregate information from their neighbors using neural networks A C TARGET NODE B B A A C B C A E F D F E D A INPUT GRAPH Neural networks 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 4
¡ Intuition: Network neighborhood defines a computation graph Every node defines a computation graph based on its neighborhood! 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 5
Key idea: Generate node embeddings based on local network neighborhoods § Nodes aggregate “messages” from their neighbors using neural networks ¡ Graph Convolutional Neural Networks: § Basic variant: Average neighborhood information and stack neural networks ¡ GraphSAGE: § Generalized neighborhood aggregation h k − 1 k h k − 1 u v 𝑤 h k v 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 6
… Output: Vector embeddings 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 7
… Output: Graph Structure! 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 8
1. Problem of Graph Generation 2. ML Basics for Graph Generation 3. GraphRNN 4. Applications and Open Questions 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 9
¡ We want to generate realistic graphs Generate a Given a large synthetic graph real graph ¡ What is a good model? ¡ How can we fit the model and generate the graph using it? 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 11
¡ Generation – Gives insight into the graph formation process ¡ Anomaly detection – abnormal behavior, evolution ¡ Predictions – predicting future from the past ¡ Simulations of novel graph structures ¡ Graph completion – many graphs are partially observed ¡ "What if” scenarios 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 12
Task 1: Realistic graph generation ¡ Generate graphs that are similar to a given set of graphs [Focus of this lecture] Task 2: Goal-directed graph generation ¡ Generate graphs that optimize given objectives/constraints § Drug molecule generation/optimization 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 13
Drug discovery ¡ Discover highly drug-like molecules Graph generative model drug_likeness=0.94 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 14
Drug discovery ¡ Complete an existing molecule to optimize a desired property Complete Improve Solubility=-5.55 Solubility=-1.78 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 15
Discovering novel structures Grid Community Ego Train GraphRNN 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 16
Network Science ¡ Null models for realistic networks Barabasi_Albert(n=50, m=2) ~ NeuralNet_X(n=50, p=3, q=5) ~ 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 17
¡ Large and variable output space § For 𝑜 nodes we need to generate 𝑜 $ values § Graph size (nodes, edges) varies 0 1 1 0 0 1 3 1 0 0 1 0 5 1 0 0 1 1 2 4 0 1 1 0 1 0 0 1 1 0 5 nodes: 25 values 1K nodes: 1M values 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 18
¡ Non-unique representations: § 𝑜 -node graph can be represented in 𝑜! ways § Hard to compute/optimize objective functions (e.g., reconstruction error) 0 1 1 0 0 1 3 1 0 0 1 0 5 1 0 0 1 1 Same graph 2 4 0 1 1 0 1 0 0 1 1 0 Very different 0 0 0 1 1 1 5 representations! 0 0 1 0 1 2 0 1 0 1 1 4 3 1 0 1 0 0 1 1 1 0 0 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 19
¡ Complex dependencies: § Edge formation has long-range dependencies Example: Generate a ring graph on 6 nodes: 1 1 1 1 Shouldn’t Should 1 1 1 have edge! have edge! 1 1 1 1 Existence of an edge may depend on the entire graph! 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 20
1. Problem of Graph Generation 2. ML Basics for Graph Generation 3. GraphRNN 4. Applications and Open Questions 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 21
¡ Given: Graphs sampled from 𝑞 '()( (𝐻) ¡ Goal: § Learn the distribution 𝑞 -.'/0 (𝐻) § Sample from 𝑞 -.'/0 (𝐻) 𝑞 -.'/0 (𝐻) 𝑞 '()( (𝐻) Learn & Sample 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 23
Setup: ¡ Assume we want to learn a generative model from a set of data points (i.e., graphs) {𝒚 3 } § 𝑞 '()( (𝒚) is the data distribution, which is never known to us, but we have sampled 𝒚 3 ~ 𝑞 '()( (𝒚) § 𝑞 -.'/0 (𝒚; 𝜄) is the model, parametrized by 𝜄 , that we use to approximate 𝑞 '()( (𝒚) ¡ Goal: § (1) Make 𝑞 -.'/0 𝒚; 𝜄 close to 𝑞 '()( 𝒚 § (2) Make sure we can sample from 𝑞 -.'/0 𝒚; 𝜄 § We need to generate examples (graphs) from 𝑞 -.'/0 𝒚; 𝜄 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 24
(1) Make 𝒒 𝒏𝒑𝒆𝒇𝒎 𝒚; 𝜾 close to 𝒒 𝒆𝒃𝒖𝒃 𝒚 ¡ Key Principle : Maximum Likelihood ¡ Fundamental approach to modeling distributions § Find parameters 𝜄 ∗ , such that for observed data points 𝒚 3 ~𝑞 '()( , ∑ 3 log 𝑞 -.'/0 𝒚 3 ; 𝜄 ∗ has the highest value, among all possible choices of 𝜄 § That is, find the model that is most likely to have generated the observed data 𝑦 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 25
(2) Sample from 𝒒 𝒏𝒑𝒆𝒇𝒎 𝒚; 𝜾 ¡ Goal : Sample from a complex distribution ¡ The most common approach: § (1) Sample from a simple noise distribution 𝒜 3 ~𝑂(0,1) § (2) Transform the noise 𝑨 3 via 𝑔(⋅) 𝒚 3 = 𝑔(𝒜 3 ; 𝜄) Then 𝒚 3 follows a complex distribution ¡ Q : How to design 𝑔(⋅) ? ¡ A : Use Deep Neural Networks, and train it using the data we have! 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 26
[Goodfellow, NeurIPS 2016] Taxonomy of Deep Generative Models This lecture: Auto-regressive models: An autoregressive (AR) model predicts future behavior based on past behavior. 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 27
Auto-regressive models ¡ 𝒒 𝒏𝒑𝒆𝒇𝒎 𝒚; 𝜾 is used for both density estimation and sampling (from the probability density) § (other models like Variational Auto Encoders (VAEs), Generative Adversarial Nets (GANs) have 2 or more models, each playing one of the roles) § Apply chain rule : Joint distribution is a product of conditional distributions: R 𝑞 -.'/0 𝒚; 𝜄 = O 𝑞 -.'/0 (𝑦 ) |𝑦 Q , … , 𝑦 )UQ ; 𝜄) )PQ § E.g., 𝒚 is a vector, 𝑦 ) is the 𝑢 -th dimension; 𝒚 is a sentence, 𝑦 ) is the 𝑢 -th word. § In our case: 𝑦 ) will be the 𝑢 -th action (add node, add edge) 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 28
1. Problem of Graph Generation 2. ML Basics for Graph Generation 3. GraphRNN 4. Applications and Open Questions 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 29
GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models. J. You, R. Ying, X. Ren, W. L. Hamilton, J. Leskovec. International Conference on Machine Learning (ICML) , 2018.
[You et al., ICML 2018] Generating graphs via sequentially adding nodes and edges Graph 𝐻 1 3 5 2 4 Generation process 𝑇 X 1 3 1 1 1 3 1 3 5 2 2 4 2 2 4 10/24/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 31
Recommend
More recommend