CS224W: Machine Learning with Graphs Jure Leskovec, Weihua Hu, Stanford University http://cs224w.stanford.edu
… Output: Node embeddings. We can also embed larger network structures, subgraphs, graphs. 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 2
¡ Key idea: Generate node embeddings based on local network neighborhoods A C TARGET NODE B B A A C B C A E F D F E D INPUT GRAPH A 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 3
¡ Intuition: Nodes aggregate information from their neighbors using neural networks A C TARGET NODE B B A A C B C A E F D F E D A INPUT GRAPH Neural networks 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 4
¡ Many model variants have been proposed with difference choice of neural networks. Scarselli et al., 2009b; Battaglia et al., 2016; Defferrard et al., 2016; Duvenaud et al., 2015; Hamilton et al., 2017a; Kearnes et al., 2016; Kipf & Welling, 2017; Lei et al., 2017; Li et al., 2016; Velickovic et al., 2018; Verma & Zhang, 2018; Ying et al., 2018; Zhang et al., 2018 A ? C TARGET NODE B B A A C B ? ? C A E F D F E D ? A INPUT GRAPH What’s inside the box? 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 5 5
¡ Many model variants have been proposed with difference choice of neural networks. Scarselli et al., 2009b; Battaglia et al., 2016; Defferrard et al., 2016; Duvenaud et al., 2015; Hamilton et al., 2017a; Kearnes et al., 2016; Kipf & Welling, 2017; Lei et al., 2017; Li et al., 2016; Velickovic et al., 2018; Verma & Zhang, 2018; Ying et al., 2018; Zhang et al., 2018 A Graph Convolutional Networks ks C [Kipf & Welling ICLR'2017] TARGET NODE B B Linear + + A ReLU Re A C B Me Mean Li Line near + C A Re ReLU E F Me Mean D F E D A INPUT GRAPH 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 6 6
¡ Many model variants have been proposed with difference choice of neural networks. Scarselli et al., 2009b; Battaglia et al., 2016; Defferrard et al., 2016; Duvenaud et al., 2015; Hamilton et al., 2017a; Kearnes et al., 2016; Kipf & Welling, 2017; Lei et al., 2017; Li et al., 2016; Velickovic et al., 2018; Verma & Zhang, 2018; Ying et al., 2018; Zhang et al., 2018 A Gr Graph phSAGE GE C [Hamilton+ NeurIPS’2017] TARGET NODE B B A Ma Max A C B ML MLP Ma Max C A E F MLP ML D F E D A INPUT GRAPH 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 7 7
¡ Intuition: Network neighborhood defines a computation graph Every node defines a computation graph based on its neighborhood! 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 8
¡ Obtain node representation by neighbor aggregation 12/3/19 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 9 9
¡ Obtain graph representation by pooling node representation Pool ool (e.g., Sum, Average) 12/3/19 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 10 10
Graph Neural Networks have achieved state-of- the-art performance on: ¡ Node classification [Kipf+ ICLR’2017] ¡ Graph Classification [Ying+ NeurIPS’2018] ¡ Link Prediction [Zhang+ NeurIPS’2018] 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 11
Graph Neural Networks have achieved state-of- the-art performance on Are GNNs perfect? ¡ Node classification [Kipf+ ICLR’2017] ¡ Graph Classification [Ying+ NeurIPS’2018] What are the limitations of GNNs? ¡ Link Prediction [Zhang+ NeurIPS’2018] 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 12
¡ Some simple graph structures cannot be distinguished by conventional GNNs. Assume: Input node features are uniform (denoted by the same node color) GCN and GraphSAGE fail to distinguish the two graphs. 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 13
¡ Some simple graph structures cannot be distinguished by conventional GNNs. Assume: Input node features are uniform (denoted by the same node color) GCN and GraphSAGE fail to distinguish the two graphs. ¡ GNNs are not robust to noise in graph data. 1 1 1 Noise in graph 1 1 0 1. Node feature 1 1 0 0 ? perturbation 2. Edge 0 1 GNNs 0 0 addition/deletion 1 Class 2 Class 1 Class 3 0 1 Class prediction Graph 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 14
1. Limitations of conventional GNNs in capturing graph structure 2. Vulnerability of GNNs to noise in graph data 3. Open questions & Future direction 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 15
¡ Given two different graphs, can GNNs map them into different graph representations ? ¡ Important condition for classification scenario. ? ? GN GNN GN GNN 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 17
¡ Essentially, graph isomorphism test problem. ¡ No polynomial algorithms exist for general case. ¡ GNNs may not perfectly distinguish any graphs! ? ? GN GNN GN GNN 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 18
¡ Essentially graph isomorphism test problem. ¡ No polynomial algorithms exist for general case. ¡ GNNs may not perfectly distinguish any graphs. How well can GNNs perform ? ? the graph isomorphism test? GN GNN GN GNN Requires rethinking the mechanism of how GNNs capture graph structure. 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 19
¡ GNNs use different computational graphs to distinguish different graphs. 1 2 1’ 2’ 4 3 4’ 3’ 3’ 4’ 1’ 2’ 2 3 4 1 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 20
¡ Node representation captures rooted subtree structure. Rooted subtree 1 2 1’ 2’ Rooted subtree structure structure 4 3 4’ 3’ 3’ 4’ 1’ 2’ 2 3 4 1 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 21
¡ Most discriminative GNNs map different subtrees into different node representations (denoted by different colors). 3’ 4’ 1’ 2’ 2 3 4 1 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 22
¡ Most discriminative GNNs map different subtrees into different node representation (denoted by different colors). Injec ectivity 3’ 4’ 1’ 2’ 2 3 4 1 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 23
¡ Function is injective if it maps different elements into different outputs. 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 24
¡ Entire neighbor aggregation is injective if every step of neighbor aggregation is injective. In Inject ective e à Inject In ective e à En Enti tire functi tion is Injecti tive ve! 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 25
¡ Neighbor aggregation is essentially a function over multi-set (set with repeating elements). Examples of Ex multi-se mu set Eq Equiva valent Mu Multi-se set f function Neighbo Ne hbor Same Sam e co color indicat cates es the e ag aggreg egat ation sa same no node fe featur ures 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 26
¡ Neighbor aggregation is essentially function over multi-set (set with repeating elements). Discriminative Power of Examples of Ex GNNs can be characterized multi-se mu set Eq Equiva valent by that of multi-set functions Mu Multi-se set f function Neighbo Ne hbor Next: Analyzing GCN, GraphSAGE aggreg ag egat ation 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 27
Re Recall: GCN CN uses me mean po pooling. Mean Mean Po Pooling + Li Line near Re ReLU LU 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 28
Re Recall: GCN CN uses me mean po pooling. Mean Mean po pooling + + Li Line near Re ReLU LU GCN GCN wi will fail to di distinguish pr propo portionally equ quivalent multi-se sets. s. Not injective No ve! 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 29
Re Recall: Gr Grap aphSAGE GE use uses m s max p x pool ooling. ng. ML MLP + Max ax pooling 12/3/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 30
Recommend
More recommend