logical expressiveness of graph neural networks
play

Logical Expressiveness of Graph Neural Networks Mikal Monet - PowerPoint PPT Presentation

Logical Expressiveness of Graph Neural Networks Mikal Monet October 10th, 2019 Millennium Institute for Foundational Research on Data, Chile Graph Neural Networks (GNNs) With: Pablo Barcel, Egor Kostylev, Jorge Prez, Juan Reutter,


  1. Logical Expressiveness of Graph Neural Networks Mikaël Monet October 10th, 2019 Millennium Institute for Foundational Research on Data, Chile

  2. Graph Neural Networks (GNNs) • With: Pablo Barceló, Egor Kostylev, Jorge Pérez, Juan Reutter, Juan Pablo Silva (ongoing work) • Graph Neural Networks (GNNs) [Merkwirth and Lengauer, 2005, Scarselli et al., 2009]: a class of NN architectures that has recently become popular to deal with structured data → Goal: understand their theoretical properties 1

  3. Neural Networks (NNs) output vector output vector output vector output vector output vector input vector x input vector x input vector x input vector x input vector x L layers of neurons L layers of neurons L layers of neurons L layers of neurons L layers of neurons y = N ( x ) y = N ( x ) y = N ( x ) y = N ( x ) y = N ( x ) y 4 . . . x 3 y 3 . . . x 2 y 2 . . . x 1 y 1 . . . x 0 y 0 . . . A fully connected neural network N . 2

  4. Neural Networks (NNs) output vector output vector output vector output vector output vector input vector x input vector x input vector x input vector x input vector x L layers of neurons L layers of neurons L layers of neurons L layers of neurons L layers of neurons y = N ( x ) y = N ( x ) y = N ( x ) y = N ( x ) y = N ( x ) y 4 . . . x 3 y 3 . . . x 2 y 2 . . . x 1 y 1 . . . x 0 y 0 . . . A fully connected neural network N . • Weight w n ′ → n between two consecutive neurons 2

  5. Neural Networks (NNs) output vector output vector output vector output vector output vector input vector x input vector x input vector x input vector x input vector x L layers of neurons L layers of neurons L layers of neurons L layers of neurons L layers of neurons y = N ( x ) y = N ( x ) y = N ( x ) y = N ( x ) y = N ( x ) y 4 . . . x 3 y 3 . . . x 2 y 2 . . . x 1 y 1 . . . x 0 y 0 . . . A fully connected neural network N . • Weight w n ′ → n between two consecutive neurons • Compute left to right λ ( n ) := f ( � w n ′ → n × λ ( n ′ )) 2

  6. Neural Networks (NNs) output vector output vector output vector output vector output vector input vector x input vector x input vector x input vector x input vector x L layers of neurons L layers of neurons L layers of neurons L layers of neurons L layers of neurons y = N ( x ) y = N ( x ) y = N ( x ) y = N ( x ) y = N ( x ) y 4 . . . x 3 y 3 . . . x 2 y 2 . . . x 1 y 1 . . . x 0 y 0 . . . A fully connected neural network N . • Weight w n ′ → n between two consecutive neurons • Compute left to right λ ( n ) := f ( � w n ′ → n × λ ( n ′ )) • Goal: find the weights that “solve” your problem (classification, clustering, regression, etc.) 2

  7. Finding the weights • Goal: find the weights that “solve” your problem → minimize Dist ( N ( x ) , g ( x )) , where g is what you want to learn 3

  8. Finding the weights • Goal: find the weights that “solve” your problem → minimize Dist ( N ( x ) , g ( x )) , where g is what you want to learn → use backpropagation algorithms 3

  9. Finding the weights • Goal: find the weights that “solve” your problem → minimize Dist ( N ( x ) , g ( x )) , where g is what you want to learn → use backpropagation algorithms • Problem : for fully connected NNs, when a layer has many neurons there are a lot of weights. . . 3

  10. Finding the weights • Goal: find the weights that “solve” your problem → minimize Dist ( N ( x ) , g ( x )) , where g is what you want to learn → use backpropagation algorithms • Problem : for fully connected NNs, when a layer has many neurons there are a lot of weights. . . → example: input is a 250 × 250 pixels image, and we want to build a fully connected NN with 500 neurons per layer → between the first two layers we have 250 × 250 × 500 = 31 , 250 , 000 weights 3

  11. Convolutional Neural Networks input vector (an image) . . . . . . A convolutional neural network. • Idea : use the structure of the data (here, a grid) 4

  12. Convolutional Neural Networks input vector (an image) . . . . . . A convolutional neural network. • Idea : use the structure of the data (here, a grid) 4

  13. Convolutional Neural Networks input vector (an image) . . . . . . A convolutional neural network. • Idea : use the structure of the data (here, a grid) 4

  14. Convolutional Neural Networks input vector (an image) . . . . . . A convolutional neural network. • Idea : use the structure of the data (here, a grid) 4

  15. Convolutional Neural Networks input vector (an image) . . . . . . A convolutional neural network. • Idea : use the structure of the data (here, a grid) 4

  16. Convolutional Neural Networks input vector (an image) . . . . . . A convolutional neural network. • Idea : use the structure of the data (here, a grid) → fewer weights to learn (e.g, 500 ∗ 9 = 4 , 500 for the first layer) 4

  17. Convolutional Neural Networks input vector (an image) . . . . . . A convolutional neural network. • Idea : use the structure of the data (here, a grid) → fewer weights to learn (e.g, 500 ∗ 9 = 4 , 500 for the first layer) → other advantage: recognize patterns that are local 4

  18. Graph Neural Networks (GNNs) input vector output: is it poisonous? (e.g., [Duvenaud et al., 2015]) (a molecule) . . . . . . A (convolutional) graph neural network. • Idea : use the structure of the data → GNNs generalize this idea to allow any graph as input 5

  19. Graph Neural Networks (GNNs) input vector output: is it poisonous? (e.g., [Duvenaud et al., 2015]) (a molecule) . . . . . . A (convolutional) graph neural network. • Idea : use the structure of the data → GNNs generalize this idea to allow any graph as input 5

  20. Graph Neural Networks (GNNs) input vector output: is it poisonous? (e.g., [Duvenaud et al., 2015]) (a molecule) . . . . . . A (convolutional) graph neural network. • Idea : use the structure of the data → GNNs generalize this idea to allow any graph as input 5

  21. Graph Neural Networks (GNNs) input vector output: is it poisonous? (e.g., [Duvenaud et al., 2015]) (a molecule) . . . . . . A (convolutional) graph neural network. • Idea : use the structure of the data → GNNs generalize this idea to allow any graph as input 5

  22. Graph Neural Networks (GNNs) input vector output: is it poisonous? (e.g., [Duvenaud et al., 2015]) (a molecule) . . . . . . A (convolutional) graph neural network. • Idea : use the structure of the data → GNNs generalize this idea to allow any graph as input 5

  23. Graph Neural Networks (GNNs) input vector output: is it poisonous? (e.g., [Duvenaud et al., 2015]) (a molecule) . . . . . . A (convolutional) graph neural network. • Idea : use the structure of the data → GNNs generalize this idea to allow any graph as input 5

  24. Question: what can we do with graph neural networks? (from a theoretical perspective)

  25. GNNs: formalisation • Simple, undirected, node-labeled graph G = ( V , E , λ ) , where λ : V → R d 6

  26. GNNs: formalisation • Simple, undirected, node-labeled graph G = ( V , E , λ ) , where λ : V → R d • Run of a GNN with L layers on G : iteratively compute x ( i ) ∈ R d for 0 ≤ i ≤ L as follows: u 6

  27. GNNs: formalisation • Simple, undirected, node-labeled graph G = ( V , E , λ ) , where λ : V → R d • Run of a GNN with L layers on G : iteratively compute x ( i ) ∈ R d for 0 ≤ i ≤ L as follows: u → x ( 0 ) := λ ( u ) u 6

  28. GNNs: formalisation • Simple, undirected, node-labeled graph G = ( V , E , λ ) , where λ : V → R d • Run of a GNN with L layers on G : iteratively compute x ( i ) ∈ R d for 0 ≤ i ≤ L as follows: u → x ( 0 ) := λ ( u ) u → x ( i + 1 ) := COMB ( i + 1 ) ( x ( i ) u , AGG ( i + 1 ) ( {{ x ( i ) | v ∈ N G ( u ) }} )) u v • Where the AGG ( i ) are called aggregation functions and the COMB ( i ) combination functions 6

  29. GNNs: formalisation • Simple, undirected, node-labeled graph G = ( V , E , λ ) , where λ : V → R d • Run of a GNN with L layers on G : iteratively compute x ( i ) ∈ R d for 0 ≤ i ≤ L as follows: u → x ( 0 ) := λ ( u ) u → x ( i + 1 ) := COMB ( i + 1 ) ( x ( i ) u , AGG ( i + 1 ) ( {{ x ( i ) | v ∈ N G ( u ) }} )) u v • Where the AGG ( i ) are called aggregation functions and the COMB ( i ) combination functions • Let us call such a GNN an aggregate-combine GNN (AC-GNN) 6

  30. AC-GNNs: what can they do? Related work (1/2) → x ( i ) := λ ( u ) u → x ( i + 1 ) := COMB ( i + 1 ) ( x ( i ) u , AGG ( i + 1 ) ( {{ x ( i ) | v ∈ N G ( u ) }} )) u v • Recently, [Morris et al., 2019, Xu et al., 2019] established a link with the Weisfeiler-Lehman (WL) isomorphism test 7

  31. AC-GNNs: what can they do? Related work (1/2) → x ( i ) := λ ( u ) u → x ( i + 1 ) := COMB ( i + 1 ) ( x ( i ) u , AGG ( i + 1 ) ( {{ x ( i ) | v ∈ N G ( u ) }} )) u v • Recently, [Morris et al., 2019, Xu et al., 2019] established a link with the Weisfeiler-Lehman (WL) isomorphism test → Namely: WL works exactly like an AC-GNNs with injective aggregation and combination functions 7

Recommend


More recommend