no node2vec
play

no node2vec: Scalable Feature Learning for Networks Aditya Grover - PowerPoint PPT Presentation

no node2vec: Scalable Feature Learning for Networks Aditya Grover and Jure Leskovec. KDD 2016. Presented by Haoxiang Wang. Feb 26, 2020. Node Embeddings Ou Outpu put A B In Input Intuition: Find embeddings of nodes in a d-


  1. no node2vec: Scalable Feature Learning for Networks Aditya Grover and Jure Leskovec. KDD 2016. Presented by Haoxiang Wang. Feb 26, 2020.

  2. Node Embeddings Ou Outpu put A B In Input ´ Intuition: Find embeddings of nodes in a d- dimensional space so that “similar” nodes in the graph have embeddings that are close together.

  3. Setup ´ Assume we have a graph G : ´ V is the vertex set (i.e., node set). ´ A is the adjacency matrix (assume binary).

  4. Embedding Nodes ´ Goal: to encode nodes so that similarity in the embedding space (e.g., dot product) approximates similarity in the original network. similarity( u, v ) ≈ z > v z u

  5. Random Walk Embeddings: Basic Idea probability that u and v co-occur on a z > u z v ≈ random walk over the network 1. Estimate probability of visiting node v on a random walk starting from node u using some random walk strategy R . 2. Optimize embeddings to encode these random walk statistics.

  6. Algorithm/Optimization of Random Walk Embeddings 1. Run short random walks starting from each node on the graph using some strategy R . For each node u collect N R ( u ) , the multiset * of nodes 2. visited on random walks starting from u. ( * N R ( u ) can have repeat elements since nodes can be visited multiple times on random walks. ) 3. Optimize embeddings to according to: X X L = − log( P ( v | z u )) u ∈ V v ∈ N R ( u ) exp( z > u z v ) P ( v | z u ) = n 2 V exp( z > P u z n ) In practice, random sampling based on some distribution over nodes

  7. Node2vec: Biased Random Walks ´ Idea: use flexible, biased random walks that can trade off between local and global views of the network (Grover and Leskovec, 2016). ´ BFS (Breath-First Search)and DFS (Depth-First Search): Two classic strategies to define a neighborhood 𝑂 , 𝑣 of a given node 𝑣 : 𝑂 ./0 𝑣 = { 𝑡 4 , 𝑡 6 , 𝑡 7 } s 1 s 2 s 8 Local microscopic view s 7 BFS u s 6 DFS 𝑂 9/0 𝑣 = { 𝑡 : , 𝑡 ; , 𝑡 < } s 9 s 4 s 5 s 3 Global macroscopic view

  8. Combine BFS + DFS by a Ratio Biased random walk 𝑆 that Unnormalized given a node 𝑣 generates Walker is at 𝑥 . transition prob. neighborhood 𝑂 , 𝑣 Where to go next? ´ Two parameters: 1 s 2 s 3 s 1 1/𝑞 ´ Return parameter 𝑞 : w s 2 1 w → 1/𝑟 Return back to the s 1 s 3 1/𝑟 u 1/𝑞 previous node BFS-like walk: Low value of 𝑞 ´ Walk-away parameter DFS-like walk: Low value of 𝑟 𝑟 : Moving outwards (DFS) vs. inwards (BFS)

  9. Benchmarks: Node Classification & Link Prediction ? ? Node ? Classification ? Machine Learning ? ? Link Prediction ? x ? Machine Learning

  10. Link Prediction Empirical Results Node Classification

  11. Advantages of Node2Vec ´ node2vec performs better on node classification compared with other node embedding methods. ´ Random walk approaches are generally more efficient (i.e., O(|E|) vs. O(|V| 2 ) ) ´ (Note: In general , one must choose definition of node similarity that matches application. )

  12. Other random walk node embedding works ´ Different kinds of biased random walks: ´ Based on node attributes (Dong et al., 2017). ´ Based on a learned weights (Abu-El-Haija et al., 2017) ´ Alternative optimization schemes: ´ Directly optimize based on 1-hop and 2-hop random walk probabilities (as in LINE from Tang et al. 2015). ´ Network preprocessing techniques: ´ Run random walks on modified versions of the original network (e.g., Ribeiro et al. 2017’s struct2vec, Chen et al. 2016’s HARP).

Recommend


More recommend