Representation Learning on Networks Yuxiao Dong Microsoft Research, Redmond Joint work with Jiezhong Qiu, Jie Zhang, Jie Tang (Tsinghua University) Hao Ma (MSR & Facebook AI) and Kuansan Wang (MSR)
Networks Social networks Economic networks Biomedical networks Information networks Internet Networks of neurons Slides credit: Jure Leskovec
The Network & Graph Mining Paradigm π¦ ππ : node π€ π βs π π’β feature, e.g., π€ π βs pagerank value Graph & network applications β’ Node label inference; β’ X Link prediction; β’ User behaviorβ¦ β¦ hand-crafted feature matrix machine learning models feature engineering
Representation Learning for Networks Graph & network applications β’ Node label inference; β’ Z Node clustering; β’ Link prediction; β’ β¦ β¦ hand-crafted latent feature matrix machine learning models Feature engineering learning β’ Input: a network π» = (π, πΉ) Output: π β π π Γπ , π βͺ |π| , π -dim vector π π€ for each node v . β’
Network Embedding: Random Walk + Skip-Gram π₯ πβ2 π₯ πβ1 π₯ π π₯ π+1 π₯ π+2 β’ sentences in NLP skip-gram β’ vertex-paths in Networks (word2vec) Perozzi et al. DeepWalk: Online learning of social representations. In KDDβ 14 , pp. 701 β 710.
Random Walk Strategies β’ Random Walk β DeepWalk (walk length > 1) β LINE (walk length = 1) β’ Biased Random Walk β’ 2 nd order Random Walk β node2vec β’ Metapath guided Random Walk β metapath2vec
Application: Embedding Heterogeneous Academic Graph metapath2vec Microsoft Academic Graph β’ https://academic.microsoft.com/ β’ https://www.openacademic.ai/oag/ β’ metapath2vec: scalable representation learning for heterogeneous networks. In KDD 2017.
Application 1: Related Venues β’ https://academic.microsoft.com/ β’ https://www.openacademic.ai/oag/ β’ metapath2vec: scalable representation learning for heterogeneous networks. In KDD 2017.
Application 2: Similarity Search (Institution) Microsoft Facebook Stanford Harvard Johns Hopkins UChicago AT&T Labs Google MIT Yale Columbia CMU β’ https://academic.microsoft.com/ β’ https://www.openacademic.ai/oag/ β’ metapath2vec: scalable representation learning for heterogeneous networks. In KDD 2017.
Network Embedding Random Walk Skip Gram Output: Input: Vectors Adjacency Matrix π π© β’ Random Walk β DeepWalk (walk length > 1) β LINE (walk length = 1) β’ Biased Random Walk 2 nd order Random Walk β’ β node2vec β’ Metapath guided Random Walk β metapath2vec
Unifying DeepWalk, LINE, PTE, & node2vec as Matrix Factorization β’ DeepWalk β’ LINE β’ PTE β’ node2vec π© Adjacency matrix b : #negative samples T : context window size π¬ Degree matrix π€ππ π» = ΰ· π΅ ππ ΰ· π π 1. Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDMβ18.
Understanding Random Walk + Skip Gram π₯ πβ2 π₯ πβ1 π₯ π π₯ π+1 π₯ π+2 π» = (π, πΉ) β’ (π₯, π) : co-occurrence of w & c ? log(#(π, π )|π | β’ (π₯) : occurrence of node w β’ Adjacency matrix π© π#(π₯)#(π)) β’ (π) : occurrence of context c β’ Degree matrix π¬ β’ π : nodeβcontext pair (w, c) multiβset β’ Volume of π»: π€ππ π» β’ |π | : number of node-context pairs Levy and Goldberg. Neural word embeddings as implicit matrix factorization. In NIPS 2014
Understanding Random Walk + Skip Gram log(#(π, π )|π | π#(π₯)#(π)) β’ (π₯, π) : co-occurrence of w & c β’ (π₯) : occurrence of node w β’ (π) : occurrence of context c β’ π : nodeβcontext pair (w, c) multiβset β’ |π | : number of node-context pairs
Understanding Random Walk + Skip Gram log(#(π, π )|π | π#(π₯)#(π)) β’ (π₯, π) : co-occurrence of w & c β’ (π₯) : occurrence of node w β’ (π) : occurrence of context c β’ π : nodeβcontext pair (w, c) multiβset β’ |π | : number of node-context pairs β’ Partition the multiset π into several sub-multisets according to the way in which each node and its context appear in a random walk node sequence. β’ More formally, for π = 1, 2, β― , π , we define Distinguish direction and distance
Understanding Random Walk + Skip Gram the length of random walk π β β β’ (π₯, π) : co-occurrence of w & c β’ π : (w, c) multiβset
Understanding Random Walk + Skip Gram the length of random walk π β β
Understanding Random Walk + Skip Gram π₯ πβ2 π₯ πβ1 π₯ π π₯ π+1 π₯ π+2 DeepWalk is asymptotically and implicitly factorizing π© Adjacency matrix π¬ Degree matrix π€ππ π» = ΰ· ΰ· π΅ ππ π π b : #negative samples T : context window size Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDMβ18. 1.
Unifying DeepWalk, LINE, PTE, & node2vec as Matrix Factorization β’ DeepWalk β’ LINE β’ PTE β’ node2vec Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDMβ18. The most cited paper in WSDMβ18 as of May 2019
NetMF: explicitly factorizing the DeepWalk matrix Matrix π₯ πβ2 π₯ πβ1 π₯ π Factorization π₯ π+1 π₯ π+2 DeepWalk is asymptotically and implicitly factorizing Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDMβ18. 1.
the NetMF algorithm 1. Construction 2. Factorization π» = Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDMβ18. 1.
Results β’ Predictive performance on varying the ratio of training data; β’ The x -axis represents the ratio of labeled data (%) Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDMβ18. 1.
Results Explicit matrix factorization (NetMF) offers performance gains over implicit matrix factorization (DeepWalk & LINE) Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDMβ18. 1.
Network Embedding Random Walk Skip Gram DeepWalk, LINE, node2vec, metapath2vec (dense) Matrix Output: π» = π(π©) Input: Factorization Vectors Adjacency Matrix NetMF π π© Incorporate network structures π© into the similarity matrix π» , and then factorize π» π π© =
Challenges π» = NetMF is not practical for very large networks
NetMF How can we solve this issue? 1. Construction 2. Factorization π» = 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019
NetSMF--Sparse How can we solve this issue? 1. Sparse Construction 2. Sparse Factorization π» = 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019
Sparsify π» For random-walk matrix polynomial where and non-negative One can construct a 1 + π -spectral sparsifier ΰ·¨ π΄ with non-zeros in time for undirected graphs β’ Dehua Cheng, Yu Cheng, Yan Liu, Richard Peng, and Shang-Hua Teng, Efficient Sampling for Gaussian Graphical Models via Spectral Sparsification, COLT 2015. β’ Dehua Cheng, Yu Cheng, Yan Liu, Richard Peng, and Shang-Hua Teng. Spectral sparsification of random-walk matrix polynomials. arXiv:1502.03496.
Sparsify π» For random-walk matrix polynomial where and non-negative One can construct a 1 + π -spectral sparsifier ΰ·¨ π΄ with non-zeros in time π» = 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019
NetSMF --- Sparse Factorize the constructed sparse matrix 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019
NetSMF---bounded approximation error π΅ ΰ·© π΅ 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019
#non-zeros ~4.5 Quadrillion β 45 Billion 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019
1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019
Effectiveness: β’ (sparse MF)NetSMF β (explicit MF)NetMF > (implicit MF) DeepWalk/LINE Efficiency: β’ Sparse MF can handle billion-scale network embedding 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019
Embedding Dimension? 1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019
Network Embedding Random Walk Skip Gram DeepWalk, LINE, node2vec, metapath2vec (dense) Matrix Output: π» = π(π©) Input: Factorization Vectors Adjacency Matrix NetMF π π© (sparse) Matrix Sparsify π» Factorization NetSMF Incorporate network structures π© into the similarity matrix π» , and then factorize π» π π© =
ProNE: More fast & scalable network embedding 1. Zhang et al. ProNE: Fast and Scalable Network Representation Learning. In IJCAI 2019
Recommend
More recommend