Heterogeneous Graph Transformer WWW20 1
Author • Second-year CS Ph.D student, advised by Prof. Yizhou Sun • bachelor degree in Peking University, advised by Prof. Xuanzhe Liu. WSDM 2018, WWW 2019, Best Paper Award, ICLR 2019 Workshop, ACL 2019, WWW 2020 2
Background • General GNN Framework s Aggr s t s 3
Background • Graph Attention Network s Aggr s t s 4
Background • Relational graph convolutional networks (R-GCN) ($) 𝑋 # 𝑤 " ℎ " !" 𝑠 ! 𝑠 " # ℎ " 𝑤 " 𝑤 ! 𝑠 ! # 𝑤 " ℎ " ($) 𝑋 !" 5
Background • Node classification 西虹 市首 喜剧 富 沈 腾 羞羞 ? 的铁 拳 6
Heterogeneous Information Networks (HIN) From Heterogeneous Graph Neural Network and its Applications in E-Commerce Prof. Chuan Shi 7
OAG Graph 8
OAG Graph 9
Tasks • Node Classification • Paper-Field prediction • Paper–Field (L1) p1 • Paper–Field (L2) 哈工 大 • Paper-Venue prediction zwn p2 • Link prediction 上交 zwn • Author Disambiguation tasks p3 10
Heterogeneous Graph Directed graph Type mapping functions 𝑓 = (𝐼𝐻𝑈, 𝐼𝐵𝑂) 𝑤 =< Heterogeneous Graph Transformer > 𝜐 𝑤 = 𝑞𝑏𝑞𝑓𝑠 ∅ 𝑓 = 𝑑𝑗𝑢𝑓𝑒 11
Meta Relation 𝑓 = (𝐼𝐻𝑈, 𝐼𝐵𝑂) < 𝜐 𝑡 , ∅ 𝑓 , 𝜐 𝑢 > =< 𝑞𝑏𝑞𝑓𝑠, 𝑑𝑗𝑢𝑓𝑒, 𝑞𝑏𝑞𝑓𝑠 > 12
Model • Heterogeneous Mutual Attention • Heterogeneous Message Passing • Target-Specific Aggregation 13
Heterogeneous Mutual Attention s Aggr s t s 14
Heterogeneous Mutual Attention e e d d g g e e 1 1 edge1 edge1 type1 s type1 s s t s t type1 type2 type1 type1 edge1 edge1 edge2 edge2 s s edge2 edge2 type2 type2 15
Heterogeneous Message Passing edge1 type1 s s t type1 edge1 edge2 s type2 16
Target-Specific Aggregation edge1 type1 s s t type1 edge1 type1 edge2 s type2 17
Overall Architecture 18
Dynamic Heterogeneous Graph 𝑤 = 𝐼𝐻𝑈 𝑤 = 𝐼𝐵𝑂 𝑓 = (𝐼𝐻𝑈, 𝑋𝑋𝑋) 𝑓 = (𝐼𝐻𝑈, 𝑋𝑋𝑋) 𝑋𝑋𝑋 2020 𝑋𝑋𝑋 2019 𝑓 = (𝐼𝐻𝑈, 𝑋𝑋𝑋) timestamp 2020 𝑓 = (𝐼𝐵𝑂, 𝑋𝑋𝑋) timestamp 2019 𝑤 = 𝐼𝐻𝑈 timestamp 2020 𝑤 = 𝐼𝐵𝑂 timestamp 2019 𝑤 = 𝑋𝑋𝑋 timestamp 2020 𝑤 = 𝑋𝑋𝑋 timestamp 2019 19
Relative Temporal Encoding 2 0 Transformer Relative Temporal Encoding 20
Relative Temporal Encoding 21
Overall Architecture 22
HGSampling • keep a similar number of nodes and edges for each type • keep the sampled sub-graph dense to minimize the information loss and reduce the sample variance. 23
Baselines • GCN • GAT • R-GCN • HetGNN (KDD19 Heterogeneous Graph Neural Network) • HAN (WWW19 Heterogeneous Graph Attention Network) 24
Input Features • Paper • pre-trained XLNet to get the representation of each word in its title. • Average them weighted by each word’s attention to get the title representation for each paper. • Author • average of his/her published papers’ representations • Field, venue, and institute • metapath2vec 25
Results 26
Visualize Meta Relation Attention 27
Papers 28
Thanks! 29
Recommend
More recommend