multi hop rc hotpotqa gnns
play

Multi-Hop RC, HotpotQA & GNNs Select, Answer and Explain: - PowerPoint PPT Presentation

Multi-Hop RC, HotpotQA & GNNs Select, Answer and Explain: Interpretable Multi-hop Reading Comprehension over Multiple Documents Tu et al., AAAI 2020 Presented By: Lovish Madaan References HotpotQA - Peng Qi (Stanford) GNNs -


  1. Multi-Hop RC, HotpotQA & GNNs Select, Answer and Explain: Interpretable Multi-hop Reading Comprehension over Multiple Documents – Tu et al., AAAI 2020 Presented By: Lovish Madaan

  2. References • HotpotQA - Peng Qi (Stanford) • GNNs - Jure Leskovec (Stanford), AAAI 2019 Tutorial by William Hamilton (McGill) • Some elements and images borrowed from Tu et al. (AAAI 2020), Yang et al. (EMNLP 2018), and Jay Alammar

  3. Topics • Introduction and HotpotQA • Select, Answer and Explain • GNNs • Answer and Explain • Results and Ablation Study • Reviews

  4. The Promise of Question Answering In which city was Facebook first launched? Cambridge, Massachusetts. This is because Mark Zuckerberg and his business partners launched it from his Harvard dormitory [1], and Harvard is located in Cambridge, Massachusetts [2]. [1] https://en.wikipedia.org/wiki/Mark_Zuckerberg [2] https://en.wikipedia.org/wiki/Harvard_University

  5. Reality The Promise of Question Answering In which city was Facebook first launched? Cambridge, Massachusetts. This is because Mark Zuckerberg and his business partners launched it from his Harvard dormitory [1], and Harvard is located in Cambridge, Massachusetts [2]. [1] https://en.wikipedia.org/wiki/Mark_Zuckerberg [2] https://en.wikipedia.org/wiki/Harvard_University Sorry, folks from Google!

  6. The Promise of Question Answering In which city was Multi-hop reasoning Facebook first launched? Cambridge, Massachusetts. This is because Mark Zuckerberg and his business partners launched it from his Harvard dormitory [1], and Harvard is located in Cambridge, Massachusetts [2]. [1] https://en.wikipedia.org/wiki/Mark_Zuckerberg [2] https://en.wikipedia.org/wiki/Harvard_University

  7. The Promise of Question Answering In which city was Multi-hop reasoning Facebook first launched? Cambridge, Massachusetts. This is because Mark Zuckerberg and his business partners launched it from Text-based, diverse his Harvard dormitory [1], and Harvard is located in Cambridge, Massachusetts [2]. [1] https://en.wikipedia.org/wiki/Mark_Zuckerberg [2] https://en.wikipedia.org/wiki/Harvard_University

  8. The Promise of Question Answering In which city was Multi-hop reasoning Explainability Facebook first launched? Cambridge, Massachusetts. This is because Mark Zuckerberg and his business partners launched it from Text-based, diverse his Harvard dormitory [1], and Harvard is located in Cambridge, Massachusetts [2]. [1] https://en.wikipedia.org/wiki/Mark_Zuckerberg [2] https://en.wikipedia.org/wiki/Harvard_University

  9. Multi-hop reasoning Explainability HotpotQA Text-based, diverse Comparison Questions

  10. Multi-hop Reasoning across Multiple Documents • Previous work (SQuAD, • HotpotQA TriviaQA, etc) When was Chris Martin born? When was the lead singer of Coldplay born? (Rajpurkar et al., 2016; Joshi et al., 2017; Dunn et al., 2017)

  11. Explainability • Previous work • HotpotQA Answer Answer Sup fact 1 Sup fact 2

  12. Evaluation Settings • Distractor Setting • 2 gold paragraphs + 8 extracted from information retrieval • Fullwiki Setting • Entire Wikipedia as context

  13. • Types of Instances • Bridge Entity Questions • Comparison Questions

  14. Topics • Introduction and HotpotQA • Select, Answer and Explain • GNNs • Answer and Explain • Results and Ablation Study • Reviews

  15. Multi-hop RC – Previous Works • Adapt techniques from single-hop QA • Use Graph Neural Networks (GNNs) • Cao et al., 2018 – Build entity graph and realize multi-hop reasoning

  16. Shortcomings – Previous Works • Concatenate multiple documents / Process documents separately • No document filters • Current application of GNNs • Entities as nodes – either pre specified / use NER • Further processing if answer is not an entity

  17. Select, Answer and Explain (SAE)

  18. Preprocessing & Inputs • Question and set of documents • Answer text • Set of labelled support sentences from each document • Label corresponding to each document - 𝐸 𝑗 (0/1) • Answer type – (“Span” / “Yes” / “No”)

  19. Select Module • [CLS] + Q + [SEP] + D + [SEP] • One Approach – Use BCE with [CLS] embeddings as features • Neglects inter-document interactions

  20. MHSA – Single Attention Head X – matrix of [CLS] embeddings of question/document pairs

  21. MHSA – Multiple Attention Heads Output is the matrix of modified [CLS] embeddings having contextual information

  22. Pairwise Bi-Linear Layer • 𝑇 𝐸 𝑗 - Score for each document (0/1/2) • 𝑚 𝑗,𝑘 = 1 𝑗𝑔 𝑇 𝐸 𝑗 > 𝑇 𝐸 𝑘 0 𝑗𝑔 𝑇 𝐸 𝑗 ≤ 𝑇(𝐸 𝑘 ) 𝑜 𝑗 • 𝑀 = − 𝑗=0 𝑘=0,𝑘≠𝑗 𝑚 𝑗,𝑘 log(𝑄 𝐸 𝑗 , 𝐸 𝑘 ) + (1 − 𝑚 𝑗,𝑘 )log(1 − 𝑄 𝐸 𝑗 , 𝐸 𝑘 ) 𝑜 Ι 𝑄 𝐸 𝑗 , 𝐸 • 𝑆 𝑗 = 𝑘 𝑘 > 0.5 - Relevance score for each document • Take top-k documents according to this relevance score

  23. Answer Prediction • Gold Documents extracted from Select Module • [CLS] + Q + [SEP] + Context + [SEP] 𝐼 𝑗 ∈ ℝ 𝑀 × 𝑒 BERT 2-Layer 𝑍 ∈ ℝ 𝑀 × 2 MLP ( 𝑔 𝑡𝑞𝑏𝑜 )

  24. Contextual Sentence Embeddings • Sentence Representation: • Self Attention Weights: Weighted 2-layer MLP [0/1 Label] Representation ( 𝑔 𝑏𝑢𝑢 )

  25. Contextual Sentence Embeddings - 2 • Motivation for adding start and end span probabilities • Answer span -> Supporting Sentence • Final sentence embeddings:

  26. Sentence Graph • Construct a graph with the following properties: • Nodes represent the sentences • Each node has label 0/1 (supporting sentence) • 3 types of edges • Between nodes present in the same document (Type 1) • Between nodes of different documents if they have named entities / noun phrases (can be different) present in the question (Type 2) • Between nodes of different documents if they have the same named entity / noun phrase (Type 3)

  27. Sentence Graph

  28. Topics • Introduction and HotpotQA • Select, Answer and Explain • GNNs • Answer and Explain • Results and Ablation Study • Reviews

  29. Images T ext/Speech Modern deep learning toolbox is designed for simple sequences & grids Jure Leskovec, Stanford University 11

  30. The Math § Average neighbor messages and apply a neural network. Initial “ layer 0 ” embeddings are previous layer equal to node features h 0 embedding of v v = x v 0 1 X h k − 1 N ( v ) | + B k h k − 1 @ W k A , 8 k > 0 h k u v = σ v | u 2 N ( v ) kth layer embedding non-linearity (e.g., average of neighbor ’ s of v ReLU or tanh) previous layer embeddings 19 Tutorial on Graph Representation Learning, AAAI 2019

  31. Graph Attention Networks § Augment basic graph neural network model with attention. X ↵ v,u W k h k − 1 h k v = σ ( ) u u 2 N ( v ) [ { v } Non-linearity Sum over all neighbors (and the node itself) Tutorial on Graph Representation Learning, AAAI 2019 60

  32. Training the Model § z A u Tutorial on Graph Representation Learning, AAAI 2019 20

  33. Training the Model trainable matrices h 0 v = x v (i.e., what we learn) 0 1 X h k − 1 + B k h k − 1 @ W k A , 8 k 2 { 1 , . h k u v = σ , K } . . v | N ( v ) | u 2 N ( v ) z v = h K v § After K-layers of neighborhood aggregation, we get output embeddings for each node. § and run stochastic gradient descent to train the aggregation parameters. Tutorial on Graph Representation Learning, AAAI 2019 21

  34. Training the Model § : Directly train the model for a supervised task (e.g., node classification): classification weights X v ✓ )) + (1 − y v ) log(1 − σ ( z > v ✓ )) y v log( σ ( z > L = v 2 V output node embedding node class label Tutorial on Graph Representation Learning, AAAI 2019 24

  35. Overview of Model z A u Tutorial on Graph Representation Learning, AAAI 2019 25

  36. Overview of Model Tutorial on Graph Representation Learning, AAAI 2019 26

  37. Overview of Model Tutorial on Graph Representation Learning, AAAI 2019 27

  38. Topics • Introduction and HotpotQA • Select, Answer and Explain • GNNs • Answer and Explain • Results and Ablation Study • Reviews

  39. Aggregation mechanism in SAE

  40. Graph Representation • Weighted sum of the embeddings of the nodes of the graph • The weights are given by

  41. Answer and Explain Pipeline

  42. Topics • Introduction and HotpotQA • Select, Answer and Explain • GNNs • Answer and Explain • Results and Ablation Study • Reviews

  43. Dataset Details • Train – 90K • Validation/Dev – 7.4K • Test – 7.4K

  44. Results

  45. Ablation Study – Document Selection Module

  46. Ablation Study – Answer & Explain Module

  47. Ablation Study – Bridge / Comp. Questions

  48. Attention Heatmap Example Question - “Were Scott Derrickson and Ed Wood of the same nationality?”

  49. HotpotQA Leaderboard

  50. Topics • Introduction and HotpotQA • Select, Answer and Explain • GNNs • Answer and Explain • Results and Ablation Study • Reviews

Recommend


More recommend