commonsense for generative multi hop question answering
play

Commonsense for Generative Multi-Hop Question Answering Tasks Lisa - PowerPoint PPT Presentation

Commonsense for Generative Multi-Hop Question Answering Tasks Lisa Bauer* Yicheng Wang* Mohit Bansal 1 Outline Reading Comprehension Task (Lisa) Reading Comprehension Baseline (Yicheng) Commonsense Extraction (Lisa)


  1. Commonsense for Generative Multi-Hop Question Answering Tasks Lisa Bauer* Yicheng Wang* Mohit Bansal 1

  2. Outline • Reading Comprehension Task (Lisa) • Reading Comprehension Baseline (Yicheng) • Commonsense Extraction (Lisa) • Commonsense Model Integration (Yicheng) • Results on NarrativeQA & WikiHop (Yicheng) 2

  3. Reading Comprehension Task Context Question "Sir Leicester Dedlock and his wife Lady Honoria live on his “What is the connection estate at Chesney Wold.." between Esther and Lady "..Unknown to Sir Leicester, Dedlock?” Lady Dedlock had a lover .. before she married and had a daughter with him.." ".. Lady Dedlock believes her multi-hop daughter is dead. The reasoning daughter , Esther , is in fact Answers alive.." "..Esther sees Lady Dedlock at “Mother and daughter.” church and talks with her later “Mother and illegitimate at Chesney Wold though neither woman recognizes their child.” connection.." 3

  4. NarrativeQA [Kočiský et al., 2018] • Motivation • Answer spans: 44.05% • Outside knowledge required: 42% • Challenges • Intricate event timelines “Who leads Mickey back to boxing after the HBO documentary is released?” • Large number of characters “Why did Sophia go to Russia with Alexei , instead of John ?” • Complex structure “ Why did Mickey have reservations about his fight in Atlantic City?” 4

  5. Baseline Multi-Hop Pointer-Generator • Success on Multi-Hop Reasoning QA datasets require a model to have: 5

  6. Baseline Multi-Hop Pointer-Generator • Success on Multi-Hop Reasoning QA datasets require a model to have: • Strong NLU capabilities 6

  7. Baseline Multi-Hop Pointer-Generator • Success on Multi-Hop Reasoning QA datasets require a model to have: • Strong NLU capabilities • Ability to extract disjoint pieces of information 7

  8. Baseline Multi-Hop Pointer-Generator • Success on Multi-Hop Reasoning QA datasets require a model to have: • Strong NLU capabilities • Ability to extract disjoint pieces of information • Tools to process long/interconnected context 8

  9. Baseline Multi-Hop Pointer-Generator • Success on Multi-Hop Reasoning QA datasets require a model to have: • Strong NLU capabilities • Ability to extract disjoint pieces of information • Tools to process long/interconnected context • Strong generative modelling capabilities (rare words) 9

  10. Baseline Multi-Hop Pointer-Generator Embedding Layer Q w 1 E Q ... m u e b r e Q w m y d C w 1 C w 2 C E o m n b t ... e e x d t C w n 10

  11. Baseline Multi-Hop Pointer-Generator Embedding Layer Reasoning Layer Q w 1 E Q ... m u e b r e Q w m y d C w 1 C w 2 C E o m n b t ... e e x d t C w n 11

  12. Baseline Multi-Hop Pointer-Generator Embedding Layer Reasoning Layer Q w 1 ... E Q ... m u e b r e Q w m y d C w 1 ... C w 2 C E o m n b t ... e e x d t C w n 12

  13. Baseline Multi-Hop Pointer-Generator Embedding Layer Reasoning Layer Q w 1 ... E Q ... m u e b r e Q w m y d C w 1 ... C w 2 C E o m n b t ... e e x d t k Reasoning Cells C w n 13

  14. Baseline Multi-Hop Pointer-Generator Embedding Layer Reasoning Layer Q w 1 ... E Q ... m u Self-Attention Layer e b r e Q w m y d Self-Attention C w 1 ... + C w 2 C E o m n b t ... e e x d t k Reasoning Cells C w n 14

  15. Baseline Multi-Hop Pointer-Generator Embedding Layer Decoding Layer Reasoning Layer Q w 1 ... E Q Context Representation ... m u Self-Attention Layer e b Attention r e Q w m y d Distribution Self-Attention C w 1 ... Context Vector + C w 2 C E o m n b t ... e e x d t p sel k Reasoning Cells x t-2 x t-3 x t-1 C w n Generative Distribution Final Distribution 15

  16. Baseline Reasoning Cell Reasoning Layer ... Q u e r y ... C o n t e x t Query Bi-LSTM C Bi-LSTM o n BiDAF t e x t 16 Baseline Reasoning Cell

  17. Baseline Ablations Model BLEU-1 ( ∆ ) BLEU-4 ( ∆ ) METEOR ( ∆ ) ROUGE-L ( ∆ ) CiDER( ∆ ) Baseline 42.3 (-) 18.9 (-) 18.3 (-) 44.9 (-) 151.6 (-) Single-Hop Baseline 32.5 (-9.8) 11.7 (-7.2) 12.9 (-5.4) 32.4 (-12.5) 95.7 (-55.9) Without ELMo 32.8 (-9.5) 12.7 (-6.2) 13.6 (-4.7) 33.7 (-11.2) 103.1 (-48.5) Without Self-Attn 37.0 (-5.3) 16.4 (-2.5) 15.6 (-2.7) 38.6 (-6.3) 125.6 (-26.0) 17

  18. Commonsense Requirements • Success on Multi-Hop Reasoning QA datasets require a model to have: • Strong NLU capabilities • Ability to extract disjoint pieces of information • Tools to process long/interconnected context • Strong generative modelling capabilities (rare words) 18

  19. Commonsense Requirements • Success on Multi-Hop Reasoning QA datasets require a model to have: • Strong NLU capabilities • Ability to extract disjoint pieces of information • Tools to process long/interconnected context • Strong generative modelling capabilities (rare words) • Reason with implicit relations not mentioned in the context 19

  20. Commonsense Addition Query Bi-LSTM C Bi-LSTM o ??? n BiDAF t e x t CS CS CS , , w w ..., w 1 2 l Commonsense Relations NOIC Reasoning Cell 20

  21. Types of Commonsense • Taxonomic Paula, like Charmian, is What physical disorders Insomnia and the inability subject to insomnia , and do Paul and Charmian Paula, like Charmian, is to have kids have in common? unable to bear children. 21

  22. Types of Commonsense • Taxonomic Paula, like Charmian, is What physical disorders Insomnia and the inability subject to insomnia , and do Paul and Charmian Paula, like Charmian, is to have kids have in common? unable to bear children. • Cause/Effect Having recently received an What position does Anne offer to be the principal of take at Summerside Principal the Summerside School in School? the fall, Anne is keeping herself occupied. 22

  23. Types of Commonsense • Taxonomic Paula, like Charmian, is What physical disorders Insomnia and the inability subject to insomnia , and do Paul and Charmian Paula, like Charmian, is to have kids have in common? unable to bear children. • Cause/Effect Having recently received an What position does Anne offer to be the principal of take at Summerside Principal the Summerside School in School? the fall, Anne is keeping herself occupied. • Colloquialisms To make ends meet and Jack took the job to pay Why did Jack take the job? against better judgement, he for necessities. takes a job as a croupier. 23

  24. ��� ConceptNet [Speer and Havasi, 2012] • A knowledge graph of is a semantic network ConceptNet common sense has knowledge graph semantic relations between knowledge part of natural language artificial is used for part of concepts understanding intelligence part of word embeddings • Has 28 million edges crowdsourced knowledge • Each edge represents one of made of lexicography games with a 37 types of semantic purpose the Semantic similar to linked data Web a relationship, e.g., UsedFor, k i n d o f has a Web API JSON-LD is used for FormOf, CapableOf, etc. has property open content multilíngue m y n o n y s multilingual synonym synonym domain-general motivated by goal let computers understand what people already know 24

  25. Commonsense Extraction [Speer and Havasi, 2012] Context Question ConceptNet "Sir Leicester Dedlock and his “What is the connection wife Lady Honoria live on his church between Esther and Lady estate at Chesney Wold.." Dedlock?” wife class "..Unknown to Sir Leicester, lady Lady Dedlock had a lover .. mother UK before she married and had a daughter with him.." lord person ".. Lady Dedlock believes her historical daughter is dead. The daughter , Esther , is in fact multi-hop alive.." Answers reasoning "..Esther sees Lady Dedlock at “Mother and daughter.” church and talks with her later “Mother and illegitimate at Chesney Wold though neither child.” woman recognizes their connection.." lady → mother → daughter → child 25

  26. Tree Construction Question concept lady 26

  27. Tree Construction Question concept lady church wife mother person Direct Interaction 27

  28. Tree Construction Question concept lady church wife mother person Direct Interaction Multi-Hop house marry daughter book lover help 28

  29. Tree Construction Question concept lady church wife mother person Direct Interaction Multi-Hop house marry daughter book lover help child child Outside Knowledge 29

  30. Tree Construction Question concept lady church wife mother person Direct Interaction Multi-Hop house marry daughter book lover help child child Outside Knowledge their Context Grounding 30 30

  31. Initial Node Scoring lady Term-Frequency church wife mother person freq= 1/1044 freq= 1/1044 freq= 3/1044 freq= 1/1044 house marry daughter book lover help child child their 31 31

  32. Initial Node Scoring lady Softmax Normalization church wife mother person 0.249 0.249 0.250 0.249 house marry daughter book lover help child child their 32 32

Recommend


More recommend