Machine Comprehension with Discourse Relations Karthik Narasimhan - PowerPoint PPT Presentation

Machine Comprehension with Discourse Relations Karthik Narasimhan Regina Barzilay CSAIL, Massachusetts Institute of Technology 1

Sally ¡liked ¡going ¡outside. ¡She ¡put ¡on ¡her ¡shoes. ¡She ¡went ¡ outside ¡to ¡walk. ¡[...] ¡Missy ¡the ¡cat ¡meowed ¡to ¡Sally. ¡Sally ¡waved ¡ to ¡Missy ¡the ¡cat. ¡[...] ¡Sally ¡hears ¡her ¡name. ¡”Sally, ¡Sally, ¡come ¡ home”, ¡Sally’s ¡mom ¡calls ¡out. ¡Sally ¡runs ¡home ¡to ¡her ¡Mom. ¡ Sally ¡liked ¡going ¡outside. ¡ ¡ ¡ Why ¡did ¡Sally ¡put ¡on ¡her ¡shoes?   A) ¡To ¡wave ¡to ¡Missy ¡the ¡cat   B) ¡To ¡hear ¡her ¡name   C) ¡Because ¡she ¡wanted ¡to ¡go ¡outside ¡ ¡ D) ¡To ¡come ¡home ¡ Sample passage excerpt and question in a Machine Comprehension task 2

Reasoning over multiple sentences Accuracy of baseline systems on MC500-test 80 Multi-sentential questions 60 are significantly harder 40 than single sentence ones 20 0 SWD RTE RTE+SWD Single Multi We focus on modeling multi-sentence relations to improve Q&A performance. 3

Is there only a single relation? Causality Temporality She put on her shoes She went outside to walk Why did Sally put on her shoes? When did Sally put on her shoes? Relation between two clauses is question- dependent. 4

Key idea: Learn relations optimized for MC Sally ¡liked ¡going ¡outside. ¡[…] ¡ Why ¡did ¡Sally ¡put ¡on ¡her ¡shoes?   C) ¡Because ¡she ¡wanted ¡to ¡go ¡ outside ¡ ✓ Traditional approach: Use o ff -the-shelf Training data: Q&A pairs discourse analyzers (Source: Feng and Hirst, 2012) Hypothesis : Task-based discourse relations can facilitate better Comprehension Q&A 5

Fully supervised case Causality She put on her shoes She went outside to walk Why ¡did ¡Sally ¡put ¡on ¡her ¡shoes?   A) ¡To ¡wave ¡to ¡Missy ¡the ¡cat   B) ¡To ¡hear ¡her ¡name   C) ¡Because ¡she ¡wanted ¡to ¡go ¡outside ¡ ✓ ¡ D) ¡To ¡come ¡home ¡ 6

She put on her shoes She went outside to walk Why ¡did ¡Sally ¡put ¡on ¡her ¡shoes?   A) ¡To ¡wave ¡to ¡Missy ¡the ¡cat   B) ¡To ¡hear ¡her ¡name   C) ¡Because ¡she ¡wanted ¡to ¡go ¡outside ¡ ✓ ¡ D) ¡To ¡come ¡home ¡ 7

Sally ¡liked ¡going ¡outside. ¡She ¡put ¡on ¡her ¡shoes. ¡She ¡went ¡outside ¡to ¡ walk. ¡[...] ¡Missy ¡the ¡cat ¡meowed ¡to ¡Sally. ¡Sally ¡waved ¡to ¡Missy ¡the ¡ cat. ¡[...] ¡Sally ¡hears ¡her ¡name. ¡”Sally, ¡Sally, ¡come ¡home”, ¡Sally’s ¡mom ¡ calls ¡out. ¡Sally ¡runs ¡home ¡to ¡her ¡Mom. ¡Sally ¡liked ¡going ¡outside. ¡ Why ¡did ¡Sally ¡put ¡on ¡her ¡shoes?   A) ¡To ¡wave ¡to ¡Missy ¡the ¡cat   B) ¡To ¡hear ¡her ¡name   C) ¡Because ¡she ¡wanted ¡to ¡go ¡outside ¡ ✓ ¡ D) ¡To ¡come ¡home ¡ 8

Key Steps Infer correct relation Causality Identify relevant sentences She put on her shoes She went outside to walk Why ¡did ¡Sally ¡put ¡on ¡her ¡shoes?   C) ¡Because ¡she ¡wanted ¡to ¡go ¡outside ¡ ✓ Select correct answer 9

Three models Identify most relevant sentence from passage Expand to a set of sentences Infer inter-sentential relations 10

Sally ¡liked ¡going ¡outside. ¡She ¡put ¡on ¡her ¡shoes. ¡She ¡went ¡ outside ¡to ¡walk. ¡[...] ¡Missy ¡the ¡cat ¡meowed ¡to ¡Sally. ¡Sally ¡waved ¡ to ¡Missy ¡the ¡cat. ¡[...] ¡Sally ¡hears ¡her ¡name. ¡”Sally, ¡Sally, ¡come ¡ home”, ¡Sally’s ¡mom ¡calls ¡out. ¡Sally ¡runs ¡home ¡to ¡her ¡Mom. ¡ Sally ¡liked ¡going ¡outside. ¡ ¡ ¡ Why ¡did ¡Sally ¡put ¡on ¡her ¡shoes?   A) ¡To ¡wave ¡to ¡Missy ¡the ¡cat   Sentence - z B) ¡To ¡hear ¡her ¡name   Question - q C) ¡Because ¡she ¡wanted ¡to ¡go ¡outside ¡ ¡ D) ¡To ¡come ¡home ¡ Answer - a 11

Identifying a relevant sentence (Model 1) ‣ Retrieve a single relevant sentence from passage. ‣ Joint model over sentence z and answer choice a , given question q . P ( a, z | q ) = P ( z | q ) · P ( a | z, q ) 12

Identifying a relevant sentence set (Model 2) Extends model 1 to select a pair of relevant sentences from passage. ‣ Retrieve a second sentence z 2 conditioned on both question q and the first retrieved sentence z 1 . P ( a, z 1 , z 2 | q ) = P ( z 1 | q ) · P ( z 2 | z 1 , q ) · P ( a | z 1 , z 2 , q ) 13

Incorporating Relations (Model 3) Capture inter-sentential relations, modeled as hidden variables. P ( a, r, z 1 , z 2 | q ) = P ( z 1 | q ) · P ( r | q ) · P ( z 2 | z 1 , r, q ) · P ( a | z 1 , z 2 , r, q ) Flexibility to induce relations between sentences conditioned on the question . 14

Learning ‣ Supervision: question-answer pairs. ‣ Marginalize over hidden variables z and r to get P( a | q ). ‣ Maximize the following objective (model 3): X X ij , z im , z in , r | q ij ) − λ || θ || 2 P ( a ∗ L 3 ( θ ; P train ) = log i,j,m,r ∈ R n ∈ [ m − k,m + k ] 15

Prediction For a given question q , simply choose answer with highest P( a | q ). ‣ Marginalize over all hidden variables z and r . ˆ a j = argmax P ( a jk | q j ) k 16

Lexical Features Type 1 (q, z): ‣ Unigram and bigram matches + entity and action matches Type 2 (q, a, z1, [z2]): ‣ Capture interactions between a, q and sentence(s) (z1, z2). 17

Relational Features Type 3 (q, r, z1, z3) and Type 4 (q, r): ‣ Inter-sentence distance, presence of relation- specific markers (small seed list) in sentences. ‣ Second-order: cross of above features with entity and action match counts. ‣ Connect question word with relation type (Ex. why and Causality ) 18

Discourse in Q&A Prior work has shown value of domain-independent discourse relations in Q&A. ‣ Chai and Jin (2004) incorporate discourse processing into context Q&A. ‣ Verberne et al. (2007) use Rhetorical Structure Theory (RST) to relate question topics and answers. ‣ Jansen et al. (2014) use discourse information to improve answer re-ranking for non-factoid Q&A. 19

Experiments ‣ Data: MCTest (Richardson et al., 2013) Split& MC160& MC500& Passages& Ques4ons& Passages& Ques4ons& Train& 70& 280& 300& 1200& Dev& 30& 120& 50& 200& Test& 60& 240& 150& 600& ‣ > 50% of questions require information from multiple sentences. ‣ Evaluation: Answering accuracy with partial credit for ties (as previously used). 20

Baselines Systems from Richardson et al. (2013) ‣ SWD: uses sliding window to count matches between passage words and words in answer. ‣ RTE: utilizes a textual entailment system to determine if answer is entailed by passage. ‣ RTE+SWD: weighted combination of systems above 21

Comprehension Accuracy Accuracy'of'baselines'compared'to'our'model' 75# 70# SWD# Accuracy' 65# RTE# 60# SWD+RTE# Model#3# 55# 50# MC160#test# MC500#test# 22

Accuracy by Question Type Comparison'of'our'different'model'variants' 71" 69" 67" 65" Accuracy' Model"1" 63" Model"2" 61" 59" Model"3" 57" 55" Single" Mul0" Overall" MC500'test' 23

RST'augmented'model'2'vs'Model'3' 70# 68# 66# 64# Accuracy' 62# 60# Model#2#+#RST# 58# Model#3# 56# 54# 52# 50# Single# Mul1# Overall# MC500'test' Task-based discourse relations can facilitate better Comprehension Q&A 77% of the predicted RST relations are Elaboration! 24

Evaluation using Human judgements We annotated 240 questions from MC160 test set with most relevant sentence(s) in passage, and relations between sentence pairs. ‣ 103 sentence pairs with annotated relations ‣ 34% of these have relevant discourse markers occurring anywhere in sentences. ‣ Only 9% of sentences have a marker at an end. 25

Sentence Retrieval Freq Model 1 Model 2 Model 3 90 67.5 45 22.5 0 Single Multi Overall Table: Recall (@5) of relevant sentences retrieved by different models compared to human judgements. 26

Relation Prediction Relation R @ 1 R @ 2 Causal 56.25 75.00 Temporal 27.27 54.54 Explanation 16.66 33.33 Other 57.40 64.81 Overall 51.45 65.04 Table : Recall of annotated relations at various thresholds in ranking produced by Model 3 27

Conclusions ‣ Discourse relations help in the task of machine comprehension Q&A involving multiple sentences. ‣ A task-specific approach of incorporating discourse information does better than using off- the-shelf analyzers. Code and data will be available at: http://people.csail.mit.edu/karthikn/mcdr/ 28

Machine Comprehension with Discourse Relations Karthik Narasimhan - PowerPoint PPT Presentation

Machine Comprehension with Discourse Relations Karthik Narasimhan Regina Barzilay CSAIL, Massachusetts Institute of Technology 1 Sally liked going outside. She put on her shoes. She went outside to walk. [...]

Using Natural Language Relations between Answer Choices for Machine Comprehension Rajkumar Pujari

On the one hand as a Cue in the Comprehension of Discourse Structure Vera Demberg a , Hannah

Explicit Discourse Connectives Implicit Discourse Relations Bonnie Webber Hannah Rohde

Explicit Discourse Connectives Implicit Discourse Relations Bonnie Webber Hannah Rohde

Automatically Evaluating Text Coherence Using Discourse Relations Ziheng Lin , Hwee Tou Ng and

Discourse Coherence Lecture Plan: Einf uhrung in Pragmatik Discourse cohesion and

A Systematic Study of Neural Discourse Models for Implicit Discourse Relation Attapol T.

cohesion, coherence, RST delimiting units of discourse meaning and... Magdalena Wolska 10

COMPREHENSION Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, Hannaneh Hajishirzi Presenter: Wenda

Explicit and Implicit Discourse Relations: An Extrinsic Evaluation Peter Bourgonje and Manfred

Computational Models of Discourse Regina Barzilay MIT What is Discourse? What is Discourse?

Measuring Non-Expert Comprehension of Machine Learning Fairness Metrics Debjani Saha , Candice

Evaluation Metrics for Machine Reading Comprehension (RC): Prerequisite Skills and Readability

Interplay of Coreference and Discourse Research and Annotations Anna Nedoluzhko Charles University,

Representa)on Learning for Reading Comprehension Russ Salakhutdinov Machine Learning Department

Computational Discourse 11-711 Algorithms for NLP 31 October 2019 What Is Discourse? Discourse

Computational Discourse 11-711 Algorithms for NLP 15 November 2018 What Is Discourse? Discourse

Lets not lose any information: mapping discourse relations Vera Demberg Universit at des

SQuAD:100,000+ Questions for Machine Comprehension of Text Pranav Rajpurkar, Jian Zhang,

Multiple concurrent discourse relations Hannah Rohde, Bonnie Webber, Nathan Schneider, &

Argumentative Text: From Argument Schemes to Discourse Relations Elena Musi Tariq Alhindi

Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification Lianhui

Discourse Coherence: Concurrent Explicit and Implicit Relations Hannah Rohde, Alexander Johnson,

Discourse-Level

Machine Comprehension with Discourse Relations Karthik Narasimhan - PowerPoint PPT Presentation

Machine Comprehension with Discourse Relations Karthik Narasimhan Regina Barzilay CSAIL, Massachusetts Institute of Technology 1 Sally liked going outside. She put on her shoes. She went outside to walk. [...]

Using Natural Language Relations between Answer Choices for Machine Comprehension Rajkumar Pujari

On the one hand as a Cue in the Comprehension of Discourse Structure Vera Demberg a , Hannah

Explicit Discourse Connectives Implicit Discourse Relations Bonnie Webber Hannah Rohde

Explicit Discourse Connectives Implicit Discourse Relations Bonnie Webber Hannah Rohde

Automatically Evaluating Text Coherence Using Discourse Relations Ziheng Lin , Hwee Tou Ng and

Discourse Coherence Lecture Plan: Einf uhrung in Pragmatik Discourse cohesion and

A Systematic Study of Neural Discourse Models for Implicit Discourse Relation Attapol T.

cohesion, coherence, RST delimiting units of discourse meaning and... Magdalena Wolska 10

COMPREHENSION Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, Hannaneh Hajishirzi Presenter: Wenda

Explicit and Implicit Discourse Relations: An Extrinsic Evaluation Peter Bourgonje and Manfred

Computational Models of Discourse Regina Barzilay MIT What is Discourse? What is Discourse?

Measuring Non-Expert Comprehension of Machine Learning Fairness Metrics Debjani Saha , Candice

Evaluation Metrics for Machine Reading Comprehension (RC): Prerequisite Skills and Readability

Interplay of Coreference and Discourse Research and Annotations Anna Nedoluzhko Charles University,

Representa)on Learning for Reading Comprehension Russ Salakhutdinov Machine Learning Department

Computational Discourse 11-711 Algorithms for NLP 31 October 2019 What Is Discourse? Discourse

Computational Discourse 11-711 Algorithms for NLP 15 November 2018 What Is Discourse? Discourse

Lets not lose any information: mapping discourse relations Vera Demberg Universit at des

SQuAD:100,000+ Questions for Machine Comprehension of Text Pranav Rajpurkar, Jian Zhang,

Multiple concurrent discourse relations Hannah Rohde, Bonnie Webber, Nathan Schneider, &amp;

Argumentative Text: From Argument Schemes to Discourse Relations Elena Musi Tariq Alhindi

Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification Lianhui

Discourse Coherence: Concurrent Explicit and Implicit Relations Hannah Rohde, Alexander Johnson,

Discourse-Level

Multiple concurrent discourse relations Hannah Rohde, Bonnie Webber, Nathan Schneider, &