Improving Domain Independent Question Parsing with Synthetic Treebanks COLING 2018: LAW-MWE-CxG Halim-Antoine Boukaram , Nizar Habash, † Micheline Ziadee, and Majd Sakr ‡ American University of Science and Technology, Lebanon † New York University Abu Dhabi, UAE ‡ Carnegie Mellon University, USA {hboukaram,mziadee}@aust.edu.lb, nizar.habash@nyu.edu, msakr@cs.cmu.edu
Problem & Solution ● Automatic parsers do not perform well on question constructions ○ Most treebanks used for training are in the news domain which lacks question constructions ● Our proposed solution is to synthetically create syntactic trees of questions on which to train parsers ● We present our results on Standard Arabic, a morphologically rich and relatively low-resource language 1
Example of Question Parsing Errors To where do I go to submit the application? ؟ بلطلا مداق هجا نًا ىلا Automatically Parsed Human Parsed 2
Example of Question Parsing Errors What time will the celebration start? ؟ لافتاقا اهيف أدبيس ةعاس يأ Automatically Parsed Human Parsed 3
Research Questions ● We explore two effective and low-cost techniques to add more annotated questions to the training corpus ○ Automatically Generating Questions from Existing Treebanks ○ Automatically Generating Questions from Question Templates ● Research questions: ○ How do these techniques compare with manual annotation of additional questions? ○ Do combinations of synthetic and manual data improve accuracy? 4
Technique #1: QGen ● Automatically transform an annotated sentence into a number of annotated questions (4.75 on average) ○ (S (NP-SPJ the boy) (VP ate (NP-OBJ the apple))) ○ → (SBARQ (WHNP who) (S (VP ate (NP-OBJ the apple)))) ○ → (SQ (VP did) (NP-SPJ the boy) (VP eat (NP-OBJ the apple))) 5
Technique #1: QGen ● Words of the input tree are modified morphologically depending on the type of generated question ○ Arabic Who questions ■ Sentences with gender- and number-specific verbs → Questions with masculine-singular verbs 6
QGen Examples Original phrase structure Simple SQ Structure 7
QGen Examples Original phrase structure Simple SBARQ Structure 8
QGen Examples Original phrase structure Modified SQ Structure 9
QGen Examples Original phrase structure Modified SBARQ Structure 10
QGen Examples Original phrase structure Modified SBARQ Structure (who) 11
Limitations of QGen ● Errors in the resulting synthetic data due to overgeneration resulting in nonsensical synthetic questions ● Limited coverage of modeled question structures ● Input domain might be different from desired question domain 12
Technique #2: QTemp ● Generate question templates ● Fill the template by filling the in a desired domain placeholder elements ○ Where is %place%? ○ Where is the bathroom? ● Annotate question templates ○ Where is a bathroom? ○ Where is the dean’s office? ○ Where is the finance office? ○ Where is ...? 13
QTemp Examples Annotated question template Annotated token Annotated question + = 14
QTemp Examples Annotated question template Annotated token Annotated question + = 15
Experimental Setup ● Baseline Treebank is Penn Arabic Treebank (PATB) ● Two Synthetic Treebanks ○ QGen and QTemp ● Two Manually Annotated Treebanks ○ TalkShow and Chatbot ● Test accuracy of parser trained using: ○ Synthetic vs Manual ○ Combined vs Synthetic or Manual 16
Data Sets Treebank Domain Train # Sentences (# Words) Test # Sentences (# Words) PATB (part3) News articles 10,836 (320,998) 794 (12,884) PATBQ News articles N/A 67 (1,054) TalkShow Political talk show 544 (2,691) 143 (692) Chatbot Conversational 239 (1,505) 62 (441) QGEN PATB News articles (Synthetic) 962 (8,140) N/A QTemp Conversational (Synthetic) 1,607 (13,099) N/A 17
Results Corpus Baseline +Synthetic Manual All PATB Train QGEN PATB + QTemp TalkShow + Chatbot PATB 80.6 80.6 80.6 80.9 PATBQ 73.8 74.0 74.9 75.9 Test TalkShow 88.2 87.3 91.4 92.9 Chatbot 90.5 93.6 93.3 94.1 Macro Average Q 84.2 84.9 86.5 87.6 18
Results Corpus Baseline +Synthetic Manual All PATB Train QGEN PATB + QTemp TalkShow + Chatbot PATB 80.6 80.6 80.6 80.9 PATBQ 73.8 74.0 74.9 75.9 Test TalkShow 88.2 87.3 91.4 92.9 Chatbot 90.5 93.6 93.3 94.1 Macro Average Q 84.2 84.9 86.5 87.6 19
Results Corpus Baseline +Synthetic Manual All PATB Train QGEN PATB + QTemp TalkShow + Chatbot PATB 80.6 80.6 80.6 80.9 PATBQ 73.8 74.0 74.9 75.9 Test TalkShow 88.2 87.3 91.4 92.9 Chatbot 90.5 93.6 93.3 94.1 Macro Average Q 84.2 84.9 86.5 87.6 20
Results Corpus Baseline +Synthetic Manual All PATB Train QGEN PATB + QTemp TalkShow + Chatbot PATB 80.6 80.6 80.6 80.9 PATBQ 73.8 74.0 74.9 75.9 Test TalkShow 88.2 87.3 91.4 92.9 Chatbot 90.5 93.6 93.3 94.1 Macro Average Q 84.2 84.9 86.5 87.6 21
Results Corpus Baseline +Synthetic Manual All PATB Train QGEN PATB + QTemp TalkShow + Chatbot PATB 80.6 80.6 80.6 80.9 PATBQ 73.8 74.0 74.9 75.9 Test TalkShow 88.2 87.3 91.4 92.9 Chatbot 90.5 93.6 93.3 94.1 Macro Average Q 84.2 84.9 86.5 87.6 22
Conclusions and Future Work ● Synthetic question treebanks are useful for improving question parsing ● The domain of the synthetic treebanks must match the desired domain of questions we are interested in parsing ● We will investigate how applicable the synthetic techniques are to other languages ● We will write more question generating procedures ● The Manual and Synthetic Treebanks will be published through the Linguistic Data Consortium 23
Thank You
Recommend
More recommend