Learning Sentence Planning Rules with Bayesian Methods David M. Howcroft Department of Language Science and Technology Saarland Informatics Campus, Saarland University, Germany 24 October 2018 Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 1 / 23
Natural Language Generation How do we transform non-linguistic input ... name price cuisine Due Fratelli $$ Italian Andalucia $$$ Spanish, Seafood ...into a natural language text ? Due Fratelli is an Italian restaurant, while Andalucia is a Spanish seafood restaurant. However, Due Fratelli’s price is average, while Andalucia’s price is more expensive. Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 2 / 23
Templates and traditional approaches to NLG name price cuisine Due Fratelli $$ Italian Andalucia $$$ Spanish, Seafood Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 3 / 23
Templates and traditional approaches to NLG name price cuisine Due Fratelli $$ Italian Andalucia $$$ Spanish, Seafood Simplest case: use templates! Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 3 / 23
Templates and traditional approaches to NLG name price cuisine Due Fratelli $$ Italian Andalucia $$$ Spanish, Seafood Simplest case: use templates! inform(NAME, CUISINE) Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 3 / 23
Templates and traditional approaches to NLG name price cuisine Due Fratelli $$ Italian Andalucia $$$ Spanish, Seafood Simplest case: use templates! inform(NAME, CUISINE) → “NAME is a CUISINE restaurant” Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 3 / 23
Templates and traditional approaches to NLG name price cuisine Due Fratelli $$ Italian Andalucia $$$ Spanish, Seafood Simplest case: use templates! inform(NAME, CUISINE) → “NAME is a CUISINE restaurant” → “NAME serves CUISINE food” Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 3 / 23
Templates and traditional approaches to NLG name price cuisine Due Fratelli $$ Italian Andalucia $$$ Spanish, Seafood Simplest case: use templates! inform(NAME, CUISINE) → “NAME is a CUISINE restaurant” → “NAME serves CUISINE food” But this doesn’t generalize; only good for limited applications! Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 3 / 23
Templates and traditional approaches to NLG name price cuisine Due Fratelli $$ Italian Andalucia $$$ Spanish, Seafood Simplest case: use templates! inform(NAME, CUISINE) → “NAME is a CUISINE restaurant” → “NAME serves CUISINE food” But this doesn’t generalize; only good for limited applications! Solution : modularity for reusability Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 3 / 23
Templates and traditional approaches to NLG name price cuisine Due Fratelli $$ Italian Andalucia $$$ Spanish, Seafood Simplest case: use templates! inform(NAME, CUISINE) → “NAME is a CUISINE restaurant” → “NAME serves CUISINE food” But this doesn’t generalize; only good for limited applications! Solution : modularity for reusability ◮ Factor out language-general elements Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 3 / 23
‘The’ NLG Pipeline Meaning Representation (MR): Text Planning Content Selection DB records, slot-value pairs, etc Document Structuring Sentence Planning Lexicalization Aggregation Referring Expres- sion Generation Surface Realization Linearization Morphosyntactic Agreement Punctuation & Natural Language Text Capitalization More reusable components, but still requires a lot of human attention. Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 4 / 23
End-to-end machine learning approaches to NLG Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 5 / 23
End-to-end machine learning approaches to NLG Represent meanings as collections of slot-value pairs... Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 5 / 23
End-to-end machine learning approaches to NLG Represent meanings as collections of slot-value pairs... inform(name=DueFratelli, cuisine=Italian) inform(name=Andalucia, cuisine=Spanish,Seafood) Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 5 / 23
End-to-end machine learning approaches to NLG Represent meanings as collections of slot-value pairs... inform(name=DueFratelli, cuisine=Italian) inform(name=Andalucia, cuisine=Spanish,Seafood) Represent texts as sequences of words... Due Fratelli serves Italian food. Andalucia is a Spanish, seafood restaurant. Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 5 / 23
End-to-end machine learning approaches to NLG Represent meanings as collections of slot-value pairs... inform(name=DueFratelli, cuisine=Italian) inform(name=Andalucia, cuisine=Spanish,Seafood) Represent texts as sequences of words... Due Fratelli serves Italian food. Andalucia is a Spanish, seafood restaurant. Apply your favorite sequence-to-sequence model Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 5 / 23
End-to-end machine learning approaches to NLG Represent meanings as collections of slot-value pairs... inform(name=DueFratelli, cuisine=Italian) inform(name=Andalucia, cuisine=Spanish,Seafood) Represent texts as sequences of words... Due Fratelli serves Italian food. Andalucia is a Spanish, seafood restaurant. Apply your favorite sequence-to-sequence model ◮ Bayesian networks for gen. w/active learning (Mairesse et al. 2010; BAGEL) Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 5 / 23
End-to-end machine learning approaches to NLG Represent meanings as collections of slot-value pairs... inform(name=DueFratelli, cuisine=Italian) inform(name=Andalucia, cuisine=Spanish,Seafood) Represent texts as sequences of words... Due Fratelli serves Italian food. Andalucia is a Spanish, seafood restaurant. Apply your favorite sequence-to-sequence model ◮ Bayesian networks for gen. w/active learning (Mairesse et al. 2010; BAGEL) ◮ Semantically-Conditioned LSTM (Wen et al. 2015) ◮ TGen (Dušek & Jurčíček 2016) ◮ Neural Checklist Model (Kiddon et al. 2016) Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 5 / 23
End-to-end machine learning approaches to NLG Represent meanings as collections of slot-value pairs... inform(name=DueFratelli, cuisine=Italian) inform(name=Andalucia, cuisine=Spanish,Seafood) Represent texts as sequences of words... Due Fratelli serves Italian food. Andalucia is a Spanish, seafood restaurant. Apply your favorite sequence-to-sequence model ◮ Bayesian networks for gen. w/active learning (Mairesse et al. 2010; BAGEL) ◮ Semantically-Conditioned LSTM (Wen et al. 2015) ◮ TGen (Dušek & Jurčíček 2016) ◮ Neural Checklist Model (Kiddon et al. 2016) Problem : shallow representations and relative opacity Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 5 / 23
Can we have the best of both worlds? From the rule-based approach: ◮ existing resources for NLG; ◮ richer semantic & syntactic structure; ◮ inspectability; and ◮ modularity where it’s helpful. From the ML approach: ◮ reduced development effort Let’s find out! Narrowing our focus: ◮ assume text planning already done ◮ learn sentence planning rules ◮ use existing systems for surface realization Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 6 / 23
Text Plans with Discourse Structure Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 7 / 23
Text Plans with Discourse Structure Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 7 / 23
Text Plans with Discourse Structure Due Fratelli is an Italian restaurant, while Andalucia is a Spanish seafood restaurant. However, Due Fratelli’s price is average, while Andalucia’s price is more expensive. Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 7 / 23
Morphosyntactic Reps for Surface Realization Text Plan ‘Logical Form’ be contrast 0 1 A M g g r r Arg1 o r g A A 2 d price price price dollar while Arg1 1 Arg2 A g Mod r Det r g A Arg1 2 the at 51 be SoniaRose 51 Bienvenue 35 Arg0 Arg1 Arg1 Sonia Rose price dollar Mod Det Arg1 the at 35 ◮ OpenCCG for surface realization Arg1 ◮ morphosyntactic rep: logical forms Bienvenue ◮ think ‘lemmatized dependency trees’ ◮ based on CCGbank (Hockenmaier 2006) = ⇒ WSJ coverage Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 8 / 23
How to represent tree-to-tree mappings? Synchronous Tree Substitution Grammars contrast but 1 A Next First g r r g A 2 be be price price Arg1 Arg2 1 A Arg0 Arg1 Arg0 Arg1 g r r g A 2 price dollar price dollar SoniaRose 51 Bienvenue 35 Mod Mod Det Det Mod Mod the at 51 the at 35 Arg1 Arg1 Sonia Rose Bienvenue contrast but Arg1 Arg2 First Next x y x y Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 9 / 23
How to represent tree-to-tree mappings? Synchronous Tree Substitution Grammars contrast but 1 A Next First g r r g A 2 be be price price Arg1 Arg2 1 A Arg0 Arg1 Arg0 Arg1 g r r g A 2 price dollar price dollar SoniaRose 51 Bienvenue 35 Mod Mod Det Det Mod Mod the at 51 the at 35 price be Arg1 Arg2 Arg0 Arg1 Arg1 Arg1 Sonia Rose Bienvenue Bienvenue 35 price dollar Mod Det Mod the at 35 Arg1 Bienvenue Howcroft (UdS) Learning Sentence Planning 24 Oct 2018 10 / 23
Recommend
More recommend