natural language generation
play

Natural Language Generation . .. . . .. .. . .. . . . .. - PowerPoint PPT Presentation

. .. . . .. . . .. . . .. . . .. . . . . Ondej Duek Ondej Duek 1/ 40 May 11 th , 2016 Charles University in Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics (mostly) for Spoken


  1. . .. .. . .. . . .. . . .. . . .. . . . . Standard NLG Pipeline ( Textbook ) Ondřej Dušek 4/ 40 [Text] [Sentence plan(s)] [Content plan] [Inputs] Textbook NLG Pipeline . Introduction to NLG . .. . . .. .. . . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . ↓ Content/text planning (“what to say”) • Content selection, basic ordering ↓ Sentence planning/microplanning (“middle ground”) • aggregation, lexical choice, referring… ↓ Surface realization (“how to say it”) • linearization, conforming to rules of the target language

  2. • Content selection according to communication goal • Basic structuring (ordering) . . . .. . . .. .. . .. . . .. . . . . . . .. . . .. . Introduction to NLG Textbook NLG Pipeline Standard NLG Pipeline ( Textbook ) Inputs weather report numbers etc.) Content planning 5/ 40 Ondřej Dušek .. .. . . . . .. . . .. . . .. . . .. . . .. .. . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Communication goal (e.g. “inform user about search results”) • Knowledge base (e.g. list of matching entries in database, • User model (constraints, e.g. user wants short answers) • Dialogue history (referring expressions, repetition)

  3. . . .. . . .. . . .. . . .. . . .. . .. . Textbook NLG Pipeline Ondřej Dušek 5/ 40 Content planning weather report numbers etc.) Inputs Standard NLG Pipeline ( Textbook ) Introduction to NLG . . .. . . .. . . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . . .. .. . . .. . . .. Natural Language Generation • Communication goal (e.g. “inform user about search results”) • Knowledge base (e.g. list of matching entries in database, • User model (constraints, e.g. user wants short answers) • Dialogue history (referring expressions, repetition) • Content selection according to communication goal • Basic structuring (ordering)

  4. • Creating linear text from (typically) structured input • Ensuring grammatical correctness . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . Introduction to NLG Textbook NLG Pipeline Standard NLG Pipeline ( Textbook ) Sentence planning (micro-planning) Surface realization 6/ 40 Ondřej Dušek .. .. . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Word and syntax selection (e.g. choose templates) • Dividing content into sentences • Aggregation (merging simple sentences) • Lexicalization • Referring expressions

  5. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . Introduction to NLG Textbook NLG Pipeline Standard NLG Pipeline ( Textbook ) Sentence planning (micro-planning) Surface realization 6/ 40 Ondřej Dušek . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • Word and syntax selection (e.g. choose templates) • Dividing content into sentences • Aggregation (merging simple sentences) • Lexicalization • Referring expressions • Creating linear text from (typically) structured input • Ensuring grammatical correctness

  6. • Templates, grammars, rules, statistics, or a mix thereof • Varied, custom-tailored, non-compatible . .. . . .. . . . . . .. . . .. . .. .. . .. . . .. . Introduction to NLG Real NLG Systems Real NLG Systems Few systems implement the whole pipeline realization Approaches Data representations 7/ 40 Ondřej Dušek .. . . . . . .. . . .. . . .. . . .. . . . .. . .. .. . . .. . . .. . . .. . . .. . . Natural Language Generation • Systems focused on content planning with trivial surface • Surface-realization-only, word-order-only systems • One-step (holistic) approaches • SDS: content planning done by dialogue manager → only sentence planning and realization here

  7. • Varied, custom-tailored, non-compatible . . . .. . . .. .. . .. . . .. . .. . . . Real NLG Systems Ondřej Dušek 7/ 40 Data representations Approaches realization Few systems implement the whole pipeline Real NLG Systems . Introduction to NLG . .. . . .. .. . . . .. . .. . . .. . . .. . . .. . . .. . . . . . .. . . .. .. . .. . . .. . . Natural Language Generation • Systems focused on content planning with trivial surface • Surface-realization-only, word-order-only systems • One-step (holistic) approaches • SDS: content planning done by dialogue manager → only sentence planning and realization here • Templates, grammars, rules, statistics, or a mix thereof

  8. . .. .. . .. . . .. . . .. . . .. . . . . Real NLG Systems Ondřej Dušek 7/ 40 Data representations Approaches realization Few systems implement the whole pipeline Real NLG Systems . Introduction to NLG . .. . . .. .. . . . . .. .. . . .. . . .. . . .. . . .. . . . . . .. . .. .. . . .. . . .. . . Natural Language Generation • Systems focused on content planning with trivial surface • Surface-realization-only, word-order-only systems • One-step (holistic) approaches • SDS: content planning done by dialogue manager → only sentence planning and realization here • Templates, grammars, rules, statistics, or a mix thereof • Varied, custom-tailored, non-compatible

  9. • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix • Surface realizer is (relatively) easy to implement • and you can use a third-party one • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools .. . . .. . . . . . . .. . . .. .. . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek .. . . .. .. . . .. . . . .. . .. . . .. . . . . .. .. . . .. . . .. . . . . . .. . . .. . Natural Language Generation • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning

  10. • Surface realizer is (relatively) easy to implement • and you can use a third-party one • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools . . .. . . .. . . .. .. . . .. . . . . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek .. . .. .. .. . . .. . . . . . .. . . .. . . . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix

  11. • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools . . . .. . . .. . . .. .. . .. .. . .. . . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek . . . . . . .. . . .. . . .. . . .. . . .. . . .. .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix • Surface realizer is (relatively) easy to implement • and you can use a third-party one

  12. • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools . . .. . . .. . . .. . . .. .. . . . .. . . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek .. . . . . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix • Surface realizer is (relatively) easy to implement • and you can use a third-party one • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up

  13. . . .. . .. .. . . .. . . .. . . .. . .. . . . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. .. . . • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix • Surface realizer is (relatively) easy to implement • and you can use a third-party one • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools

  14. • Each stage: . . . .. . . .. .. . .. . . .. . . . . . 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples .. NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . .. .. . . .. . . Natural Language Generation • Divided by NLG stage:

  15. • Each stage: . . . .. . . .. .. . .. . . .. . . . . . 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples .. NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . .. .. . . .. . . Natural Language Generation • Divided by NLG stage:

  16. • Each stage: . . . .. . . .. .. . .. . . .. . . . . . 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples .. NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . .. .. . . .. . . Natural Language Generation • Divided by NLG stage:

  17. • Each stage: . . . .. . . .. .. . .. . . .. . . . . . 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples .. NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . .. .. . . .. . . Natural Language Generation • Divided by NLG stage:

  18. . .. . . .. . . .. . . .. . . .. . . . .. 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples . NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. Natural Language Generation .. . .. . . .. . . • Divided by NLG stage: • Each stage:

  19. • Actually typically handcrafued or non-existent • One-step approaches or simplistic systems • Here we focus on trainable approaches • …and especially on our own ☺ . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . NLG Systems Examples Sentence planning Sentence Planning Examples 10/ 40 Ondřej Dušek .. . .. . . . .. . . .. . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . Natural Language Generation • Various input/output formats, not very comparable

  20. • One-step approaches or simplistic systems • Here we focus on trainable approaches • …and especially on our own ☺ . .. . . .. . . . . . .. . . .. .. .. . .. . . .. . . .. . NLG Systems Examples Sentence planning Sentence Planning Examples 10/ 40 Ondřej Dušek . . .. . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • Various input/output formats, not very comparable • Actually typically handcrafued or non-existent

  21. . .. . .. . . .. . . .. . . .. . . . .. . .. . . .. . . .. . NLG Systems Examples Sentence planning Sentence Planning Examples 10/ 40 Ondřej Dušek .. . . .. . . .. . . .. . . .. . . .. . . . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • Various input/output formats, not very comparable • Actually typically handcrafued or non-existent • One-step approaches or simplistic systems • Here we focus on trainable approaches • …and especially on our own ☺

  22. . .. .. . . .. . . . . . .. . . .. . . .. .. domain Ondřej Dušek 11/ 40 plans hand-annotated sentence (RankBoost) trained on overgeneration the flight information . Trainable Sentence Planning: SPoT Sentence planning NLG Systems Examples . .. . . .. . . . . .. . . .. . . .. . . .. . . .. . . .. .. . . .. . . .. . . .. . . .. . . Natural Language Generation • Spoken Dialogue System in • Handcrafued generator + • Statistical reranker

  23. • Paiva&Evans : linguistic features annotated in corpus generated with • PERSONAGE-PE : personality traits connected to linguistic features via . . . .. . . .. .. . .. . . .. . . . . . . .. . . .. . NLG Systems Examples Sentence planning Trainable Sentence Planning: Parameter Optimization Examples many parameter settings, correlation analysis machine learning 12/ 40 Ondřej Dušek .. .. . . . . .. . . .. . . .. . . .. . . .. .. . .. . . .. . . .. . Natural Language Generation . .. . . .. . . • Requires a flexible handcrafed planner • No overgeneration • Adjusting its parameters “somehow”

  24. . . .. . . .. . . .. . . .. . . .. . .. . Sentence planning Ondřej Dušek 12/ 40 machine learning many parameter settings, correlation analysis Examples Trainable Sentence Planning: Parameter Optimization NLG Systems Examples . . .. . . .. . . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . Natural Language Generation . .. . . .. . . .. • Requires a flexible handcrafed planner • No overgeneration • Adjusting its parameters “somehow” • Paiva&Evans : linguistic features annotated in corpus generated with • PERSONAGE-PE : personality traits connected to linguistic features via

  25. • Typical NLG training: • Our sentence planner learns alignments jointly • training from pairs: MR + sentence . . .. . . .. . . .. . . .. . . . . .. . .. .. . . .. . NLG Systems Examples Our Approach to Sentence Planning 2. Learns from unaligned data a) requires alignment of MR elements and words/phrases b) uses a separate alignment step 13/ 40 Ondřej Dušek .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . Natural Language Generation .. .. . . .. . . Our A ∗ /Perceptron Sentence Planner ( TGEN1 ) 1. Requires no handcrafued module

  26. • Our sentence planner learns alignments jointly • training from pairs: MR + sentence . . .. . . .. . . .. . . .. .. . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning 2. Learns from unaligned data a) requires alignment of MR elements and words/phrases b) uses a separate alignment step 13/ 40 Ondřej Dušek .. . . . . .. .. . . .. . . .. . . .. . . .. . . . . . .. . . .. .. Natural Language Generation . .. . . .. . . Our A ∗ /Perceptron Sentence Planner ( TGEN1 ) 1. Requires no handcrafued module • Typical NLG training: MR inform(name=X, type=placetoeat, eattype=restaurant, area=riverside, food=Italian) alignment X is an italian restaurant in the riverside area . text

  27. . . .. .. . .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning 2. Learns from unaligned data a) requires alignment of MR elements and words/phrases b) uses a separate alignment step 13/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . . . . .. . . . .. . . . . .. .. . .. . . Natural Language Generation .. . Our A ∗ /Perceptron Sentence Planner ( TGEN1 ) 1. Requires no handcrafued module • Typical NLG training: • Our sentence planner learns alignments jointly • training from pairs: MR + sentence MR inform(name=X, type=placetoeat, eattype=restaurant, area=riverside, food=Italian) X is an italian restaurant in the riverside area . text

  28. • Output : deep-syntax dependency trees • based on TectoMT 's t-layer, but very • two attributes per tree node: • using surface word order • Conversion to plain text sentences – • Treex / TectoMT English synthesis . .. . . .. . . . .. . . .. . . .. . Our Approach to Sentence Planning .. 14/ 40 X is an Italian restaurant area=riverside, food=Italian) eattype=restaurant, inform(name=X, type=placetoeat, Natural Language Generation Ondřej Dušek (rule-based, later) . surface realization t-lemma + formeme simplified I/O formats . NLG Systems Examples .. . .. .. . .. . . .. . . . . . .. . . .. . . . .. . . . .. . . .. . . .. . .. . . . .. . . .. in the riverside area. • Input : a MR • dialogue acts: “inform” + slot-value pairs • other formats possible

  29. • Conversion to plain text sentences – • Treex / TectoMT English synthesis . . . .. . . .. .. . . .. . . .. . . .. .. . 14/ 40 X is an Italian restaurant area=riverside, food=Italian) eattype=restaurant, inform(name=X, type=placetoeat, Natural Language Generation Ondřej Dušek (rule-based, later) . surface realization t-lemma + formeme simplified I/O formats Our Approach to Sentence Planning NLG Systems Examples . .. . . .. .. . .. .. . . . . . .. . . .. . . . . .. . . . .. . . . . . .. . .. in the riverside area. . . . .. .. • Input : a MR • dialogue acts: “inform” + slot-value pairs t-tree • other formats possible be • Output : deep-syntax dependency trees v:fin • based on TectoMT 's t-layer, but very X-name restaurant n:subj n:obj • two attributes per tree node: italian area adj:attr n:in+X • using surface word order riverside n:attr

  30. . .. . . .. . . .. . . .. . . .. . . .. .. . . 14/ 40 X is an Italian restaurant area=riverside, food=Italian) eattype=restaurant, inform(name=X, type=placetoeat, Natural Language Generation Ondřej Dušek (rule-based, later) .. surface realization t-lemma + formeme simplified I/O formats Our Approach to Sentence Planning NLG Systems Examples . . . .. .. . . .. . . . . . . .. . . .. . . .. in the riverside area. . . . .. . . .. .. . .. . . .. . . . .. . • Input : a MR • dialogue acts: “inform” + slot-value pairs t-tree • other formats possible be • Output : deep-syntax dependency trees v:fin • based on TectoMT 's t-layer, but very X-name restaurant n:subj n:obj • two attributes per tree node: italian area adj:attr n:in+X • using surface word order riverside n:attr • Conversion to plain text sentences – • Treex / TectoMT English synthesis

  31. • Using two subcomponents: • candidate generator • churning out candidate sentence plan trees • given an incomplete candidate tree, add node(s) • scorer /ranker for the candidates • influences which candidate trees will be . . .. . . .. . . .. .. . . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Overall Structure of Our Sentence Planner candidate sentence plan expanded (selects the most promising) 15/ 40 Ondřej Dušek .. . .. .. .. . . .. . . . . . .. . . .. . . . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . • A*-style search – “finding the path” empty tree → full sentence plan tree • always expand the most promising • stop when candidates don't improve for a while

  32. • scorer /ranker for the candidates • influences which candidate trees will be . . . .. . . .. .. . .. . . .. . . . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Overall Structure of Our Sentence Planner candidate sentence plan expanded (selects the most promising) 15/ 40 Ondřej Dušek Natural Language Generation .. .. . . . . .. . . .. . . .. . . .. . . .. .. . . . . .. . .. . .. . .. . . .. . . MR • A*-style search – “finding the path” empty tree → full sentence plan tree Sentence planner • always expand the most promising candidate generator • stop when candidates don't improve for a while A* search • Using two subcomponents: • candidate generator scorer • churning out candidate sentence plan trees • given an incomplete candidate tree, add node(s) Sentence plan (deep syntax tree)

  33. . . .. . . .. . . .. . . .. . . .. . .. . Our Approach to Sentence Planning Natural Language Generation Ondřej Dušek 15/ 40 expanded (selects the most promising) candidate sentence plan Overall Structure of Our Sentence Planner NLG Systems Examples . . .. . . .. . . .. .. .. . . .. . . .. . . .. . . .. . . . . . .. . . . .. . . .. . .. . . .. . MR • A*-style search – “finding the path” empty tree → full sentence plan tree Sentence planner • always expand the most promising candidate generator • stop when candidates don't improve for a while A* search • Using two subcomponents: • candidate generator scorer • churning out candidate sentence plan trees • given an incomplete candidate tree, add node(s) • scorer /ranker for the candidates • influences which candidate trees will be Sentence plan (deep syntax tree)

  34. • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . .. . . .. . . .. . . .. . . .. .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek . . .. . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • Given a candidate plan tree, generate its successors

  35. • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . . .. . . .. . . .. .. .. .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . . . . . . . . . .. . . .. . . .. . . .. .. . . .. . . .. . . .. .. . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree t-tree be recommend serve v:fin v:fin v:fin

  36. • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . . .. . . .. . . .. .. .. .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . . . . . . . . . .. . . .. . . .. . . .. .. . . .. . . .. . . .. .. . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree t-tree be recommend serve v:fin v:fin v:fin

  37. • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . .. . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . . . . . . .. . . . . . .. . . .. . . .. .. . . .. .. . . . . .. .. . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree t-tree be be be v:fin v:fin v:fin be v:fin restaurant X-name restaurant n:obj n:subj n:subj

  38. • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . .. . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . . . . . . .. . . . . . .. . . .. . . .. .. . . .. .. . . . . .. .. . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree t-tree be be be v:fin v:fin v:fin be v:fin restaurant X-name restaurant n:obj n:subj n:subj

  39. • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . .. . . . . .. . . . . . .. . . .. . . .. .. . .. .. .. . . .. . . . . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree be be be v:fin v:fin v:fin X-name restaurant X-name bar X-name n:subj n:obj n:subj n:obj n:subj

  40. • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . .. . . . . .. . . .. . . .. . . .. . . .. . . .. .. .. . . .. . . . . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree be be be v:fin v:fin v:fin X-name restaurant X-name bar X-name n:subj n:obj n:subj n:obj n:subj • “possible places” must be limited in practice

  41. • parent–child • t-lemma + formeme • number of children, tree size … . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation .. .. . . . . .. . . .. .. . .. . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. • Given a candidate plan tree, generate its successors t-tree t-tree t-tree be be be v:fin v:fin v:fin X-name restaurant X-name bar X-name n:subj n:obj n:subj n:obj n:subj • “possible places” must be limited in practice • using combination of things seen in training data

  42. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . .. .. . . . . . .. . . .. . . .. . . .. . .. .. . . .. . . .. . . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree be be be v:fin v:fin v:fin X-name restaurant X-name bar X-name n:subj n:obj n:subj n:obj n:subj • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size …

  43. • occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child) • … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. . . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m

  44. • occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child) • … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. . . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m

  45. • occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child) • … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. . . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m

  46. • tree shape • tree edges (parent-child) • … . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek .. .. . . . . .. . . .. . . .. . . .. . . .. .. . .. . . .. . . .. . Natural Language Generation . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m • occurence of input DA slots + t-lemmas / formemes

  47. • tree edges (parent-child) • … . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek .. .. . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m • occurence of input DA slots + t-lemmas / formemes • tree shape

  48. • … . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . Natural Language Generation . .. . . .. . . .. • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m • occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child)

  49. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. . . .. • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m • occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child) • …

  50. • Training: • given m , generate the best tree t top with current weights • update weights if t top t gold (gold-standard) • Update: w feat t gold m feat t top m • Updates based on partial trees • Estimating future value of the trees .. . . .. . . . . .. . . .. . . NLG Systems Examples .. . . .. . .. Our Approach to Sentence Planning Perceptron scorer Basic form w Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek . .. . .. . .. . . .. . . . . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation • score = w ⊤ · feat ( t , m )

  51. • Update: w feat t gold m feat t top m • Updates based on partial trees • Estimating future value of the trees . .. . . .. . . . . . .. . . .. .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Perceptron scorer Basic form w Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek .. . .. .. .. . . .. . . . . . .. . . .. . . . . .. . . .. . . .. . . .. . . .. . . .. . Natural Language Generation • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard)

  52. • Updates based on partial trees • Estimating future value of the trees . . . .. . . .. .. . .. . . .. . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Perceptron scorer Basic form Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek .. . . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard) • Update: w = w + α · ( feat ( t gold , m ) − feat ( t top , m ))

  53. • Updates based on partial trees • Estimating future value of the trees . . . .. . . .. .. . .. . . .. . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Perceptron scorer Basic form Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek .. . . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard) • Update: w = w + α · ( feat ( t gold , m ) − feat ( t top , m ))

  54. • Estimating future value of the trees . . . .. . . .. .. . .. . . .. . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Perceptron scorer Basic form Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek .. . . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard) • Update: w = w + α · ( feat ( t gold , m ) − feat ( t top , m )) • Updates based on partial trees

  55. . . .. .. . .. . . .. . . .. . . .. . .. . Our Approach to Sentence Planning Ondřej Dušek 18/ 40 Trying to guide the search on incomplete trees Our improvements Basic form Perceptron scorer NLG Systems Examples . . .. . . .. . . . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. . . .. • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard) • Update: w = w + α · ( feat ( t gold , m ) − feat ( t top , m )) • Updates based on partial trees • Estimating future value of the trees

  56. • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well .. . . .. . . . . . .. . . . .. .. . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek .. . . .. .. . . .. . . . .. . .. . . .. . . . . .. .. . . .. . . .. . . . . . .. . . .. . Natural Language Generation • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs

  57. • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well . . .. . . .. . . .. . .. . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek .. . .. .. .. . . .. . . . . . .. . . .. . . . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89%

  58. • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well . .. . . .. . . . . . .. . . .. .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results but: 19/ 40 Ondřej Dušek . .. .. .. . . . .. . . . . . .. . . .. . . . .. .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU…

  59. • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well . . . .. . . .. . . .. .. . .. .. . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek . . . . . . .. . . .. . . .. . . .. . . .. . . .. .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include

  60. • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well . . .. . . .. . . .. . . .. .. . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek .. . . . . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful

  61. • slooooow, doesn't scale very well . . .. . . .. . . .. . . .. .. . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. .. . . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output

  62. . . .. . .. .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. .. . . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well

  63. . NLG Systems Examples X is a restaurant in the riverside area near X. Generated X restaurant is near X on the riverside. Reference eattype=restaurant) inform(name=X-name, type=placetoeat, area=riverside, near=X-near, Input DA Example Outputs Our Approach to Sentence Planning . inform(name=X-name, type=placetoeat, area=X-area, .. . . .. . . .. . . Input DA pricerange=moderate, eattype=restaurant) . Input DA Ondřej Dušek 20/ 40 X is a Japanese restaurant in the centre of town near X and X. Generated X is a Chinese takeaway and Japanese restaurant in the city centre near X. Reference food=Japanese) area=citycentre, near=X-near, food=“Chinese takeaway”, inform(name=X-name, type=placetoeat, eattype=restaurant, X is a French restaurant in the riverside area which serves French food. Reference Generated X is a French restaurant on the riverside. Reference area=riverside, food=French) inform(name=X-name, type=placetoeat, eattype=restaurant, Input DA X is a restaurant in the X area. Generated X is a moderately priced restaurant in X. .. . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . Natural Language Generation

  64. . NLG Systems Examples X is a restaurant in the riverside area near X. Generated X restaurant is near X on the riverside. Reference eattype=restaurant) inform(name=X-name, type=placetoeat, area=riverside, near=X-near, Input DA Example Outputs Our Approach to Sentence Planning . inform(name=X-name, type=placetoeat, area=X-area, .. . . .. . . .. . . Input DA pricerange=moderate, eattype=restaurant) . Input DA Ondřej Dušek 20/ 40 X is a Japanese restaurant in the centre of town near X and X. Generated X is a Chinese takeaway and Japanese restaurant in the city centre near X. Reference food=Japanese) area=citycentre, near=X-near, food=“Chinese takeaway”, inform(name=X-name, type=placetoeat, eattype=restaurant, X is a French restaurant in the riverside area which serves French food. Reference Generated X is a French restaurant on the riverside. Reference area=riverside, food=French) inform(name=X-name, type=placetoeat, eattype=restaurant, Input DA X is a restaurant in the X area. Generated X is a moderately priced restaurant in X. .. . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . Natural Language Generation

  65. . NLG Systems Examples X is a restaurant in the riverside area near X. Generated X restaurant is near X on the riverside. Reference eattype=restaurant) inform(name=X-name, type=placetoeat, area=riverside, near=X-near, Input DA Example Outputs Our Approach to Sentence Planning . inform(name=X-name, type=placetoeat, area=X-area, .. . . .. . . .. . . Input DA pricerange=moderate, eattype=restaurant) . Input DA Ondřej Dušek 20/ 40 X is a Japanese restaurant in the centre of town near X and X. Generated X is a Chinese takeaway and Japanese restaurant in the city centre near X. Reference food=Japanese) area=citycentre, near=X-near, food=“Chinese takeaway”, inform(name=X-name, type=placetoeat, eattype=restaurant, X is a French restaurant in the riverside area which serves French food. Reference Generated X is a French restaurant on the riverside. Reference area=riverside, food=French) inform(name=X-name, type=placetoeat, eattype=restaurant, Input DA X is a restaurant in the X area. Generated X is a moderately priced restaurant in X. .. . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . Natural Language Generation

  66. . NLG Systems Examples X is a restaurant in the riverside area near X. Generated X restaurant is near X on the riverside. Reference eattype=restaurant) inform(name=X-name, type=placetoeat, area=riverside, near=X-near, Input DA Example Outputs Our Approach to Sentence Planning . inform(name=X-name, type=placetoeat, area=X-area, .. . . .. . . .. . . Input DA pricerange=moderate, eattype=restaurant) . Input DA Ondřej Dušek 20/ 40 X is a Japanese restaurant in the centre of town near X and X. Generated X is a Chinese takeaway and Japanese restaurant in the city centre near X. Reference food=Japanese) area=citycentre, near=X-near, food=“Chinese takeaway”, inform(name=X-name, type=placetoeat, eattype=restaurant, X is a French restaurant in the riverside area which serves French food. Reference Generated X is a French restaurant on the riverside. Reference area=riverside, food=French) inform(name=X-name, type=placetoeat, eattype=restaurant, Input DA X is a restaurant in the X area. Generated X is a moderately priced restaurant in X. .. . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . Natural Language Generation

  67. . . . .. . . .. . . .. . . .. . . .. . .. .. . . .. . . .. . NLG Systems Examples Surface Realization Surface Realization Examples Treex / TectoMT realizer 21/ 40 Ondřej Dušek .. . . .. . . .. . . .. . . .. . . .. . . . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • Also various input formats, at least output is always text • From handcrafued to different trainable realizers • Also including our own (developed here at ÚFAL): • actually handcrafued for the most part

  68. • General purpose • Functional Unification . . . .. . . .. . .. Surface Realization . . .. . . .. . . NLG Systems Examples KPML Grammar-based Realizers (90's): KPML , FUF/SURGE "It is raining cats and dogs." Ondřej Dušek 22/ 40 ) (C / OBJECT :LEX CATS-AND-DOGS :NUMBER MASS)) :TENSE PRESENT-CONTINUOUS :ACTEE (A / AMBIENT-PROCESS :LEX RAIN :LOGICALFORM :TARGETFORM .. EX-SET-1 :NAME (EXAMPLE Grammar FUF/SURGE Grammar multilingual .. . . . . . .. . . .. . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. Natural Language Generation • General purpose, • Systemic Functional

  69. . . .. . . .. . . .. . NLG Systems Examples .. . . .. . . .. . Surface Realization . "It is raining cats and dogs." Ondřej Dušek 22/ 40 ) (C / OBJECT :LEX CATS-AND-DOGS :NUMBER MASS)) :TENSE PRESENT-CONTINUOUS :ACTEE (A / AMBIENT-PROCESS :LEX RAIN :LOGICALFORM :TARGETFORM Grammar-based Realizers (90's): KPML , FUF/SURGE EX-SET-1 :NAME (EXAMPLE Grammar FUF/SURGE Grammar multilingual KPML .. . .. .. . .. . . .. . . . .. . .. . . .. . . . . . . . .. . . .. . . .. . .. . . . .. . . .. Natural Language Generation • General purpose, • Systemic Functional • General purpose • Functional Unification

  70. . . .. . . .. . . .. . . .. . . .. . .. . Surface Realization Ondřej Dušek 23/ 40 enhancements Grammar multi-lingual Grammar-based Realizer: OpenCCG NLG Systems Examples . . .. . . .. . .. . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • General purpose, • Combinatory Categorial • Used in several projects • With statistical

  71. . . . . .. . . .. . .. . . . .. . . .. . .. NLG Systems Examples .. p.setVerb("chase"); Ondřej Dušek 24/ 40 >>> Mary chased the monkey. System.out.println(output); String output = realiser.realiseSentence(p); p.setFeature(Feature.TENSE, Tense.PAST); p.setObject("the monkey"); p.setSubject("Mary"); Surface Realization SPhraseSpec p = nlgFactory.createClause(); Realiser realiser = new Realiser(lexicon); NLGFactory nlgFactory = new NLGFactory(lexicon); Lexicon lexicon = new XMLLexicon("my-lexicon.xml"); (procedural) other languages Procedural Realizer: SimpleNLG .. . . .. . .. . . .. . . . .. . .. . . .. . . . . . . .. . . .. . . .. . .. . . . .. . . .. Natural Language Generation • General purpose • English, adapted to several • Java implementation

  72. • Ranking according to: • n -gram models ( NITROGEN, HALOGEN ) • Tree models (XTAG grammar – FERGUS ) • Predicted Text-to-Speech quality ( Nakatsu and White ) • Personality traits (extraversion, agreeableness… – CRAG ) • Provides variance, but at a greater computational cost . .. . . .. . . .. . . . .. . . .. .. . . .. . . .. . NLG Systems Examples Surface Realization Trainable Realizers: Overgenerate and Rank + alignment (repeating words uttered by dialogue counterpart) 25/ 40 Ondřej Dušek .. . . . . . .. . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . Natural Language Generation .. .. . . .. . . • Require a handcrafued realizer, e.g. CCG realizer • Input underspecified → more outputs possible • Overgenerate • Then use a statistical reranker

  73. • Provides variance, but at a greater computational cost . . .. .. . .. . . .. . . .. . . .. .. . . . . .. . . .. . NLG Systems Examples Surface Realization Trainable Realizers: Overgenerate and Rank + alignment (repeating words uttered by dialogue counterpart) 25/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . . . . .. . . . .. . . . .. Natural Language Generation . .. . . .. . . .. • Require a handcrafued realizer, e.g. CCG realizer • Input underspecified → more outputs possible • Overgenerate • Then use a statistical reranker • Ranking according to: • n -gram models ( NITROGEN, HALOGEN ) • Tree models (XTAG grammar – FERGUS ) • Predicted Text-to-Speech quality ( Nakatsu and White ) • Personality traits (extraversion, agreeableness… – CRAG )

  74. . . . .. .. . .. . . .. . . .. . . .. . .. .. . . .. . . .. . NLG Systems Examples Surface Realization Trainable Realizers: Overgenerate and Rank + alignment (repeating words uttered by dialogue counterpart) 25/ 40 Ondřej Dušek . . . .. . . .. . . .. . . .. . . .. . . . . . .. . . . .. . Natural Language Generation .. . . .. .. . . • Require a handcrafued realizer, e.g. CCG realizer • Input underspecified → more outputs possible • Overgenerate • Then use a statistical reranker • Ranking according to: • n -gram models ( NITROGEN, HALOGEN ) • Tree models (XTAG grammar – FERGUS ) • Predicted Text-to-Speech quality ( Nakatsu and White ) • Personality traits (extraversion, agreeableness… – CRAG ) • Provides variance, but at a greater computational cost

  75. . .. .. .. . . .. . . .. . . .. . . . .. . .. . . .. . . .. . NLG Systems Examples Surface Realization Trainable Realizers: Syntax-Based 26/ 40 Ondřej Dušek . . . .. . . .. . . .. . . .. . . .. . . . . . .. . . . . .. .. . . .. . . .. Natural Language Generation • StuMaBa : general realizer based on SVMs • Pipeline: ↓ Deep syntax/semantics ↓ surface syntax ↓ linearization ↓ morphologization

  76. • We use it for our experiments ( TGEN1 ) • analysis • Pipeline approach • Mostly simple, single-purpose, rule-based modules (blocks) • Word inflection: statistical ( Flect ) • Gradual transformation of deep trees into surface dependency • Surface trees are then simply linearized . .. . .. . . .. . . .. . . .. . . . . .. . .. .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realizer synthesis on BAGEL data = 89.79% BLEU trees 27/ 40 Ondřej Dušek . . .. .. .. . . .. . . . . . .. . . .. . . . . .. .. . .. . . .. . . . . . .. . . .. . Natural Language Generation • Domain-independent

  77. • Pipeline approach • Mostly simple, single-purpose, rule-based modules (blocks) • Word inflection: statistical ( Flect ) • Gradual transformation of deep trees into surface dependency • Surface trees are then simply linearized . .. . . .. . . .. . . . .. . . .. .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realizer trees 27/ 40 Ondřej Dušek . .. . . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • Domain-independent • We use it for our experiments ( TGEN1 ) • analysis → synthesis on BAGEL data = 89.79% BLEU

  78. • Gradual transformation of deep trees into surface dependency • Surface trees are then simply linearized . .. . . .. . . . . . .. . . .. .. .. . . . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realizer trees 27/ 40 Ondřej Dušek . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . Natural Language Generation . .. . . .. . . .. • Domain-independent • We use it for our experiments ( TGEN1 ) • analysis → synthesis on BAGEL data = 89.79% BLEU • Pipeline approach • Mostly simple, single-purpose, rule-based modules (blocks) • Word inflection: statistical ( Flect )

  79. . . . .. . . .. . . .. . . .. . . .. . .. .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realizer trees 27/ 40 Ondřej Dušek . .. . .. . . .. . . .. . . .. . . . . . .. . . .. . . . .. . Natural Language Generation .. . . .. . . .. • Domain-independent • We use it for our experiments ( TGEN1 ) • analysis → synthesis on BAGEL data = 89.79% BLEU • Pipeline approach • Mostly simple, single-purpose, rule-based modules (blocks) • Word inflection: statistical ( Flect ) • Gradual transformation of deep trees into surface dependency • Surface trees are then simply linearized

  80. • Copy the deep tree (sentence plan) • Determine morphological agreement • Add prepositions and conjunctions • Add articles • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization t-tree zone=en_gen jump v:fin cat window n:subj n:through+X . .. . . . . .. . . .. . Treex / TectoMT Surface Realization Example .. . . .. . NLG Systems Examples Our Surface realizer . 28/ 40 Ondřej Dušek Natural Language Generation .. . .. . . . .. . . .. . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . • Realizer steps (simplified):

  81. • Determine morphological agreement • Add prepositions and conjunctions • Add articles • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization . . .. . . .. . . .. .. . .. .. . . . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realization Example 28/ 40 Ondřej Dušek Natural Language Generation . . .. .. .. . . .. . . . . . .. . . .. . . . . .. .. . .. . . .. . . . . . . .. . . .. • Realizer steps (simplified): • Copy the deep tree (sentence plan) t-tree zone=en_gen jump v:fin cat window n:subj n:through+X

  82. • Add prepositions and conjunctions • Add articles • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization . .. . . .. . . .. . . .. .. . . .. .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realization Example 28/ 40 Ondřej Dušek Natural Language Generation . . . . . . . . . .. . . .. . . .. . . .. .. . . .. . . .. . .. .. . . . .. . . .. . • Realizer steps (simplified): • Copy the deep tree (sentence plan) • Determine morphological agreement t-tree zone=en_gen jump v:fin cat window n:subj n:through+X

  83. • Add articles • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization . .. . . .. . . .. . . .. .. . . .. .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realization Example 28/ 40 Ondřej Dušek Natural Language Generation . . . . . . . . . .. . . .. . . .. . . .. .. . . .. . . .. . . .. .. . .. . . .. . . • Realizer steps (simplified): • Copy the deep tree (sentence plan) • Determine morphological agreement • Add prepositions and conjunctions t-tree zone=en_gen jump v:fin cat window n:subj n:through+X

  84. • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization . . .. . . .. . .. .. . . .. . . . . .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realization Example 28/ 40 Ondřej Dušek Natural Language Generation .. . . . . . .. .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . .. . . .. • Realizer steps (simplified): • Copy the deep tree (sentence plan) • Determine morphological agreement • Add prepositions and conjunctions • Add articles t-tree zone=en_gen jump v:fin cat window n:subj n:through+X

Recommend


More recommend