nlg specific components
play

NLG: Specific Components Texts NLG Systems Architecture modules - PowerPoint PPT Presentation

NLG: Specific Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu NLG: Specific Components Texts NLG Systems Architecture modules Scott Farrar Textplanner Microplanner CLMA, University of Washington


  1. NLG: Specific Example text (personal ads) Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu I am a single girl new to area who would like to Texts meet someone to hang out with and get a taste of NLG Systems local flavor. I am well educated with a great Architecture modules career ...full-figured. I’m looking for mutual Textplanner Microplanner stimulating conversation with dating potential.... Surface realizer SimpleNLG realizer Don’t reply without picture, and be single–no Hw7 married or attached guys please. 4/42

  2. NLG: Specific Example text (personal ads) Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu I am a single girl new to area who would like to Texts meet someone to hang out with and get a taste of NLG Systems local flavor. I am well educated with a great Architecture modules career... full-figured . I’m looking for mutual Textplanner Microplanner stimulating conversation with dating potential.... Surface realizer SimpleNLG realizer Don’t reply without picture, and be single–no Hw7 married or attached guys please. 4/42

  3. NLG: Specific Example text (personal ads) Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu I am a single girl new to area who would like to Texts meet someone to hang out with and get a taste of NLG Systems local flavor. I am well educated with a great Architecture modules career...full-figured. I’m looking for mutual Textplanner Microplanner stimulating conversation with dating potential.... Surface realizer SimpleNLG realizer Don’t reply without picture , and be single–no Hw7 married or attached guys please. 4/42

  4. NLG: Specific Example text (personal ads) Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu I am a single girl new to area who would like to Texts meet someone to hang out with and get a taste of NLG Systems local flavor. I am well educated with a great Architecture modules career...full-figured. I’m looking for mutual Textplanner Microplanner stimulating conversation with dating potential.... Surface realizer SimpleNLG realizer Don’t reply without picture , and be single –no Hw7 married or attached guys please. There’s more than just facts to be reported. In NLG communicative intention is crucial. 4/42

  5. NLG: Specific Example text (personal ads) Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu I am a single girl new to area who would like to Texts meet someone to hang out with and get a taste of NLG Systems local flavor. I am well educated with a great Architecture modules career...full-figured. I’m looking for mutual Textplanner Microplanner stimulating conversation with dating potential.... Surface realizer SimpleNLG realizer Don’t reply without picture , and be single – no Hw7 married or attached guys please. There’s more than just facts to be reported. In NLG communicative intention is crucial. 4/42

  6. NLG: Specific Example text (personal ads) Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu I am a single girl new to area who would like to Texts meet someone to hang out with and get a taste of NLG Systems local flavor. I am well educated with a great Architecture modules career...full-figured. I’m looking for mutual Textplanner Microplanner stimulating conversation with dating potential.... Surface realizer SimpleNLG realizer Don’t reply without picture , and be single –no Hw7 married or attached guys please. There’s more than just facts to be reported. In NLG communicative intention is crucial. 4/42

  7. NLG: Specific From knowledge to language Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts An important first step in NLG concerns planning the NLG Systems information needed to produce natural sounding, coherent Architecture modules text. Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 5/42

  8. NLG: Specific From knowledge to language Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts An important first step in NLG concerns planning the NLG Systems information needed to produce natural sounding, coherent Architecture modules text. Textplanner Microplanner Surface realizer SimpleNLG realizer From a non-linguistic knowledge base, the system needs to Hw7 identify the information of interest and combine it in a way consistent with the way humans package their beliefs, desires, intentions as language. 5/42

  9. NLG: Specific Today’s lecture Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts 1 Texts NLG Systems NLG Systems 2 Architecture modules Textplanner Microplanner Architecture modules 3 Surface realizer SimpleNLG realizer Textplanner Hw7 Microplanner Surface realizer SimpleNLG realizer Hw7 4 6/42

  10. NLG: Specific NLG Three-step systems Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts NLG Systems Last time we said that a more fine-tuned approach to the Architecture notion of choice was needed for better control over the NLG modules process. We compared a two- and a three-step approach. Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 Three-step architectures like WeatherReporter or the KNIGHT System are more flexible, and modular. There’s more control over the output. 7/42

  11. Main components in a three-step stystem Module Content task Structure task Text Planner content determination document structuring Microplanner lexicalization; referring aggregation expression generation Surface Realizer linguistic realization structure realization

  12. NLG three-step architecture

  13. NLG: Specific Today’s lecture Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts 1 Texts NLG Systems NLG Systems 2 Architecture modules Textplanner Microplanner Architecture modules 3 Surface realizer SimpleNLG realizer Textplanner Hw7 Microplanner Surface realizer SimpleNLG realizer Hw7 4 10/42

  14. NLG: Specific Text planner: Purpose Components Scott Farrar CLMA, University The text planner decides what chunks of information to of Washington far- rar@u.washington.edu include (content determination), and how to structure them Texts (text structuring). NLG Systems Architecture modules Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 11/42

  15. NLG: Specific Text planner: Purpose Components Scott Farrar CLMA, University The text planner decides what chunks of information to of Washington far- rar@u.washington.edu include (content determination), and how to structure them Texts (text structuring). NLG Systems Architecture Definition modules Textplanner The basic unit of information produced by text planner is the Microplanner Surface realizer message : a configuration of significant predications from SimpleNLG realizer Hw7 the knowledge source. Messages correspond to major textual units in the output (e.g., a full sentence or group of sentences). 11/42

  16. NLG: Specific Text planner: Purpose Components Scott Farrar CLMA, University The text planner decides what chunks of information to of Washington far- rar@u.washington.edu include (content determination), and how to structure them Texts (text structuring). NLG Systems Architecture Definition modules Textplanner The basic unit of information produced by text planner is the Microplanner Surface realizer message : a configuration of significant predications from SimpleNLG realizer Hw7 the knowledge source. Messages correspond to major textual units in the output (e.g., a full sentence or group of sentences). The text planner deals with information, not yet packaged in a linguistically suitable format, and with no reference to a NL grammar or lexicon. 11/42

  17. NLG: Specific Message types (genealogy vs. personal ads) Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts MarriageMessage NLG Systems BirthMessage Architecture modules OccupationMessage Textplanner Microplanner Surface realizer SimpleNLG realizer PhysicalTraitMessage Hw7 RelationshipStatusMessage LikesMessage , DislikesMessage OccupationMessage 12/42

  18. NLG: Specific Text planner: Input Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts knowledge source : instances, classes, relations NLG Systems (possibly expressed in FOL, in a relational database, Architecture modules etc.) Textplanner Microplanner communicative goals : these are domain specific, Surface realizer SimpleNLG realizer though there are generalizations for all domains. E.g.,: Hw7 ( comparePerson Sam Fred ) ( describeAll KB ) ( queryAge Fred ) 13/42

  19. NLG: Specific Text planner: Output Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu The text planner outputs a text plan : a non-linguistic data structure that contains messages structured according to Texts NLG Systems rhetorical relations. It does not specify any grammatical or Architecture lexical information. modules Textplanner Microplanner Surface realizer message : significant predications from the domain. SimpleNLG realizer E.g. Hw7 ∃ e BirthEvent ( e ) ∧ actor ( e , GEORGE ) ∧ location ( e , SOMERSET CO MD ), etc. rhetorical structure : relations between the chunks. E.g, elaboration , contrast , purpose , etc. 14/42

  20. NLG: Specific Target text elements Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts NLG Systems George Melvin Phillips was from Somerset Co., MD. Architecture modules George Melvin Phillips was born in 1864. Textplanner Microplanner Surface realizer But George Melvin Phillips died in 1933. SimpleNLG realizer Hw7 He was married to Martha Hastings. George and Martha had four children. 15/42

  21. Output of text planner: text plan

  22. NLG: Specific Rhetorical Structure Theory (RST) Components Scott Farrar RST is a theory of the structure of texts that emphasizes CLMA, University of Washington far- textual function and the relationships between textual units. rar@u.washington.edu Consider several relation types among textual units T 1 and Texts T 2: NLG Systems Architecture modules Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 17/42

  23. NLG: Specific Rhetorical Structure Theory (RST) Components Scott Farrar RST is a theory of the structure of texts that emphasizes CLMA, University of Washington far- textual function and the relationships between textual units. rar@u.washington.edu Consider several relation types among textual units T 1 and Texts T 2: NLG Systems Architecture evidence - T 1 is proven by T 2. modules Textplanner The defendant killed Smith. He had Smith’s blood on Microplanner Surface realizer his hands. SimpleNLG realizer Hw7 17/42

  24. NLG: Specific Rhetorical Structure Theory (RST) Components Scott Farrar RST is a theory of the structure of texts that emphasizes CLMA, University of Washington far- textual function and the relationships between textual units. rar@u.washington.edu Consider several relation types among textual units T 1 and Texts T 2: NLG Systems Architecture evidence - T 1 is proven by T 2. modules Textplanner The defendant killed Smith. He had Smith’s blood on Microplanner Surface realizer his hands. SimpleNLG realizer elaboration - the content of T 1 is elaborated in T 2. Hw7 Mary had a little lamb, and she had it with mint sauce. 17/42

  25. NLG: Specific Rhetorical Structure Theory (RST) Components Scott Farrar RST is a theory of the structure of texts that emphasizes CLMA, University of Washington far- textual function and the relationships between textual units. rar@u.washington.edu Consider several relation types among textual units T 1 and Texts T 2: NLG Systems Architecture evidence - T 1 is proven by T 2. modules Textplanner The defendant killed Smith. He had Smith’s blood on Microplanner Surface realizer his hands. SimpleNLG realizer elaboration - the content of T 1 is elaborated in T 2. Hw7 Mary had a little lamb, and she had it with mint sauce. sequence - T 1 precedes T 2 in the narrative. John picked up his iPhone, then fired his boss with a text message. 17/42

  26. NLG: Specific Rhetorical Structure Theory (RST) Components Scott Farrar RST is a theory of the structure of texts that emphasizes CLMA, University of Washington far- textual function and the relationships between textual units. rar@u.washington.edu Consider several relation types among textual units T 1 and Texts T 2: NLG Systems Architecture evidence - T 1 is proven by T 2. modules Textplanner The defendant killed Smith. He had Smith’s blood on Microplanner Surface realizer his hands. SimpleNLG realizer elaboration - the content of T 1 is elaborated in T 2. Hw7 Mary had a little lamb, and she had it with mint sauce. sequence - T 1 precedes T 2 in the narrative. John picked up his iPhone, then fired his boss with a text message. contrast - shows that two elements T 1 and T 2 are contrasting with each other. The bride’s dress was red, while the bridesmaids’ were white. 17/42

  27. NLG: Specific Final structured text Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts NLG Systems Architecture George Melvin Phillips was from Somerset Co., modules Textplanner MD. He was born in 1864, but died in 1933. He Microplanner Surface realizer was married to Martha Hastings. Later, George SimpleNLG realizer and Martha had four children ... Hw7 18/42

  28. Significant predications Significant predications refer to the important knowledge to be linguistically packaged in the generated text. These must be determined in a domain specific manner, i.e., according to communicative goals. Genealogy, encyclopedia entry, obituary, etc. Roosevelt died of a cerebral hemorrhage on April 12, 1945

  29. Significant predications Significant predications refer to the important knowledge to be linguistically packaged in the generated text. These must be determined in a domain specific manner, i.e., according to communicative goals. Genealogy, encyclopedia entry, obituary, etc. Roosevelt died of a cerebral hemorrhage on April 12, 1945 Buddy Holley’s coroner’s report

  30. Significant predications Significant predications refer to the important knowledge to be linguistically packaged in the generated text. These must be determined in a domain specific manner, i.e., according to communicative goals. Genealogy, encyclopedia entry, obituary, etc. Roosevelt died of a cerebral hemorrhage on April 12, 1945 Buddy Holley’s coroner’s report There was bleeding from both ears, and the face showed multiple lacerations. The consistency of the chest was soft due to extensive crushing injury to the bony structure. The left forearm was fractured 1/3 the way up from the wrist and the right elbow was fractured.

  31. A BirthMessage from hw7 NLG: Specific Components This structured object represents the information to Scott Farrar CLMA, University generate: of Washington far- rar@u.washington.edu FDR was born on January 30th, 1882 in Hyde Park, NY. Texts <BirthMessage> NLG Systems <person> Architecture <firstname>Franklin</firstname> modules <middlename>Delano</middlename> Textplanner Microplanner <lastname>Roosevelt</lastname> Surface realizer <gender>male</gender> SimpleNLG realizer </person> Hw7 <date> <month>January</month> <day>30</day> <year>1882</year> </date> <location> <city>Hyde Park</city> <state>New York</state> </location> </BirthMessage> 20/42

  32. NLG: Specific Challenges to two-level generation Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Early generation systems (pre-1990): consisted of only two Texts modules, were brittle and did not perform well in terms of NLG Systems generating the overall textual structure. Architecture modules Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 21/42

  33. NLG: Specific Challenges to two-level generation Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Early generation systems (pre-1990): consisted of only two Texts modules, were brittle and did not perform well in terms of NLG Systems generating the overall textual structure. Architecture modules Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 21/42

  34. NLG: Specific Challenges to two-level generation Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Early generation systems (pre-1990): consisted of only two Texts modules, were brittle and did not perform well in terms of NLG Systems generating the overall textual structure. Architecture modules Textplanner Microplanner 1 Strategic generation : to determine the significant Surface realizer SimpleNLG realizer predications and organize it into a text plan; research Hw7 focused on AI planning techniques (e.g., STRIPS planner). 21/42

  35. NLG: Specific Challenges to two-level generation Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Early generation systems (pre-1990): consisted of only two Texts modules, were brittle and did not perform well in terms of NLG Systems generating the overall textual structure. Architecture modules Textplanner Microplanner 1 Strategic generation : to determine the significant Surface realizer SimpleNLG realizer predications and organize it into a text plan; research Hw7 focused on AI planning techniques (e.g., STRIPS planner). 2 Tactical generation : grammatical selection; lexical choice (i.e., sentence planning) 21/42

  36. NLG: Specific Challenges to two-level generation Components Scott Farrar CLMA, University In early 1990s, researchers recognized that some functions of of Washington far- rar@u.washington.edu the strategic and tactical components overlapped: Texts NLG Systems Architecture modules Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 22/42

  37. NLG: Specific Challenges to two-level generation Components Scott Farrar CLMA, University In early 1990s, researchers recognized that some functions of of Washington far- rar@u.washington.edu the strategic and tactical components overlapped: Texts 1 The bridge between common-sense and linguistic NLG Systems knowledge is often taken to be the lexicon. There was Architecture modules much debate as to where exactly lexical selection Textplanner belonged. Microplanner Surface realizer SimpleNLG realizer Hw7 22/42

  38. NLG: Specific Challenges to two-level generation Components Scott Farrar CLMA, University In early 1990s, researchers recognized that some functions of of Washington far- rar@u.washington.edu the strategic and tactical components overlapped: Texts 1 The bridge between common-sense and linguistic NLG Systems knowledge is often taken to be the lexicon. There was Architecture modules much debate as to where exactly lexical selection Textplanner belonged. Microplanner Surface realizer SimpleNLG realizer 2 In attempting complex text generation, the mapping of Hw7 propositions/messages directly onto sentences resulted in choppy, robotic sounding texts. Combining information is especially difficult when the input data is not specially designed for NLG. 22/42

  39. NLG: Specific Challenges to two-level generation Components Scott Farrar CLMA, University In early 1990s, researchers recognized that some functions of of Washington far- rar@u.washington.edu the strategic and tactical components overlapped: Texts 1 The bridge between common-sense and linguistic NLG Systems knowledge is often taken to be the lexicon. There was Architecture modules much debate as to where exactly lexical selection Textplanner belonged. Microplanner Surface realizer SimpleNLG realizer 2 In attempting complex text generation, the mapping of Hw7 propositions/messages directly onto sentences resulted in choppy, robotic sounding texts. Combining information is especially difficult when the input data is not specially designed for NLG. 3 Consider the problem of how to refer to domain entities. In the best systems, control had to be switched back and forth between strategic and tactical components. 22/42

  40. NLG: Specific Referring expressions Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts AIG received another bailout ... The insurance NLG Systems giant underwent little scrutiny ... A spokesman for Architecture modules the company ... Congressmen berated the Textplanner Microplanner institution as ... Surface realizer SimpleNLG realizer Hw7 Two-step system: Control has to be switched back and forth between strategic and tactical components. 23/42

  41. NLG: Specific Referring expressions Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts AIG received another bailout ... The insurance NLG Systems giant underwent little scrutiny ... A spokesman for Architecture modules the company ... Congressmen berated the Textplanner Microplanner institution as ... Surface realizer SimpleNLG realizer Hw7 Two-step system: Control has to be switched back and forth between strategic and tactical components. 23/42

  42. NLG: Specific Referring expressions Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts AIG received another bailout ... The insurance NLG Systems giant underwent little scrutiny ... A spokesman for Architecture modules the company ... Congressmen berated the Textplanner Microplanner institution as ... Surface realizer SimpleNLG realizer Hw7 Two-step system: Control has to be switched back and forth between strategic and tactical components. 23/42

  43. NLG: Specific Referring expressions Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts AIG received another bailout ... The insurance NLG Systems giant underwent little scrutiny ... A spokesman for Architecture modules the company ... Congressmen berated the Textplanner Microplanner institution as ... Surface realizer SimpleNLG realizer Hw7 Two-step system: Control has to be switched back and forth between strategic and tactical components. 23/42

  44. NLG: Specific Referring expressions Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts AIG received another bailout ... The insurance NLG Systems giant underwent little scrutiny ... A spokesman for Architecture modules the company ... Congressmen berated the Textplanner Microplanner institution as ... Surface realizer SimpleNLG realizer Hw7 Two-step system: Control has to be switched back and forth between strategic and tactical components. 23/42

  45. NLG: Specific Microplanner: Purpose Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu These issues suggest the need for a third, intermediate component, often called a microplanner . This has been a Texts major focus of research in the NLG community for the past NLG Systems decade or so. Architecture modules Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 24/42

  46. NLG: Specific Microplanner: Purpose Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu These issues suggest the need for a third, intermediate component, often called a microplanner . This has been a Texts major focus of research in the NLG community for the past NLG Systems decade or so. Architecture modules Textplanner Microplanner The microplanning component receives input from the text Surface realizer SimpleNLG realizer planner and determines the deep linguistic structure and Hw7 content. 24/42

  47. NLG: Specific Microplanner: Purpose Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu These issues suggest the need for a third, intermediate component, often called a microplanner . This has been a Texts major focus of research in the NLG community for the past NLG Systems decade or so. Architecture modules Textplanner Microplanner The microplanning component receives input from the text Surface realizer SimpleNLG realizer planner and determines the deep linguistic structure and Hw7 content. The main point : the microplanner is an intermediate stage that has access to both non-linguistic information (numerical database, etc.) and linguistic knowledge (grammar and lexicon). 24/42

  48. Microplanner

  49. Microplanner input

  50. NLG: Specific Microplanner output Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts NLG Systems Architecture modules Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 27/42

  51. NLG: Specific Microplanner output Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts NLG Systems a phrase specification : a data structure that, along Architecture with the grammar, gives a full recipe for a particular modules Textplanner phrase (e.g., clause or noun phrase), but is not the Microplanner Surface realizer phrase itself. SimpleNLG realizer Hw7 27/42

  52. NLG: Specific Microplanner output Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts NLG Systems a phrase specification : a data structure that, along Architecture with the grammar, gives a full recipe for a particular modules Textplanner phrase (e.g., clause or noun phrase), but is not the Microplanner Surface realizer phrase itself. SimpleNLG realizer Hw7 Or for generation of entire texts, a text specification , an abstract structure representing the text without committing to certain surface forms 27/42

  53. NLG: Specific Phrase specifications Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Consider a phrase specification: we need two things in order Texts NLG Systems have a full recipe for the resulting NL: Architecture modules Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 28/42

  54. NLG: Specific Phrase specifications Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Consider a phrase specification: we need two things in order Texts NLG Systems have a full recipe for the resulting NL: Architecture modules Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 28/42

  55. NLG: Specific Phrase specifications Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Consider a phrase specification: we need two things in order Texts NLG Systems have a full recipe for the resulting NL: Architecture modules Textplanner 1 content: lexemes (linguistic counterparts of Microplanner Surface realizer “concepts”) SimpleNLG realizer Hw7 28/42

  56. NLG: Specific Phrase specifications Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Consider a phrase specification: we need two things in order Texts NLG Systems have a full recipe for the resulting NL: Architecture modules Textplanner 1 content: lexemes (linguistic counterparts of Microplanner Surface realizer “concepts”) SimpleNLG realizer 2 structure: Hw7 28/42

  57. NLG: Specific Phrase specifications Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Consider a phrase specification: we need two things in order Texts NLG Systems have a full recipe for the resulting NL: Architecture modules Textplanner 1 content: lexemes (linguistic counterparts of Microplanner Surface realizer “concepts”) SimpleNLG realizer 2 structure: Hw7 phrasal categories, 28/42

  58. NLG: Specific Phrase specifications Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Consider a phrase specification: we need two things in order Texts NLG Systems have a full recipe for the resulting NL: Architecture modules Textplanner 1 content: lexemes (linguistic counterparts of Microplanner Surface realizer “concepts”) SimpleNLG realizer 2 structure: Hw7 phrasal categories, features (whatever is grammaticalized in the language), 28/42

  59. NLG: Specific Phrase specifications Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Consider a phrase specification: we need two things in order Texts NLG Systems have a full recipe for the resulting NL: Architecture modules Textplanner 1 content: lexemes (linguistic counterparts of Microplanner Surface realizer “concepts”) SimpleNLG realizer 2 structure: Hw7 phrasal categories, features (whatever is grammaticalized in the language), semantic role information for mapping semantic to syntactic structure 28/42

  60. NLG: Specific Lexemes Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts Definition NLG Systems A lexeme is an uninflected abstraction of a content word, Architecture modules e.g., IDEA , SLEEP , GREEN (cf. ideas, slept, greener ). Textplanner Microplanner Dictionary entries are based on the idea of a lexeme. Surface realizer SimpleNLG realizer Hw7 29/42

  61. NLG: Specific Lexemes Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts Definition NLG Systems A lexeme is an uninflected abstraction of a content word, Architecture modules e.g., IDEA , SLEEP , GREEN (cf. ideas, slept, greener ). Textplanner Microplanner Dictionary entries are based on the idea of a lexeme. Surface realizer SimpleNLG realizer Hw7 The exact specification of a lexeme takes various forms for given theoretical traditions. But in general, lexemes are abstractions over a set of word forms. 29/42

  62. NLG: Specific Lexicon Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts Definition NLG Systems Architecture The lexicon is the collection of lexemes for a given language. modules It provides a bridge from non-linguistic (common-sense) Textplanner Microplanner Surface realizer knowledge to linguistic knowledge (the grammar). SimpleNLG realizer Hw7 30/42

  63. NLG: Specific Lexicon Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts Definition NLG Systems Architecture The lexicon is the collection of lexemes for a given language. modules It provides a bridge from non-linguistic (common-sense) Textplanner Microplanner Surface realizer knowledge to linguistic knowledge (the grammar). SimpleNLG realizer Hw7 The lexicon enumerates the psychologically and culturally salient concepts in a language. 30/42

  64. NLG: Specific Lexical semantics Components Scott Farrar CLMA, University of Washington far- A key aim of lexical semantics is to investigate the mapping rar@u.washington.edu between language and commonsense knowledge. Texts NLG Systems Architecture modules Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 31/42

  65. NLG: Specific Lexical semantics Components Scott Farrar CLMA, University of Washington far- A key aim of lexical semantics is to investigate the mapping rar@u.washington.edu between language and commonsense knowledge. Texts NLG Systems Break or squash or crush? Architecture modules John broke/squashed/crushed the ball. Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 31/42

  66. NLG: Specific Lexical semantics Components Scott Farrar CLMA, University of Washington far- A key aim of lexical semantics is to investigate the mapping rar@u.washington.edu between language and commonsense knowledge. Texts NLG Systems Break or squash or crush? Architecture modules John broke/squashed/crushed the ball. Textplanner Microplanner Surface realizer SimpleNLG realizer lexeme semantic features Hw7 BREAK object: +artifact, +rigid result: -functional 31/42

  67. NLG: Specific Lexical semantics Components Scott Farrar CLMA, University of Washington far- A key aim of lexical semantics is to investigate the mapping rar@u.washington.edu between language and commonsense knowledge. Texts NLG Systems Break or squash or crush? Architecture modules John broke/squashed/crushed the ball. Textplanner Microplanner Surface realizer SimpleNLG realizer lexeme semantic features Hw7 BREAK object: +artifact, +rigid result: -functional 31/42

  68. NLG: Specific Lexical semantics Components Scott Farrar CLMA, University of Washington far- A key aim of lexical semantics is to investigate the mapping rar@u.washington.edu between language and commonsense knowledge. Texts NLG Systems Break or squash or crush? Architecture modules John broke/squashed/crushed the ball. Textplanner Microplanner Surface realizer SimpleNLG realizer lexeme semantic features Hw7 BREAK object: +artifact, +rigid result: -functional SQUASH object: -rigid result: +flat, +smaller 31/42

  69. NLG: Specific Lexical semantics Components Scott Farrar CLMA, University of Washington far- A key aim of lexical semantics is to investigate the mapping rar@u.washington.edu between language and commonsense knowledge. Texts NLG Systems Break or squash or crush? Architecture modules John broke/squashed/crushed the ball. Textplanner Microplanner Surface realizer SimpleNLG realizer lexeme semantic features Hw7 BREAK object: +artifact, +rigid result: -functional SQUASH object: -rigid result: +flat, +smaller CRUSH object: +rigid result: +smaller 31/42

  70. NLG: Specific Grammatical constituents Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts Structuring NLG Systems Once the lexicalization is complete, the information in the Architecture modules message needs to be transformed into a grammatical form: Textplanner Microplanner the syntactic category (e.g., NP) Surface realizer SimpleNLG realizer the syntactic role (e.g., head of phrase, complement of Hw7 phrase) the features relevant for the grammar (e.g., definiteness, tense, etc.) 32/42

  71. NLG: Specific Microplanning sub-tasks Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu In general, we can come up four subtasks for the microplanner: Texts NLG Systems Architecture grammaticalization : committing to specific modules Textplanner grammatical structures (NPs, VPs, features) Microplanner Surface realizer lexicalization : committing to specific lexical items SimpleNLG realizer Hw7 (lexemes for message content, cue words for textual relations) aggregation : repackaging message content into a form that is more language-like, and less data-like referring expression generation : generating specific forms for KB entities 33/42

  72. NLG: Specific Surface realizer Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts Purpose NLG Systems To generate natural language strings from a fully specified Architecture input (deterministic); the inverse of certain kinds of parsing modules Textplanner processes. Microplanner Surface realizer SimpleNLG realizer determines the surface form of the text; Hw7 adds inflectional endings of words; orders constituents; misc. markup (e.g., lists, paragraphs, punctuation) 34/42

  73. NLG: Specific Surface realizer Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Texts NLG Systems Inputs Architecture modules phrase specifications Textplanner Microplanner Or for an entire text, a text specification Surface realizer SimpleNLG realizer Hw7 Outputs linearized sentences, texts 35/42

  74. NLG: Specific Details of the realizer Components Scott Farrar CLMA, University of Washington far- The surface realizer, in general, can be separated from the rar@u.washington.edu rest of the NLG system. It hides the idiosyncrasies of Texts grammar from the rest of the system. NLG Systems Architecture modules Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 36/42

  75. NLG: Specific Details of the realizer Components Scott Farrar CLMA, University of Washington far- The surface realizer, in general, can be separated from the rar@u.washington.edu rest of the NLG system. It hides the idiosyncrasies of Texts grammar from the rest of the system. NLG Systems Architecture modules In theory, the output language (e.g., Spanish) could be Textplanner Microplanner changed by swapping out this component. It’s the most Surface realizer SimpleNLG realizer language specific of the three components. (But this, really Hw7 depends on how language-neutral the other NLG components are.) 36/42

  76. NLG: Specific Details of the realizer Components Scott Farrar CLMA, University of Washington far- The surface realizer, in general, can be separated from the rar@u.washington.edu rest of the NLG system. It hides the idiosyncrasies of Texts grammar from the rest of the system. NLG Systems Architecture modules In theory, the output language (e.g., Spanish) could be Textplanner Microplanner changed by swapping out this component. It’s the most Surface realizer SimpleNLG realizer language specific of the three components. (But this, really Hw7 depends on how language-neutral the other NLG components are.) Cutting edge research does not focus on the realizer. Surface realization is largely a solved problem and there are a couple of robust open source systems. 36/42

  77. NLG: Specific SimpleNLG Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Purpose Texts To take an underspecified input object (a text specification) NLG Systems and create a linearized string of words as output. Architecture modules Textplanner Microplanner Surface realizer SimpleNLG realizer Hw7 37/42

  78. NLG: Specific SimpleNLG Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Purpose Texts To take an underspecified input object (a text specification) NLG Systems and create a linearized string of words as output. Architecture modules Textplanner Features of the system Microplanner Surface realizer SimpleNLG realizer Hw7 37/42

  79. NLG: Specific SimpleNLG Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Purpose Texts To take an underspecified input object (a text specification) NLG Systems and create a linearized string of words as output. Architecture modules Textplanner Features of the system Microplanner Surface realizer SimpleNLG realizer programmatic lexicon access Hw7 37/42

  80. NLG: Specific SimpleNLG Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Purpose Texts To take an underspecified input object (a text specification) NLG Systems and create a linearized string of words as output. Architecture modules Textplanner Features of the system Microplanner Surface realizer SimpleNLG realizer programmatic lexicon access Hw7 morphological component (e.g., adds -s to dog , -ren to child ) 37/42

  81. NLG: Specific SimpleNLG Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Purpose Texts To take an underspecified input object (a text specification) NLG Systems and create a linearized string of words as output. Architecture modules Textplanner Features of the system Microplanner Surface realizer SimpleNLG realizer programmatic lexicon access Hw7 morphological component (e.g., adds -s to dog , -ren to child ) an inventory of morphological features 37/42

  82. NLG: Specific SimpleNLG Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Purpose Texts To take an underspecified input object (a text specification) NLG Systems and create a linearized string of words as output. Architecture modules Textplanner Features of the system Microplanner Surface realizer SimpleNLG realizer programmatic lexicon access Hw7 morphological component (e.g., adds -s to dog , -ren to child ) an inventory of morphological features various output formats: HTML, txt, etc. 37/42

  83. NLG: Specific SimpleNLG Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu Purpose Texts To take an underspecified input object (a text specification) NLG Systems and create a linearized string of words as output. Architecture modules Textplanner Features of the system Microplanner Surface realizer SimpleNLG realizer programmatic lexicon access Hw7 morphological component (e.g., adds -s to dog , -ren to child ) an inventory of morphological features various output formats: HTML, txt, etc. structured objects representing text hierarchy 37/42

  84. NLG: Specific Elements Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu “Realisation in SimpleNLG revolves around a tree structure. Each node in the tree is represented by a NLGElement , Texts which in turn may have child nodes.” NLG Systems Architecture Direct subclasses of NLGElement modules Textplanner These are the primary elements: Microplanner Surface realizer SimpleNLG realizer Hw7 38/42

  85. NLG: Specific Elements Components Scott Farrar CLMA, University of Washington far- rar@u.washington.edu “Realisation in SimpleNLG revolves around a tree structure. Each node in the tree is represented by a NLGElement , Texts which in turn may have child nodes.” NLG Systems Architecture Direct subclasses of NLGElement modules Textplanner These are the primary elements: Microplanner Surface realizer SimpleNLG realizer DocumentElement : used to define elements that form Hw7 part of the textual structure 38/42

Recommend


More recommend