question answering advanced natural language processing
play

Question Answering Advanced Natural Language Processing We would - PowerPoint PPT Presentation

1 Question Answering Advanced Natural Language Processing We would like to build Lecture 25 a machine that answers questions in natural language may have access to knowledge bases, dictionaries, thesauri Lexical Semantics 2:


  1. 1 Question Answering Advanced Natural Language Processing • We would like to build Lecture 25 – a machine that answers questions in natural language – may have access to knowledge bases, dictionaries, thesauri Lexical Semantics 2: Questions Answering and – may have access to vast quantities of English text Word Sense Disambiguation • Basically, a smarter Google Johanna Moore (some slides by Philipp Koehn) • This task is typically called Question Answering 20 November 2012 • What will we need to be able to do this? Johanna Moore ANLP Lecture 24 20 November 2012 Johanna Moore ANLP Lecture 24 20 November 2012 2 3 Example Question (2) Example Question • Question • Question What kind of plants grow in Scotland? When was Barack Obama born? • Text available to the machine • Text available to the machine A new chemical plant was opened in Scotland. Barack Obama was born on August 4, 1961 Heather is just one of the many plants that grow in Scotland. • This is easy. – just phrase a Google query properly: • What is hard? "Barack Obama was born on *" – words may have di ff erent meanings – syntactic rules that convert questions into statements are straight-forward – we need to be able to disambiguate them Johanna Moore ANLP Lecture 24 20 November 2012 Johanna Moore ANLP Lecture 24 20 November 2012

  2. 4 5 Example Question (3) Example Question (4) • Question • Question Do the police use dogs to sni ff for drugs? Which animals love to swim? • Text available to the machine • Text available to the machine The police use canines to sni ff for drugs. Ice bears love to swim in the freezing waters of the Arctic. • What is hard? • What is hard? – words may have the “same” meaning (synonyms, hyponyms) – some words belong to groups which are referred to by other words – we need to be able to match them – we need to have database of such A is-a B relationships, such as the WordNet object hierarchy Johanna Moore ANLP Lecture 24 20 November 2012 Johanna Moore ANLP Lecture 24 20 November 2012 6 7 Example Question (5) Example Question (6) • Question • Question What is the name of George Bush’s poodle? Did Poland reduce its carbon emissions since 1989? • Text available to the machine • Text available to the machine President George Bush has a terrier called Barney. Due to the collapse of the industrial sector after the end of communism in 1989, all countries in Central Europe saw a fall in carbon emissions. • What is hard? Poland is a country in Central Europe. – we need to know that poodle and terrier are related–they share a common • What is hard? ancestor in a taxonomy such as the WordNet object hierarchy – words need to be grouped together into semantically related classes – we need to do logical inference to relate the two sentences Johanna Moore ANLP Lecture 24 20 November 2012 Johanna Moore ANLP Lecture 24 20 November 2012

  3. 8 9 Word Sense Disambiguation (WSD) Constraint Satisfaction An important capability for automated question answering is word sense Three cases: disambiguation , i.e., the ability to select the correct sense for each word in a given context • Disambiguate an argument by using the selection restrictions from an unambiguous predicate. What types of plants grow in Scotland? • Disambiguate a predicate by using the selection restrictions from an There are many approaches to this problem: unambiguous argument. • Constraint satisfaction approaches • Mutual disambiguation of an argument and a predicate. • Dictionary approaches • Supervised ML • Unsupervised ML Johanna Moore ANLP Lecture 24 20 November 2012 Johanna Moore ANLP Lecture 24 20 November 2012 10 11 Constraint Satisfaction Examples Constraint Satisfaction Examples Disambiguating arguments using predicates: Disambiguating predicates using arguments: “In our house, everybody has a career and none of them includes washing 1. Well, there was the time they served green-lipped mussels from New dishes ,” he says. Zealand. In her tiny kitchen at home, Ms. Chen works e ffi ciently, stir-frying several 2. Which airlines serve Denver? simple dishes , including braised pig’s ears and chicken livers with green 3. Which ones serve breakfast? peppers. Sense of serve in 1 requires its patient to be edible Disambiguate dishes using the selectional restrictions that predicates washing and Sense of serve in 2 requires its patient to be geographical entity stir-fry place on their arguments Sense of serve in 3 requires its patient to be a meal designator Johanna Moore ANLP Lecture 24 20 November 2012 Johanna Moore ANLP Lecture 24 20 November 2012

  4. 12 13 Constraint Satisfaction Examples Problems with Constraint Satisfaction Approach Mutual disambiguation • The need to parse to get the verb-argument information needed to make it work I’m looking for a restaurant that serves vegetarian dishes . • Scaling up to large numbers of words (WordNet helps with this) • Getting details of all selectional restrictions correct Assuming 3 senses of serve and 2 of dishes gives 6 possible sense combinations, • Wider context can sanction violation of selectional restriction but only 1 satisfies all selectional restrictions But it fell apart in 1931, perhaps because people realized you can’t eat gold for lunch. • Dealing with metaphorical uses that violate the constraints If you want to kill the Soviet Union, get it to try to eat Afghanistan. Johanna Moore ANLP Lecture 24 20 November 2012 Johanna Moore ANLP Lecture 24 20 November 2012 14 15 WSD as a Classification Problem What are useful features for WSD? Assume corpus of texts with words labeled with their senses “If one examines the words in a book, one at a time as through an opaque • She pays 3% interest/INTEREST-MONEY on the loan. mask with a hole in it one word wide, then it is obviously impossible to • He showed a lot of interest/INTEREST-CURIOSITY in the painting. determine, one at a time, the meaning of the words. . . But if one lengthens the slit in the opaque mask, until one can see not only the central word in Similar to POS tagging question but also say N words on either side, then if N is large enough one • given a corpus tagged with senses can unambiguously decide the meaning of the central word. . . The practical • identify features that indicate one sense over another question is: ”What minimum value of N will, at least in a tolerable fraction • learn a model that predicts the correct sense given the features of cases, lead to the correct choice of meaning for the central word?” We can apply similar supervised learning methods Warren Weaver, A Synopsis of Linguistic Theory, 1955 • Naive Bayes • Decision lists • Decision trees, etc. Johanna Moore ANLP Lecture 24 20 November 2012 Johanna Moore ANLP Lecture 24 20 November 2012

  5. 16 17 Example Feature Extraction: Collocational Features Collocational Features : information about words in specific positions to the left An electric guitar and bass player stand o ff to one side, not really part or right of the target word of the scene, just as a sort of nod to gringo expectations perhaps. • plant life Collocational feature vector extracted from window of 2 words (+ POS tags) to • plant closure right and left of target word. • manufacturing plant [guitar, NN1, and, CJC, player, NN1, stand, VVB] • assembly plant Features extracted for context words: • word itself • root form • POS Johanna Moore ANLP Lecture 24 20 November 2012 Johanna Moore ANLP Lecture 24 20 November 2012 18 19 Feature Extraction: Bag of Words Features Other useful features Bag of Words Features: all content words in an N-word window Of course, many other features may be included: E.g., vector of binary features indicating whether word w , from vocabulary, V , • Syntactically related words occurs in context window • Syntactic role in sense An electric guitar and bass player stand o ff to one side, not really part • Topic of the text of the scene, just as a sort of nod to gringo expectations perhaps. V = [fishing, big, sound, player, fly, rod, pound, double, runs, playing, guitar, band] Window = 10 Bag of Words Feature Vector: [0,0,0,1,0,0,0,0,0,0,1,0] Johanna Moore ANLP Lecture 24 20 November 2012 Johanna Moore ANLP Lecture 24 20 November 2012

Recommend


More recommend