Ambiguity in Language Ambiguity in Language The Lexicon The Lexicon 1 Ambiguity in Language Derivations and Structural Ambiguity Dealing with Ambiguity Ambiguity and the Lexicon in Natural Language Informatics 2A: Lecture 12 2 The Lexicon Word Classes Parts of Speech Bonnie Webber Part of Speech Ambiguity Word Frequency School of Informatics University of Edinburgh bonnie@inf.ed.ac.uk Readings: J&M (2 nd edition) ch. 5 (intro, sec 5.1), ch. 13 (sec 13.2) 17 October 2008 NLTK Tutorial: Words Reminder: NLTK labs start next week (Week 5) Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 1 Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 2 Ambiguity in Language Derivations and Structural Ambiguity Ambiguity in Language Derivations and Structural Ambiguity The Lexicon Dealing with Ambiguity The Lexicon Dealing with Ambiguity Review: Derivations Example NP → NP VBG Recall from Lecture 4 that equivalent derivations are ones NP → N PP that only differ in the order of non-terminal expansion. NP → N Recall also that the set of equivalent derivations of a string PP → about NP from a context-free (CF) phrase structure grammar (PSG) can N → complaints | referees be represented as a tree. VBG → multiplying A tree makes no commitment as to the order in which Consider the string: non-terminals are expanded. complaints about referees multiplying However, not all derivations of a given string from a given How many non-equivalent sets of derivations (ie, different trees) grammar are equivalent. are there for this string? Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 3 Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 4
Ambiguity in Language Derivations and Structural Ambiguity Ambiguity in Language Derivations and Structural Ambiguity The Lexicon Dealing with Ambiguity The Lexicon Dealing with Ambiguity Complaints about referees multiplying Complaints about referees multiplying NP NP PP NP NP PP NP NP N N VBG N VBG N Complaints about referees multiplying Complaints about referees multiplying Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 5 Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 6 Ambiguity in Language Derivations and Structural Ambiguity Ambiguity in Language Derivations and Structural Ambiguity The Lexicon Dealing with Ambiguity The Lexicon Dealing with Ambiguity Complaints about referees multiplying Derivations and structural ambiguity Given a grammar, those strings that can be associated with NP more than one tree (i.e., non-equivalent derivations) are called structurally ambiguous. NP PP Even if a string is structurally ambiguous, the agent producing NP it usually only has one meaning in mind, so only one of the PP structures corresponds to what s/he intended. NP Example: Newspaper Headlines stolen painting found by tree N N VBG lung cancer in women mushrooms dealers will hear car talk at noon Complaints about referees multiplying miners refuse to work after death juvenile court to try shooting defendant Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 7 Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 8
Ambiguity in Language Derivations and Structural Ambiguity Ambiguity in Language Derivations and Structural Ambiguity The Lexicon Dealing with Ambiguity The Lexicon Dealing with Ambiguity Avoiding Ambiguity Handling Ambiguity The designers of formal languages (e.g., XML) or programming languages try to eliminate or reduce structural ambiguity. Given a string from a language, the role of a parser is to deliver either its most likely structure or all its possible Example structures (for another procedure to examine further). Python’s use of indentation to indicate embedding and of no In weeks 5-7, we’ll look at various techniques that parsers use indentation to indicate sequence . to do this efficiently. When we talk, we can use speech rate, pauses and emphasis to Fortunately, NLTK Lite (Python add-on) will allow us to indicate what we intend. study parsers without having to build them ourselves. But structural ambiguity is not the only form of ambiguity in Example Natural Language that causes problems for parsers. lung cancer in WOMEN | mushrooms To understand part-of-speech ambiguity, we need to look at dealers will hear CAR TALK at noon word classes (aka “parts of speech”) in Natural Language. This is one reason why we don’t normally notice that NL strings can have multiple analyses (and multiple meanings!). Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 9 Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 10 Word Classes Word Classes Ambiguity in Language Parts of Speech Ambiguity in Language Parts of Speech The Lexicon Part of Speech Ambiguity The Lexicon Part of Speech Ambiguity Word Frequency Word Frequency Word Classes in Formal/Programming Languages Lexicon in Natural Languages Every grammar for describing a language contains a set of non-terminal symbols Words (and punctuation) comprise the terminal symbols in a set of terminal symbols (Σ) that appear in strings in the (the written form of) a Natural Language. language. But NL grammars are most often largely specified in terms of But within Σ, we can distinguish: the classes that words belong to. those symbols that convey information about the structure of a string and the roles that other symbols play. Several word classes are found in all Indo-European languages and in other language families as well: nouns, verbs, Example adjectives, adverbs. FOL : S → ( ∀|∃ ) Variable Formula Other word classes are more specific to particular languages: S → for Var in ListOrDictionary : S + Python : prepositions, particles, determiners, conjunctions, interjections S → from Module import Namelist all other symbols Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 11 Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 12
Word Classes Word Classes Ambiguity in Language Parts of Speech Ambiguity in Language Parts of Speech The Lexicon Part of Speech Ambiguity The Lexicon Part of Speech Ambiguity Word Frequency Word Frequency Parts of Speech Nouns Notionally, nouns generally refer to living things ( mouse ), places ( Scotland ), things ( projector ), or concepts ( intelligence ). How do we tell what word class (part of speech) a word belongs to? Distributionally, nouns appear after determiners like the or before At least three different criteria can be used: relative pronouns like that . Notional (semantic) criteria: What does the word refer to? Example: the blob/mouse/university that ate Chicago Distributional (syntactic) criteria: Where is the word found? Formal (morphological) criteria: What does the word look like? Formally, words ending in -ness , -tion , -ity , and -ance tend to be nouns. We will look at different parts of speech (POS) using these criteria. Example: happiness, exertion, levity, significance Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 13 Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 14 Word Classes Word Classes Ambiguity in Language Parts of Speech Ambiguity in Language Parts of Speech The Lexicon Part of Speech Ambiguity The Lexicon Part of Speech Ambiguity Word Frequency Word Frequency Verbs Adjectives Notionally, verbs refer to actions ( sleep , wash , give ). Notionally, adjectives describe things that are nouns ( small , wee , Distributionally, verbs can be classified by the number of salubrious , excellent ). arguments they co-occur with: Distributionally, adjectives usually appear before a noun or after a intransitive verbs (1 arg): Smoke rises. transitive verbs (2 args): John washed the glass, The cat groomed form of be . itself. Example: wee drop; The food is excellent . ditransitive verbs (3 args): John served us steak, Mary gave Fred a toothpick. verbs with 4 args: Fred transferred the glass from the table to the Formally, words that end in -al , -ble , and -ous tend to be shelf. adjectives. Formally, words that end in -ate or -ize tend to be verbs, and ones Example: formal, invisible, capable, salubrious, parlous that end in -ing are often the present participle of a verb. Example: automate, calibrate, equalize, modernize; rising, washing, grooming. Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 15 Informatics 2A: Lecture 12 Ambiguity and the Lexicon in Natural Language 16
Recommend
More recommend