Ambiguity in Language Ambiguity in Language The Lexicon The Lexicon Semantic Ambiguity: Scope and Reference Semantic Ambiguity: Scope and Reference 1 Ambiguity in Language Structural Ambiguity Types of Ambiguity Ambiguity and the Lexicon in Natural Language 2 The Lexicon Informatics 2A: Lecture 12 Closed vs. Open Classes Parts of Speech Bonnie Webber (revised by Frank Keller) Lexical Ambiguity 3 Semantic Ambiguity: Scope and Reference School of Informatics University of Edinburgh Readings: keller@inf.ed.ac.uk J&M (1 st edition) ch. 8 (pp. 287–298), ch. 10 (pp. 372–376) or J&M (2 nd edition) ch. 5 (pp. 1–11), ch. 13 (pp. 7–8); 12 October 2007 NLTK Tutorial: Elementary Language Processing. Reminder: NLTK labs start next week. Informatics 2A: Lecture 11 Ambiguity and the Lexicon in Natural Language 1 Informatics 2A: Lecture 11 Ambiguity and the Lexicon in Natural Language 2 Ambiguity in Language Ambiguity in Language Structural Ambiguity Structural Ambiguity The Lexicon The Lexicon Types of Ambiguity Types of Ambiguity Semantic Ambiguity: Scope and Reference Semantic Ambiguity: Scope and Reference Structural Ambiguity Structural Ambiguity Given a grammar, some strings can be associated with more than one structure (i.e., non-equivalent derivations). Such In both formal and natural languages, meaning is derived (in strings are structurally ambiguous. part) from the structure underlying its strings (Weeks 8 & 9). Even if a string is structurally ambiguous, the agent producing Parsers are procedures for recovering this structure, as a basis it usually only has one meaning in mind, so only one of the for computing meaning and/or mapping to another form structures corresponds to what s/he intended. (Weeks 5–7). Recall from Week 2 that for CFGs, the structure of a string Example: Newspaper Headlines can be represented as a tree diagram. stolen painting found by tree The tree diagram represents the set of derivations used in lung cancer in women mushrooms producing a string from S (the start symbol of the grammar). dealers will hear car talk at noon miners refuse to work after death juvenile court to try shooting defendant Informatics 2A: Lecture 11 Ambiguity and the Lexicon in Natural Language 3 Informatics 2A: Lecture 11 Ambiguity and the Lexicon in Natural Language 4
Ambiguity in Language Ambiguity in Language Structural Ambiguity Structural Ambiguity The Lexicon The Lexicon Types of Ambiguity Types of Ambiguity Semantic Ambiguity: Scope and Reference Semantic Ambiguity: Scope and Reference Structural Ambiguity Structural Ambiguity The designers of formal languages (e.g., XML) or programming Some languages produced by CFGs are inherently ambiguous. languages try to eliminate or reduce structural ambiguity. G1 G2 Example S -> S’C S -> AS’’ Python’s use of indentation to indicate embedding and of no S’ -> aS’b | ab S’’ -> bS’’c | bc C -> cC | c A -> aA | a indentation to indicate sequence . Consider the language produced by G1 ∪ G2 Spoken Natural Language has means of disambiguating structural ambiguity that are absent in written language. a n b n c m ∪ a j b k c k m , n , j , k ≥ 1 Example How many distinct non-equivalent derivations (tree structures) are lung cancer in WOMEN | mushrooms there for the simple string abc? dealers will hear CAR TALK at noon How many for the string aabbcc? Intonation is one of the reasons we don’t normally see how How about any string of the form a n b n c n ? ambiguous natural language is. Informatics 2A: Lecture 11 Ambiguity and the Lexicon in Natural Language 5 Informatics 2A: Lecture 11 Ambiguity and the Lexicon in Natural Language 6 Ambiguity in Language Ambiguity in Language Structural Ambiguity Structural Ambiguity The Lexicon The Lexicon Types of Ambiguity Types of Ambiguity Semantic Ambiguity: Scope and Reference Semantic Ambiguity: Scope and Reference Types of Ambiguity Types of Ambiguity Scope ambiguity and referential ambiguity are interpretation Given a string from a language, we want a parser to deliver issues (Lecture 23). Parsers ignore them. either its most likely structure or all possible structures (for another procedure to examine further). Parsers can’t ignore lexical and structural ambiguity: They must have strategies for dealing with them. If a string doesn’t belong to the language, what the parser should do depends on the application. To understand why ambiguity is so central to parsing, we’ll look first at the lexical and structural properties of Natural There are several types of ambiguity we will discuss: Language (NL) and how they are rife with ambiguity. global structural ambiguity (see examples above) All parsers for NL are based on those for CFG. However, it is local structural ambiguity (important for human parsing: known that NL is not context-free: It has more complex Lecture 30) lexical ambiguity (examples coming up) dependencies than simple embedding (Lecture 29). scope ambiguity (examples coming up) NLTK Lite (Python add-on) will allow us to study parsers referential ambiguity (examples coming up) without having to build them ourselves. Informatics 2A: Lecture 11 Ambiguity and the Lexicon in Natural Language 7 Informatics 2A: Lecture 11 Ambiguity and the Lexicon in Natural Language 8
Ambiguity in Language Closed vs. Open Classes Ambiguity in Language Closed vs. Open Classes The Lexicon Parts of Speech The Lexicon Parts of Speech Semantic Ambiguity: Scope and Reference Lexical Ambiguity Semantic Ambiguity: Scope and Reference Lexical Ambiguity Closed vs. Open Classes Lexicon in Natural Languages NL lexicons contains tens of thousands of words, with new ones Σ: set of terminal symbols constantly being created. Even in formal languages, we can distinguish two subsets within Σ: Grammars or models for NL can be largely specified in terms of the Closed class symbols associated with particular productions classes that words belong to, rather than words themselves. and their meaning. Word classes found in all Indo-European languages and in many FOL : S → ( ∀|∃ ) Variable Formula other language families: S → for Var in ListOrDictionary : S + Python : nouns S → from Module import Namelist verbs Any language has relatively few closed class tokens, and users adjectives rarely, if ever, introduce new ones. adverbs Open class symbols, which are fully productive: users All are open classes. New open class words enter English all the continually introduce new instances. time, e.g., blogger (N), ping (V), google for (V), entrepreneurial (Adj). Informatics 2A: Lecture 11 Ambiguity and the Lexicon in Natural Language 9 Informatics 2A: Lecture 11 Ambiguity and the Lexicon in Natural Language 10 Ambiguity in Language Closed vs. Open Classes Ambiguity in Language Closed vs. Open Classes The Lexicon Parts of Speech The Lexicon Parts of Speech Semantic Ambiguity: Scope and Reference Lexical Ambiguity Semantic Ambiguity: Scope and Reference Lexical Ambiguity Parts of Speech Nouns Notional criterion: nouns generally refer to living things ( mouse ), places ( Scotland ), things ( projector ), or concepts ( intelligence ). Distributional criterion: nouns can appear after determiners like the How do we tell what part of speech (word class) a word belongs to? or before relative pronouns like that . To make this decision, linguists have developed a set of criteria: Example: the blob/mouse/university that ate Chicago Notional (semantic) criteria: What does it refer to? Formal criterion: words that end in -ness , -tion , -ity , and -ance Distributional (syntactic) criteria: Where is it found? tend to be nouns. Formal (morphological) criteria: What does it look like? Example: happiness, exertion, levity, significance We will look at each of the parts of speech (POS) in turn. Formal and distributional criteria help people (and machines) recognize the class of unknown words. Example: Within the conurbation , open countryside is limited to a network of corridors. Informatics 2A: Lecture 11 Ambiguity and the Lexicon in Natural Language 11 Informatics 2A: Lecture 11 Ambiguity and the Lexicon in Natural Language 12
Recommend
More recommend