Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernández Institute for Logic, Language, and Computation Winter 2012, lecture 2b Raquel Fernández TtTv 2012 - lecture 2b 1 / 24
Announcements • Presentations website and work plan https://sites.google.com/site/projectstttv2012/ • Guest lecture on Machine Translation by Gideon Wenniger http://staff.science.uva.nl/~gemaille/ • Homework Raquel Fernández TtTv 2012 - lecture 2b 2 / 24
Overview Last lecture: • Introduction to grammars • Chomsky hierarchy • grammars for natural language: not regular, mostly context-free. Raquel Fernández TtTv 2012 - lecture 2b 3 / 24
Overview Last lecture: • Introduction to grammars • Chomsky hierarchy • grammars for natural language: not regular, mostly context-free. Today: • Grammars for syntactic analysis of natural languages • Implementation of CFGs in Prolog as DCGs Raquel Fernández TtTv 2012 - lecture 2b 3 / 24
Syntax Syntax is the area of linguistics which studies the internal structure of natural language sentences. • not all sequences of words are well-formed sentences: US of criticisms policy Chomsky threats has because of death received his foreign. Raquel Fernández TtTv 2012 - lecture 2b 4 / 24
Syntax Syntax is the area of linguistics which studies the internal structure of natural language sentences. • not all sequences of words are well-formed sentences: US of criticisms policy Chomsky threats has because of death received his foreign. Chomsky has received death threats because of his criticisms of US foreign policy. Raquel Fernández TtTv 2012 - lecture 2b 4 / 24
Syntax Syntax is the area of linguistics which studies the internal structure of natural language sentences. • not all sequences of words are well-formed sentences: US of criticisms policy Chomsky threats has because of death received his foreign. Chomsky has received death threats because of his criticisms of US foreign policy. • speakers have intuitions about what are well-formed sentences of a language, even if they don’t know what a sentence means: Raquel Fernández TtTv 2012 - lecture 2b 4 / 24
Syntax Syntax is the area of linguistics which studies the internal structure of natural language sentences. • not all sequences of words are well-formed sentences: US of criticisms policy Chomsky threats has because of death received his foreign. Chomsky has received death threats because of his criticisms of US foreign policy. • speakers have intuitions about what are well-formed sentences of a language, even if they don’t know what a sentence means: Colourless green ideas sleep furiously (Noam Chomsky) Raquel Fernández TtTv 2012 - lecture 2b 4 / 24
Syntax Syntax is the area of linguistics which studies the internal structure of natural language sentences. • not all sequences of words are well-formed sentences: US of criticisms policy Chomsky threats has because of death received his foreign. Chomsky has received death threats because of his criticisms of US foreign policy. • speakers have intuitions about what are well-formed sentences of a language, even if they don’t know what a sentence means: Colourless green ideas sleep furiously (Noam Chomsky) The gostak distims the doshes (Andrew Ingraham) Raquel Fernández TtTv 2012 - lecture 2b 4 / 24
Syntax Syntax is the area of linguistics which studies the internal structure of natural language sentences. • not all sequences of words are well-formed sentences: US of criticisms policy Chomsky threats has because of death received his foreign. Chomsky has received death threats because of his criticisms of US foreign policy. • speakers have intuitions about what are well-formed sentences of a language, even if they don’t know what a sentence means: Colourless green ideas sleep furiously (Noam Chomsky) The gostak distims the doshes (Andrew Ingraham) • the same sequence of words can convey different meanings: Raquel Fernández TtTv 2012 - lecture 2b 4 / 24
Syntax Syntax is the area of linguistics which studies the internal structure of natural language sentences. • not all sequences of words are well-formed sentences: US of criticisms policy Chomsky threats has because of death received his foreign. Chomsky has received death threats because of his criticisms of US foreign policy. • speakers have intuitions about what are well-formed sentences of a language, even if they don’t know what a sentence means: Colourless green ideas sleep furiously (Noam Chomsky) The gostak distims the doshes (Andrew Ingraham) • the same sequence of words can convey different meanings: The tourist saw the astronomer with the telescope Raquel Fernández TtTv 2012 - lecture 2b 4 / 24
Syntactic Ambiguity We can account for some of the different meanings of a sentence by assigning more than one possible internal structure to it. Raquel Fernández TtTv 2012 - lecture 2b 5 / 24
Syntactic Ambiguity We can account for some of the different meanings of a sentence by assigning more than one possible internal structure to it. S NP VP Det N the tourist VP PP V NP P NP saw with Det N Det N the telescope the astronomer Raquel Fernández TtTv 2012 - lecture 2b 5 / 24
Syntactic Ambiguity S NP VP Det N V NP the tourist saw NP PP Det N P NP the astronomer with Det N the telescope Raquel Fernández TtTv 2012 - lecture 2b 6 / 24
Syntactic Ambiguity S NP VP Det N V NP the tourist saw NP PP Det N P NP the astronomer with Det N the telescope Using linguistic terminnology, in one case the prepositional phrase (PP) modifies the verb phrase (VP), while in the other case it modifies the noun phrase (NP). Raquel Fernández TtTv 2012 - lecture 2b 6 / 24
Formal Grammars for Syntactic Analysis Raquel Fernández TtTv 2012 - lecture 2b 7 / 24
Formal Grammars for Syntactic Analysis The number of well-formed sentences of a language is potentially infinite. • Formal grammars allow us to generate an infinite number of sentences with finite means (a finite set of rules). Raquel Fernández TtTv 2012 - lecture 2b 7 / 24
Formal Grammars for Syntactic Analysis The number of well-formed sentences of a language is potentially infinite. • Formal grammars allow us to generate an infinite number of sentences with finite means (a finite set of rules). • For a grammar to be able to generate an infinite languages it must be recursive. Raquel Fernández TtTv 2012 - lecture 2b 7 / 24
Formal Grammars for Syntactic Analysis The number of well-formed sentences of a language is potentially infinite. • Formal grammars allow us to generate an infinite number of sentences with finite means (a finite set of rules). • For a grammar to be able to generate an infinite languages it must be recursive. ∗ we can have direct recursion within a rule, or ∗ recursion across rules NP → Det N NP → NP PP PP → P NP Raquel Fernández TtTv 2012 - lecture 2b 7 / 24
Formal Grammars for Syntactic Analysis The number of well-formed sentences of a language is potentially infinite. • Formal grammars allow us to generate an infinite number of sentences with finite means (a finite set of rules). • For a grammar to be able to generate an infinite languages it must be recursive. ∗ we can have direct recursion within a rule, or ∗ recursion across rules NP → Det N NP → NP PP PP → P NP ∗ there is recursion when the same non-terminal symbol appears both in the left-hand side of a rule and in the left-hand side of a (possible different) rule. Raquel Fernández TtTv 2012 - lecture 2b 7 / 24
Formal Grammars for Syntactic Analysis • A grammar can be used to decide whether a sentence is a well-formed sentence of a given language or not. ∗ If the sentence can be derived from the grammar rules, then it is well-formed, i.e. part of the language specified by the grammar or grammatical • A grammar can also be used to assign structure to a sentence ∗ the computational process of assigning a syntactic structure to a sentence is called parsing and the resulting trees of a derivation parse trees. → We’ll discuss parsing next week. (Recall our discussion of FSAs as recognizers and generators) Raquel Fernández TtTv 2012 - lecture 2b 8 / 24
Grammar Equivalence What grammar should we choose? Raquel Fernández TtTv 2012 - lecture 2b 9 / 24
Grammar Equivalence What grammar should we choose? • Two grammars are strongly equivalent if they generate the same language (the same set of strings) and assign the same parse tree to each sentence (the non-terminal symbols may be different). Raquel Fernández TtTv 2012 - lecture 2b 9 / 24
Grammar Equivalence What grammar should we choose? • Two grammars are strongly equivalent if they generate the same language (the same set of strings) and assign the same parse tree to each sentence (the non-terminal symbols may be different). • Two grammars are weakly equivalent if they generate the same language but assign different parse trees to some sentences. Raquel Fernández TtTv 2012 - lecture 2b 9 / 24
Recommend
More recommend