until now phrase structures and syntax
play

Until now... Phrase Structures and Syntax ANLP: Lecture 11 Focused - PDF document

Until now... Phrase Structures and Syntax ANLP: Lecture 11 Focused mostly on regular languages Finite state machines and transducers n -gram models Shay Cohen Hidden Markov Models Viterbi search and friends School of


  1. Until now... Phrase Structures and Syntax ANLP: Lecture 11 ◮ Focused mostly on regular languages ◮ Finite state machines and transducers ◮ n -gram models Shay Cohen ◮ Hidden Markov Models ◮ Viterbi search and friends School of Informatics University of Edinburgh ◮ ... Next: going up one level in the Chomsky hierarchy 8 October 2019 1 / 31 2 / 31 Recap: The Chomsky hierarchy Side note: Is English Regular? Centre-embedding [The cat 1 likes tuna fish 1 ]. [The cat 1 [the dog 2 chased 2 ] likes tuna fish 1 ]. [The cat 1 [the dog 2 [the rat 3 bit 3 ] chased 2 ] likes tuna fish 1 ]. Consider L = { (the N) n TV m likes tuna fish | n , m ≥ 0 } where N = { cat, dog, rat, elephant, kangaroo . . . } Regular TV = { chased, bit, admired, ate, befriended . . . } Clearly L is regular. However, L ∩ English is the language Context−free { (the N) n TV n − 1 likes tuna fish | n ≥ 1 } Context−sensitive Can use pumping lemma to show L is not regular. Recursively enumerable Assumption 1. “(the N) n TV m likes tuna fish” is ungrammatical for m � = n − 1. Assumption 2. “(the N) n TV n − 1 likes tuna fish” is grammatical for all n ≥ 1. (Is this reasonable? You decide!) 3 / 31 4 / 31

  2. The NLP Pipeline Grammar Writing Exercise Date: October 25 (Friday during lecture time) You will write a grammar for the English language There will be a competition between the grammars for “precision” and “recall” You should be able to start working on your grammar by the end of this class More details here: http://www.inf.ed.ac.uk/teaching/courses/anlp/cgw There will be prizes! 5 / 31 6 / 31 Computing meaning Natural language syntax A well-studied, difficult, and un- Syntax provides the scaffolding for semantic composition. solved problem. The brown dog on the mat saw the striped cat through the Fortunately, we know enough to window. have made partial progress (Wat- The brown cat saw the striped dog through the window on the son won). mat. Do the two sentences above mean the same thing? What is the Over the next few weeks, we will work up to the study of systems process by which you computed their meanings? that can assign logical forms that mathematically state the meaning of a sentence, so that they can be processed by machines. Our first stop will be natural language syntax . 7 / 31 8 / 31

  3. Constituents Heads and Phrases Noun (N): Noun Phrase (NP) Adjective (A): Adjective Phrase (AP) Words in a sentence often form groupings that can combine with Verb (V): Verb Phrase (VP) other units to produce meaning. These groupings, called Preposition (P): Prepositional Phrase (PP) consituents can often be identified by substitution tests (much like parts of speech!) ◮ So far we have looked at terminals (words or POS tags). Kim [read a book], [gave it to Sandy], and [left] ◮ Today, we’ll look at non-terminals, which correspond to phrases. You said I should read the book and [read it] I did. ◮ The part of speech that a word belongs to is closely linked to Kim read [a very interesting book about grammar]. the type of constituent that it is associated with. ◮ In a X-phrase (eg NP), the key occurrence of X (eg N) is called the head, and controls how the phrase interacts (both syntactically and semantically) with the rest of the sentence. ◮ In English, the head tends to appear in the middle of a phrase. 9 / 31 10 / 31 Constituents have structure WALS - Subject Verb Object order English NPs are commonly of the form: (Det) Adj* Noun (PP | RelClause)* NP: the angry duck that tried to bite me , head: duck . VPs are commonly of the form: (Aux) Adv* Verb Arg* Adjunct* Arg → NP | PP Adjunct → PP | AdvP | . . . VP: usually eats artichokes for dinner , head: eat . In Japanese, Korean, Hindi, Urdu, and other head-final languages, the head is at the end of its associated phrase. In Irish, Welsh, Scots Gaelic and other head-initial languages, the head is at the beginning of its associated phrase. Taken from https://wals.info/feature/81A#2/5.6/172.8 11 / 31 12 / 31

  4. Desirable Properties of a Grammar Desirable Properties of a Grammar Chomsky specified two properties that make a grammar “interesting and satisfying”: ◮ Context-free grammars (CFGs) provide a pretty good ◮ It should be a finite specification of the strings of the approximation. language, rather than a list of its sentences. ◮ Some features of NLs are more easily captured using mildly ◮ It should be revealing, in allowing strings to be associated context-sensitive grammars, as well see later in the course. with meaning (semantics) in a systematic way. ◮ There are also more modern grammar formalisms that better capture structural and distributional properties of human We can add another desirable property: languages. (E.g. combinatory categorial grammar.) ◮ It should capture structural and distributional properties of ◮ Programming language grammars (such as the ones used with the language. (E.g. where heads of phrases are located; how a compilers, like LL(1)) aren’t enough for NLs. sentence transforms into a question; which phrases can float around the sentence.) 13 / 31 14 / 31 A Tiny Fragment of English Grammar for the Tiny Fragment of English Let’s say we want to capture in a grammar the structural and distributional properties that give rise to sentences like: Grammar G1 generates the sentences on the previous slide: Grammatical rules Lexical rules A duck walked in the park. NP,V,PP S → NP VP Det → a | the | her (determiners) The man walked with a duck. NP,V,PP NP → Det N N → man | park | duck | telescope (nouns) You made a duck. Pro,V,NP NP → Det N PP Pro → you (pronoun) You made her duck. ? Pro,V,NP NP → Pro V → saw | walked | made (verbs) A man with a telescope saw you. NP,PP,V,Pro VP → V NP PP Prep → in | with | for (prepositions) A man saw you with a telescope. NP,V,Pro,PP VP → V NP You saw a man with a telescope. Pro,V,NP,PP VP → V PP → Prep NP We want to write grammatical rules that generate these phrase structures, and lexical rules that generate the words appearing in them. 15 / 31 16 / 31

  5. Context-free grammars: formal definition A sentential form is any sequence of terminals and nonterminals that can appear in a derivation starting from the start symbol. Formal definition: The set of sentential forms derivable from G is the smallest set S ( G ) ⊆ ( N ∪ Σ) ∗ such that A context-free grammar (CFG) G consists of ◮ S ∈ S ( G ) ◮ a finite set N of non-terminals, ◮ if α X β ∈ S ( G ) and X → γ ∈ P , then αγβ ∈ S ( G ). ◮ a finite set Σ of terminals, disjoint from N , The language associated with grammar is the set of sentential ◮ a finite set P of productions of the form X → α , where forms that contain only terminals. X ∈ N , α ∈ ( N ∪ Σ) ∗ , Formal definition: The language associated with G is defined by ◮ a choice of start symbol S ∈ N . L ( G ) = S ( G ) ∩ Σ ∗ A language L ⊆ Σ ∗ is defined to be context-free if there exists some CFG G such that L = L ( G ). 17 / 31 18 / 31 Assorted remarks Grammar for the Tiny Fragment of English Grammar G1 generates the sentences on the previous slide: ◮ X → α 1 | α 2 | · · · | α n is simply an abbreviation for a bunch of productions X → α 1 , X → α 2 , . . . , X → α n . Grammatical rules Lexical rules ◮ These grammars are called context-free because a rule X → α S → NP VP Det → a | the | her (determiners) says that an X can always be expanded to α , no matter where NP → Det N N → man | park | duck | telescope (nouns) the X occurs. NP → Det N PP Pro → you (pronoun) NP → Pro V → saw | walked | made (verbs) This contrasts with context-sensitive rules, which might allow VP → V NP PP Prep → in | with | for (prepositions) us to expand X only in certain contexts, e.g. bXc → b α c . VP → V NP ◮ Broad intuition: context-free languages allow nesting of VP → V PP → Prep NP structures to arbitrary depth. E.g. brackets, begin-end blocks, if-then-else statements, subordinate clauses in English, . . . Does G1 produce a finite or an infinite number of sentences? 19 / 31 20 / 31

  6. Recursion Structural Ambiguity Recursion in a grammar makes it possible to generate an infinite number of sentences. You saw a man with a telescope. In direct recursion, a non-terminal on the LHS of a rule also S appears on its RHS. The following rules add direct recursion to G1: NP VP VP → VP Conj VP Conj → and | or Pro In indirect recursion, some non-terminal can be expanded (via You V NP PP several steps) to a sequence of symbols containing that non-terminal: Det N saw Prep NP a man with Det N NP → Det N PP PP → Prep NP a telescope 21 / 31 22 / 31 Structural Ambiguity Structural Ambiguity You saw a man with a telescope. You saw a man with a telescope. S S S NP VP NP VP Pro Pro NP VP V NP You V NP Pro You saw You saw V NP PP Det N PP saw Det N a man Det N PP Prep NP Prep NP a man with with Det N Det N a man Prep NP a telescope a telescope This illustrates attachment ambiguity: the PP can be a part of the with Det N VP or of the NP. Note that there’s no POS ambiguity here. a telescope 23 / 31 24 / 31

Recommend


More recommend