Statistical Parsing October 27, 2016 Dependency grammars Grammar formalisms Finale Plan of the lecture Ç. Çöltekin, SfS / University of Tübingen 5 / 31 Introduction Recap Introduction Constituency grammars Dependency grammars Grammar formalisms Finale Constituency grammars Recap probably the most studied Grammars and grammar formalisms Recap Introduction Constituency grammars Dependency grammars Grammar formalisms Finale The term grammar is used for, 4 / 31 language—as in a ‘grammar (book) of English’ of language—as in HPSG, LFG, CCG infjnite set of strings (not necessarily a natural language) Ç. Çöltekin, SfS / University of Tübingen October 27, 2016 Constituency grammars grammars both in linguistics, and October 27, 2016 What is a constituency Recap Introduction Constituency grammars Dependency grammars Grammar formalisms Finale Linguists ofger a number of tests for constituency, such as October 27, 2016 but, presumably, no question with answer ‘John saw’ John [saw Marry] and [said ‘hi’] Note, however, these tests are leaky, e.g., ‘[John saw] and [Peter Ç. Çöltekin, SfS / University of Tübingen October 27, 2016 6 / 31 SfS / University of Tübingen computer science NP words form natural groups, or ‘constituents’, like no noun phrases or word phrases context-free grammars are often used as synonyms S John Ç. Çöltekin, VP V saw NP Marry Note: many grammar formalisms use constituency grammars in some way, we will not focus on a particular grammar formalism here. 3 / 31 Grammars SfS / University of Tübingen NN dependency NN parsing PP IN of NP natural and NN languages nmod amod Ç. Çöltekin, nmod conj NN CC amod Dependency grammars Çağrı Çöltekin University of Tübingen Seminar für Sprachwissenschaft October 27, 2016 Recap Introduction Constituency grammars Grammar formalisms constituency Finale This course is about … NP NP JJ statistical NN cc case Ç. Çöltekin, (semantic) interpretation of sentences hence it is useful for Ingredients of a parser Finale Grammar formalisms Dependency grammars Constituency grammars Introduction Recap 2 / 31 October 27, 2016 SfS / University of Tübingen Ç. Çöltekin, for linguistic research applications like speech recognition and machine translation applications like question answering , information extraction SfS / University of Tübingen 7 / 31 Dependency grammars October 27, 2016 1 / 31 Why do we need syntactic parsing? Introduction Constituency grammars Recap Grammar formalisms Finale • Often, syntactic analysis is an intermediate step helping • A grammar • An algorithm for parsing • (Statistical) parsers are also used as language models for • A method for ambiguity resolution • It can be used for grammar checking , and can be a useful tool • a description of the whole system/structure of a • Constituency grammars • Dependency grammars • a grammar formalism, that are often developed as theory • Brief notes on some major grammar formalisms • A formal (fjnite) specifjcation of a language as a possibly • Constituency grammars are • They can answer questions: Q: ‘What did John do? → A: ‘saw Marry’ • Substitution with a pronoun forms: • The main idea is that a group of Q: ‘John [read the book] last week? → A: ‘John [did that] last week.’ • Fronting, topicalization: ‘John likes [reading books]’ → ‘[Reading books], John likes’ • Coordination: • phrase structure grammars or • … greated] Marry’ (see Müller 2016, for more examples).
Recap (dashed ellipse) are adequate for representing natural languages Grammar formalisms Finale Chomsky hierarchy: the picture Regular Context Free Context Sensitive Recursively Enumerable care about empty language) probably cross-cut this hierarchy (shaded region) Constituency grammars Ç. Çöltekin, SfS / University of Tübingen October 27, 2016 12 / 31 Recap Introduction Constituency grammars Dependency grammars Dependency grammars Introduction Finale Finale SfS / University of Tübingen October 27, 2016 10 / 31 Recap Introduction Constituency grammars Introduction Grammar formalisms Some examples Recap memory including morphological analysis, partial parsing. language parsers they are too powerful, hence too expensive for some syntactic phenomena. Ç. Çöltekin, SfS / University of Tübingen October 27, 2016 11 / 31 Grammar formalisms Expressiveness of grammar classes or Grammar formalisms Ç. Çöltekin, SfS / University of Tübingen October 27, 2016 14 / 31 Recap Introduction Constituency grammars Dependency grammars Finale Constituency grammars and parsing Constituency grammars summary idea that some words form constituents (non-terminals in a formal grammar) science phrase structure grammars used in parsing natural or programming languages (maybe with some extensions) Ç. Çöltekin, SfS / University of Tübingen October 27, 2016 equivalent classes of grammars) Finale natural languages has been an important question for …we (computational) linguistics there are some examples, e.g., from Swiss German (Shieber 1985) Jan säit das… …mer em Hans es huss hälfed aastriiche Hans (dat) house (acc) helped Grammar formalisms paint Ç. Çöltekin, SfS / University of Tübingen October 27, 2016 13 / 31 Recap Introduction Constituency grammars Dependency grammars Ç. Çöltekin, Dependency grammars type 3 Regular, recognized by fjnite-state automata Recap NP John | Marry V saw Ç. Çöltekin, SfS / University of Tübingen October 27, 2016 8 / 31 Introduction VP Constituency grammars Dependency grammars Grammar formalisms Example derivation The example grammar: S NP VP VP V NP NP VP NP can be derived from S with the Constituency grammars Dependency grammars Grammar formalisms Finale Formal defjnition N, S, R) N is a set of non-terminal symbols R is a set of rules of the form rewrite rules R S S NP John VP V saw NP Marry V NP Finale John | Marry Grammar formalisms SfS / University of Tübingen October 27, 2016 9 / 31 Recap Introduction Dependency grammars Finale called sentential forms Chomsky hierarchy of grammars type 0 Recursively enumerable, recognized by Turing machines (HPSG, LFG) type 1 Context sensitive, recognized by linear-bound automaton type 2.1 Mildly context sensitive (TAG, CCG) type 2 Context free, recognized by push-down automata Ç. Çöltekin, Constituency grammars 15 / 31 or, S V successive application of rewrite rules. saw A phrase structure grammar is a tuple ( Σ , Σ is a set of terminal symbols → → → → S ∈ N is a distinguished start symbol • Phrase structure grammars derive a sentence with αAβ → γ for A ∈ N α, β, γ ∈ Σ ∪ N S ⇒ NP VP ⇒ John VP ⇒ John V NP ⇒ John saw NP ⇒ John saw Marry • The grammar accepts a sentence if it ∗ ⇒ John saw Marry • The intermediate forms that contain non-terminals are → → → → • Regular grammars (fjnite-state automata) do not have any can represent a ∗ b ∗ , but not a n b n αAβ → γ • Finite-state automata are used in many tasks in CL, αAβ → αγβ, γ ̸ = ϵ • Context free grammars (push-down automata) uses a stack can represent a n b n , a n b m c m d n , but not a n b m c n d m • Context-free grammars form the basis of most natural A → α • Context-sensitive languages can do all of the above but A → aB A → Ba In all of the above A and B are non-terminals, a is a terminal symbol, α , β , γ • Some level of context sensitiveness seems to be necessary are sequences of terminals and non-terminals, and ϵ is the empty string. • The class of grammars adequate for formally describing • For the most part, context-free grammars are enough, but • Chomsky hierarchy of languages form a hierarchy (with some • It is often claimed that mildly context sensitive grammars Note that this resembles a n b m c n d m . • Note, however, not even every regular language is a potential natural language (e.g., a ∗ bbc ∗ ). The possible natural languages • Constituency, or phrase structure, grammars builds on the • Context-free grammars are often parseable with complexity O ( n 3 ) using dynamic programming algorithms • Mildly context-sensitive grammars can also be parsed in • They are well studied, both in linguistics and computer polynomial time ( O ( n 6 ) ) • Often greedy search algorithms are used (even for CFG or • Context free grammars are the most common class of
Recommend
More recommend