Statistical Parsing Paper presentation: Proceedings of the 43rd - PowerPoint PPT Presentation

Statistical Parsing Paper presentation: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics . ACL ’05. Ann Arbor, Michigan: Association for Computational Linguistics, pp. 173–180. doi: 10.3115/1219840.1219862 . url: http://dx.doi.org/10.3115/1219840.1219862 Çağrı Çöltekin University of Tübingen Seminar für Sprachwissenschaft December 2016 Eugene Charniak and Mark Johnson (2005). “Coarse-to-fjne N-best Parsing and MaxEnt Discriminative Reranking”. In:

The general idea – n-best generative parser with limited/local features – discriminative re-ranker with lots of global features – Effjcient n-best parsing is non-trivial – The features/methods for re-ranking Ç. Çöltekin, SfS / University of Tübingen Collins parser 1 / 10 • A two-stage parsing process • The problems/issues

N-best parsing: the problem programming: – Space complexity becomes an issue, theoretical complexity – Abandon dynamic programming, use a backtracking parser (slow) – Keep dynamic programming with (clever) tricks (potentially resulting in approximate solutions) Ç. Çöltekin, SfS / University of Tübingen Collins parser 2 / 10 • Beam search (n-best parsing) is tricky with dynamic for bi-lexical grammars: O ( nm 3 ) • Potential solutions:

Coarse-to-fjne n-best parsing space complexity seems to stay sub-quadratic (add-hoc Collins parser SfS / University of Tübingen Ç. Çöltekin, 60 50k number of str. not have 3 / 10 • First parse with a coarse (non-lexicalized) PCFG • Prune the parse forest, removing the branches with probability less than a threshold (about 10 − 4 ) • Lexicalize the pruned parse forest + Conditions on information that non-lexicalized PCFG does − Increases the number of dynamic programming states. But calculation: below 100 ∗ L 1.5 ) 100 ∗ L 1.5 observed Average sentence length ( L )

Getting the n-best parse with dynamic programming 50 Collins parser SfS / University of Tübingen Ç. Çöltekin, cf. 89.7% F-score of the base parser 0.968 0.960 0.948 0.914 0.897 F-score 25 10 2 1 n (only a few MB) non-terminals 4 / 10 • For each span (CKY chart entry) keep only the n-best • Note: if lists are sorted by probability, combination would not require n 2 time • Space effjciency does not seem to be a problem in practice • N-best oracle results:

Re-ranking – Note: they distinguish between ‘lexical’ and ‘functional’ Collins parser SfS / University of Tübingen Ç. Çöltekin, heads parse tree was ‘eat’ with complement ‘pizza’ to re-rank them parser 5 / 10 • Having 50-best parses from the base parser, the idea now is • Each parse tree is converted a numeric vector of features • The fjrst feature is the log probability assigned by the base • Other features are assigned based on templates – For example, f eat pizza ( y ) counts number of times the head of • After discarding rare features, total number of features is 1 148 697

Feature templates preterminal heads, their terminal heads and their Collins parser SfS / University of Tübingen Ç. Çöltekin, LexFunHeads POS tags of lexical and functional heads Heads Head-to-head dependencies NGram ngrams (bigrams) of the siblings ancestors’ categories Rule whether nodes are annotated with their CoPar conjunct parallelism Neighbors preterminals before/after the node they are fjnal or they follow a punctuation Heavy categories and their lengths, including whether path between root and the rightmost terminal RightBranch number of non-terminals that (do not) lie on the fmag indicating fjnal conjuncts CoLenPar length difgerence between conjuncts, including a 6 / 10

Feature templates (cont.) maximal projection ancestors projection ancestors HeadTree tree fragments consisting of the local trees consisting of the projections of a preterminal node and the siblings of such projections contiguous preterminal nodes Ç. Çöltekin, SfS / University of Tübingen Collins parser 7 / 10 WProj preterminals with the categories of their closest ℓ Word lexical items with the their closest ℓ maximal NGramTree subtrees rooted in the least common ancestor of ℓ

Results/Conclusions F-score Collins parser SfS / University of Tübingen Ç. Çöltekin, n-best parser, followed by discriminative re-ranking State-of-the art parsing of PTB with generative 8 / 10 effjcient 0.9037 Collins 0.9102 New • Also better than 0 . 907 reported by Bod (2003), but more • 13 % error reduction over the base parser (or maybe even 18 % , considering PTB is not perfect) • The parser is publicly available

Results/Conclusions F-score Collins parser SfS / University of Tübingen Ç. Çöltekin, n-best parser, followed by discriminative re-ranking 8 / 10 effjcient 0.9037 Collins 0.9102 New • Also better than 0 . 907 reported by Bod (2003), but more • 13 % error reduction over the base parser (or maybe even 18 % , considering PTB is not perfect) • The parser is publicly available • State-of-the art parsing of PTB with generative

Parameter estimation for re-ranking regularized negative log-likelihood in n-best list – Pick the tree(s) that are most similar to gold-standard tree (with best F-score) – In case of ties (multiple best trees), prefer the solution maximizing the log likelihood of all Ç. Çöltekin, SfS / University of Tübingen Collins parser 9 / 10 • They use a maximum-entropy model (=logistic regression) • Feature weights are calculated by minimizing L2 • A slight divergence: the gold-standard parse is not always

Summary coordination are the main sources of error needed for good accuracy rules that were not seem in the training) Ç. Çöltekin, SfS / University of Tübingen Collins parser 10 / 10 • Accurate generative parser that breaks down rules • Does well on ‘core’ dependencies, adjuncts and • Either conditioning on adjacency or subcategorization is • The models work well with fmat dependencies • Breaking down the rules have good properties (can use

Bibliography Bod, Rens (2003). “An Effjcient Implementation of a New DOP Model”. In: Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics - Volume 1 . EACL ’03. Budapest, Hungary: http://dx.doi.org/10.3115/1067807.1067812 . Charniak, Eugene and Mark Johnson (2005). “Coarse-to-fjne N-best Parsing and MaxEnt Discriminative Reranking”. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics . ACL ’05. Ann Arbor, http://dx.doi.org/10.3115/1219840.1219862 . Collins, Michael and Terry Koo (2005). “Discriminative Reranking for Natural Language Parsing”. In: Computational http://dx.doi.org/10.1162/0891201053630273 . Ç. Çöltekin, SfS / University of Tübingen Collins parser A.1 Association for Computational Linguistics, pp. 19–26. isbn: 1-333-56789-0. doi: 10.3115/1067807.1067812 . url: Michigan: Association for Computational Linguistics, pp. 173–180. doi: 10.3115/1219840.1219862 . url: Linguistics 31.1, pp. 25–70. issn: 0891-2017. doi: 10.1162/0891201053630273 . url:

Statistical Parsing Paper presentation: Proceedings of the 43rd - PowerPoint PPT Presentation

Statistical Parsing Paper presentation: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics . ACL 05. Ann Arbor, Michigan: Association for Computational Linguistics, pp. 173180. doi: 10.3115/1219840.1219862 .

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in Parsing Technology WS

Statistical Dependency Parsing in Korean: From Corpus Generation To Automatic Parsing Workshop on

Statistical Parsing Parsing context-free languages ar ltekin University of Tbingen

Statistical Parsing Dependency parsing ar ltekin University of Tbingen Seminar fr

Models of Human Parsing Experimental Data 2 Informatics 2A: Lecture 22 Eye-tracking Reading

Outline LR Parsing Review of bottom-up parsing LALR Parser Generators Computing the

Graph-Based Parsing Joakim Nivre Uppsala University Department of Linguistics and Philology

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Generalised Parsing and Combinator Parsing A Happy Marriage? L. Thomas van Binsbergen

Parsing as Deduction Joseph K uhner March 24, 2007 Joseph K uhner Parsing as Deduction

Bottom-up parsing LR parsing Construct parse tree for input from leaves up LR( k ) parsing

Compilers Shift-Reduce Parsing Alex Aiken Shift-Reduce Parsing Important Fact #1 about

Parsing, Part I Jim Royer April 2, 2019 CIS 352 Parsing, Part I 1 Miss Teen South

Positive Intuitionistic Propositional Logic Carl Pollard Department of Linguistics Ohio State

strongly equal k-tautologies in some proof systems. Anahit Chubaryan and Garik Petrosyan

Classical negation Patrick D. Elliott November 16, 2020 Logic and Engineering of Natural

A Parameterization Method for Computing Normally Hyperbolic Invariant Tori Some Numerical

Lecture 2 Recall A state is an assignment of values to all variables A step is a pair of

Planning and Optimization B3. General Regression Malte Helmert and Gabriele R oger

Many-Sorted First-Order Model Theory Lecture 9 2 nd July, 2020 1 / 37 Quantifier elimination

GSP Coordinating Committee ing September Coor Coordina dinating ting Commit Committee tee