Evaluating and Extending the Coverage of HPSG Grammars: A Case Study - PowerPoint PPT Presentation

Evaluating and Extending the Coverage of HPSG Grammars: A Case Study for German Jeremy Nicholson, Valia Kordoni, Yi Zhang, Timothy Baldwin, Rebecca Dridan Department of Computational Linguistics Saarland University & DFKI GmbH Department of Computer Science and Software Engineering & NICTA Victoria Research Labs University of Melbourne 28 May 2008 Nicholson, Kordoni, Zhang, Baldwin, Dridan Evaluating and Extending the Coverage of HPSG Grammars

Deep Lexical Grammars Deep grammars provide a full analysis, more semantic information than shallower tools Tendency to emphasise precision over recall can cause poor coverage HPSG grammars, parsing tools from the DELPH-IN initiative Our aim: take a “snapshot” of the grammar, examine potential for expansion Nicholson, Kordoni, Zhang, Baldwin, Dridan Evaluating and Extending the Coverage of HPSG Grammars

Analysis of a “Broad–Coverage” Grammar “Beauty and the Beast” (2004): Use the ERG to parse 20K sentences from the BNC Analyse sources of parse failures “Evaluating and Extending” (Today): Use GG to parse 612K sentences from Frankfurter Rundschau Evaluate errors over 1K sentences Use lexical type prediction to increase coverage Nicholson, Kordoni, Zhang, Baldwin, Dridan Evaluating and Extending the Coverage of HPSG Grammars

Corpus Analysis of the Grammar Ran a large grammar out–of–the–box on a very different corpus Lexical span: ERG - 32%; GG - 28% Sentences with correct reading attested: ERG - 83%; GG - 85% No span Span, no parse ≥ 1 parse ERG 68% 14% 18% GG 72% 16% 12% Nicholson, Kordoni, Zhang, Baldwin, Dridan Evaluating and Extending the Coverage of HPSG Grammars

Lexical Gaps for GG Lexical gaps: Error Type Proportion lexical entries 33% proper nouns 22% noun compounds 30% tokenisation 12% garbage strings 2% Nicholson, Kordoni, Zhang, Baldwin, Dridan Evaluating and Extending the Coverage of HPSG Grammars

Lexical Gaps for GG Lexical entry: Aufgrund des ruhigeren Gesch¨ aftsverlaufs rechnet Maier f¨ ur 1992 mit einem “leicht r¨ uckl¨ aufigen” Ergebnis. Noun compound: Das T¨ urelement l¨ aßt sich hinter die Verkleidung schieben und wird damit unsichtbar. Sophisticated tokenisation could account for proper nouns, noun compounds, tokenisation errors. Nicholson, Kordoni, Zhang, Baldwin, Dridan Evaluating and Extending the Coverage of HPSG Grammars

Parsing Errors for GG Parsing Errors for GG: Error type Proportion constructional gap 39% lexical item gap 47% multi–word expression 7% spelling 4% fragment 3% Nicholson, Kordoni, Zhang, Baldwin, Dridan Evaluating and Extending the Coverage of HPSG Grammars

Parsing Errors for GG Constructional gap: BREMEN, 4. Februar. Lexical item gap: Beginn ist um 19 Uhr in der Stadthalle. Multi–word expression: Der Opfer dieser Verbrechen der Nationalsozialisten gedachte die Stadt Bad Homburg gestern abend. Similar distribution observed in Beauty and the Beast. Nicholson, Kordoni, Zhang, Baldwin, Dridan Evaluating and Extending the Coverage of HPSG Grammars

Lexical Acquisition Baldwin (2005): use a range of morphological, syntactic, semantic features for predicting lexical type class of unknown token/type e.g. Katze in Die Katze ist schwarz. is one of count-noun-le , mass-noun-le , count-noun-mass-unit-le , deverbal-noun-le ... Nicholson, Kordoni, Zhang, Baldwin, Dridan Evaluating and Extending the Coverage of HPSG Grammars

Lexical Acquisition Feature set from Zhang and Kordoni (2006): prefixes/suffixes, 2 tokens of context, 2 types of context Token–wise prediction on the GG treebank (MaxEnt, cross–validation) Limit evaluation to “unknown words” (type–wise) Accuracy approaches 60% Nicholson, Kordoni, Zhang, Baldwin, Dridan Evaluating and Extending the Coverage of HPSG Grammars

Lexicon Extension Using the MaxEnt model from the treebank, predict lexical types for unknown tokens within Frankfurter Rundschau Intrinsic evaluation not possible Thresholding MaxEnt at 10% likelihood, add 1130 lexemes to the lexicon Further 9% coverage, 83% of these had at least one parse Nicholson, Kordoni, Zhang, Baldwin, Dridan Evaluating and Extending the Coverage of HPSG Grammars

Summary Change of 12% of parsed sentences at 85% precision to about 20% at 84% precision This means getting more “easy” sentences Scope for improving the grammar, parsing strategy (shallow methods to improve deep parsing) Nicholson, Kordoni, Zhang, Baldwin, Dridan Evaluating and Extending the Coverage of HPSG Grammars

Evaluating and Extending the Coverage of HPSG Grammars: A Case Study - PowerPoint PPT Presentation

Evaluating and Extending the Coverage of HPSG Grammars: A Case Study for German Jeremy Nicholson, Valia Kordoni, Yi Zhang, Timothy Baldwin, Rebecca Dridan Department of Computational Linguistics Saarland University & DFKI GmbH Department

Hybrid NLP Hybrid NLP Multilingual HPSG Grammar Engineering Multilingual HPSG Grammar

Head-Driven Phrase Structure Grammar (HPSG) Introduction Grammatikformalismen Lecture

HPSG Binding Theory David Lahm Deutsches Seminar - Eberhard Karls Universit at T ubingen

Optimization of HPSG Grammar Implementations in Trale Georgiana Dinu Optimization of HPSG

Day 4: HPSG approaches to information structure The signature of an HPSG grammar The signature

Grammar Formalisms Head-Driven Phrase Structure Grammar (HPSG) Laura Kallmeyer, Timm Lichte,

HPSG approaches to information structure A basic HPSG approach (Engdahl & Vallduv

Agreement in HPSG Introduction to HPSG, WS 2007/2008 Monica L. L au Universitt Tbingen

Grammars and Parsing Grammars and Sentence Structure What makes a good grammar A

Speech and Language Processing Formal Grammars Chapter 12 Today Formal Grammars

Formal Grammars Why Study Grammars? Whats a Grammar? August 24, 2014 Parsing Brian A.

The components of a Trale grammar Implementing HPSG grammars Signature The TRALE system

HPSG Analysis and Computational Implementation of Indonesian Passives Division of Linguistics and

Complement Structures: Outline Complement Structures and Non-Finite Constructions in HPSG

Some remarks on trends in HPSG Erhard Hinrichs and Detmar Meurers Universit at T ubingen

Scott Drellishak & Emily M. Bender Coordination Modules for a Crosslinguistic Grammar

Lecture 15: Machine Translation Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center

A User's experience with the Installation, Configuration, and Features of DTS's Space Recovery

Learnability-based Syntactic Annotation Design Roy Schwartz, Omri Abend and Ari Rappoport The

Logic as a Tool Chapter 3: Understanding First-order Logic 3.1 First-order structures and

Robust Programming Style of programming that prevents abnormal termination and unexpected

Systeme hoher Sicherheit und Qualitt Wintersemester 2013-14 Christoph Lth MZH 3100,

Universal Dependencies are hard to parse or are they? Ines Rehbein , Julius Steen ,

Welcome and Introduction Thank you for coming to Hamburg! Main goal of the meeting - try to find

Sambuz

Useful Links

Newsletter

Mail Us

Evaluating and Extending the Coverage of HPSG Grammars: A Case Study - PowerPoint PPT Presentation

Evaluating and Extending the Coverage of HPSG Grammars: A Case Study for German Jeremy Nicholson, Valia Kordoni, Yi Zhang, Timothy Baldwin, Rebecca Dridan Department of Computational Linguistics Saarland University & DFKI GmbH Department

Hybrid NLP Hybrid NLP Multilingual HPSG Grammar Engineering Multilingual HPSG Grammar

Head-Driven Phrase Structure Grammar (HPSG) Introduction Grammatikformalismen Lecture

HPSG Binding Theory David Lahm Deutsches Seminar - Eberhard Karls Universit at T ubingen

Optimization of HPSG Grammar Implementations in Trale Georgiana Dinu Optimization of HPSG

Day 4: HPSG approaches to information structure The signature of an HPSG grammar The signature

Grammar Formalisms Head-Driven Phrase Structure Grammar (HPSG) Laura Kallmeyer, Timm Lichte,

HPSG approaches to information structure A basic HPSG approach (Engdahl &amp; Vallduv

Agreement in HPSG Introduction to HPSG, WS 2007/2008 Monica L. L au Universitt Tbingen

Grammars and Parsing Grammars and Sentence Structure What makes a good grammar A

Speech and Language Processing Formal Grammars Chapter 12 Today Formal Grammars

Formal Grammars Why Study Grammars? Whats a Grammar? August 24, 2014 Parsing Brian A.

The components of a Trale grammar Implementing HPSG grammars Signature The TRALE system

HPSG Analysis and Computational Implementation of Indonesian Passives Division of Linguistics and

Complement Structures: Outline Complement Structures and Non-Finite Constructions in HPSG

Some remarks on trends in HPSG Erhard Hinrichs and Detmar Meurers Universit at T ubingen

Scott Drellishak &amp; Emily M. Bender Coordination Modules for a Crosslinguistic Grammar

Lecture 15: Machine Translation Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center

A User's experience with the Installation, Configuration, and Features of DTS's Space Recovery

Learnability-based Syntactic Annotation Design Roy Schwartz, Omri Abend and Ari Rappoport The

Logic as a Tool Chapter 3: Understanding First-order Logic 3.1 First-order structures and

Robust Programming Style of programming that prevents abnormal termination and unexpected

Systeme hoher Sicherheit und Qualitt Wintersemester 2013-14 Christoph Lth MZH 3100,

Universal Dependencies are hard to parse or are they? Ines Rehbein , Julius Steen ,

Welcome and Introduction Thank you for coming to Hamburg! Main goal of the meeting - try to find

Sambuz

Useful Links

Newsletter

Mail Us

HPSG approaches to information structure A basic HPSG approach (Engdahl & Vallduv

Scott Drellishak & Emily M. Bender Coordination Modules for a Crosslinguistic Grammar