Evaluating grammar formalisms for applications to natural language - PowerPoint PPT Presentation

Evaluating grammar formalisms for applications to natural language processing and biological sequence analysis David Chiang 28 June 2004

Applications of grammars • Statistical parsing (Charniak, 1997; Collins, 1997) • Language modeling (Chelba and Jelinek, 1998) • Statistical machine translation (Wu, 1997; Yamada and Knight, 2001) • Prediction or modeling of RNA/protein structure (Searls, 1992) Dissertation defense 1

Applications of grammars • Grammars are a convenient way to . . . � encode bits of theories (subcategorization, SVO/SOV/VSO) � structure algorithms (searching through word alignments, chain foldings) • A di�culty of using grammars: don't know what kind to use Dissertation defense 2

The overarching question What makes one grammar better than another? • Weak generative capacity (WGC): what strings does a grammar generate? • Strong generative capacity (SGC): what structural descriptions (SDs) does a grammar generate? � speci�es whatever is needed to determine how the sentence is used and understood (Chomsky) � not just phrase-structure trees Dissertation defense 3

Weak vs. strong generative capacity • Chomsky: � WGC is �the only area in which substantial results of a mathematical character have been achieved� � SGC is �by far the more interesting notion� • Theory focuses on WGC because it's easier to compare strings than to compare SDs • Applications are concerned with SGC because SDs contain the information that eventually gets used • Occasional treatment of SGC (Kuroda, 1976; Miller 1999) but nothing directed towards computational applications Dissertation defense 4

Objective • Ask the right questions: re�ne SGC so that it is rigorous (unlike before) and relevant (unlike WGC) to applications • Answer the questions and see what the consequences are for applications • Three areas: � Statistical natural language parsing � Natural language translation � Biological sequence analysis Dissertation defense 5

Historical example: cross-serial dependencies • Example from Dutch: dat Jan Piet de kinderen zag helpen zwemmen that Jan Piet the children saw help swim `that Jan saw Piet help the children swim' • Looks like non-context-free { ww } but actually context-free, like { a n b n } (Pullum and Gazdar, 1982) • How to express intuition that this is beyond the power of CFG? Dissertation defense 6

Historical example: a solution Two things had to happen to show this was beyond CFG but within TAG (Joshi, 1985): 1. A di�erent notion of generative capacity : not strings, but strings with links representing dependencies ( derivational generative capacity) dat Jan Piet de kinderen zag helpen zwemmen 2. A locality constraint on how grammars generate these objects: links must be con�ned to a single elementary structure Dissertation defense 7

Historical example: a solution • CFG can't do this S → Piet S? helpen S? S → de kinderen S? zwemmen S? • TAG can S S S V S V NP VP helpen NP VP zwemmen N S V de kinderen V ∗ Piet t t Dissertation defense 8

Historical example: a solution S S S V S V NP VP helpen NP VP zwemmen N S V de kinderen V ∗ Piet t t S S S V S S V V zwemmen NP NP VP VP helpen helpen N N S S S V V Piet Piet NP NP NP VP VP VP t t de kinderen de kinderen de kinderen V V V t t t Dissertation defense 9

Miller (1999): relativized SGC • Generalize from DGC to many notions of SGC • Miller: SGC should not compare SDs, but interpretations of SDs in various domains Weighted Linked Translated parse strings strings Trees trees Structural descriptions Strings Dissertation defense 10

Joshi et many al.: Local grammar formalisms • Generalize from TAG to many formalisms, retaining the idea of locality : � SDs built out of a �nite set of elementary structures � Interpretation functions factor into local interpretation functions de�ned on elementary structures • Linear context-free rewriting systems (Weir, 1988) or simple literal movement grammar (Groenink, 1997) Dissertation defense 11

Combined framework • Choose interpretation domains to measure SGC in a sense suitable for applications • De�ne how interpretation functions should respect locality of grammars • Show how various formalisms compare • Test them by experiments (or thought experiments!) Dissertation defense 12

Overview of comparisons: statistical parsing Trees Weighted trees TIG TIG CFG = TSG = RF-TAG = clMC-CFG CFG = TSG = RF-TAG = clMC-CFG Dissertation defense 13

Overview of comparisons: translation Tree relations String relations RF-TAG clMC-CFG TIG TSG RF-TAG clMC-CFG CFG CFG = TSG = TIG 2CFG 2CFG Dissertation defense 14

Overview of comparisons: biological sequence analysis Weighted linked strings RF-TAG clMC-CFG CFG ∩ FSA CFG = TSG = TIG Dissertation defense 15

First application: statistical parsing • Measuring statistical-modeling power of grammars • A negative result leads to a reconceptualization of some current parsers • Experiments on a stochastic TAG-like model Dissertation defense 16

Measuring modeling power • Statistical parsers use probability distributions over parse structures (trees) • Statistical parsing models map from parse structures to products of parameters � History-based: event sequences � Maximum-entropy: feature vectors • Right notion of SGC : parse structures with generalized weights Dissertation defense 17

Measuring modeling power • Locality constraint: weights must be decomposed so that each elementary structure gets a �xed weight • History-based: each elementary structure gets a single event (e.g., PCFG) or event sequence, combine by concatenation • Maximum-entropy: each elementary structure gets a feature vector (Chiang, 2003; Miyao and Tsujii, 2002), combine by addition • Grammars with semiring weights Dissertation defense 18

Modeling power for free? • We might hope that there are formalisms with the same parsing complexity as, say, CFG that have greater modeling power than PCFG • Often a weakly CF formalism has a parsing algorithm which dynamically compiles the grammar G down to a CFG (a cover grammar) • Easy to show that weights can be chosen for the cover to give the same weights as G Dissertation defense 19

Modeling power for free? Trees Weighted trees TIG TIG CFG = TSG = RF-TAG = clMC-CFG CFG = TSG = RF-TAG = clMC-CFG • Not very promising • However, we may still learn something . . . Dissertation defense 20

Example: cover grammar of a TSG • A tree-substitution grammar S NP VP NP NP VP NNP VB NP NNS MD VP Qintex sell assets would • Constructing a cover grammar, step 1: S( β ) VP( γ ) NP( α ) NP( δ ) NP(*) VP( β ) NNP( α ) VB( γ ) NP(*) NNS( δ ) MD( β ) VP( β ) Qintex( α ) sell( γ ) assets( δ ) would( β ) Dissertation defense 21

Example: cover grammar of a TSG • Constructing a cover grammar, step 2: NP( α ) → PRP( α ) PRP( α ) → Qintex( α ) S( β ) → NP( ∗ ) VP( β ) VP( β ) → MD( β ) VP( ∗ ) MD( β ) → would( β ) VP( γ ) → VB( γ ) NP( ∗ ) VB( γ ) → sell( γ ) NP( δ ) → NNS( δ ) NNS( δ ) → assets( δ ) Dissertation defense 22

Example: cover grammar of a TSG • But this is almost identical to the PCFGs many current parsers use NP(Qintex) → PRP(Qintex) PRP(Qintex) → Qintex S(would) → NP( ∗ ) VP(would) VP(would) → MD(would) VP( ∗ ) MD(would) → would VP(sell) → VB(sell) NP( ∗ ) VB(sell) → sell NP(assets) → NNS(assets) NNS(assets) → assets (Charniak, 1997, 2000; Collins, 1997, 1999) • Think of these PCFGs as a compiled version of something with richer SDs, like a TSG Dissertation defense 23

Lexicalized PCFG Train from the Treebank by using heuristics (head rules, argument rules) to create lexicalized trees S(would) NP(Qintex) VP(would) NNP(Qintex) MD(would) VP(sell) Qintex would VB(sell) PRT(o�) NP(assets) sell RP(o�) NNS(assets) o� assets Dissertation defense 24

Lexicalized PCFG as a cover grammar • Conventional wisdom: propagation of head words rearranges lexical information in trees to bring pairs of words together • But experiments show that bilexical statistics not as important as lexico-structural statistics (Gildea, 2001; Bikel, 2004) • These structures are in the propagation paths and subcategorization frames • New view: what matters is the structural information reconstructed heuristically Dissertation defense 25

A stochastic TIG model (Chiang, 2000) • Direct implementation of new view�why? • Sometimes better not to use head word as a proxy • Greater �exibility (e.g., multi-headed elementary trees) • Alternative training method Dissertation defense 26

A stochastic TIG model (Chiang, 2000) S S NP VP NP VP VB NP NNP MD VP NP VP ⇒ sell Qintex would VB PRT NP NNP MD VP PRT NP Qintex would RP NNS sell RP NNS o� assets o� assets start with initial tree α P i ( α ) substitute α at node η P s ( α | η ) sister-adjoin α under η between i th, ( i +1) st children P sa ( α | η, i ) adjoin β at node η ( β 's foot node must be at left or P a ( β | η ) right corner) Dissertation defense 27

Evaluating grammar formalisms for applications to natural language - PowerPoint PPT Presentation

Evaluating grammar formalisms for applications to natural language processing and biological sequence analysis David Chiang 28 June 2004 Applications of grammars Statistical parsing (Charniak, 1997; Collins, 1997) Language modeling

Grammar Formalisms Introduction Laura Kallmeyer, Timm Lichte, Wolfgang Maier Universit at T

Grammar Formalisms: C-structures are represented with trees. Lexical Functional Grammar (LFG)

Working Together What does his future hold? Carres Grammar School Carres Grammar School

Grammar and word order Grammar and word order Grammar Grammar Includes morphology and syntax

Grammar Formalisms Head-Driven Phrase Structure Grammar (HPSG) Laura Kallmeyer, Timm Lichte,

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Part I Parsing Formal Properties

Hans Vangheluwe Complexity causes solution: different formalisms modelling dynamics: different

Specification Formalisms for LTSs Xinxin Liu Institute of Software Chinese Academy of Sciences

CSE 517 Natural Language Processing Winter 2017 Dependency Parsing And Other Grammar Formalisms

GRAMMAR THROUGH HUMOR BRANDY SHOOKS & WHITNEY SCHARER TEACHING GRAMMAR THROUGH HUMOR Having

General Context-Free Grammar Parsing: Application of grammar rewrite rules A phrase structure

Introduction to English Linguistics 4: Grammar and Syntax I Grammar and Syntax Grammar The

Grammar: The Heart of Numeracy 18 Nov, 2017 0B 2017 NNN2 Grammar: The Heart of Numeracy 1 0B

Introduction to English Linguistics 4: Grammar and Syntax Grammar and Syntax Grammar The rules

General Context-Free Grammar Parsing Application of grammar rewrite rules A phrase structure

Outline Grammar Formalisms Syntax in LTAG The derivation tree 1 Design principles for

A Test-Bed for Mobile Ad-hoc Networks How Much Can Watchdogs Really Do? Sonja Buchegger, Cedric

Philosophical Foundations Weak AI claim: computers can be programmed to act as if they

Qualis Health Wyoming Medicaid DME Educational Session Yvette Stratton, RN, BSN, CCM Manager,

Regulating Speed and Generating Large Speed Transitions in a Neuromuscular Human Walking Model

Broverview Outline 2 Outline Philosophy and Architecture A framework for network traffic

Disclosures Research grant support from National Institutes for Health (NIH), Centers for

Convenient drug-resistance testing of HIV mutants Qinchang Zhu & Masaaki Kai * Laboratory of

Keith. W. Crawford, RPH, PhD Assistant Chief

Evaluating grammar formalisms for applications to natural language - PowerPoint PPT Presentation

Evaluating grammar formalisms for applications to natural language processing and biological sequence analysis David Chiang 28 June 2004 Applications of grammars Statistical parsing (Charniak, 1997; Collins, 1997) Language modeling

Grammar Formalisms Introduction Laura Kallmeyer, Timm Lichte, Wolfgang Maier Universit at T

Grammar Formalisms: C-structures are represented with trees. Lexical Functional Grammar (LFG)

Working Together What does his future hold? Carres Grammar School Carres Grammar School

Grammar and word order Grammar and word order Grammar Grammar Includes morphology and syntax

Grammar Formalisms Head-Driven Phrase Structure Grammar (HPSG) Laura Kallmeyer, Timm Lichte,

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Part I Parsing Formal Properties

Hans Vangheluwe Complexity causes solution: different formalisms modelling dynamics: different

Specification Formalisms for LTSs Xinxin Liu Institute of Software Chinese Academy of Sciences

CSE 517 Natural Language Processing Winter 2017 Dependency Parsing And Other Grammar Formalisms

GRAMMAR THROUGH HUMOR BRANDY SHOOKS &amp; WHITNEY SCHARER TEACHING GRAMMAR THROUGH HUMOR Having

General Context-Free Grammar Parsing: Application of grammar rewrite rules A phrase structure

Introduction to English Linguistics 4: Grammar and Syntax I Grammar and Syntax Grammar The

Grammar: The Heart of Numeracy 18 Nov, 2017 0B 2017 NNN2 Grammar: The Heart of Numeracy 1 0B

Introduction to English Linguistics 4: Grammar and Syntax Grammar and Syntax Grammar The rules

General Context-Free Grammar Parsing Application of grammar rewrite rules A phrase structure

Outline Grammar Formalisms Syntax in LTAG The derivation tree 1 Design principles for

A Test-Bed for Mobile Ad-hoc Networks How Much Can Watchdogs Really Do? Sonja Buchegger, Cedric

Philosophical Foundations Weak AI claim: computers can be programmed to act as if they

Qualis Health Wyoming Medicaid DME Educational Session Yvette Stratton, RN, BSN, CCM Manager,

Regulating Speed and Generating Large Speed Transitions in a Neuromuscular Human Walking Model

Broverview Outline 2 Outline Philosophy and Architecture A framework for network traffic

Disclosures Research grant support from National Institutes for Health (NIH), Centers for

Convenient drug-resistance testing of HIV mutants Qinchang Zhu &amp; Masaaki Kai * Laboratory of

Keith. W. Crawford, RPH, PhD Assistant Chief

GRAMMAR THROUGH HUMOR BRANDY SHOOKS & WHITNEY SCHARER TEACHING GRAMMAR THROUGH HUMOR Having

Convenient drug-resistance testing of HIV mutants Qinchang Zhu & Masaaki Kai * Laboratory of