Transforming Projective Bilexical Dependency Grammars into - PowerPoint PPT Presentation

Transforming Projective Bilexical Dependency Grammars into efficiently-parsable CFGs with Unfold-Fold Mark Johnson Microsoft Research Brown University ACL 2007 1 / 22

Motivation and summary ◮ What’s the relationship between CKY parsing and the Eisner/Satta O ( n 3 ) PBDG parsing algorithm? (c.f., McAllester 1999) ◮ split-head encoding , collecting left and right dependents separately ◮ unfold-fold transform reorganizes grammar for efficient CKY parsing ◮ Approach generalizes to 2nd-order dependencies ◮ predict argument given governor and sibling (McDonald 2006) ◮ predict argument given governor and governor’s governor ◮ In principle can use any CFG parsing or estimation algorithm for PBDGs ◮ transformed grammars typically too large to enumerate ◮ my CKY implementations transform grammar on the fly 2 / 22

Outline Projective Bilexical Dependency Grammars Simple split-head encoding O ( n 3 ) split-head CFGs via Unfold-Fold Transformations capturing 2nd-order dependencies Conclusion 3 / 22

Projective Bilexical Dependency Grammars ◮ Projective Bilexical Dependency Grammar (PBDG) 0 gave Sandy gave gave dog the dog gave bone a bone ◮ A dependency parse generated by the PBDG 0 Sandy gave the dog a bone ◮ Weights can be attached to dependencies (and preserved in CFG transforms) 4 / 22

A naive encoding of PBDGs as CFGs S → X u where 0 u X u → u X u → X v X u where v u X u → X u X v where u v S X gave X Sandy X gave Sandy X gave X bone X gave X dog X a X bone gave X the X dog a bone the dog 5 / 22

Spurious ambiguity in naive encoding ◮ Naive encoding allows dependencies on different sides of head to be freely reordered ⇒ Spurious ambiguity in CFG parses (not present in PBDG parses) S X gave S X Sandy X gave X gave Sandy X gave X bone X gave X bone X gave X dog X a X bone X gave X dog X a X bone gave X the X dog a bone X Sandy X gave X the X dog a bone the dog Sandy gave the dog 6 / 22

Parsing naive CFG encoding takes O ( n 5 ) time ◮ A production schema such as X u X u X v → has 5 variables, and so can match input in O ( n 5 ) different ways X u X u X v i u j v k 7 / 22

Simple split-head encoding ◮ Replace input word u with a left variant u ℓ and a right variant u r (can be avoided in practice with fancy book-keeping) Sandy gave the dog a bone ⇓ Sandy ℓ Sandy r gave ℓ gave r the ℓ the r dog ℓ dog r a ℓ a r bone ℓ bone r ◮ PCFG separately collects left dependencies and right dependencies S S X u where 0 → u X gave X u L u u R where u ∈ Σ → L gave gave R L u → u l X Sandy L gave gave R X bone L u X v L u where v u → u R u r → Sandy gave R X dog a bone u R u R X v where u v → gave ℓ gave r the dog 9 / 22

Simple split-head CFG parse S X gave L gave gave R X Sandy L gave gave R X bone L Sandy Sandy R gave R X dog L bone bone R Sandy ℓ Sandy r gave ℓ gave r L dog dog R X a L bone X the L dog L a a R bone ℓ bone r L the the R dog ℓ dog r a ℓ a r the ℓ the r 10 / 22

L u and u R heads are phrase-peripheral ⇒ O ( n 4 ) ◮ Heads of L u and u R are always at right (left) edge S → X u where 0 u X u X u → L u u R where u ∈ Σ L u u R L u → u l X v 1 X v 3 L u → X v L u where v u L u u R u R → u r u R → u R X v where u v X v 2 L u u R X v 4 u ℓ u r u R ◮ X u take O ( n 3 ) → L u u R u R X v take O ( n 4 ) ◮ u R → u R X v i = u j v k 11 / 22

The Unfold-Fold transform ◮ Unfold-fold originally proposed for transforming recursive programs; used here to transform CFGs into new CFGs ◮ Unfolding a nonterminal replaces it with its expansion A → α β 1 γ A → α B γ A → α β 2 γ B → β 1 B → β 1 ⇒ B → β 2 B → β 2 . . . . . . ◮ Folding is the inverse of unfolding (replace RHS with nonterminal) A → α β γ A → α B γ B → β B → β ⇒ . . . . . . ◮ Transformed grammar generates same language (Sato 1992) 13 / 22

Unfold-fold converts O ( n 4 ) to O ( n 3 ) grammar ◮ Unfold X v responsible for O ( n 4 ) parse time L u → u l L u u l → L u X v L u → ⇒ L u L v v R L u → X v L v v R → ◮ Introduce new non-terminals x M y (doesn’t change language) x M y → x R L y ◮ Fold two children of L u into x M y L u → u l L u → u l L u L v v R L u L u L v v M u → → ⇒ x M y x R L y x M y x R L y → → 14 / 22

Transformed grammar collects left and right dependencies separately L u u R L u u R ⇒ X v X v ′ v M u u M v ′ L v v R L u u R L v ′ v ′ R L v v R L u u R L v ′ v ′ R u ℓ u r u ℓ u r ◮ X v constituents (which cause O ( n 4 ) parse time) no longer used ◮ Head annotations now all phrase peripheral ⇒ O ( n 3 ) parse time ◮ Dependencies can be recovered from parse tree ◮ Basically same as Eisner and Satta O ( n 3 ) algorithm ◮ explains why Inside-Outside sanity check fails for Eisner/Satta ◮ two copies of each terminal ⇒ each terminals’ Outside probability is double the Inside sentence probability 15 / 22

Parse using O ( n 3 ) transformed split-head grammar S L gave gave R L Sandy Sandy M gave gave M bone bone R Sandy R L gave gave R L bone Sandy ℓ Sandy r gave M dog dog R L a a M bone gave R L dog a R L bone gave ℓ gave r L the the M dog a ℓ a r bone ℓ bone r the R L dog the ℓ the r dog r dog ℓ 0 Sandy gave the dog a bone 16 / 22

Parsing time of CFG encodings of same PBDG CFG schemata sentences parsed / second Naive O ( n 5 ) CFG 45.4 O ( n 4 ) simple split-head CFG 406.2 O ( n 3 ) transformed split-head CFG 3580.0 ◮ Weighted PBDG; all pairs of heads have some dependency weight ◮ Dependency weights precomputed before parsing begins ◮ Timing results on a 3.6GHz Pentium 4 machine parsing section 24 of the PTB ◮ CKY parsers with grammars hard-coded in C (no rule lookup) ◮ Dependency accuracy of Viterbi parses = 0.8918 for all grammars ◮ Feature extraction is much slower than even naive CFG 17 / 22

Predict argument based on governor and sibling S L gave gave R R gave M bone R bone L R L Sandy Sandy M gave M dog M bone gave dog Sandy R gave ℓ gave r L dog dog R L bone L L Sandy ℓ Sandy r L the the M L a a M dog bone the R dog ℓ dog r a R bone ℓ bone r the ℓ the r a ℓ a r ◮ Very similar to second-order algorithm given by McDonald (2006) 19 / 22

Predict argument based on governor and governor’s governor S L gave gave R L R L Sandy Sandy M gave M bone R gave bone L Sandy ℓ Sandy r L gave gave M a a M bone gave R L a L bone R gave M dog R a ℓ a r bone ℓ bone r dog L gave M the the M dog gave R L the L dog gave ℓ gave r the ℓ the r dog ℓ dog r ◮ Because left and right dependencies are assembled separately, only captures 2nd-order dependencies where one dependency is leftward and other is rightward 20 / 22

Conclusion and future work ◮ Presented a reduction from PBDGs to O ( n 3 ) parsable CFGs ◮ split-head CFG representation of PBDGs ◮ Unfold-fold transform ◮ CKY algorithm on resulting CFG simulates Eisner/Satta algorithm on original PBDG ◮ Makes CFG techniques applicable to PBDGs ◮ max marginal parsing (Goodman 1996) and other CFG parsing and estimation algorithms ◮ Can capture different dependencies, yielding different PDG models ◮ 2nd-order “horizontal” dependencies (McDonald 2006) ◮ what other combinations of dependencies can we capture? (if we permit O ( n 4 ) parse time?) ◮ do any of these improve parsing accuracy? 22 / 22

Transforming Projective Bilexical Dependency Grammars into - PowerPoint PPT Presentation

Transforming Projective Bilexical Dependency Grammars into efficiently-parsable CFGs with Unfold-Fold Mark Johnson Microsoft Research Brown University ACL 2007 1 / 22 Motivation and summary Whats the relationship between CKY parsing

Whens a grammar bilexical? Efficient Parsing for Bilexical CF Grammars If it has rules /

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner Giorgio

Dependency Grammars Topological Dependency Trees: A Constraint-based Account of Linear

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Dependency Grammars Dependency grammars . ltekin, SfS / University of Tbingen WS

Computer Vision Mid-Level Vision Projective Geometry The projective projection of a 3D point:

Learning Task-specific Bilexical Embeddings Pranava Madhyastha (1) , Xavier Carreras (1 , 2) ,

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Dependency Parsing CMSC 470 Marine Carpuat Dependency Grammars Syntactic structure = lexical

Dependency Grammars and Parser LING 571 Deep Processing for NLP October 16, 2019 Shane

Dependency Grammars and Parsers Deep Processing for NLP Ling571 January 28, 2015 Roadmap

Statistical Parsing October 27, 2016 Dependency grammars Grammar formalisms Finale Plan of the

Speech and Language Processing Formal Grammars Chapter 12 Today Formal Grammars

Grammars and Parsing Grammars and Sentence Structure What makes a good grammar A

TRANSFORMING TRANSFORMING TRANSFORMING TRANSFORMING FINANCIAL SERVICES FINANCIAL SERVICES FOR

Lecture 16: Iterative Methods and Sparse Linear Algebra David Bindel 25 Oct 2011 Logistics

High Voltage Activation Kevin Wood, for the team September 20, 2018 The Ramp The Top right:

Denotational Semantics TyngRuey Chuang Institute of Information Science Academia Sinica,

Real Time Support in Programming Languages Radek Pel anek Overview of Languages POSIX RT

Control Flow CPU Sean Barker 1 Physical Control Flow Physical control flow <startup>

MPI and Fault Tolerance: concept and limitations of the current specification Edgar Gabriel

Deadlock If you are not careful, it can lead to deadlock Today s lecture: What is

Operating Systems ECE344 Ding Yuan Deadlock Synchronization is a live gun we can easily

Transforming Projective Bilexical Dependency Grammars into - PowerPoint PPT Presentation

Transforming Projective Bilexical Dependency Grammars into efficiently-parsable CFGs with Unfold-Fold Mark Johnson Microsoft Research Brown University ACL 2007 1 / 22 Motivation and summary Whats the relationship between CKY parsing

Whens a grammar bilexical? Efficient Parsing for Bilexical CF Grammars If it has rules /

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner Giorgio

Dependency Grammars Topological Dependency Trees: A Constraint-based Account of Linear

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Dependency Grammars Dependency grammars . ltekin, SfS / University of Tbingen WS

Computer Vision Mid-Level Vision Projective Geometry The projective projection of a 3D point:

Learning Task-specific Bilexical Embeddings Pranava Madhyastha (1) , Xavier Carreras (1 , 2) ,

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Dependency Parsing CMSC 470 Marine Carpuat Dependency Grammars Syntactic structure = lexical

Dependency Grammars and Parser LING 571 Deep Processing for NLP October 16, 2019 Shane

Dependency Grammars and Parsers Deep Processing for NLP Ling571 January 28, 2015 Roadmap

Statistical Parsing October 27, 2016 Dependency grammars Grammar formalisms Finale Plan of the

Speech and Language Processing Formal Grammars Chapter 12 Today Formal Grammars

Grammars and Parsing Grammars and Sentence Structure What makes a good grammar A

TRANSFORMING TRANSFORMING TRANSFORMING TRANSFORMING FINANCIAL SERVICES FINANCIAL SERVICES FOR

Lecture 16: Iterative Methods and Sparse Linear Algebra David Bindel 25 Oct 2011 Logistics

High Voltage Activation Kevin Wood, for the team September 20, 2018 The Ramp The Top right:

Denotational Semantics TyngRuey Chuang Institute of Information Science Academia Sinica,

Real Time Support in Programming Languages Radek Pel anek Overview of Languages POSIX RT

Control Flow CPU Sean Barker 1 Physical Control Flow Physical control flow &lt;startup&gt;

MPI and Fault Tolerance: concept and limitations of the current specification Edgar Gabriel

Deadlock If you are not careful, it can lead to deadlock Today s lecture: What is

Operating Systems ECE344 Ding Yuan Deadlock Synchronization is a live gun we can easily

Control Flow CPU Sean Barker 1 Physical Control Flow Physical control flow <startup>