Dependency Parsing Joakim Nivre Uppsala University Department of - PowerPoint PPT Presentation

Dependency Parsing Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Dependency Parsing 1(14)

1. Dependency Trees 2. Arc-Factored Models 3. Online Learning 4. Eisner’s Algorithm 5. Spanning Tree Parsing Dependency Parsing 2(14)

Dependency Trees ◮ Input sentence x = x 1 , . . . , x n ◮ Dependency graph G = ( V x , A ) ◮ V x = { 0 , . . . , n } is a set of nodes, one for each x i + root ◮ A ⊆ ( V x × L × V x ) is a set of labeled arcs ( i , l , j ) ◮ Dependency tree = dependency graph satisfying: 1. Root : No arcs into node 0. 2. Single-Head : At most one incoming arc to any node. 3. Connected : Graph is weakly connected. Dependency Parsing 3(14)

Dependency Trees Projectivity: For every arc ( i , l , j ) , there is a directed path from i to every word k such that min ( i , j ) < k < max ( i , j ) . Dependency Parsing 4(14)

Dependency Trees Language Trees Arcs Arabic 11.2% 0.4% Basque 26.2% 2.9% Czech 23.2% 1.9% Danish 15.6% 1.0% Greek 20.3% 1.1% Russian 10.6% 0.9% Slovene 22.2% 1.9% Turkish 11.6% 1.5% Dependency Parsing 5(14)

Dependency Trees Parsing problem: ◮ Input: x = x 1 , . . . , x n ◮ Output: dependency tree y for x Equivalent to: ◮ Assign a head i and a label l to every node j (1 ≤ j ≤ n ) under the tree constraint ◮ Find a directed spanning tree in the complete graph G x = ( V x , V x × L × V x ) Dependency Parsing 6(14)

Arc-Factored Models � Score ( x , y ) = Score ( i , l , j , x ) ( i , l , j ) ∈ A y GEN ( x ) = { y | y is a spanning tree in G x = ( V x , V x × L × V x ) } � EVAL ( x , y ) = Score ( x , y ) = Score ( i , l , j , x ) ( i , l , j ) ∈ A y Dependency Parsing 7(14)

Arc-Factored Models K � Score ( i , l , j , x ) = f k ( i , l , j , x ) · w k k = 1 K y ∗ = argmax � � f k ( i , l , j , x ) · w k y ∈ GEN ( x ) k = 1 ( i , l , j ) ∈ A y Dependency Parsing 8(14)

Arc-Factored Models Unigram Bigram In-Between PoS x i -w, x i -p x i -w, x i -p, x j -w, x j -p x i -p, b -p, x j -p x i -w x i -p, x j -w, x j -p Surrounding PoS x i -p x i -w, x j -w, x j -p x i -p, x i + 1 -p, x j − 1 -p, x j -p x j -w, x j -p x i -w, x i -p, x j -p x i − 1 -p, x i -p, x j − 1 -p, x j -p x j -w x i -w, x i -p, x j -w x i -p, x i + 1 -p, x j -p, x j + 1 -p x j -p x i -w, x j -w x i − 1 -p, x i -p, x j -p, x j + 1 -p x i -p, x j -p Dependency Parsing 9(14)

Online Learning Training data: T = { ( x i , y i ) } |T | i = 1 1 w ← 0 2 for n : 1 .. N 3 for i : 1 .. |T | y ∗ ← Parse ( x i , w ) 4 if y ∗ � = y i 5 w ← Update ( w , y ∗ , y i ) 6 7 return w Dependency Parsing 10(14)

Online Learning Parse ( x , w ) � K k = 1 f k ( i , l , j , x i ) · w k 1 return argmax y ∈ GEN ( x i ) � ( i , l , j ) ∈ A y Update ( w , y ∗ , y i ) 1 for k : 1 .. K 2 for ( i , l , j ) ∈ A y ∗ 3 w k ← w k − f k ( i , l , j , x ) 4 for ( i , l , j ) ∈ A y i 5 w k ← w k + f k ( i , l , j , x ) Dependency Parsing 11(14)

Eisner’s Algorithm CKY Eisner Dependency Parsing 12(14)

Eisner’s Algorithm 1 for i : 0 .. n and all d , c 2 C [ i ][ i ][ d ][ c ] ← 0 . 0 3 for m : 1 .. n 4 for i : 0 .. n − m 5 j ← i + m 6 C [ i ][ j ][ ← ][ 0 ] ← max i ≤ k < j C [ i ][ k ][ → ][ 1 ] + C [ k + 1 ][ j ][ ← ][ 1 ] + Score ( j , i ) C [ i ][ j ][ → ][ 0 ] ← max i ≤ k < j C [ i ][ k ][ → ][ 1 ] + C [ k + 1 ][ j ][ ← ][ 1 ] + Score ( i , j ) 7 C [ i ][ j ][ ← ][ 1 ] ← max i ≤ k < j C [ i ][ k ][ ← ][ 1 ] + C [ k ][ j ][ ← ][ 0 ] 8 C [ i ][ j ][ → ][ 1 ] ← max i < k ≤ j C [ i ][ k ][ → ][ 0 ] + C [ k ][ j ][ → ][ 1 ] 9 10 return C [ 0 ][ n ][ → ][ 1 ] Dependency Parsing 13(14)

Spanning Tree Parsing ROOT ROOT 9 10 10 9 saw 30 saw 30 30 30 20 0 Mary John Mary John 11 3 Dependency Parsing 14(14)

Dependency Parsing Joakim Nivre Uppsala University Department of - PowerPoint PPT Presentation

Dependency Parsing Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Dependency Parsing 1(14) 1. Dependency Trees 2. Arc-Factored Models 3. Online Learning 4. Eisners Algorithm 5.

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre,

Dependency Parsing & Feature-based Parsing Ling571 Deep Processing Techniques for NLP

Marina Valeeva Outline 2 1. Introduction What is Dependency Parsing? What is a

Statistical Parsing Dependency parsing ar ltekin University of Tbingen Seminar fr

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Thoughts on Learner Data and Motivation Learner Language Dependency Parsing and Dependency

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen & Christopher D.

NLP Programming Tutorial 12 - Dependency Parsing Graham Neubig Nara Institute of Science and

Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks Yuan Cao

UMBC A N R Y L D A B M A L F T U M B C I O M 1 (March 29, 1998 4:05 pm) Y O

Regression via Iteratively Reweighted Least Squares Alina Ene, Adrian Vladu IRLS Method Basic

On Approximating the Covering Radius and Finding Dense Lattice Subspaces Daniel Dadush Centrum

Improving Your TABLEGEN Description Javed Absar WHAT IS TABLEGEN ? DSL invented for LLVM

MT@EC Final Multilingual W eb w orkshop Luxem bourg 1 5 -1 6 March 2 0 1 2 Spyridon Pilos

Bandit opmizaon with large strategy sets Alexandre Prou*ere

Area-Universal Area-Universal Rectangular Layouts Rectangular Layouts David Eppstein

Dependency Parsing Joakim Nivre Uppsala University Department of - PowerPoint PPT Presentation

Dependency Parsing Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Dependency Parsing 1(14) 1. Dependency Trees 2. Arc-Factored Models 3. Online Learning 4. Eisners Algorithm 5.

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre,

Dependency Parsing &amp; Feature-based Parsing Ling571 Deep Processing Techniques for NLP

Marina Valeeva Outline 2 1. Introduction What is Dependency Parsing? What is a

Statistical Parsing Dependency parsing ar ltekin University of Tbingen Seminar fr

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Thoughts on Learner Data and Motivation Learner Language Dependency Parsing and Dependency

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen &amp; Christopher D.

NLP Programming Tutorial 12 - Dependency Parsing Graham Neubig Nara Institute of Science and

Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks Yuan Cao

UMBC A N R Y L D A B M A L F T U M B C I O M 1 (March 29, 1998 4:05 pm) Y O

Regression via Iteratively Reweighted Least Squares Alina Ene, Adrian Vladu IRLS Method Basic

On Approximating the Covering Radius and Finding Dense Lattice Subspaces Daniel Dadush Centrum

Improving Your TABLEGEN Description Javed Absar WHAT IS TABLEGEN ? DSL invented for LLVM

MT@EC Final Multilingual W eb w orkshop Luxem bourg 1 5 -1 6 March 2 0 1 2 Spyridon Pilos

Bandit op*miza*on with large strategy sets Alexandre Prou*ere

Area-Universal Area-Universal Rectangular Layouts Rectangular Layouts David Eppstein

Dependency Parsing & Feature-based Parsing Ling571 Deep Processing Techniques for NLP

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen & Christopher D.

Bandit opmizaon with large strategy sets Alexandre Prou*ere