Dependency Parsing as Head Selection Xingxing Zhang , Jianpeng Cheng, - PowerPoint PPT Presentation

Dependency Parsing as Head Selection Xingxing Zhang , Jianpeng Cheng, Mirella Lapata Institute for Language, Cognition and Computation University of Edinburgh x.zhang@ed.ac.uk April 6, 2017 Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 1 / 18

Dependency Parsing Dependency Parsing is the task of transforming a sentence S = ( root , w 1 , w 2 , . . . , w N ) into a directed tree originating out of root . Parsing Algorithms Transition-based Parsing Graph-based Parsing Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 2 / 18

Dependency Parsing Dependency Parsing is the task of transforming a sentence S = ( root , w 1 , w 2 , . . . , w N ) into a directed tree originating out of root . Parsing Algorithms Transition-based Parsing Graph-based Parsing Our parser is neither Transition-based nor Graph-based (during training) Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 2 / 18

Transition-based Parsing Data Structure Buffer, Stack, Arc Set Parsing: Choose an action from SHIFT REDUCE-Left REDUCE-Right Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 3 / 18

Graph-based Parsing A Sentence → A Directed Complete Graph (Graphs from Kubler et al., 2009) Parsing: Finding Maximum Spanning Tree Chu-Liu-Edmond algorithm (Chu and Liu, 1965) Eisner algorithm (Eisner 1996) Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 4 / 18

Recent Advances Mostly replacing discrete features with Neural Network features. Transition-based Parsers Feed-Forward NN features (Chen and Manning, 2014) Bi-LSTM features (Kiperwasser and Goldberg, 2016) Stack LSTM: Buffer, Stack and Action Sequences modeled by Stack-LSTMs (Dyer et al., 2015) Graph-based Parsers Tensor Decomposition features (Lei et al., 2014) Feed-Forward NN features (Pei et al., 2015) Bi-LSTM features (Kiperwasser and Goldberg, 2016) Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 5 / 18

Do we need a transition system or graph algorithm? root kids love candy Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 6 / 18

Do we need a transition system or graph algorithm? root kids love candy An important fact: Every word has only one head! Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 6 / 18

Do we need a transition system or graph algorithm? root kids love candy An important fact: Every word has only one head! Why not just learn to select the head? Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 6 / 18

Dependency Parsing as Head Selection DeNSe : De pendency N eural Se lection Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 7 / 18

Dependency Parsing as Head Selection DeNSe : De pendency N eural Se lection exp( MLP ( a root , a love )) P head ( root | love , S ) = � 3 k =0 exp( MLP ( a k , a love )) Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 7 / 18

Decoding Greedy Decoding: The output may not be a (projective) tree! Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 8 / 18

Decoding Greedy Decoding: The output may not be a (projective) tree! Greedy Decoding Dataset #Sent (Dev) Tree Proj PTB (English) 1,700 95.1 86.6 CTB (Chinese) 803 87.0 73.1 Czech 374 87.7 65.5 German 367 96.7 67.3 Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 8 / 18

Decoding Greedy Decoding: The output may not be a (projective) tree! Greedy Decoding Dataset #Sent (Dev) Tree Proj PTB (English) 1,700 95.1 86.6 CTB (Chinese) 803 87.0 73.1 Czech 374 87.7 65.5 German 367 96.7 67.3 Decoding with a Maximum Spanning Tree Algorithm (relatively rare) Projective Parsing: Eisner Algorithm Non-projective Parsing: Chu-Liu-Edmond Algorithm Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 8 / 18

Labelled Parser A two-layer Rectifier Network (Glorot et al., 2011) Dependent Word: Bi-LSTM Feature Word Embedding PoS Embedding Head Word: Bi-LSTM Feature Word Embedding PoS Embedding Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 9 / 18

Experiments Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 10 / 18

Projective Parsing Results (PTB; English) NN (Chen & Manning, 2014); S-LSTM (Dyer et al., 2015); Bi-LSTM (Kiperwasser & Goldberg, 2016); SynNet (Andor et al. 2016) Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 11 / 18

Projective Parsing Results (PTB; Chinese) NN (Chen & Manning, 2014); S-LSTM (Dyer et al., 2015); Bi-LSTM (Kiperwasser & Goldberg, 2016); 3rd-cubic (Zhang & McDonald 2014) Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 12 / 18

Non-projective Parsing Results (German) MST-1st, MST-2nd (McDonald et al., 2005) Turbo-1st, Turbo-3rd (Martins et al., 2013) RBG-1st RBG-3rd (Martins et al. 2013) Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 13 / 18

Non-projective Parsing Results (Czech) MST-1st, MST-2nd ((McDonald et al., 2005) Turbo-1st, Turbo-3rd (Martins et al., 2013) RBG-1st RBG-3rd (Martins et al. 2013) Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 14 / 18

Unlabeled Exact Match PTB CTB Parser Dev Test Dev Test C&M14 43.35 40.93 32.75 32.20 Dyer15 51.94 50.70 39.72 37.23 DeNSe 51.24 49.34 34.74 33.66 DeNSe +E 36.49 35.13 52.47 50.79 Table: UEM results on PTB and CTB. Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 15 / 18

UAS v.s. Length 96 95 94 UAS (%) 93 92 91 C&M14 DeNSe+E 90 Dyer15 89 11 14 17 20 23 26 28 32 38 118 PTB sentence length Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 16 / 18

UAS v.s. Length 93 92 91 90 89 88 UAS (%) 87 86 85 84 83 C&M14 DeNSe+E 82 Dyer15 81 80 5 9 14 18 22 26 30 37 49 116 PTB sentence length CTB CTB Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 16 / 18

Conclusions We propose a dependency parser as greedily selecting the head of each word in sentence. Combine the greedy model with a MST algorithm can further increase the performance Code available: https://github.com/XingxingZhang/dense parser Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 17 / 18

Thanks Q & A Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency Neural Selection April 6, 2017 18 / 18

Dependency Parsing as Head Selection Xingxing Zhang , Jianpeng Cheng, - PowerPoint PPT Presentation

Dependency Parsing as Head Selection Xingxing Zhang , Jianpeng Cheng, Mirella Lapata Institute for Language, Cognition and Computation University of Edinburgh x.zhang@ed.ac.uk April 6, 2017 Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre,

Dependency Parsing & Feature-based Parsing Ling571 Deep Processing Techniques for NLP

Marina Valeeva Outline 2 1. Introduction What is Dependency Parsing? What is a

Statistical Parsing Dependency parsing ar ltekin University of Tbingen Seminar fr

Last class Dependency parsing and logistic regression Dependency parsing: a fully lexicalized

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Thoughts on Learner Data and Motivation Learner Language Dependency Parsing and Dependency

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen & Christopher D.

Logic Programming: Search Strategies Alan Smaill Nov 5, 2012 Alan Smaill Logic Programming:

Is a Graph Connected? Algorithm 1: Breadth-first search - From a starting node, find closest

Level-Balanced B-Trees Gerth Stlting Brodal BRICS University of Aarhus Pankaj K. Agarwal

30: General and Binary Trees Chris Wyatt Electrical and Computer Engineering Trees are

Design of Multi-tier Wireless Mesh Issues in Wireless Network Design Networks Multi-tier Wless

Decompiling Boolean Expressions from Java TM Bytecode Mangala Gowri Nanda (IBM-IRL) and S.

Experimental SiPM parameter characterization from avalanche triggering probabilities G. Gallina ,

Gates and Logic: From Transistors to Logic Gates and Logic Circuits Prof. Anne Bracy CS 3410

Dependency Parsing as Head Selection Xingxing Zhang , Jianpeng Cheng, - PowerPoint PPT Presentation

Dependency Parsing as Head Selection Xingxing Zhang , Jianpeng Cheng, Mirella Lapata Institute for Language, Cognition and Computation University of Edinburgh x.zhang@ed.ac.uk April 6, 2017 Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre,

Dependency Parsing &amp; Feature-based Parsing Ling571 Deep Processing Techniques for NLP

Marina Valeeva Outline 2 1. Introduction What is Dependency Parsing? What is a

Statistical Parsing Dependency parsing ar ltekin University of Tbingen Seminar fr

Last class Dependency parsing and logistic regression Dependency parsing: a fully lexicalized

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Thoughts on Learner Data and Motivation Learner Language Dependency Parsing and Dependency

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen &amp; Christopher D.

Logic Programming: Search Strategies Alan Smaill Nov 5, 2012 Alan Smaill Logic Programming:

Is a Graph Connected? Algorithm 1: Breadth-first search - From a starting node, find closest

Level-Balanced B-Trees Gerth Stlting Brodal BRICS University of Aarhus Pankaj K. Agarwal

30: General and Binary Trees Chris Wyatt Electrical and Computer Engineering Trees are

Design of Multi-tier Wireless Mesh Issues in Wireless Network Design Networks Multi-tier Wless

Decompiling Boolean Expressions from Java TM Bytecode Mangala Gowri Nanda (IBM-IRL) and S.

Experimental SiPM parameter characterization from avalanche triggering probabilities G. Gallina ,

Gates and Logic: From Transistors to Logic Gates and Logic Circuits Prof. Anne Bracy CS 3410

Dependency Parsing & Feature-based Parsing Ling571 Deep Processing Techniques for NLP

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen & Christopher D.