NLP Programming Tutorial 12 - Dependency Parsing Graham Neubig - PowerPoint PPT Presentation

NLP Programming Tutorial 12 – Dependency Parsing NLP Programming Tutorial 12 - Dependency Parsing Graham Neubig Nara Institute of Science and Technology (NAIST) 1

NLP Programming Tutorial 12 – Dependency Parsing Interpreting Language is Hard! I saw a girl with a telescope ● “Parsing” resolves structural ambiguity in a formal way 2

NLP Programming Tutorial 12 – Dependency Parsing Two Types of Parsing ● Dependency: focuses on relations between words I saw a girl with a telescope ● Phrase structure: focuses on identifying phrases and their recursive structure S VP PP NP NP NP PRPVBD DT NN IN DT NN 3 I saw a girl with a telescope

NLP Programming Tutorial 12 – Dependency Parsing Dependencies Also Resolve Ambiguity I saw a girl with a telescope I saw a girl with a telescope 4

NLP Programming Tutorial 12 – Dependency Parsing Dependencies ● Typed: Label indicating relationship between words prep pobj dobj nsubj det det I saw a girl with a telescope ● Untyped: Only which words depend I saw a girl with a telescope 5

NLP Programming Tutorial 12 – Dependency Parsing Dependency Parsing Methods ● Shift-reduce ● Predict from left-to-right ● Fast (linear), but slightly less accurate? ● MaltParser ● Spanning tree ● Calculate full tree at once ● Slightly more accurate, slower ● MSTParser, Eda (Japanese) ● Cascaded chunking ● Chunk words into phrases, find heads, delete non- heads, repeat 6 ● CaboCha (Japanese)

NLP Programming Tutorial 12 – Dependency Parsing Maximum Spanning Tree ● Each dependency is an edge in a directed graph ● Assign each edge a score (with machine learning) ● Keep the tree with the highest score Graph Scored Graph Dependency Tree saw saw saw 6 6 4 4 -1 2 I girl I girl I girl 1 7 7 -2 5 a a a 7 (Chu-Liu-Edmonds Algorithm)

NLP Programming Tutorial 12 – Dependency Parsing Cascaded Chunking ● Works for Japanese, which is strictly head-final ● Divide sentence into chunks, head is rightmost word 私は望遠鏡で女の子を見たはでの子を見た私望遠鏡女はで子を見た私望遠鏡の女見たはでを見たはでを子子私望遠鏡私望遠鏡のの 8 女女

NLP Programming Tutorial 12 – Dependency Parsing Shift-Reduce ● Process words one-by-one left-to-right ● Two data structures ● Queue: of unprocessed words ● Stack: of partially processed words ● At each point choose ● shift: move one word from queue to stack ● reduce left: top word on stack is head of second word ● reduce right: second word on stack is head of top word ● Learn how to choose each action with a classifier 9

NLP Programming Tutorial 12 – Dependency Parsing Shift Reduce Example Stack Queue Stack Queue I saw a girl saw a girl shift I I saw a girl r left shift saw girl I saw a girl shift I a r left r right saw a girl saw I I girl shift saw a girl a 10 I

NLP Programming Tutorial 12 – Dependency Parsing Classification for Shift-Reduce ● Given a state: Stack Queue saw a girl I ● Which action do we choose? ? r left ? r right ? shift saw a girl a girl saw girl saw a I I I ● Correct actions → correct tree 11

NLP Programming Tutorial 12 – Dependency Parsing Classification for Shift-Reduce ● We have a weight vector for “shift” “reduce left” “reduce right” w s w l w r ● Calculate feature functions from the queue and stack φ( queue , stack ) ● Multiply the feature functions to get scores s s = w s * φ( queue , stack ) ● Take the highest score s s > s l && s s > s r → do shift 12

NLP Programming Tutorial 12 – Dependency Parsing Features for Shift Reduce ● Features should generally cover at least the last stack entries and first queue entry stack [-2] stack [-1] queue [0] (-2 → second-to-last) (-1 → last) saw a girl Word: (0 → first) VBD DET NN POS: φ W-2saw,W-1a = 1 φ W-1a,W0girl = 1 φ W-2saw,P-1DET = 1 φ W-1a,P0NN = 1 φ P-2VBD,W-1a = 1 φ P-1DET,W0girl = 1 φ P-2VBD,P-1DET = 1 φ P-1DET,P0NN = 1 13

NLP Programming Tutorial 12 – Dependency Parsing Algorithm Definition ● The algorithm ShiftReduce takes as input: ● Weights w s w l w r ● A queue =[ (1, word 1 , POS 1 ), (2, word 2 , POS 2 ), …] ● starts with a stack holding the special ROOT symbol: ● stack = [ (0, “ROOT”, “ROOT”) ] ● processes and returns: ● heads = [ -1, head 1 , head 2 , … ] 14

NLP Programming Tutorial 12 – Dependency Parsing Shift Reduce Algorithm ShiftReduce ( queue ) make list heads stack = [ (0, “ROOT”, “ROOT”) ] while | queue | > 0 or | stack | > 1: feats = MakeFeats ( stack , queue ) s s = w s * feats # Score for “shift” s l = w l * feats # Score for “reduce left” s r = w r * feats # Score for “reduce right” if s s >= s l and s s >= s r and | queue | > 0: stack .push( queue .popleft() ) # Do the shift elif s l >= s r : # Do the reduce left heads [ stack [-2]. id ] = stack [-1]. id stack . remove (-2) else : # Do the reduce right heads [ stack [-1]. id ] = stack [-2]. id 15 stack .remove(-1)

NLP Programming Tutorial 12 – Dependency Parsing Training Shift-Reduce ● Can be trained using perceptron algorithm ● Do parsing, if correct answer corr different from classifier answer ans , update weights ● e.g. if ans = SHIFT and corr = LEFT w s -= φ( queue , stack ) w l += φ( queue , stack ) 16

NLP Programming Tutorial 12 – Dependency Parsing Keeping Track of the Correct Answer (Initial Attempt) ● Assume we know correct head of each stack entry: stack [-1]. head == stack [-2]. id (left is head of right) → corr = RIGHT stack [-2]. head == stack [-1]. id (right is head of left) → corr = LEFT else → corr = SHIFT ● Problem: too greedy for right-branching dependencies stack [-2] stack [-1] queue [0] go go to school → RIGHT to id: 1 2 3 17 head: 0 1 2 school

NLP Programming Tutorial 12 – Dependency Parsing Keeping Track of the Correct Answer (Revised) ● Count the number of unprocessed children ● stack [-1]. head == stack [-2]. id (right is head of left) stack [-1]. unproc == 0 (left no unprocessed children) → corr = RIGHT ● stack [-2]. head == stack [-1]. id (left is head of right) stack [-2]. unproc == 0 (right no unprocessed children) → corr = LEFT ● else → corr = SHIFT ● Increase unproc when reading in the tree When we reduce a head, decrement unproc 18 corr == RIGHT → stack [-1]. unproc -= 1

NLP Programming Tutorial 12 – Dependency Parsing Shift Reduce Training Algorithm ShiftReduceTrain ( queue ) make list heads stack = [ (0, “ROOT”, “ROOT”) ] while | queue | > 0 or | stack | > 1: feats = MakeFeats ( stack , queue ) calculate ans # Same as ShiftReduce calculate corr # Previous slides if ans != corr : w ans -= feats w corr += feats perform action according to corr 19

NLP Programming Tutorial 12 – Dependency Parsing CoNLL File Format: ● Standard format for dependencies ● Tab-separated columns, sentences separated by space ID Word Base POS POS2 ? Head Type 1 ms. ms. NNP NNP _ 2 DEP 2 haag haag NNP NNP _ 3 NP-SBJ 3 plays plays VBZ VBZ _ 0 ROOT 4 elianti elianti NNP NNP _ 3 NP-OBJ 5 . . . . _ 3 DEP 20

NLP Programming Tutorial 12 – Dependency Parsing Exercise 21

NLP Programming Tutorial 12 – Dependency Parsing Exercise ● Write train-sr.py test-sr.py ● Train the program ● Input: data/mstparser-en-train.dep ● Run the program on actual data: ● data/mstparser-en-test.dep ● Measure: accuracy with script/grade-dep.py ● Challenge: ● think of better features to use ● use a better classification algorithm than perceptron ● analyze the common mistakes 22

NLP Programming Tutorial 12 – Dependency Parsing Thank You! 23

NLP Programming Tutorial 12 - Dependency Parsing Graham Neubig - PowerPoint PPT Presentation

NLP Programming Tutorial 12 Dependency Parsing NLP Programming Tutorial 12 - Dependency Parsing Graham Neubig Nara Institute of Science and Technology (NAIST) 1 NLP Programming Tutorial 12 Dependency Parsing Interpreting Language is

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Dependency Parsing & Feature-based Parsing Ling571 Deep Processing Techniques for NLP

NLP Programming Tutorial 8 - Phrase Structure Parsing Graham Neubig Nara Institute of Science

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre,

NLP Programming Tutorial 0 - Programming Basics Graham Neubig Nara Institute of Science and

Algorithms for NLP CS 11-711 Fall 2020 Lecture 14: Graph-based dependency parsing Emma

Marina Valeeva Outline 2 1. Introduction What is Dependency Parsing? What is a

NLP Programming Tutorial 4 - Word Segmentation Graham Neubig Nara Institute of Science and

Statistical Parsing Dependency parsing ar ltekin University of Tbingen Seminar fr

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

V 3 lines maximum size of equi lines in IR Question 23 41 42 6 7 14 3 4 5 2 h 6 10 16 28

Natural Language Processing Fall 2018 Frank Ferraro Natural language processing ITE 358

Linear Programming Lecturer: Shi Li Department of Computer Science and Engineering University at

White-box vs Black-box: Bayes Optimal Strategies for Membership Inference Alexandre Sablayrolles,

Why NLP Needs Theoretical Syntax (It in Fact Already Uses It) Owen Rambow Center for

Canonical Correlation a Tutorial Magnus Borga January 12, 2001 Contents 1 About this tutorial

COMP331/COMP557: Optimisation Martin Gairing Computer Science Department University of Liverpool

Oracle Nested T ables Another structuring to ol pro vided in Oracle is the abilit y

NLP Programming Tutorial 12 - Dependency Parsing Graham Neubig - PowerPoint PPT Presentation

NLP Programming Tutorial 12 Dependency Parsing NLP Programming Tutorial 12 - Dependency Parsing Graham Neubig Nara Institute of Science and Technology (NAIST) 1 NLP Programming Tutorial 12 Dependency Parsing Interpreting Language is

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Dependency Parsing &amp; Feature-based Parsing Ling571 Deep Processing Techniques for NLP

NLP Programming Tutorial 8 - Phrase Structure Parsing Graham Neubig Nara Institute of Science

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre,

NLP Programming Tutorial 0 - Programming Basics Graham Neubig Nara Institute of Science and

Algorithms for NLP CS 11-711 Fall 2020 Lecture 14: Graph-based dependency parsing Emma

Marina Valeeva Outline 2 1. Introduction What is Dependency Parsing? What is a

NLP Programming Tutorial 4 - Word Segmentation Graham Neubig Nara Institute of Science and

Statistical Parsing Dependency parsing ar ltekin University of Tbingen Seminar fr

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

V 3 lines maximum size of equi lines in IR Question 23 41 42 6 7 14 3 4 5 2 h 6 10 16 28

Natural Language Processing Fall 2018 Frank Ferraro Natural language processing ITE 358

Linear Programming Lecturer: Shi Li Department of Computer Science and Engineering University at

White-box vs Black-box: Bayes Optimal Strategies for Membership Inference Alexandre Sablayrolles,

Why NLP Needs Theoretical Syntax (It in Fact Already Uses It) Owen Rambow Center for

Canonical Correlation a Tutorial Magnus Borga January 12, 2001 Contents 1 About this tutorial

COMP331/COMP557: Optimisation Martin Gairing Computer Science Department University of Liverpool

Oracle Nested T ables Another structuring to ol pro vided in Oracle is the abilit y

Dependency Parsing & Feature-based Parsing Ling571 Deep Processing Techniques for NLP