Parsing Universal Dependencies Team C2L2 Tianze Shi, Felix G. Wu, - PowerPoint PPT Presentation

Combining Global Models for Parsing Universal Dependencies Team C2L2 — Tianze Shi, Felix G. Wu, Xilun Chen, Yao Cheng Cornell University

Overview — Scope of Our System What we did What we didn’t do • Word Segmentation • Sentence Boundary Detection • Projective Parsing • POS Tagging • Dependency Arc Labeling • Morphology Analysis • Delexicalized Parsing • Non-projective Parsing • Unlabeled data

Overview — Highlights 2 nd argmax 𝑧∈𝒵 • Global transition- • Bi-LSTM-powered • Overall based models compact features 1 st fi sme • Delexicalized • High efficiency, low • Small Treebanks syntactic transfer resource demand • Surprise Languages

Overview — System Pipeline I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Sentence Raw UDPipe delimited Text & tokenized

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Languages OOV rates ↓ (word) ko – Korean 43.68% la – Latin 41.22% sk – Slovak 36.51% … … Average 14.4% * Measured on development set

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing parsing Bi-directional LSTM p a r s i n g

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Universal dependency parsing Bi-directional LSTM Universal dependency parsing

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Reparsing by Eisner’s (Sagae and Lavie, 2006) Arc-eager Arc-hybrid Eisner’s Global Global Bi-LSTM features

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Global Transition-based Parsing • 𝑃(𝑜 3 ) Exact decoders • Arc-eager and Arc-hybrid systems • Large-margin global training • Dynamic programming (Huang and Sagae, 2010; Kuhlmann, Gómez-Rodríguez and Satta, 2011) * Shi, Huang and Lee (2017, EMNLP)

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Compact (2) Feature Set Eisner’s head modifier Arc-eager stack top buffer top Arc-hybrid stack top buffer top Scoring function: deep bi-affine (Dozat and Manning, 2017)

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Ensembling 75.00 75 74.5 74.32 LAS 74.00 74 73.75 73.5 73 Single Single Single Full Arc-eager Arc-hybrid Eisner’s

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing nsubj obj ……. Multi-layer perceptron concat( ) head modifier

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Effect of Ensemble 75.00 75 74.69 74.5 LAS 74 73.5 73 Single Full Labeler

Results — Official Ranking Big Treebanks 2 Small Treebanks 1 PUD Treebanks 2 Surprise Languages 1 Overall 2

Strategies — Small Treebanks Task finetune Task finetune Task finetune fr_partut model fr_sequoia model fr model Finetune on fr Finetune on fr_partut Finetune on fr_sequoia All tasks All tasks All tasks Combined model Train on: {fr, fr_partut, fr_sequoia} All tasks

Results — Small Treebanks Test Treebank fr fr_partut fr_sequoia Train Treebank fr 84.09 fr_partut 79.53 fr_sequoia 84.65 Combined 87.57 85.57 82.80 +Finetune 87.87 86.65 86.37 * UAS results on dev set, using gold segmentation

Strategies — Surprise Languages Train on a source language (selected via WALS) • Delexicalized parser • parsing parsing UPOS Bag of Bi-directional LSTM concat( ) tag Morphology Max pooling p a r s i n g Morphology tags

Results — Surprise Languages Target Source* Ranking Buryat Hindi 2 Upper Sorbian Czech 1 Kurmanji Persian 1 North Sámi Finnish 1 Average 1 *selected via WALS

Implementation • Neural networks • Parsing algorithms • Hardware X 2 • Training time Approx. 1 week

Efficiency Runtime (Hours) * 30 26.17 25 20 16.27 15 8.88 10 5.96 4.64 5 0 Stanford C2L2 IMS HIT-SCIR LATTICE (Stanford) (Ithaca) (Stuttgart) (Harbin) (Paris) LAS 76.30 75.00 74.42 72.11 70.93 CPUs 4 2 12 1 8 RAM 16 8 64 8 32 * Not Benchmark Results

Combining Global Models for Parsing Universal Dependencies argmax 𝑧∈𝒵 • Global transition- • Ensemble • Two-stage based models fine-tuning https://github.com/CoNLL-UD-2017/C2L2 Team C2L2 — Tianze Shi, Felix G. Wu, Xilun Chen, Yao Cheng

Parsing Universal Dependencies Team C2L2 Tianze Shi, Felix G. Wu, - PowerPoint PPT Presentation

Combining Global Models for Parsing Universal Dependencies Team C2L2 Tianze Shi, Felix G. Wu, Xilun Chen, Yao Cheng Cornell University Overview Scope of Our System What we did What we didnt do Word Segmentation Sentence

Towards an adequate account of parataxis in Universal Dependencies Lars Ahrenberg Department of

GF2UD and UD2GF UD: Universal Dependencies Prasanth Kolachina GF Summer school, 2017 the black

Parsing to Stanford Dependencies: Trade-offs between speed and accuracy Daniel Cer,

ConlluEditor: a fully graphical editor for Universal dependencies treebank files Johannes

A Latent Variable Model of Synchronous Parsing for Syntactic and Semantic Dependencies James

Universal Dependencies for Croatian (that Work for Serbian, too) c Nikola Ljube c

Towards Transferring Bulgarian Sentences with Elliptical Elements to Universal Dependencies

Universal Dependencies are hard to parse or are they? Ines Rehbein , Julius Steen ,

Presenting TWITTIR-UD An Italian Twitter Treebank in Universal Dependencies Alessandra Teresa

Universal Dependencies for Mby Guaran Guillaume Thomas August 30, 2019 Department of

Dependency Parsing CMSC 470 Marine Carpuat Dependency Grammars Syntactic structure = lexical

Universal Dependencies Joakim Nivre, Dan Zeman, Filip Ginter, Sampo Pyysalo, Chris Manning,

Towards Deep Universal Dependencies Kira Droganova, Daniel Zeman

Survey of Uralic Universal Dependencies development Niko Partanen & Jack Rueter University

Fuzzy Systems Are Universal . . . Universal Approximators Often, We Can Only . . . Main Idea:

Using Left-corner Parsing to Encode Universal Structural Constraints in Grammar Induction

Character-level Annotation for Chinese Surface-Syntactic Universal Dependencies Chuanming Dong,

tweeDe A Universal Dependencies treebank for German tweets Ines Rehbein Josef Ruppenhofer

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Universal Dependency Treebank for Latvian: a Pilot Lauma Pretkalnia, Laura Rituma and Baiba

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Task Dependencies: ant Steven J Zeil February 25, 2013 Task Dependencies: ant Outline

j t t t q t t t t t q t t LAW-MWE-CxG-2018 @ COLING 2018 t q t q t t t t t q t t t t t

Building stuff with monadic dependencies + unchanging dependencies + polymorphic dependencies +

Parsing Universal Dependencies Team C2L2 Tianze Shi, Felix G. Wu, - PowerPoint PPT Presentation

Combining Global Models for Parsing Universal Dependencies Team C2L2 Tianze Shi, Felix G. Wu, Xilun Chen, Yao Cheng Cornell University Overview Scope of Our System What we did What we didnt do Word Segmentation Sentence

Towards an adequate account of parataxis in Universal Dependencies Lars Ahrenberg Department of

GF2UD and UD2GF UD: Universal Dependencies Prasanth Kolachina GF Summer school, 2017 the black

Parsing to Stanford Dependencies: Trade-offs between speed and accuracy Daniel Cer,

ConlluEditor: a fully graphical editor for Universal dependencies treebank files Johannes

A Latent Variable Model of Synchronous Parsing for Syntactic and Semantic Dependencies James

Universal Dependencies for Croatian (that Work for Serbian, too) c Nikola Ljube c

Towards Transferring Bulgarian Sentences with Elliptical Elements to Universal Dependencies

Universal Dependencies are hard to parse or are they? Ines Rehbein , Julius Steen ,

Presenting TWITTIR-UD An Italian Twitter Treebank in Universal Dependencies Alessandra Teresa

Universal Dependencies for Mby Guaran Guillaume Thomas August 30, 2019 Department of

Dependency Parsing CMSC 470 Marine Carpuat Dependency Grammars Syntactic structure = lexical

Universal Dependencies Joakim Nivre, Dan Zeman, Filip Ginter, Sampo Pyysalo, Chris Manning,

Towards Deep Universal Dependencies Kira Droganova, Daniel Zeman

Survey of Uralic Universal Dependencies development Niko Partanen &amp; Jack Rueter University

Fuzzy Systems Are Universal . . . Universal Approximators Often, We Can Only . . . Main Idea:

Using Left-corner Parsing to Encode Universal Structural Constraints in Grammar Induction

Character-level Annotation for Chinese Surface-Syntactic Universal Dependencies Chuanming Dong,

tweeDe A Universal Dependencies treebank for German tweets Ines Rehbein Josef Ruppenhofer

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Universal Dependency Treebank for Latvian: a Pilot Lauma Pretkalnia, Laura Rituma and Baiba

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Task Dependencies: ant Steven J Zeil February 25, 2013 Task Dependencies: ant Outline

j t t t q t t t t t q t t LAW-MWE-CxG-2018 @ COLING 2018 t q t q t t t t t q t t t t t

Building stuff with monadic dependencies + unchanging dependencies + polymorphic dependencies +

Survey of Uralic Universal Dependencies development Niko Partanen & Jack Rueter University