parsing universal dependencies
play

Parsing Universal Dependencies Team C2L2 Tianze Shi, Felix G. Wu, - PowerPoint PPT Presentation

Combining Global Models for Parsing Universal Dependencies Team C2L2 Tianze Shi, Felix G. Wu, Xilun Chen, Yao Cheng Cornell University Overview Scope of Our System What we did What we didnt do Word Segmentation Sentence


  1. Combining Global Models for Parsing Universal Dependencies Team C2L2 — Tianze Shi, Felix G. Wu, Xilun Chen, Yao Cheng Cornell University

  2. Overview — Scope of Our System What we did What we didn’t do • Word Segmentation • Sentence Boundary Detection • Projective Parsing • POS Tagging • Dependency Arc Labeling • Morphology Analysis • Delexicalized Parsing • Non-projective Parsing • Unlabeled data

  3. Overview — Highlights 2 nd argmax 𝑧∈𝒵 • Global transition- • Bi-LSTM-powered • Overall based models compact features 1 st fi sme • Delexicalized • High efficiency, low • Small Treebanks syntactic transfer resource demand • Surprise Languages

  4. Overview — System Pipeline I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing

  5. I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Sentence Raw UDPipe delimited Text & tokenized

  6. I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Languages OOV rates ↓ (word) ko – Korean 43.68% la – Latin 41.22% sk – Slovak 36.51% … … Average 14.4% * Measured on development set

  7. I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing parsing Bi-directional LSTM p a r s i n g

  8. I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Universal dependency parsing Bi-directional LSTM Universal dependency parsing

  9. I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Reparsing by Eisner’s (Sagae and Lavie, 2006) Arc-eager Arc-hybrid Eisner’s Global Global Bi-LSTM features

  10. I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Global Transition-based Parsing • 𝑃(𝑜 3 ) Exact decoders • Arc-eager and Arc-hybrid systems • Large-margin global training • Dynamic programming (Huang and Sagae, 2010; Kuhlmann, Gómez-Rodríguez and Satta, 2011) * Shi, Huang and Lee (2017, EMNLP)

  11. I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Compact (2) Feature Set Eisner’s head modifier Arc-eager stack top buffer top Arc-hybrid stack top buffer top Scoring function: deep bi-affine (Dozat and Manning, 2017)

  12. I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Ensembling 75.00 75 74.5 74.32 LAS 74.00 74 73.75 73.5 73 Single Single Single Full Arc-eager Arc-hybrid Eisner’s

  13. I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing nsubj obj ……. Multi-layer perceptron concat( ) head modifier

  14. I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Effect of Ensemble 75.00 75 74.69 74.5 LAS 74 73.5 73 Single Full Labeler

  15. Results — Official Ranking Big Treebanks 2 Small Treebanks 1 PUD Treebanks 2 Surprise Languages 1 Overall 2

  16. Strategies — Small Treebanks Task finetune Task finetune Task finetune fr_partut model fr_sequoia model fr model Finetune on fr Finetune on fr_partut Finetune on fr_sequoia All tasks All tasks All tasks Combined model Train on: {fr, fr_partut, fr_sequoia} All tasks

  17. Results — Small Treebanks Test Treebank fr fr_partut fr_sequoia Train Treebank fr 84.09 fr_partut 79.53 fr_sequoia 84.65 Combined 87.57 85.57 82.80 +Finetune 87.87 86.65 86.37 * UAS results on dev set, using gold segmentation

  18. Strategies — Surprise Languages Train on a source language (selected via WALS) • Delexicalized parser • parsing parsing UPOS Bag of Bi-directional LSTM concat( ) tag Morphology Max pooling p a r s i n g Morphology tags

  19. Results — Surprise Languages Target Source* Ranking Buryat Hindi 2 Upper Sorbian Czech 1 Kurmanji Persian 1 North Sámi Finnish 1 Average 1 *selected via WALS

  20. Implementation • Neural networks • Parsing algorithms • Hardware X 2 • Training time Approx. 1 week

  21. Efficiency Runtime (Hours) * 30 26.17 25 20 16.27 15 8.88 10 5.96 4.64 5 0 Stanford C2L2 IMS HIT-SCIR LATTICE (Stanford) (Ithaca) (Stuttgart) (Harbin) (Paris) LAS 76.30 75.00 74.42 72.11 70.93 CPUs 4 2 12 1 8 RAM 16 8 64 8 32 * Not Benchmark Results

  22. Combining Global Models for Parsing Universal Dependencies argmax 𝑧∈𝒵 • Global transition- • Ensemble • Two-stage based models fine-tuning https://github.com/CoNLL-UD-2017/C2L2 Team C2L2 — Tianze Shi, Felix G. Wu, Xilun Chen, Yao Cheng

Recommend


More recommend