Soft Cross lingual Syntax Projection for Dependency Parsing - PowerPoint PPT Presentation

Soft Cross ‐ lingual Syntax Projection for Dependency Parsing Zhenghua Li, Min Zhang, Wenliang Chen {zhli13, minzhang, wlchen}@suda.edu.cn Soochow University, China

Dependency parsing  A bilingual example pmod root obj obj det subj det $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 subj obj obj root vv fish eat

Big picture (semi-supervised) English Chinese Treebank Treebank Bitext English I love this game Larger Parser 我爱这运动 training data Chinese Project English labeled data parse trees into with partial Chinese tree

Syntax projection $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 fish eat

Challenges  Syntactic non-isomorphism across languages  Different annotation choices (guideline)  Partial (incomplete) parse trees resulted from projection  Parsing errors on the source side  Word alignment errors

Cross-language non-isomorphism $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 use eat (verb)

Different annotation choices  Coordination structure as an example fish and bird fish and bird fish and bird fish and bird fish and bird

Challenges  Syntactic non-isomorphism across languages  Different annotation choices (guideline)  Partial (incomplete) parse trees resulted from projection  Parsing errors on the source side  Word alignment errors All these factors can lead to bad projections!

Why called soft projection Project less but reliable dependencies, put quality before quantity Careful/gentle/conservative projection Wrong projection -> training noise

Big picture (semi-supervised) English Chinese Treebank Treebank Bitext English I love this game Chinese Larger Parser Parser 我爱这运动 training data filtering Chinese Project English labeled data parse trees into with partial Chinese trees

Step 1: word alignment and English parsing on bitext English Treebank Bitext English I love this game Parser 我爱这运动 $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0

Step 2: project English tree into Chinese (direct correspondence assumption) English Treebank Bitext English I love this game Parser 我爱这运动 Chinese Project English labeled data parse trees into with partial Chinese tree

Step 2: project English tree into Chinese (direct correspondence assumption) $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0

Step 3: filter projected structures with baseline Chinese Parser English Chinese Treebank Treebank Bitext English I love this game Chinese Parser Parser 我爱这运动 filtering Chinese Project English labeled data parse trees into with partial Chinese tree

Relationship between prob and accuracy

Step 3: filter projected structures with baseline Chinese Parser $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 use eat Chinese Parser

Step 3: filter projected structures with baseline Chinese Parser $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 use 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 eat

Step 4: combine the data to train a new Chinese Parser English Chinese Treebank Treebank Bitext English I love this game Chinese Larger Parser Parser 我爱这运动 training data filtering Chinese Project English labeled data parse trees into with partial Chinese tree

How to handle data with partial tree annotation  Convert partial tree annotation into forest annotation (ambiguous labelings)  For an unattached word, add links from all other words to it. ` 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 eat use

How to handle data with partial tree annotation  Maximize the mixed likelihood of manually labeled data with tree annotation and auto- collected data with forest annotation  Tree annotation can be understood as a special case of forest annotation How to train a parser using data with forest annotation?

Train with ambiguous labelings  Refer to Tackstrom+ 13 and several earlier papers Maximize the likelihood of the data Maximize the probability of a forest Maximize the sum probability of all the trees in the forest The training problem can be solved with the inside-outside algorithm

Experiments  Data statistics  Parser  Second-order dependency parser (McDonald & Pereira 06) (CRF-based, probabilistic)  SGD training (20K + 1M training data)

Relationship between prob and accuracy

Proj ratio: 44% 31% 26% Effect of filtering threshold

Supplement the projected structures with baseline Chinese parser  Even after filtering, the projected structures may still contain wrong dependencies  Use the baseline Chinese Parser to add more high- prob dependencies (multiple heads for a single word, decrease potential noise)

Supplement the projected structures with baseline Chinese parser $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 eat use

Effect of supplement threshold

Final results on CTB5 test

Comparison with (Jiang+ 10) on CTB5X test

Recent works on multilingual dependency parsing  Semi-supervised  Bilingual word reordering info (Huang & Sagae 09)  Project to build a local classifier (Jiang & Liu 10)  Unsupervised  Projection (Ganchev+ 09)  Delexicalized (McDonald+ 11; Tackstrom+ 12, 13)  Hybrid (McDonald+ 11; Ma & Xia 14)

Conclusions  We propose a simple semi-supervised framework to derive high-quality labeled training data from bitext  Use target-language marginal probabilities to control the quality of the projected structures (quite simple and effective)  Use forest based training method to make use of partial annotations (a very general framework)

Future directions  Project more dependencies from source- language parse trees?  When two target-langauge words align to the same source-langauge word?  More complex correspondences between source- target trees?

Future directions  More elegant ways to handle  word alignment errors (word alignment prob?)  source-language parsing errors (parsing prob?)  cross-lingual non-isomorphism (very difficult!)  annotation guideline differences  Universal dependency parsing? (earlier invited talk by Prof. Nivre)  Joint word alignment and bilingual dependency parsing?  handle all of the above issues in a unified framework

Thanks for your time! Questions?

Build local classifiers via projection (Jiang & Liu 10)  Semi-supervised; project edges  Step 1: projection to obtain dependency/non-dependency classification instances  Step 2: build a target-language local dependency/non- dependency classifier  Step 3: feed the outputs of the classifier into a supervised parser as extra weights during test phase.

Supplement the projected structures with baseline Chinese parser If: a word obtain a head from projection (also survives from filtering) and the baseline Chinese parser suggests another high-prob candidate head Then: insert the head candidate into the projected structure.

Multilingual dependency parsing becomes a hot topic  Pioneered by Hwa+ 05  Motivations  A more accurate parser on one language may help a less accurate one on another language (this paper)  A difficult syntactic ambiguity in one language may be easy to resolve in another language  Rich labeled resources in one language can be transferred to build parsers of another language (unsupervised)

Soft Cross lingual Syntax Projection for Dependency Parsing - PowerPoint PPT Presentation

Soft Cross lingual Syntax Projection for Dependency Parsing Zhenghua Li, Min Zhang, Wenliang Chen {zhli13, minzhang, wlchen}@suda.edu.cn Soochow University, China Dependency parsing A bilingual example pmod root obj obj det subj

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Projection Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU)

Chapter 6: Syntax Syntax Syntax is the structure of a language. Earlier, both syntax and

WALES SOFT POWER BAROMETER 2018 Measuring soft power beyond the nation-state April 2018 01 WHAT

02 | 27 SOUTHERN CROSS 23.04 03 | 27 SOUTHERN CROSS 23.04 04 | 27 SOUTHERN CROSS 23.04 06

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Grammars Topological Dependency Trees: A Constraint-based Account of Linear

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Cross-lingual Dependency Parsing of Related Languages with Rich Morphosyntactic Tagsets eljko

Bleaching Text: Abstract Features for Cross-lingual Gender Prediction. Rob van der Goot, Nikola

Cross-lingual Information Retrieval Pavel Pecina Institute of Formal and Applied Linguistics

Cross-Lingual Information Retrieval Language Technology I Language Technology I Crosslingual

WMT 2016 Shared Task on Cross-lingual Pronoun Prediction . Liane Guillou, Christian Hardmeier,

Cross-lingual NLP Sara Stymne Uppsala University Department of Linguistics and Philology

Syntax Liam OConnor CSE, UNSW (and data61) Term3 2019 1 Abstract Syntax Parsing Bindings

Cross-lingual CCG Induction Kilian Evang @texttheater University of D usseldorf 2019-06-04

Adjoint-Based Optimization of Time-Dependent Fluid-Structure Systems using a High-Order

Identification and Estimation of Causal Effects from Dependent Data Eli Sherman esherman@jhu.edu

Advanced Use of Eclipse 4s Dependency Injection Framework Brian de Alwis Manumitting

Covering the Basics of QRTP in Dependency Court Judge Christine Schaller, Thurston County

SENSATA FIRST QUARTER 2020 EARNINGS PRESENTATION APRIL 29, 2020 Forward-Looking Statements and

Collective Traffic Prediction with Partially Observed Traffic History using Location-Based Social

OpenMP 4.0 and Beyond! Aidan Chalk, Hartree Centre, STFC What is OpenMP? OpenMP is an API

Object-Oriented Design Roman Kontchakov / Carsten Fuhs Birkbeck, University of London Outline

Sambuz

Useful Links

Newsletter

Mail Us