soft cross lingual syntax projection for dependency
play

Soft Cross lingual Syntax Projection for Dependency Parsing - PowerPoint PPT Presentation

Soft Cross lingual Syntax Projection for Dependency Parsing Zhenghua Li, Min Zhang, Wenliang Chen {zhli13, minzhang, wlchen}@suda.edu.cn Soochow University, China Dependency parsing A bilingual example pmod root obj obj det subj


  1. Soft Cross ‐ lingual Syntax Projection for Dependency Parsing Zhenghua Li, Min Zhang, Wenliang Chen {zhli13, minzhang, wlchen}@suda.edu.cn Soochow University, China

  2. Dependency parsing  A bilingual example pmod root obj obj det subj det $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 subj obj obj root vv fish eat

  3. Big picture (semi-supervised) English Chinese Treebank Treebank Bitext English I love this game Larger Parser 我 爱 这 运动 training data Chinese Project English labeled data parse trees into with partial Chinese tree

  4. Syntax projection $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 fish eat

  5. Challenges  Syntactic non-isomorphism across languages  Different annotation choices (guideline)  Partial (incomplete) parse trees resulted from projection  Parsing errors on the source side  Word alignment errors

  6. Cross-language non-isomorphism $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 use eat (verb)

  7. Different annotation choices  Coordination structure as an example fish and bird fish and bird fish and bird fish and bird fish and bird

  8. Challenges  Syntactic non-isomorphism across languages  Different annotation choices (guideline)  Partial (incomplete) parse trees resulted from projection  Parsing errors on the source side  Word alignment errors All these factors can lead to bad projections!

  9. Why called soft projection Project less but reliable dependencies, put quality before quantity Careful/gentle/conservative projection Wrong projection -> training noise

  10. Big picture (semi-supervised) English Chinese Treebank Treebank Bitext English I love this game Chinese Larger Parser Parser 我 爱 这 运动 training data filtering Chinese Project English labeled data parse trees into with partial Chinese trees

  11. Step 1: word alignment and English parsing on bitext English Treebank Bitext English I love this game Parser 我 爱 这 运动 $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0

  12. Step 2: project English tree into Chinese (direct correspondence assumption) English Treebank Bitext English I love this game Parser 我 爱 这 运动 Chinese Project English labeled data parse trees into with partial Chinese tree

  13. Step 2: project English tree into Chinese (direct correspondence assumption) $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0

  14. Step 3: filter projected structures with baseline Chinese Parser English Chinese Treebank Treebank Bitext English I love this game Chinese Parser Parser 我 爱 这 运动 filtering Chinese Project English labeled data parse trees into with partial Chinese tree

  15. Relationship between prob and accuracy

  16. Step 3: filter projected structures with baseline Chinese Parser $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 use eat Chinese Parser

  17. Step 3: filter projected structures with baseline Chinese Parser $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 use 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 eat

  18. Step 3: filter projected structures with baseline Chinese Parser $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 use 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 eat

  19. Step 4: combine the data to train a new Chinese Parser English Chinese Treebank Treebank Bitext English I love this game Chinese Larger Parser Parser 我 爱 这 运动 training data filtering Chinese Project English labeled data parse trees into with partial Chinese tree

  20. How to handle data with partial tree annotation  Convert partial tree annotation into forest annotation (ambiguous labelings)  For an unattached word, add links from all other words to it. ` 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 eat use

  21. How to handle data with partial tree annotation  Maximize the mixed likelihood of manually labeled data with tree annotation and auto- collected data with forest annotation  Tree annotation can be understood as a special case of forest annotation How to train a parser using data with forest annotation?

  22. Train with ambiguous labelings  Refer to Tackstrom+ 13 and several earlier papers Maximize the likelihood of the data Maximize the probability of a forest Maximize the sum probability of all the trees in the forest The training problem can be solved with the inside-outside algorithm

  23. Experiments  Data statistics  Parser  Second-order dependency parser (McDonald & Pereira 06) (CRF-based, probabilistic)  SGD training (20K + 1M training data)

  24. Relationship between prob and accuracy

  25. Proj ratio: 44% 31% 26% Effect of filtering threshold

  26. Supplement the projected structures with baseline Chinese parser  Even after filtering, the projected structures may still contain wrong dependencies  Use the baseline Chinese Parser to add more high- prob dependencies (multiple heads for a single word, decrease potential noise)

  27. Supplement the projected structures with baseline Chinese parser $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 eat use

  28. Supplement the projected structures with baseline Chinese parser $ 0 I 1 eat 2 the 3 fish 4 with 5 a 6 folk 7 我 1 用 2 叉子 3 吃 4 鱼 5 $ 0 eat use

  29. Effect of supplement threshold

  30. Effect of supplement threshold

  31. Effect of supplement threshold

  32. Final results on CTB5 test

  33. Comparison with (Jiang+ 10) on CTB5X test

  34. Recent works on multilingual dependency parsing  Semi-supervised  Bilingual word reordering info (Huang & Sagae 09)  Project to build a local classifier (Jiang & Liu 10)  Unsupervised  Projection (Ganchev+ 09)  Delexicalized (McDonald+ 11; Tackstrom+ 12, 13)  Hybrid (McDonald+ 11; Ma & Xia 14)

  35. Conclusions  We propose a simple semi-supervised framework to derive high-quality labeled training data from bitext  Use target-language marginal probabilities to control the quality of the projected structures (quite simple and effective)  Use forest based training method to make use of partial annotations (a very general framework)

  36. Future directions  Project more dependencies from source- language parse trees?  When two target-langauge words align to the same source-langauge word?  More complex correspondences between source- target trees?

  37. Future directions  More elegant ways to handle  word alignment errors (word alignment prob?)  source-language parsing errors (parsing prob?)  cross-lingual non-isomorphism (very difficult!)  annotation guideline differences  Universal dependency parsing? (earlier invited talk by Prof. Nivre)  Joint word alignment and bilingual dependency parsing?  handle all of the above issues in a unified framework

  38. Thanks for your time! Questions?

  39. Build local classifiers via projection (Jiang & Liu 10)  Semi-supervised; project edges  Step 1: projection to obtain dependency/non-dependency classification instances  Step 2: build a target-language local dependency/non- dependency classifier  Step 3: feed the outputs of the classifier into a supervised parser as extra weights during test phase.

  40. Supplement the projected structures with baseline Chinese parser If: a word obtain a head from projection (also survives from filtering) and the baseline Chinese parser suggests another high-prob candidate head Then: insert the head candidate into the projected structure.

  41. Multilingual dependency parsing becomes a hot topic  Pioneered by Hwa+ 05  Motivations  A more accurate parser on one language may help a less accurate one on another language (this paper)  A difficult syntactic ambiguity in one language may be easy to resolve in another language  Rich labeled resources in one language can be transferred to build parsers of another language (unsupervised)

Recommend


More recommend