discontinuous statistical machine translation with target
play

Discontinuous Statistical Machine Translation with Target-Side - PowerPoint PPT Presentation

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Discontinuous Statistical Machine Translation with Target-Side Dependency Syntax Nina Seemann Andreas Maletti University of Stuttgart Institute


  1. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Discontinuous Statistical Machine Translation with Target-Side Dependency Syntax Nina Seemann Andreas Maletti University of Stuttgart – Institute for Natural Language Processing – Pfaffenwaldring 5b 70569 Stuttgart September 17, 2015 Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 1 ·

  2. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Outline Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 2 ·

  3. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Syntax-based Machine Translation English foreign semantics syntax phrase ◮ Source language side is a string ◮ Target language side requires syntactic annotations Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 3 ·

  4. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Discontinuous Target Languages We want to translate from English to Russian and Polish: ◮ morphologically rich ◮ free word order languages ◮ grammatically agreeing parts spread out over whole sentence ◮ syntax difficult to express in terms of constituency structure ◮ not parseable by constituency parser ◮ but by dependency parsers Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 4 ·

  5. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Outline Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 5 ·

  6. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Dependency Parsing ROOT COMP PAR MWE ADJUNCT ADJUNCT MWE S A P S S I S I konwencja haska w sprawie obligacji ( g� losowanie ) PAR Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 6 ·

  7. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Non-projective Dependency Parse ◮ h → d is projective iff h dominates all nodes in the linear span between h and d . ◮ Dependency parse is projective iff all its edges are projective. ROOT COMP PAR MWE ADJUNCT ADJUNCT MWE S A P S S I S I konwencja haska w sprawie obligacji ( g� losowanie ) PAR Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 7 ·

  8. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Lifting [Kahane et al., 1998] Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 8 ·

  9. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Lifting [Kahane et al., 1998] Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 8 ·

  10. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Lifting [Kahane et al., 1998] Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 8 ·

  11. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Lifting [Kahane et al., 1998] Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 8 ·

  12. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Lifting [Nivre and Nilsson, 2005] Refined the lifting process by performing the same operation but document the lifting in the labels ⇒ path ROOT MWE ↑ COMP PAR ADJUNCT ADJUNCT MWE ↓ S A P S S I S I konwencja haska w sprawie obligacji ( g� losowanie ) PAR Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 9 ·

  13. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Conversion from dependency to constituency tree Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 10 ·

  14. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Conversion from dependency to constituency tree ROOT ADJUNCT A COMP PAR haska MWE ↓ MWE ↑ S ADJUNCT S PAR I konwencja sprawie P S I S ) w obligacji ( głosowanie Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 10 ·

  15. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Conversion from dependency to constituency tree ROOT ADJUNCT A COMP PAR haska MWE ↓ MWE ↑ S ADJUNCT S PAR I konwencja sprawie P S I S ) w obligacji ( głosowanie Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 10 ·

  16. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Conversion from dependency to constituency tree ROOT ADJUNCT A COMP PAR haska MWE ↓ MWE ↑ S ADJUNCT S PAR I konwencja sprawie P S I S ) w obligacji ( głosowanie Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 10 ·

  17. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Conversion from dependency to constituency tree ROOT ADJUNCT A COMP PAR haska MWE ↓ MWE ↑ S ADJUNCT S PAR I konwencja sprawie P S I S ) w obligacji ( głosowanie Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 10 ·

  18. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Conversion from dependency to constituency tree ROOT ADJUNCT A COMP PAR S haska ADJUNCT S MWE ↓ PAR MWE ↑ I konwencja sprawie ) P S I S w obligacji ( głosowanie Preserves discontinuities! Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 10 ·

  19. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Outline Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 11 ·

  20. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion String-to-Tree Multi Bottom-up Tree Transducer lexical continuous rule: lexical discontinuous rule: P ADJUNCT , S , I � � � � motivated by → this is not something that → , motywowane nie jest to co´ s co structural continuous rule: ADJUNCT � � technologies X → technologii MWE structural discontinuous rules: � ADJUNCT IMP � , ADJUNCT � , PUNCT � there are X that X → it needs to X → sa MWE musi MWE Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 12 ·

  21. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Translation Model Standard log-linear model with the following 8 features: ◮ . . . ◮ gap penalty 100 1 − c ( c is the number of target tree fragments) We use the MBOT-Moses decoder [Braune et al. 2013]: ◮ standard Moses syntax-based decoder ◮ extended to handle target side discontinuities Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 13 ·

  22. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Outline Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 14 ·

  23. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Setup English to Polish English to Russian training data 7th EuroParl corpus WMT 2014 language model 5-gram SRILM tuning data cut from EuroParl ( ≈ 3k) WMT 2014 test data cut from EuroParl( ≈ 3k) WMT 2014 Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 15 ·

  24. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Training Pipeline Target side: ◮ TreeTagger [Schmid 1996] ◮ MaltParser [Nivre et al. 2006, Sharoff & Nivre 2011] ◮ Path-Lifting ◮ Conversion into constituency tree Parallel Data: ◮ tokenized and lowercased ◮ length-ratio filtered up to length 80 ◮ word alignments by GIZA++ [Och & Ney 2003] with grow-diag-final-and Tuning: Minimum error rate training [Och 2003] Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 16 ·

  25. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Experimental Results Translation task System BLEU Baseline 21.29 MBOT 23.43 English-to-Polish GHKM 23.31 Phrase-based 24.35 Hiero 24.56 Baseline 24.66 MBOT 26.13 English-to-Russian GHKM 25.97 Phrase-based 27.90 Hiero 27.72 Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 17 ·

  26. Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Losses across the systems Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 18 ·

Recommend


More recommend