Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Discontinuous Statistical Machine Translation with Target-Side Dependency Syntax Nina Seemann Andreas Maletti University of Stuttgart – Institute for Natural Language Processing – Pfaffenwaldring 5b 70569 Stuttgart September 17, 2015 Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 1 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Outline Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 2 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Syntax-based Machine Translation English foreign semantics syntax phrase ◮ Source language side is a string ◮ Target language side requires syntactic annotations Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 3 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Discontinuous Target Languages We want to translate from English to Russian and Polish: ◮ morphologically rich ◮ free word order languages ◮ grammatically agreeing parts spread out over whole sentence ◮ syntax difficult to express in terms of constituency structure ◮ not parseable by constituency parser ◮ but by dependency parsers Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 4 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Outline Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 5 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Dependency Parsing ROOT COMP PAR MWE ADJUNCT ADJUNCT MWE S A P S S I S I konwencja haska w sprawie obligacji ( g� losowanie ) PAR Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 6 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Non-projective Dependency Parse ◮ h → d is projective iff h dominates all nodes in the linear span between h and d . ◮ Dependency parse is projective iff all its edges are projective. ROOT COMP PAR MWE ADJUNCT ADJUNCT MWE S A P S S I S I konwencja haska w sprawie obligacji ( g� losowanie ) PAR Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 7 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Lifting [Kahane et al., 1998] Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 8 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Lifting [Kahane et al., 1998] Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 8 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Lifting [Kahane et al., 1998] Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 8 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Lifting [Kahane et al., 1998] Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 8 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Lifting [Nivre and Nilsson, 2005] Refined the lifting process by performing the same operation but document the lifting in the labels ⇒ path ROOT MWE ↑ COMP PAR ADJUNCT ADJUNCT MWE ↓ S A P S S I S I konwencja haska w sprawie obligacji ( g� losowanie ) PAR Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 9 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Conversion from dependency to constituency tree Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 10 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Conversion from dependency to constituency tree ROOT ADJUNCT A COMP PAR haska MWE ↓ MWE ↑ S ADJUNCT S PAR I konwencja sprawie P S I S ) w obligacji ( głosowanie Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 10 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Conversion from dependency to constituency tree ROOT ADJUNCT A COMP PAR haska MWE ↓ MWE ↑ S ADJUNCT S PAR I konwencja sprawie P S I S ) w obligacji ( głosowanie Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 10 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Conversion from dependency to constituency tree ROOT ADJUNCT A COMP PAR haska MWE ↓ MWE ↑ S ADJUNCT S PAR I konwencja sprawie P S I S ) w obligacji ( głosowanie Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 10 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Conversion from dependency to constituency tree ROOT ADJUNCT A COMP PAR haska MWE ↓ MWE ↑ S ADJUNCT S PAR I konwencja sprawie P S I S ) w obligacji ( głosowanie Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 10 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Conversion from dependency to constituency tree ROOT ADJUNCT A COMP PAR S haska ADJUNCT S MWE ↓ PAR MWE ↑ I konwencja sprawie ) P S I S w obligacji ( głosowanie Preserves discontinuities! Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 10 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Outline Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 11 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion String-to-Tree Multi Bottom-up Tree Transducer lexical continuous rule: lexical discontinuous rule: P ADJUNCT , S , I � � � � motivated by → this is not something that → , motywowane nie jest to co´ s co structural continuous rule: ADJUNCT � � technologies X → technologii MWE structural discontinuous rules: � ADJUNCT IMP � , ADJUNCT � , PUNCT � there are X that X → it needs to X → sa MWE musi MWE Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 12 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Translation Model Standard log-linear model with the following 8 features: ◮ . . . ◮ gap penalty 100 1 − c ( c is the number of target tree fragments) We use the MBOT-Moses decoder [Braune et al. 2013]: ◮ standard Moses syntax-based decoder ◮ extended to handle target side discontinuities Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 13 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Outline Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 14 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Setup English to Polish English to Russian training data 7th EuroParl corpus WMT 2014 language model 5-gram SRILM tuning data cut from EuroParl ( ≈ 3k) WMT 2014 test data cut from EuroParl( ≈ 3k) WMT 2014 Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 15 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Training Pipeline Target side: ◮ TreeTagger [Schmid 1996] ◮ MaltParser [Nivre et al. 2006, Sharoff & Nivre 2011] ◮ Path-Lifting ◮ Conversion into constituency tree Parallel Data: ◮ tokenized and lowercased ◮ length-ratio filtered up to length 80 ◮ word alignments by GIZA++ [Och & Ney 2003] with grow-diag-final-and Tuning: Minimum error rate training [Och 2003] Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 16 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Experimental Results Translation task System BLEU Baseline 21.29 MBOT 23.43 English-to-Polish GHKM 23.31 Phrase-based 24.35 Hiero 24.56 Baseline 24.66 MBOT 26.13 English-to-Russian GHKM 25.97 Phrase-based 27.90 Hiero 27.72 Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 17 ·
Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Losses across the systems Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 18 ·
Recommend
More recommend