PhD Thesis: Linguistically Motivated Reordering Modeling for Phrase-Based Statistical Machine Translation Arianna Bisazza Advisor: Marcello Federico Fondazione Bruno Kessler / Università di Trento
PSMT decoding overview E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali 2 Arianna Bisazza – PhD Thesis – 19 April 2013
PSMT decoding overview ReoM scores ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali TM scores Freedom of movement must be encouraged LM scores LM scores 3 Arianna Bisazza – PhD Thesis – 19 April 2013
PSMT decoding overview ReoM scores ReoM scores ReoM scores ReoM scores ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali TM scores while ensuring that career paths … Freedom of movement must be encouraged LM scores LM scores LM scores LM scores 4 Arianna Bisazza – PhD Thesis – 19 April 2013
PSMT decoding overview ReoM scores ReoM scores ReoM scores ReoM scores ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali TM scores … Freedom of movement must be encouraged while ensuring that career paths LM scores LM scores LM scores LM scores 5 Arianna Bisazza – PhD Thesis – 19 April 2013
Reordering Models Tillman 04, Zens & Ney 06 Many solutions have been proposed Al Onaizan & Papineni 06 with different reo. classes, features, Galley & Manning 08 Green & al.10, Feng & al.10 train modes, etc. … ReoM scores ReoM scores ReoM scores ReoM scores ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali 6 Arianna Bisazza – PhD Thesis – 19 April 2013
Reordering Models Tillman 04, Zens & Ney 06 Tillman04, Zens&Ney06 Many solutions have been proposed Al Onaizan & Papineni 06 AlOnaizan & Papineni06 with different reo. classes, features, Galley & Manning 08 Galley & Manning08 Green & al.10, Feng & al.10 Green &al.10, Feng &al.10 train modes, etc. … … ReoM scores ReoM scores ReoM scores ReoM scores ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali No matter what reordering model is used, the permutation search space must be limited! The power of all reordering models is bound to the reordering constraints in use 7 Arianna Bisazza – PhD Thesis – 19 April 2013
ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali 8 Arianna Bisazza – PhD Thesis – 19 April 2013
E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali Reordering Constraints #perm = |w|! ≈ 40,000,000 9 Arianna Bisazza – PhD Thesis – 19 April 2013
E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 <s> 0 1 2 3 4 5 6 7 8 9 10 Reordering Constraints w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 #perm = |w|! ≈ 40,000,000 w 2 3 2 0 1 2 3 4 5 6 7 w 3 4 3 2 0 1 2 3 4 5 6 D(w x ,w y )=|y‐x‐1| w 4 5 4 3 2 0 1 2 3 4 5 w 5 6 5 4 3 2 0 1 2 3 4 w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 w 10 11 10 9 8 7 6 5 4 3 2 Source-to-Source distortion 10 Arianna Bisazza – PhD Thesis – 19 April 2013
E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali DL: distortion limit w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 <s> 0 1 2 3 4 5 6 7 8 9 10 Reordering Constraints w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 #perm = |w|! ≈ 40,000,000 w 2 3 2 0 1 2 3 4 5 6 7 w 3 4 3 2 0 1 2 3 4 5 6 D(w x ,w y )=|y‐x‐1| w 4 5 4 3 2 0 1 2 3 4 5 w 5 6 5 4 3 2 0 1 2 3 4 DL=3 #perm ≈7,000 w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 w 10 11 10 9 8 7 6 5 4 3 2 Source-to-Source distortion 11 Arianna Bisazza – PhD Thesis – 19 April 2013
w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 <s> 0 1 2 3 4 5 6 7 8 9 10 w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 w 2 3 2 0 1 2 3 4 5 6 7 w 3 4 3 2 0 1 2 3 4 5 6 The problem with DL… w 4 5 4 3 2 0 1 2 3 4 5 w 5 6 5 4 3 2 0 1 2 3 4 w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 Arabic-English w 10 11 10 9 8 7 6 5 4 3 2 EN EN AR AR 12 Arianna Bisazza – PhD Thesis – 19 April 2013
w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 <s> 0 1 2 3 4 5 6 7 8 9 10 w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 w 2 3 2 0 1 2 3 4 5 6 7 w 3 4 3 2 0 1 2 3 4 5 6 The problem with DL… w 4 5 4 3 2 0 1 2 3 4 5 w 5 6 5 4 3 2 0 1 2 3 4 w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 German-English w 10 11 10 9 8 7 6 5 4 3 2 EN EN DE DE 13 Arianna Bisazza – PhD Thesis – 19 April 2013
Current solution #perm = |w|! ≈ 40,000,000 D(w x ,w y )=|y‐x‐1| Increasing the DL imit! DL=3 #perm ≈7,000 w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 <s> 0 1 2 3 4 5 6 7 8 9 10 w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 w 2 3 2 0 1 2 3 4 5 6 7 w 3 4 3 2 0 1 2 3 4 5 6 w 4 5 4 3 2 0 1 2 3 4 5 w 5 6 5 4 3 2 0 1 2 3 4 w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 w 10 11 10 9 8 7 6 5 4 3 2 Source-to-Source distortion 14 Arianna Bisazza – PhD Thesis – 19 April 2013
Current solution #perm = |w|! ≈ 40,000,000 D(w x ,w y )=|y‐x‐1| Increasing the DL imit! DL=3 #perm ≈7,000 w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 DL=7 #perm ≈7,000,000 <s> 0 1 2 3 4 5 6 7 8 9 10 w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 w 2 3 2 0 1 2 3 4 5 6 7 Coarse reordering w 3 4 3 2 0 1 2 3 4 5 6 space definition: w 4 5 4 3 2 0 1 2 3 4 5 slower decoding w 5 6 5 4 3 2 0 1 2 3 4 worse translations w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 w 10 11 10 9 8 7 6 5 4 3 2 Source-to-Source distortion 15 Arianna Bisazza – PhD Thesis – 19 April 2013
Observations Word reordering is difficult! • The existing word reordering models are not perfect, but they • are expected to guide search over huge search spaces one way to go: our way: design a perfect model • simplify the task for the • problem: many have • existing reordering models already tried and failed 16 Arianna Bisazza – PhD Thesis – 19 April 2013
Working hypotheses A better definition of the reordering search space (i.e. constraints) • can simplify the task of the reordering model (Shallow) linguistic knowledge can help us to refine the reordering • search space for a given language pair 17 Arianna Bisazza – PhD Thesis – 19 April 2013
Outline o The problem o The solutions: • verb reordering lattices • modified distortion matrices • dynamically pruning the reordering space o Comparative evaluation & conclusions 18 Arianna Bisazza – PhD Thesis – 19 April 2013
Outline Bisazza and Federico, Chunk-based Verb Reordering o The problem in VSO Sentences for Arabic-English, WMT 2010 o The solutions: • verb reordering lattices • modified distortion matrices Bisazza, Pighin, Federico, Chunk-Lattices for Verb Reordering in Arabic-English Statistical Machine Translation, MT Journal 2012 • dynamically pruning the reordering space o Comparative evaluation & conclusions 19 Arianna Bisazza – PhD Thesis – 19 April 2013
Idea: keep a low #perm = |w|! ≈ 40,000,000 distortion limit and … D(w x ,w y )=|y‐x‐1| DL=3 #perm ≈7,000 w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 DL=7 #perm ≈7,000,000 <s> 0 1 2 3 4 5 6 7 8 9 10 w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 w 2 3 2 0 1 2 3 4 5 6 7 w 3 4 3 2 0 1 2 3 4 5 6 … modify the input to allow w 4 5 4 3 2 0 1 2 3 4 5 only specific long reorderings w 5 6 5 4 3 2 0 1 2 3 4 w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 w 10 11 10 9 8 7 6 5 4 3 2 Source-to-Source distortion 20 Arianna Bisazza – PhD Thesis – 19 April 2013
Reordering patterns in Arabic-English Example of VSO sentences: the Arabic verb is anticipated wrt the English order Typical PSMT outputs: *The Moroccan monarch King Mohamed VI __ his support to… *He renewed the Moroccan monarch King Mohamed VI his support to… 21 Arianna Bisazza – PhD Thesis – 19 April 2013
Working hypothesis Uneven distribution of long and short-range word movements: • few long: verb-subject-object sentences We try to model them explicitly! • many short: adjective-noun head-initial genitive constructions (idafa) We assume they are well handled in standard PSMT 22 Arianna Bisazza – PhD Thesis – 19 April 2013
Recommend
More recommend