linguistically motivated reordering modeling for phrase
play

Linguistically Motivated Reordering Modeling for Phrase-Based - PowerPoint PPT Presentation

PhD Thesis: Linguistically Motivated Reordering Modeling for Phrase-Based Statistical Machine Translation Arianna Bisazza Advisor: Marcello Federico Fondazione Bruno Kessler / Universit di Trento PSMT decoding overview E' necessario


  1. PhD Thesis: Linguistically Motivated Reordering Modeling for Phrase-Based Statistical Machine Translation Arianna Bisazza Advisor: Marcello Federico Fondazione Bruno Kessler / Università di Trento

  2. PSMT decoding overview E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali 2 Arianna Bisazza – PhD Thesis – 19 April 2013

  3. PSMT decoding overview ReoM scores ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali TM scores Freedom of movement must be encouraged LM scores LM scores 3 Arianna Bisazza – PhD Thesis – 19 April 2013

  4. PSMT decoding overview ReoM scores ReoM scores ReoM scores ReoM scores ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali TM scores while ensuring that career paths … Freedom of movement must be encouraged LM scores LM scores LM scores LM scores 4 Arianna Bisazza – PhD Thesis – 19 April 2013

  5. PSMT decoding overview ReoM scores ReoM scores ReoM scores ReoM scores ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali TM scores … Freedom of movement must be encouraged while ensuring that career paths LM scores LM scores LM scores LM scores 5 Arianna Bisazza – PhD Thesis – 19 April 2013

  6. Reordering Models Tillman 04, Zens & Ney 06 Many solutions have been proposed Al Onaizan & Papineni 06 with different reo. classes, features, Galley & Manning 08 Green & al.10, Feng & al.10 train modes, etc. … ReoM scores ReoM scores ReoM scores ReoM scores ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali 6 Arianna Bisazza – PhD Thesis – 19 April 2013

  7. Reordering Models Tillman 04, Zens & Ney 06 Tillman04, Zens&Ney06 Many solutions have been proposed Al Onaizan & Papineni 06 AlOnaizan & Papineni06 with different reo. classes, features, Galley & Manning 08 Galley & Manning08 Green & al.10, Feng & al.10 Green &al.10, Feng &al.10 train modes, etc. … … ReoM scores ReoM scores ReoM scores ReoM scores ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali No matter what reordering model is used, the permutation search space must be limited!  The power of all reordering models is bound to the reordering constraints in use 7 Arianna Bisazza – PhD Thesis – 19 April 2013

  8. ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali 8 Arianna Bisazza – PhD Thesis – 19 April 2013

  9. E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali Reordering Constraints #perm = |w|! ≈ 40,000,000 9 Arianna Bisazza – PhD Thesis – 19 April 2013

  10. E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 <s> 0 1 2 3 4 5 6 7 8 9 10 Reordering Constraints w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 #perm = |w|! ≈ 40,000,000 w 2 3 2 0 1 2 3 4 5 6 7 w 3 4 3 2 0 1 2 3 4 5 6 D(w x ,w y )=|y‐x‐1| w 4 5 4 3 2 0 1 2 3 4 5 w 5 6 5 4 3 2 0 1 2 3 4 w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 w 10 11 10 9 8 7 6 5 4 3 2 Source-to-Source distortion 10 Arianna Bisazza – PhD Thesis – 19 April 2013

  11. E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali DL: distortion limit w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 <s> 0 1 2 3 4 5 6 7 8 9 10 Reordering Constraints w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 #perm = |w|! ≈ 40,000,000 w 2 3 2 0 1 2 3 4 5 6 7 w 3 4 3 2 0 1 2 3 4 5 6 D(w x ,w y )=|y‐x‐1| w 4 5 4 3 2 0 1 2 3 4 5 w 5 6 5 4 3 2 0 1 2 3 4 DL=3  #perm ≈7,000 w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 w 10 11 10 9 8 7 6 5 4 3 2 Source-to-Source distortion 11 Arianna Bisazza – PhD Thesis – 19 April 2013

  12. w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 <s> 0 1 2 3 4 5 6 7 8 9 10 w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 w 2 3 2 0 1 2 3 4 5 6 7 w 3 4 3 2 0 1 2 3 4 5 6 The problem with DL… w 4 5 4 3 2 0 1 2 3 4 5 w 5 6 5 4 3 2 0 1 2 3 4 w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 Arabic-English w 10 11 10 9 8 7 6 5 4 3 2 EN EN AR AR 12 Arianna Bisazza – PhD Thesis – 19 April 2013

  13. w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 <s> 0 1 2 3 4 5 6 7 8 9 10 w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 w 2 3 2 0 1 2 3 4 5 6 7 w 3 4 3 2 0 1 2 3 4 5 6 The problem with DL… w 4 5 4 3 2 0 1 2 3 4 5 w 5 6 5 4 3 2 0 1 2 3 4 w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 German-English w 10 11 10 9 8 7 6 5 4 3 2 EN EN DE DE 13 Arianna Bisazza – PhD Thesis – 19 April 2013

  14. Current solution #perm = |w|! ≈ 40,000,000 D(w x ,w y )=|y‐x‐1| Increasing the DL imit! DL=3  #perm ≈7,000 w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 <s> 0 1 2 3 4 5 6 7 8 9 10 w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 w 2 3 2 0 1 2 3 4 5 6 7 w 3 4 3 2 0 1 2 3 4 5 6 w 4 5 4 3 2 0 1 2 3 4 5 w 5 6 5 4 3 2 0 1 2 3 4 w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 w 10 11 10 9 8 7 6 5 4 3 2 Source-to-Source distortion 14 Arianna Bisazza – PhD Thesis – 19 April 2013

  15. Current solution #perm = |w|! ≈ 40,000,000 D(w x ,w y )=|y‐x‐1| Increasing the DL imit! DL=3  #perm ≈7,000 w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 DL=7  #perm ≈7,000,000 <s> 0 1 2 3 4 5 6 7 8 9 10 w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 w 2 3 2 0 1 2 3 4 5 6 7 Coarse reordering w 3 4 3 2 0 1 2 3 4 5 6 space definition: w 4 5 4 3 2 0 1 2 3 4 5  slower decoding w 5 6 5 4 3 2 0 1 2 3 4  worse translations w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 w 10 11 10 9 8 7 6 5 4 3 2 Source-to-Source distortion 15 Arianna Bisazza – PhD Thesis – 19 April 2013

  16. Observations Word reordering is difficult! • The existing word reordering models are not perfect, but they • are expected to guide search over huge search spaces one way to go: our way: design a perfect model • simplify the task for the • problem: many have • existing reordering models already tried and failed 16 Arianna Bisazza – PhD Thesis – 19 April 2013

  17. Working hypotheses A better definition of the reordering search space (i.e. constraints) • can simplify the task of the reordering model (Shallow) linguistic knowledge can help us to refine the reordering • search space for a given language pair 17 Arianna Bisazza – PhD Thesis – 19 April 2013

  18. Outline o The problem o The solutions: • verb reordering lattices • modified distortion matrices • dynamically pruning the reordering space o Comparative evaluation & conclusions 18 Arianna Bisazza – PhD Thesis – 19 April 2013

  19. Outline Bisazza and Federico, Chunk-based Verb Reordering o The problem in VSO Sentences for Arabic-English, WMT 2010 o The solutions: • verb reordering lattices • modified distortion matrices Bisazza, Pighin, Federico, Chunk-Lattices for Verb Reordering in Arabic-English Statistical Machine Translation, MT Journal 2012 • dynamically pruning the reordering space o Comparative evaluation & conclusions 19 Arianna Bisazza – PhD Thesis – 19 April 2013

  20. Idea: keep a low #perm = |w|! ≈ 40,000,000 distortion limit and … D(w x ,w y )=|y‐x‐1| DL=3  #perm ≈7,000 w 0 w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 w 10 DL=7  #perm ≈7,000,000 <s> 0 1 2 3 4 5 6 7 8 9 10 w 0 0 1 2 3 4 5 6 7 8 9 w 1 2 0 1 2 3 4 5 6 7 8 w 2 3 2 0 1 2 3 4 5 6 7 w 3 4 3 2 0 1 2 3 4 5 6 … modify the input to allow w 4 5 4 3 2 0 1 2 3 4 5 only specific long reorderings w 5 6 5 4 3 2 0 1 2 3 4 w 6 7 6 5 4 3 2 0 1 2 3 w 7 8 7 6 5 4 3 2 0 1 2 w 8 9 8 7 6 5 4 3 2 0 1 w 9 10 9 8 7 6 5 4 3 2 0 w 10 11 10 9 8 7 6 5 4 3 2 Source-to-Source distortion 20 Arianna Bisazza – PhD Thesis – 19 April 2013

  21. Reordering patterns in Arabic-English Example of VSO sentences: the Arabic verb is anticipated wrt the English order Typical PSMT outputs: *The Moroccan monarch King Mohamed VI __ his support to… *He renewed the Moroccan monarch King Mohamed VI his support to… 21 Arianna Bisazza – PhD Thesis – 19 April 2013

  22. Working hypothesis Uneven distribution of long and short-range word movements: • few long:  verb-subject-object sentences We try to model them explicitly! • many short:  adjective-noun  head-initial genitive constructions (idafa) We assume they are well handled in standard PSMT 22 Arianna Bisazza – PhD Thesis – 19 April 2013

Recommend


More recommend