an empirical comparison of features and tuning for phrase
play

An Empirical Comparison of Features and Tuning for Phrase-based - PowerPoint PPT Presentation

An Empirical Comparison of Features and Tuning for Phrase-based Machine Translation Spence Green with Daniel Cer and Chris Manning Stanford University WMT // 27 June 2014 Recap: ACL13 Results SGD-based, n -best learning L 1 feature selection


  1. An Empirical Comparison of Features and Tuning for Phrase-based Machine Translation Spence Green with Daniel Cer and Chris Manning Stanford University WMT // 27 June 2014

  2. Recap: ACL13 Results SGD-based, n -best learning L 1 feature selection 2

  3. Recap: ACL13 Results SGD-based, n -best learning L 1 feature selection BOLT-scale Zh–En on NIST data: BLEU Δ MERT 48.4 2

  4. Recap: ACL13 Results SGD-based, n -best learning L 1 feature selection BOLT-scale Zh–En on NIST data: BLEU Δ MERT 48.4 SGD 48.1 2

  5. Recap: ACL13 Results SGD-based, n -best learning L 1 feature selection BOLT-scale Zh–En on NIST data: BLEU Δ MERT 48.4 SGD 48.1 SGD + Features + 1.5 49.9 :-) 2

  6. Motivation #1: WMT13 Shared Task :-( ● 32 ● ● ● ● ● BLEU newtest2008−2011 ● ● ● 31 ● ● ● ● ● ● ● ● 30 ● 29 Model ● dense ● ● feature−rich ● 1 2 3 4 5 6 7 8 9 10 Epoch 3

  7. Motivation #1: WMT13 Shared Task En–Fr news2012 (dev) BLEU Dense 31.1 SGD + Features 31.5 + 0.4 4

  8. Motivation #2: Practical Issues Q1 : Which phrase-based features should I use? 5

  9. Motivation #2: Practical Issues Q1 : Which phrase-based features should I use? Q2 : Why don’t my features help? 5

  10. My Frustrating Summer... What’s wrong with feature-rich MT? 1. Loss Function 6

  11. My Frustrating Summer... What’s wrong with feature-rich MT? 1. Loss Function 2. References and scoring functions 6

  12. My Frustrating Summer... What’s wrong with feature-rich MT? 1. Loss Function 2. References and scoring functions 3. Representation: Features 6

  13. My Frustrating Summer... What’s wrong with feature-rich MT? 1. Loss Function 2. References and scoring functions 3. Representation: Features This paper as a pain reliever... 6

  14. Loss Function

  15. ACL13: Online PRO Sensitive to length Doesn’t optimize top- k Slow to compute (sampling) 8

  16. This work: Online Expected Error Expected BLEU ℓ t (  t − 1 ) = E p t − 1 [ − BLEU ( d )] � = − p  t − 1 ( d ) · BLEU ( d ) d ∈ H 9

  17. This work: Online Expected Error Expected BLEU ℓ t (  t − 1 ) = E p t − 1 [ − BLEU ( d )] � = − p  t − 1 ( d ) · BLEU ( d ) d ∈ H Smooth, non-convex Fast , less sensitive to length ...but still doesn’t prefer top- k 9

  18. References and Scoring

  19. Single vs. Multiple References Experiment : Compute BLEU + 1 for each reference 11

  20. Single vs. Multiple References Experiment : Compute BLEU + 1 for each reference Baseline MT system 11

  21. Single vs. Multiple References Experiment : Compute BLEU + 1 for each reference Baseline MT system Ar–En NIST MT05 has five (5) references 11

  22. MT05: Max. vs. Min. BLEU + 1 100 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 75 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Maximum ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 50 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 25 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 0 25 50 75 100 Minimum 12

Recommend


More recommend