computer aided translation
play

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp - PowerPoint PPT Presentation

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015 Why Machine Translation? 1 Assimilation reader initiates translation, wants to know content user is


  1. Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  2. Why Machine Translation? 1 Assimilation — reader initiates translation, wants to know content • user is tolerant of inferior quality • focus of majority of research Communication — participants don’t speak same language, rely on translation • users can ask questions, when something is unclear • chat room translations, hand-held devices • often combined with speech recognition Dissemination — publisher wants to make content available in other languages • high demands for quality • currently almost exclusively done by human translators Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  3. Why Machine Translation? 2 Assimilation — reader initiates translation, wants to know content • user is tolerant of inferior quality • focus of majority of research Communication — participants don’t speak same language, rely on translation • users can ask questions, when something is unclear • chat room translations, hand-held devices • often combined with speech recognition Dissemination — publisher wants to make content available in other languages • high demands for quality OUR • currently almost exclusively done by human translators FOCUS Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  4. Goal: Helping Human Translators 3 If you can’t beat them, join them. → How can machine translation help human translators? Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  5. Post-Editing Machine Translation 4 (source: Autodesk) Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  6. Machine Translation Quality Matters 5 Experiment: Post-editing with different machine translation systems English–German, news stories 5.38 sec/word UEDIN SYNTAX 5.46 sec/word ONLINE B 5.45 sec/word UEDIN PHRASE OTHER WMT 13 6.35 sec/word 0 sec/word 2 sec/word 4 sec/word 6 sec/word [Koehn and Germann, 2014] Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  7. Overview 6 • Interactivity • Choices • Confidence • Adaptation Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  8. Interactivity 7 • Traditional professional translation approaches – translation from scratch – post-editing translation memory match – post-editing machine translation output • More interactive collaboration between machine and professional? Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  9. Interactive Machine Translation 8 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in Baltimore zu halten. Professional Translator | Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  10. Interactive Machine Translation 9 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in Baltimore zu halten. Professional Translator | He Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  11. Interactive Machine Translation 10 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in Baltimore zu halten. Professional Translator He | has Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  12. Interactive Machine Translation 11 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in Baltimore zu halten. Professional Translator He has | for months Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  13. Interactive Machine Translation 12 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in Baltimore zu halten. Professional Translator He planned | Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  14. Interactive Machine Translation 13 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in Baltimore zu halten. Professional Translator He planned | for months Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  15. Prediction from Search Graph 14 planned for months he has for months has months since it Search for best translation creates a graph of possible translations Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  16. Prediction from Search Graph 15 planned for months he has for months has months since it One path in the graph is the best (according to the model) This path is suggested to the user Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  17. Prediction from Search Graph 16 planned for months he has for months has months since it The user may enter a different translation for the first words We have to find it in the graph Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  18. Prediction from Search Graph 17 planned for months he has for months has months since it We can predict the optimal completion (according to the model) Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  19. Run Time 18 time 80ms 7 edits 72ms 8 edits 64ms 6 edits 56ms 5 edits 4 edits 48ms 40ms 32ms 3 edits 24ms 2 edits 16ms 1 edit 8ms 0 edits prefix 0ms 5 20 25 30 35 10 15 40 Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  20. Word Alignment Visualization 19 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in Baltimore zu halten. Professional Translator He planned for months to give a lecture in Baltimore | in Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  21. Word Alignment Visualization 20 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in Baltimore zu halten. Professional Translator He planned for months to give a lecture in Baltimore | in Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  22. Shading off Translated Material 21 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in Baltimore zu halten . Professional Translator He planned for months to give a lecture in Baltimore | in Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  23. Some Observations 22 • How can we do this? – word alignments by-product of matching against search braph – automatic word alignments (as used in training) • User feedback – users like interactive machine translation – ... but they may be slower than with post-editing machine translation – user like mouse-over word alignment highlighting – user do not like at-cursor word alignment highlighting Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  24. Overview 23 • Interactivity • Choices • Confidence • Adaptation Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  25. Choices 24 • Trigger the passive vocabulary • Display multiple translations for words and phrases er hat seit Monaten geplant , im M¨ arz einen Vortrag ... he has for months the plan in March a lecture ... it has for months now planned , in March a presentation ... he was for several months planned to in the March a speech ... he has made since months the pipeline in March of a statement ... he did for many months scheduled the March a general ... • Rank and color-highlight by probability of each translation • Prefer diversity Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  26. Alternative Translations 25 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in Baltimore zu halten. Professional Translator He planned for months to give a lecture in Baltimore in April. give a presentation present his work give a speech speak User requests alternative translations for parts of sentence. Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  27. Bilingual Concordancer 26 Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  28. Overview 27 • Interactivity • Choices • Confidence • Adaptation Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  29. Confidence 28 • Machine translation engine indicates where it is likely wrong (also known as quality estimation — Lucia Specia) • Different Levels of granularity – document-level (SDL’s ”TrustScore”) – sentence-level – word-level • What are we predicting? – how useful is the translation — on a scale of (say) 1–5 – indication if post-editing is worthwhile – estimation of post-editing effort – pin-pointing errors Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

  30. Sentence-Level Confidence 29 • Translators are used to ”Fuzzy Match Score” – used in translation memory systems – roughly: ratio of words that are the same between input and TM source – if less than 70%, then not useful for post-editing • We would like to have a similar score for machine translation • Even better – estimation of post-editing time – estimation of from-scratch translation time → can also be used for pricing • Active research question, see also shared task at WMT 2013 Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015

Recommend


More recommend