Automatic Translation Error Analysis or how to brute-force through - PowerPoint PPT Presentation

Automatic Translation Error Analysis or how to brute-force through exponential complexity algorithms by abusing beam search Mark Fishel, TÜ ATI Feb. 5, 2011, Theory Days at Nelijärve

Outline Approaches to MT evaluation Automatic analysis of translation errors alignment error detection error summarization Meta-evaluation First results Future work

Translation "Была у Мэри маленькая овечка и большая собака." "Mary had a little lamb and a big dog." "Mary was a little lamb and a large dog." "Maryl was small ovine species and a dog."

Evaluation Mostly done by comparison between the produced translation (hypothesis) and a correct one (reference) Manual Automatic WER, BLEU, NIST, Adequacy/fluency, METEOR, TER, Score rank, HTER SemPOS, LRscore, ... ad ∞ Analysis (Vilar et al. 2006) Our work Score -- good for comparison, but not informative Manual -- expensive

Translation errors by Vilar et al. (2006): Punctuation Missing words (in the reference) Content word Functional word Incorrect words (in the hypothesis) Incorrect sense/form Extra word Style, idioms Unknown words (in the hypothesis) Unknown stem/form Word order (in the hypothesis) Short/long range Word/phrase

Automatic error analysis Alignment between the hypothesis and the reference Error detection and classification Error summarization Result -- ~equivalent to Vilar et al.'s error classification

Alignment Almost trivial, except for ambiguous alignment pairs repeating words (esp. punctuation, articles, etc.) surface forms of one lemma synonyms

Alignment solution Align using lemmas/synonym sets Alignment modelled as a HMM observed variables -- hypothesis words hidden variables -- reference words emission probabilities allow matching words to align: transmission probabilities penalize long-distance reordering: We want only 1-to-1 alignments makes search cost exponential do a beam search

Lexical error detection unaligned ref words -- missing unaligned hyp words present in src? untranslated else, extra word aligned, different surface form synonyms or wrong surface form

Order error detection

Order error detection Can be used to calculate permutation distance Hamming distance Kendall's τ distance Ulam's distance Spearman's rank correlation coefficient Find misplaced words and phrases

Misplaced units Breadth-first search for a minimum number of unit shifts vertices: permutations of the hypothesis ranks edge present if the two permutations differ by two adjacent symbols in the wrong order edge weight is 0 for block shift continuation, or 1 otherwise avoid exponential cost with beam search Here: 1 word shift and 1 phrase shift

Error summarization Can be performed on different levels keep list of errors for every translated sentence usable for examining errors sentence-by-sentence summarize total number of errors, per category apply part-of-speech tagging to classify content/functional words present error numbers in percentage of total words in ref/hyp usable for overall system weakness comparison linear combination of the ratio of different error types -- score!

Summary Fast Inexpensive Language-independent, but can benefit from linguistic analysis

Meta-evaluation For scores -- correlation with human judgements For analysis -- precision/recall of error detection Both require manual labor Manual analysis requires a lot of labor

First results 2656 sentences, from http://masintolge.ut.ee/ input, manually translated into English translated automatically with Google and 2 UT systems UT-Base UT-Newer Google 54.29% 41.52% Missing 51.79% 10.08% 8.77% 2.40% Untranslated 33.96% 38.77% 30.23% Extra Wrong form 2.40% 2.83% 3.05% Misplaced 6.89% 7.09% 7.45% Rho 0.905 0.904 0.921

Future work Improve alignment Structural order error detection, with syntactic analysis Perform meta-evaluation Scoring, tuning weights to fit dev set

Thank you!

Automatic Translation Error Analysis or how to brute-force through - PowerPoint PPT Presentation

Automatic Translation Error Analysis or how to brute-force through exponential complexity algorithms by abusing beam search Mark Fishel, T ATI Feb. 5, 2011, Theory Days at Nelijrve Outline Approaches to MT evaluation Automatic analysis

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Chapter 11: The R.M.S. Error for Regression Errors: A has a large positive error B has a large

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Community Translation By Willem Stoeller Examples Community Translation Virtual Teams Powering

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

ERROR DETECTON & CORRECTION Error Detection EDC= Error Detection and Correction bits

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Global Translation Services Website translation using post-edited machine translation and

4CSLL5 IBM Translation Models Martin Emms October 22, 2020 4CSLL5 IBM Translation Models IBM

4CSLL5 IBM Translation Models IBM models Probabilities and Translation Alignments Martin Emms

Towards Heterogeneous Automatic MT Error Analysis (6th LREC) Jes us Gim enez and Llu

Human Error and Human Error Identification Techniques adapted from an IE 545 presentaton by

An Overview of Human Error Drawn f rom J . Reason, Human Error , Cambridge, 1990 Aaron Brown CS

Questions From Chapter 1 Figure 1.1: Testing life cycle Ch 12 Error vocabulary 1

L e x i c a l B u n d l e s i n C o n v e r s a t i o n a c r o s s E n g l i s h V a r i e t

Disclosures Altered Kinematics in the ACL Reconstructed Knee as a Mechanism of Zaid,

Modelling network performance with a spatial stochastic process algebra Vashti Galpin Laboratory

Spatial Resolution for How This Idea Is . . . Processing Seismic Data: Limitations of Ray . . .

Pattern Recognition Part 1: Introduction and Motivation Gerhard Schmidt

SGWB data analysis for Radler R. Buscicchio, G. Nardini, A. Petiteau 5 th Cosmology Working Group

Statistical aspects of determinantal point processes Fr ed eric Lavancier , Laboratoire de

Announcements Please turn in Assignment 3 and pick up Assignment 4 You can also email

Automatic Translation Error Analysis or how to brute-force through - PowerPoint PPT Presentation

Automatic Translation Error Analysis or how to brute-force through exponential complexity algorithms by abusing beam search Mark Fishel, T ATI Feb. 5, 2011, Theory Days at Nelijrve Outline Approaches to MT evaluation Automatic analysis

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Chapter 11: The R.M.S. Error for Regression Errors: A has a large positive error B has a large

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Community Translation By Willem Stoeller Examples Community Translation Virtual Teams Powering

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

ERROR DETECTON &amp; CORRECTION Error Detection EDC= Error Detection and Correction bits

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Global Translation Services Website translation using post-edited machine translation and

4CSLL5 IBM Translation Models Martin Emms October 22, 2020 4CSLL5 IBM Translation Models IBM

4CSLL5 IBM Translation Models IBM models Probabilities and Translation Alignments Martin Emms

Towards Heterogeneous Automatic MT Error Analysis (6th LREC) Jes us Gim enez and Llu

Human Error and Human Error Identification Techniques adapted from an IE 545 presentaton by

An Overview of Human Error Drawn f rom J . Reason, Human Error , Cambridge, 1990 Aaron Brown CS

Questions From Chapter 1 Figure 1.1: Testing life cycle Ch 12 Error vocabulary 1

L e x i c a l B u n d l e s i n C o n v e r s a t i o n a c r o s s E n g l i s h V a r i e t

Disclosures Altered Kinematics in the ACL Reconstructed Knee as a Mechanism of Zaid,

Modelling network performance with a spatial stochastic process algebra Vashti Galpin Laboratory

Spatial Resolution for How This Idea Is . . . Processing Seismic Data: Limitations of Ray . . .

Pattern Recognition Part 1: Introduction and Motivation Gerhard Schmidt

SGWB data analysis for Radler R. Buscicchio, G. Nardini, A. Petiteau 5 th Cosmology Working Group

Statistical aspects of determinantal point processes Fr ed eric Lavancier , Laboratoire de

Announcements Please turn in Assignment 3 and pick up Assignment 4 You can also email

ERROR DETECTON & CORRECTION Error Detection EDC= Error Detection and Correction bits