Machine Translation Proposal
Pilot Objective: ● Cost : PEMT 27% savings over human translation ● Efficiency : PEMT 25% faster than human translation ● Quality : an acceptable score under 30 according to the Harmonized the TAUS Dynamic Quality Framework (DQF) and Multidimensional Quality Metrics (MQM)
Error Type Minor Major Critical Accuracy omission 1 2 3 mistranslation 1 2 3 untranslated 1 2 3 Terminology inconsistent with termbase 1 2 3 inconsistent use of terminology 1 2 3 Fluency grammar 1 2 3
Pilot Project Processes & Problems
Step 1: File Preparation PDF DOCX Delete the unnecessary content Delete the extra space Make the file clear; the alignment easier
Step2: Testing and Training Round A: tmx only Round B: adding related PDFs BLEU score increased
Step3 First Round of Human Evaluation ❖ 2 post-editors ❖ A sample of 1000 words extracted from one of the system ❖ Analyzed and gave the quality score first ❖ average PE time: 53 minutes
Step4: Tuning Failed to train the system put them into training data it works:)!
Step 5: Adding dictionary ❖ 512-page IMF glossary ❖ Converted from PDF to DOCX ❖ Cleaned up formats and terms ❖ Added it into the dictionary data ❖ Trained two systems
Problem Time consuming glossary clean-up
Problem Failed to add the glossary into dictionary data
Problem Lower BLEU score after adding the glossary
Step 6: The final round of human evaluation ❖ 2 post-editors ❖ A sample of 1000 words extracted from the system with the highest BLEU score(44.03) ❖ Analyzed and gave the quality score first ❖ Average PE time: 0.68 hours
Problem Mistranslated and untranslated MT due to incomplete manual cleanup
Pilot Project Results
85% Efficiency
71% Cost HT: ❖ Translation: $0.12/word ❖ Editing: $0.05/word PEMT: ❖ Post-editing: $0.05/word
31.5 Quality
Quality QA Error Score: 49 31.5 PE time for 1000 words: 53mins 40.8mins Comparison of two rounds of human evaluation
Lessons Learned
PDF formating cleanup
When in doubt, check the system ➔ The system IS objective ➔ TMX IS better than PDF
Content relevance is key
➔ “ Terminology ” and ” Fluency ” performance are improved ➔ Further data needs to be collected for assessment accuracy.
Thank you
Recommend
More recommend